{"id":2389,"date":"2024-06-04T12:28:04","date_gmt":"2024-06-04T12:28:04","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=2389"},"modified":"2024-06-07T00:50:29","modified_gmt":"2024-06-07T00:50:29","slug":"531-deepsjeng_r","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/cpu2017\/531-deepsjeng_r\/","title":{"rendered":"531.deepsjeng_r"},"content":{"rendered":"\n<p>deepsjeng is a SPEC CPU(R) benchmark written in C++ and described <a href=\"https:\/\/spec.org\/cpu2017\/Docs\/benchmarks\/531.deepsjeng_r.html\">here<\/a>. The workload runs on all logical cores.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-24.png\" alt=\"\" class=\"wp-image-2462\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-24.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-24-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-24-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows a workload with mixed levels of backend stalls, frontend stalls and retiring instructions.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-25.png\" alt=\"\" class=\"wp-image-2463\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-25.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-25-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-25-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics on 7840 confirm the balance between frontend, backend and retiring instructions.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              811.223\non_cpu               0.986          # 15.78 \/ 16 cores\nutime                12773.395\nstime                28.115\nnvcsw                18695          # 14.28%\nnivcsw               112210         # 85.72%\ninblock              0              # 0.00\/sec\nonblock              30064          # 37.06\/sec\ncpu-clock            12802138749308 # 12802.139 seconds\ntask-clock           12802230919689 # 12802.231 seconds\npage faults          9202017        # 718.782\/sec\ncontext switches     130342         # 10.181\/sec\ncpu migrations       155            # 0.012\/sec\nmajor page faults    1031           # 0.081\/sec\nminor page faults    9200986        # 718.702\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             9272134301352  # 123.840 branches per 1000 inst\nbranch misses        370304637120   # 3.99% branch miss\nconditional          7301483191614  # 97.520 conditional branches per 1000 inst\nindirect             67884778718    # 0.907 indirect branches per 1000 inst\ncpu-cycles           52817901328679 # 4.08 GHz\ninstructions         74866236272888 # 1.42 IPC\nslots                105654379952364 #\nretiring             24901505427526 # 23.6% (30.2%)\n-- ucode             337138071      #     0.0%\n-- fastpath          24901168289455 #    23.6%\nfrontend             24245524942663 # 22.9% (29.4%)\n-- latency           15873166991802 #    15.0%\n-- bandwidth         8372357950861  #     7.9%\nbackend              28818020778073 # 27.3% (34.9%)\n-- cpu               3887120280115  #     3.7%\n-- memory            24930900497958 #    23.6%\nspeculation          4558465612069  #  4.3% ( 5.5%)\n-- branch mispredict 4439266810809  #     4.2%\n-- pipeline restart  119198801260   #     0.1%\nsmt-contention       23130785988881 # 21.9% ( 0.0%)\ncpu-cycles           52806133923811 # 4.07 GHz\ninstructions         74880184852645 # 1.42 IPC\ninstructions         24960402824569 # 23.537 l2 access per 1000 inst\nl2 hit from l1       449020291198   # 4.85% l2 miss\nl2 miss from l1      17110339701    #\nl2 hit from l2 pf    127086650332   #\nl3 hit from l2 pf    1955075744     #\nl3 miss from l2 pf   9419542641     #\ninstructions         24950191292383 # 21.274 float per 1000 inst\nfloat 512            240            # 0.000 AVX-512 per 1000 inst\nfloat 256            6813033600     # 0.273 AVX-256 per 1000 inst\nfloat 128            523978621400   # 21.001 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         9              # 0.000 scalar per 1000 inst\ninstructions         74868153303309 #\nopcache              13003747392264 # 173.689 opcache per 1000 inst\nopcache miss         2298953449524  # 17.7% opcache miss rate\nl1 dTLB miss         33051193481    # 0.441 L1 dTLB per 1000 inst\nl2 dTLB miss         17262144796    # 0.231 L2 dTLB per 1000 inst\ninstructions         74868187361005 #\nicache               3249314680210  # 43.400 icache per 1000 inst\nicache miss          532103183743   # 16.4% icache miss rate\nl1 iTLB miss         139714605      # 0.002 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            67264          # 0.000 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Process summary shows time spent in deepsjeng_r_bas<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>581 processes\n\t 48 deepsjeng_r_bas      12714.00    22.13\n\t 69 specperl                 9.48     1.49\n\t  1 lsb_release              0.01     0.00\n\t 11 ps                       0.00     0.01\n\t  1 clang++                  0.00     0.01\n\t173 sh                       0.00     0.00\n\t 54 specrxp                  0.00     0.00\n\t 48 bash                     0.00     0.00\n\t 41 specinvoke               0.00     0.00\n\t 21 grep                     0.00     0.00\n\t 20 cat                      0.00     0.00\n\t 12 uniq                     0.00     0.00\n\t 11 sort                     0.00     0.00\n\t 10 expand                   0.00     0.00\n\t  6 pwd                      0.00     0.00\n\t  5 basename                 0.00     0.00\n\t  5 specmake                 0.00     0.00\n\t  5 systemctl                0.00     0.00\n\t  4 specpp                   0.00     0.00\n\t  4 uname                    0.00     0.00\n\t  3 dirname                  0.00     0.00\n\t  3 dmidecode                0.00     0.00\n\t  3 lscpu                    0.00     0.00\n\t  2 df                       0.00     0.00\n\t  2 dpkg                     0.00     0.00\n\t  2 rm                       0.00     0.00\n\t  2 runcpu                   0.00     0.00\n\t  2 specsha512sum            0.00     0.00\n\t  2 specxz                   0.00     0.00\n\t  2 who                      0.00     0.00\n\t  1 cpupower                 0.00     0.00\n\t  1 head                     0.00     0.00\n\t  1 logname                  0.00     0.00\n\t  1 ls                       0.00     0.00\n\t  1 numactl                  0.00     0.00\n\t  1 sysctl                   0.00     0.00\n\t  1 w                        0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 which                    0.00     0.00\n0 processes running\n53 maximum processes\n<\/code><\/pre>\n\n\n\n<p>specinvoke fires off copies on each logical core<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>    52747) specinvoke       cpu=2 start=3.11  finish=269.45\n      52749) sh               cpu=2 start=3.11  finish=268.60\n        52757) bash             cpu=0 start=3.11  finish=268.60\n          52782) deepsjeng_r_bas  cpu=0 start=3.11  finish=268.56\n      52750) sh               cpu=1 start=3.11  finish=266.98\n        52759) bash             cpu=1 start=3.11  finish=266.98\n          52783) deepsjeng_r_bas  cpu=1 start=3.11  finish=266.93\n      52751) sh               cpu=1 start=3.11  finish=267.81\n        52760) bash             cpu=2 start=3.11  finish=267.81\n          52781) deepsjeng_r_bas  cpu=2 start=3.11  finish=267.75\n      52752) sh               cpu=13 start=3.11  finish=268.54\n        52762) bash             cpu=3 start=3.11  finish=268.53\n          52785) deepsjeng_r_bas  cpu=3 start=3.11  finish=268.48\n      52753) sh               cpu=5 start=3.11  finish=269.45\n        52763) bash             cpu=4 start=3.11  finish=269.45\n          52786) deepsjeng_r_bas  cpu=4 start=3.11  finish=269.42\n      52754) sh               cpu=10 start=3.11  finish=267.95\n        52765) bash             cpu=5 start=3.11  finish=267.95\n          52784) deepsjeng_r_bas  cpu=5 start=3.11  finish=267.90\n      52755) sh               cpu=13 start=3.11  finish=268.64\n        52776) bash             cpu=6 start=3.11  finish=268.64\n          52790) deepsjeng_r_bas  cpu=6 start=3.12  finish=268.60\n      52756) sh               cpu=6 start=3.11  finish=268.72\n        52767) bash             cpu=7 start=3.11  finish=268.72\n          52788) deepsjeng_r_bas  cpu=7 start=3.11  finish=268.68\n      52758) sh               cpu=10 start=3.11  finish=268.28\n        52769) bash             cpu=8 start=3.11  finish=268.28\n          52789) deepsjeng_r_bas  cpu=8 start=3.11  finish=268.22\n      52761) sh               cpu=1 start=3.11  finish=267.11\n        52771) bash             cpu=9 start=3.11  finish=267.11\n          52787) deepsjeng_r_bas  cpu=9 start=3.11  finish=267.07\n      52764) sh               cpu=9 start=3.11  finish=267.90\n        52778) bash             cpu=10 start=3.11  finish=267.90\n          52793) deepsjeng_r_bas  cpu=10 start=3.12  finish=267.85\n      52766) sh               cpu=10 start=3.11  finish=268.60\n        52774) bash             cpu=11 start=3.11  finish=268.60\n          52794) deepsjeng_r_bas  cpu=11 start=3.12  finish=268.56\n      52768) sh               cpu=11 start=3.11  finish=269.38\n        52775) bash             cpu=12 start=3.11  finish=269.38\n          52791) deepsjeng_r_bas  cpu=12 start=3.12  finish=269.33\n      52770) sh               cpu=15 start=3.11  finish=268.34\n        52777) bash             cpu=13 start=3.11  finish=268.33\n          52792) deepsjeng_r_bas  cpu=13 start=3.12  finish=268.30\n      52772) sh               cpu=13 start=3.11  finish=268.57\n        52779) bash             cpu=14 start=3.11  finish=268.57\n          52795) deepsjeng_r_bas  cpu=14 start=3.12  finish=268.51\n      52773) sh               cpu=10 start=3.11  finish=268.20\n        52780) bash             cpu=15 start=3.11  finish=268.20\n          52796) deepsjeng_r_bas  cpu=15 start=3.12  finish=268.14\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>deepsjeng is a SPEC CPU(R) benchmark written in C++ and described here. The workload runs on all logical cores. Topdown profile shows a workload with mixed levels of backend stalls, frontend stalls and retiring instructions. AMD metrics on 7840 confirm <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/cpu2017\/531-deepsjeng_r\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":2297,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-2389","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2389","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=2389"}],"version-history":[{"count":3,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2389\/revisions"}],"predecessor-version":[{"id":2465,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2389\/revisions\/2465"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2297"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=2389"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}