{"id":2375,"date":"2024-06-04T12:18:02","date_gmt":"2024-06-04T12:18:02","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=2375"},"modified":"2024-06-06T11:11:42","modified_gmt":"2024-06-06T11:11:42","slug":"500-perlbench_r","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/cpu2017\/500-perlbench_r\/","title":{"rendered":"500.perlbench_r"},"content":{"rendered":"\n<p>perlbench is a SPEC CPU(R) benchmark written in C and described <a href=\"https:\/\/spec.org\/cpu2017\/Docs\/benchmarks\/500.perlbench_r.html\">here<\/a>. The workload runs on all logical cores.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-18.png\" alt=\"\" class=\"wp-image-2437\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-18.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-18-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-18-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows two different regions, one with higher retirement rate and one with high backend stalls and lower retirement rate.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-19.png\" alt=\"\" class=\"wp-image-2438\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-19.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-19-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-19-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics on 7840 show an overall composite. Backend stalls are memory but overall L2 access is only 15 per 1000 instructions.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              1272.531\non_cpu               0.984          # 15.75 \/ 16 cores\nutime                20004.904\nstime                32.379\nnvcsw                29226          # 13.54%\nnivcsw               186610         # 86.46%\ninblock              24             # 0.02\/sec\nonblock              689944         # 542.18\/sec\ncpu-clock            20038316721690 # 20038.317 seconds\ntask-clock           20038447908679 # 20038.448 seconds\npage faults          7803717        # 389.437\/sec\ncontext switches     214838         # 10.721\/sec\ncpu migrations       320            # 0.016\/sec\nmajor page faults    1292           # 0.064\/sec\nminor page faults    7802425        # 389.373\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             24359519286108 # 184.569 branches per 1000 inst\nbranch misses        152497684614   # 0.63% branch miss\nconditional          17499844582222 # 132.594 conditional branches per 1000 inst\nindirect             1715179974141  # 12.996 indirect branches per 1000 inst\ncpu-cycles           83438427398438 # 4.10 GHz\ninstructions         132027755278274 # 1.58 IPC\nslots                166857301734108 #\nretiring             42263589541657 # 25.3% (33.3%)\n-- ucode             201051299473   #     0.1%\n-- fastpath          42062538242184 #    25.2%\nfrontend             25063748165431 # 15.0% (19.7%)\n-- latency           14691714821874 #     8.8%\n-- bandwidth         10372033343557 #     6.2%\nbackend              56668583336440 # 34.0% (44.7%)\n-- cpu               5216347027859  #     3.1%\n-- memory            51452236308581 #    30.8%\nspeculation          2911204089454  #  1.7% ( 2.3%)\n-- branch mispredict 2777740218186  #     1.7%\n-- pipeline restart  133463871268   #     0.1%\nsmt-contention       39950017975885 # 23.9% ( 0.0%)\ncpu-cycles           83613374302452 # 4.10 GHz\ninstructions         131994334545013 # 1.58 IPC\ninstructions         44000032762874 # 15.736 l2 access per 1000 inst\nl2 hit from l1       636632609297   # 11.62% l2 miss\nl2 miss from l1      44881747863    #\nl2 hit from l2 pf    20160423661    #\nl3 hit from l2 pf    5353507321     #\nl3 miss from l2 pf   30232769175    #\ninstructions         43983522979100 # 16.803 float per 1000 inst\nfloat 512            273            # 0.000 AVX-512 per 1000 inst\nfloat 256            7288           # 0.000 AVX-256 per 1000 inst\nfloat 128            739053515060   # 16.803 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\ninstructions         131977660622241 #\nopcache              23067837172026 # 174.786 opcache per 1000 inst\nopcache miss         1291932014372  #  5.6% opcache miss rate\nl1 dTLB miss         628831435258   # 4.765 L1 dTLB per 1000 inst\nl2 dTLB miss         17878142548    # 0.135 L2 dTLB per 1000 inst\ninstructions         131977627392115 #\nicache               1717802692184  # 13.016 icache per 1000 inst\nicache miss          730826491765   # 42.5% icache miss rate\nl1 iTLB miss         177044140214   # 1.341 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            969590         # 0.000 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Process overview shows spec harness and almost all computation in perlbench_r_bas<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>1061 processes\n\t144 perlbench_r_bas      20009.51    23.32\n\t165 specperl                40.37     3.87\n\t  1 clang                    0.01     0.00\n\t  1 lsb_release              0.01     0.00\n\t 41 specinvoke               0.00     0.02\n\t365 sh                       0.00     0.00\n\t144 bash                     0.00     0.00\n\t 54 specrxp                  0.00     0.00\n\t 21 grep                     0.00     0.00\n\t 20 cat                      0.00     0.00\n\t 12 uniq                     0.00     0.00\n\t 11 ps                       0.00     0.00\n\t 11 sort                     0.00     0.00\n\t 10 expand                   0.00     0.00\n\t  6 pwd                      0.00     0.00\n\t  5 basename                 0.00     0.00\n\t  5 specmake                 0.00     0.00\n\t  5 systemctl                0.00     0.00\n\t  4 specpp                   0.00     0.00\n\t  4 uname                    0.00     0.00\n\t  3 dirname                  0.00     0.00\n\t  3 dmidecode                0.00     0.00\n\t  3 lscpu                    0.00     0.00\n\t  2 df                       0.00     0.00\n\t  2 dpkg                     0.00     0.00\n\t  2 rm                       0.00     0.00\n\t  2 runcpu                   0.00     0.00\n\t  2 specsha512sum            0.00     0.00\n\t  2 specxz                   0.00     0.00\n\t  2 who                      0.00     0.00\n\t  1 cpupower                 0.00     0.00\n\t  1 head                     0.00     0.00\n\t  1 logname                  0.00     0.00\n\t  1 ls                       0.00     0.00\n\t  1 numactl                  0.00     0.00\n\t  1 sysctl                   0.00     0.00\n\t  1 w                        0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 which                    0.00     0.00\n0 processes running\n53 maximum processes\n<\/code><\/pre>\n\n\n\n<p>specinvoke starts each process separately, looks like separate regions are three separate invocations<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>    12849) specinvoke       cpu=4 start=5.15  finish=428.38\n      12851) sh               cpu=0 start=5.15  finish=183.62\n        12861) bash             cpu=0 start=5.15  finish=183.62\n          12883) perlbench_r_bas  cpu=0 start=5.16  finish=183.59\n      12852) sh               cpu=1 start=5.15  finish=184.93\n        12862) bash             cpu=1 start=5.15  finish=184.93\n          12891) perlbench_r_bas  cpu=1 start=5.16  finish=184.91\n      12853) sh               cpu=2 start=5.15  finish=183.88\n        12865) bash             cpu=2 start=5.15  finish=183.88\n          12889) perlbench_r_bas  cpu=2 start=5.16  finish=183.86\n      12854) sh               cpu=3 start=5.15  finish=184.63\n        12864) bash             cpu=3 start=5.15  finish=184.63\n          12884) perlbench_r_bas  cpu=3 start=5.16  finish=184.61\n      12855) sh               cpu=4 start=5.15  finish=184.26\n        12866) bash             cpu=4 start=5.15  finish=184.26\n          12885) perlbench_r_bas  cpu=4 start=5.16  finish=184.24\n      12856) sh               cpu=5 start=5.15  finish=187.82\n        12867) bash             cpu=5 start=5.15  finish=187.82\n          12887) perlbench_r_bas  cpu=5 start=5.16  finish=187.80\n      12857) sh               cpu=14 start=5.15  finish=190.94\n        12870) bash             cpu=6 start=5.15  finish=190.94\n          12892) perlbench_r_bas  cpu=6 start=5.16  finish=190.92\n      12858) sh               cpu=7 start=5.15  finish=184.17\n        12869) bash             cpu=7 start=5.15  finish=184.17\n          12886) perlbench_r_bas  cpu=7 start=5.16  finish=184.14\n      12859) sh               cpu=8 start=5.15  finish=184.46\n        12872) bash             cpu=8 start=5.15  finish=184.45\n          12888) perlbench_r_bas  cpu=8 start=5.16  finish=184.43\n      12860) sh               cpu=9 start=5.15  finish=184.66\n        12876) bash             cpu=9 start=5.15  finish=184.66\n          12895) perlbench_r_bas  cpu=9 start=5.16  finish=184.64\n      12863) sh               cpu=10 start=5.15  finish=183.79\n        12874) bash             cpu=10 start=5.15  finish=183.79\n          12890) perlbench_r_bas  cpu=10 start=5.16  finish=183.77\n      12868) sh               cpu=11 start=5.15  finish=183.38\n        12878) bash             cpu=11 start=5.15  finish=183.38\n          12893) perlbench_r_bas  cpu=11 start=5.16  finish=183.36\n      12871) sh               cpu=12 start=5.15  finish=184.34\n        12879) bash             cpu=12 start=5.15  finish=184.34\n          12897) perlbench_r_bas  cpu=12 start=5.16  finish=184.32\n      12873) sh               cpu=13 start=5.15  finish=187.88\n        12882) bash             cpu=13 start=5.16  finish=187.88\n          12896) perlbench_r_bas  cpu=13 start=5.16  finish=187.86\n      12875) sh               cpu=6 start=5.15  finish=190.94\n        12880) bash             cpu=14 start=5.16  finish=190.94\n          12894) perlbench_r_bas  cpu=14 start=5.16  finish=190.92\n      12877) sh               cpu=15 start=5.15  finish=183.64\n        12881) bash             cpu=15 start=5.16  finish=183.64\n          12898) perlbench_r_bas  cpu=15 start=5.16  finish=183.62\n      12900) sh               cpu=11 start=183.38 finish=277.85\n        12901) bash             cpu=11 start=183.38 finish=277.85\n          12902) perlbench_r_bas  cpu=11 start=183.38 finish=277.83\n      12903) sh               cpu=0 start=183.62 finish=278.63\n        12904) bash             cpu=0 start=183.62 finish=278.63\n          12905) perlbench_r_bas  cpu=0 start=183.62 finish=278.61\n      12906) sh               cpu=15 start=183.64 finish=278.19\n        12907) bash             cpu=15 start=183.64 finish=278.19\n          12908) perlbench_r_bas  cpu=15 start=183.65 finish=278.17\n      12909) sh               cpu=10 start=183.79 finish=280.01\n        12910) bash             cpu=10 start=183.79 finish=280.01\n          12911) perlbench_r_bas  cpu=10 start=183.79 finish=280.00\n      12912) sh               cpu=2 start=183.88 finish=278.49\n        12913) bash             cpu=2 start=183.88 finish=278.49\n          12914) perlbench_r_bas  cpu=2 start=183.89 finish=278.48\n      12915) sh               cpu=7 start=184.17 finish=278.86\n        12916) bash             cpu=7 start=184.17 finish=278.86\n          12917) perlbench_r_bas  cpu=7 start=184.17 finish=278.85\n      12918) sh               cpu=4 start=184.26 finish=278.69\n        12919) bash             cpu=4 start=184.26 finish=278.69\n          12920) perlbench_r_bas  cpu=4 start=184.26 finish=278.68\n      12921) sh               cpu=12 start=184.34 finish=278.65\n        12922) bash             cpu=12 start=184.35 finish=278.65\n          12923) perlbench_r_bas  cpu=12 start=184.35 finish=278.64\n      12924) sh               cpu=8 start=184.46 finish=279.29\n        12925) bash             cpu=8 start=184.46 finish=279.29\n          12926) perlbench_r_bas  cpu=8 start=184.46 finish=279.28\n      12927) sh               cpu=3 start=184.63 finish=278.84\n        12928) bash             cpu=3 start=184.63 finish=278.84\n          12929) perlbench_r_bas  cpu=3 start=184.63 finish=278.83\n      12930) sh               cpu=9 start=184.66 finish=279.02\n        12931) bash             cpu=9 start=184.66 finish=279.02\n          12932) perlbench_r_bas  cpu=9 start=184.66 finish=279.01\n      12933) sh               cpu=1 start=184.93 finish=279.64\n        12934) bash             cpu=1 start=184.93 finish=279.64\n          12935) perlbench_r_bas  cpu=1 start=184.93 finish=279.62\n      12936) sh               cpu=5 start=187.82 finish=282.27\n        12937) bash             cpu=5 start=187.83 finish=282.27\n          12938) perlbench_r_bas  cpu=5 start=187.83 finish=282.26\n      12939) sh               cpu=13 start=187.88 finish=282.54\n        12940) bash             cpu=13 start=187.88 finish=282.53\n          12941) perlbench_r_bas  cpu=13 start=187.88 finish=282.52\n      12942) sh               cpu=14 start=190.94 finish=285.52\n        12944) bash             cpu=14 start=190.94 finish=285.52\n          12947) perlbench_r_bas  cpu=14 start=190.94 finish=285.50\n      12943) sh               cpu=6 start=190.94 finish=285.98\n        12945) bash             cpu=6 start=190.94 finish=285.98\n          12946) perlbench_r_bas  cpu=6 start=190.94 finish=285.97\n      12948) sh               cpu=11 start=277.85 finish=421.37\n        12949) bash             cpu=11 start=277.85 finish=421.37\n          12950) perlbench_r_bas  cpu=11 start=277.85 finish=421.36\n      12951) sh               cpu=15 start=278.19 finish=422.62\n        12952) bash             cpu=15 start=278.19 finish=422.62\n          12953) perlbench_r_bas  cpu=15 start=278.19 finish=422.61\n      12954) sh               cpu=2 start=278.49 finish=421.35\n        12955) bash             cpu=2 start=278.50 finish=421.35\n          12956) perlbench_r_bas  cpu=2 start=278.50 finish=421.34\n      12957) sh               cpu=0 start=278.63 finish=422.26\n        12958) bash             cpu=0 start=278.63 finish=422.26\n          12959) perlbench_r_bas  cpu=0 start=278.63 finish=422.25\n      12960) sh               cpu=12 start=278.65 finish=422.32\n        12961) bash             cpu=12 start=278.65 finish=422.32\n          12962) perlbench_r_bas  cpu=12 start=278.66 finish=422.31\n      12963) sh               cpu=4 start=278.69 finish=422.31\n        12964) bash             cpu=4 start=278.70 finish=422.31\n          12965) perlbench_r_bas  cpu=4 start=278.70 finish=422.30\n      12966) sh               cpu=3 start=278.84 finish=422.38\n        12967) bash             cpu=3 start=278.85 finish=422.38\n          12968) perlbench_r_bas  cpu=3 start=278.85 finish=422.37\n      12969) sh               cpu=7 start=278.86 finish=422.89\n        12970) bash             cpu=7 start=278.86 finish=422.89\n          12971) perlbench_r_bas  cpu=7 start=278.87 finish=422.88\n      12972) sh               cpu=9 start=279.02 finish=421.74\n        12973) bash             cpu=9 start=279.03 finish=421.74\n          12974) perlbench_r_bas  cpu=9 start=279.03 finish=421.73\n      12975) sh               cpu=8 start=279.29 finish=422.45\n        12976) bash             cpu=8 start=279.29 finish=422.45\n          12977) perlbench_r_bas  cpu=8 start=279.30 finish=422.44\n      12978) sh               cpu=1 start=279.64 finish=422.96\n        12979) bash             cpu=1 start=279.64 finish=422.96\n          12980) perlbench_r_bas  cpu=1 start=279.65 finish=422.95\n      12981) sh               cpu=10 start=280.01 finish=422.42\n        12982) bash             cpu=10 start=280.01 finish=422.42\n          12983) perlbench_r_bas  cpu=10 start=280.01 finish=422.42\n      12984) sh               cpu=5 start=282.27 finish=426.79\n        12985) bash             cpu=5 start=282.27 finish=426.79\n          12986) perlbench_r_bas  cpu=5 start=282.28 finish=426.79\n      12987) sh               cpu=13 start=282.54 finish=426.17\n        12988) bash             cpu=13 start=282.54 finish=426.17\n          12989) perlbench_r_bas  cpu=13 start=282.54 finish=426.16\n      12990) sh               cpu=14 start=285.52 finish=428.38\n        12991) bash             cpu=14 start=285.52 finish=428.38\n          12992) perlbench_r_bas  cpu=14 start=285.52 finish=428.38\n      12993) sh               cpu=6 start=285.98 finish=428.15\n        12994) bash             cpu=6 start=285.99 finish=428.15\n          12995) perlbench_r_bas  cpu=6 start=285.99 finish=428.14\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>perlbench is a SPEC CPU(R) benchmark written in C and described here. The workload runs on all logical cores. Topdown profile shows two different regions, one with higher retirement rate and one with high backend stalls and lower retirement rate. <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/cpu2017\/500-perlbench_r\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":2297,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-2375","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2375","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=2375"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2375\/revisions"}],"predecessor-version":[{"id":2440,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2375\/revisions\/2440"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2297"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=2375"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}