{"id":2383,"date":"2024-06-04T12:23:29","date_gmt":"2024-06-04T12:23:29","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=2383"},"modified":"2024-06-06T11:26:32","modified_gmt":"2024-06-06T11:26:32","slug":"520-omnetpp_r","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/cpu2017\/520-omnetpp_r\/","title":{"rendered":"520.omnetpp_r"},"content":{"rendered":"\n<p>omnetpp is a SPEC CPU(R) benchmark written in C++ and described here. The workload runs on all logical cores.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-21.png\" alt=\"\" class=\"wp-image-2450\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-21.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-21-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-21-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows a backend-bound workload.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-22.png\" alt=\"\" class=\"wp-image-2452\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-22.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-22-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-22-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics on 7840 show ~2\/3 of time waiting in memory stalls.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              2024.583\non_cpu               0.990          # 15.83 \/ 16 cores\nutime                32034.007\nstime                19.267\nnvcsw                42794          # 13.33%\nnivcsw               278139         # 86.67%\ninblock              0              # 0.00\/sec\nonblock              1763320        # 870.95\/sec\ncpu-clock            32055936743326 # 32055.937 seconds\ntask-clock           32056180818475 # 32056.181 seconds\npage faults          3453095        # 107.720\/sec\ncontext switches     320372         # 9.994\/sec\ncpu migrations       193            # 0.006\/sec\nmajor page faults    897            # 0.028\/sec\nminor page faults    3452198        # 107.692\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             8305779946694  # 196.963 branches per 1000 inst\nbranch misses        250499781336   # 3.02% branch miss\nconditional          6062522008938  # 143.767 conditional branches per 1000 inst\nindirect             496970309851   # 11.785 indirect branches per 1000 inst\ncpu-cycles           148966833040595 # 4.58 GHz\ninstructions         42177169458361 # 0.28 IPC low\nslots                297908418460326 #\nretiring             14503312646284 #  4.9% ( 5.6%) low\n-- ucode             38380239382    #     0.0%\n-- fastpath          14464932406902 #     4.9%\nfrontend             24834374620841 #  8.3% ( 9.6%)\n-- latency           13244967852348 #     4.4%\n-- bandwidth         11589406768493 #     3.9%\nbackend              212881726950791 # 71.5% (81.9%) high\n-- cpu               14319596152813 #     4.8%\n-- memory            198562130797978 #    66.7%\nspeculation          7684515471591  #  2.6% ( 3.0%)\n-- branch mispredict 7390004514273  #     2.5%\n-- pipeline restart  294510957318   #     0.1%\nsmt-contention       38004401958954 # 12.8% ( 0.0%)\ncpu-cycles           149229662422972 # 4.57 GHz\ninstructions         42174841732306 # 0.28 IPC low\ninstructions         14057642378395 # 72.990 l2 access per 1000 inst\nl2 hit from l1       801165822037   # 44.62% l2 miss\nl2 miss from l1      305524457589   #\nl2 hit from l2 pf    72570092841    #\nl3 hit from l2 pf    19197570229    #\nl3 miss from l2 pf   133135850187   #\ninstructions         14058151814347 # 16.375 float per 1000 inst\nfloat 512            266            # 0.000 AVX-512 per 1000 inst\nfloat 256            302723         # 0.000 AVX-256 per 1000 inst\nfloat 128            230203085403   # 16.375 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         32             # 0.000 scalar per 1000 inst\ninstructions         42167979704725 #\nopcache              12148761755871 # 288.104 opcache per 1000 inst\nopcache miss         283691907486   #  2.3% opcache miss rate\nl1 dTLB miss         1871111836272  # 44.373 L1 dTLB per 1000 inst\nl2 dTLB miss         236293697504   # 5.604 L2 dTLB per 1000 inst\ninstructions         42167934060175 #\nicache               440419443915   # 10.444 icache per 1000 inst\nicache miss          125095740477   # 28.4% icache miss rate\nl1 iTLB miss         37916440996    # 0.899 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            120050         # 0.000 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Process overview shows most time spent in omnetpp_r_base.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>581 processes\n\t 48 omnetpp_r_base.      31711.38    12.68\n\t 69 specperl               174.02     3.59\n\t  1 clang++                  0.01     0.01\n\t 11 ps                       0.00     0.02\n\t  1 lsb_release              0.00     0.01\n\t173 sh                       0.00     0.00\n\t 54 specrxp                  0.00     0.00\n\t 48 bash                     0.00     0.00\n\t 41 specinvoke               0.00     0.00\n\t 21 grep                     0.00     0.00\n\t 20 cat                      0.00     0.00\n\t 12 uniq                     0.00     0.00\n\t 11 sort                     0.00     0.00\n\t 10 expand                   0.00     0.00\n\t  6 pwd                      0.00     0.00\n\t  5 basename                 0.00     0.00\n\t  5 specmake                 0.00     0.00\n\t  5 systemctl                0.00     0.00\n\t  4 specpp                   0.00     0.00\n\t  4 uname                    0.00     0.00\n\t  3 dirname                  0.00     0.00\n\t  3 dmidecode                0.00     0.00\n\t  3 lscpu                    0.00     0.00\n\t  2 df                       0.00     0.00\n\t  2 dpkg                     0.00     0.00\n\t  2 rm                       0.00     0.00\n\t  2 runcpu                   0.00     0.00\n\t  2 specsha512sum            0.00     0.00\n\t  2 specxz                   0.00     0.00\n\t  2 who                      0.00     0.00\n\t  1 cpupower                 0.00     0.00\n\t  1 head                     0.00     0.00\n\t  1 logname                  0.00     0.00\n\t  1 ls                       0.00     0.00\n\t  1 numactl                  0.00     0.00\n\t  1 sysctl                   0.00     0.00\n\t  1 w                        0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 which                    0.00     0.00\n0 processes running\n53 maximum processes\n<\/code><\/pre>\n\n\n\n<p>specinvoke fires up separate copies on each logical core.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>    40845) specinvoke       cpu=1 start=3.25  finish=667.87\n      40847) sh               cpu=1 start=3.25  finish=664.94\n        40857) bash             cpu=0 start=3.25  finish=664.94\n          40883) omnetpp_r_base.  cpu=0 start=3.26  finish=664.91\n      40848) sh               cpu=1 start=3.25  finish=662.74\n        40860) bash             cpu=1 start=3.25  finish=662.74\n          40885) omnetpp_r_base.  cpu=1 start=3.26  finish=662.72\n      40849) sh               cpu=3 start=3.25  finish=662.84\n        40861) bash             cpu=2 start=3.25  finish=662.84\n          40880) omnetpp_r_base.  cpu=2 start=3.26  finish=662.81\n      40850) sh               cpu=3 start=3.25  finish=662.69\n        40865) bash             cpu=3 start=3.25  finish=662.69\n          40882) omnetpp_r_base.  cpu=3 start=3.26  finish=662.66\n      40851) sh               cpu=10 start=3.25  finish=666.60\n        40872) bash             cpu=4 start=3.26  finish=666.60\n          40889) omnetpp_r_base.  cpu=4 start=3.26  finish=666.58\n      40852) sh               cpu=1 start=3.25  finish=665.06\n        40873) bash             cpu=5 start=3.26  finish=665.06\n          40887) omnetpp_r_base.  cpu=5 start=3.26  finish=665.03\n      40853) sh               cpu=0 start=3.25  finish=667.21\n        40859) bash             cpu=6 start=3.25  finish=667.21\n          40879) omnetpp_r_base.  cpu=6 start=3.26  finish=667.19\n      40854) sh               cpu=0 start=3.25  finish=667.07\n        40864) bash             cpu=7 start=3.25  finish=667.07\n          40881) omnetpp_r_base.  cpu=7 start=3.26  finish=667.04\n      40855) sh               cpu=8 start=3.25  finish=665.72\n        40867) bash             cpu=8 start=3.25  finish=665.72\n          40884) omnetpp_r_base.  cpu=8 start=3.26  finish=665.70\n      40856) sh               cpu=1 start=3.25  finish=663.34\n        40869) bash             cpu=9 start=3.25  finish=663.34\n          40886) omnetpp_r_base.  cpu=9 start=3.26  finish=663.32\n      40858) sh               cpu=9 start=3.25  finish=664.85\n        40871) bash             cpu=10 start=3.26  finish=664.85\n          40888) omnetpp_r_base.  cpu=10 start=3.26  finish=664.83\n      40862) sh               cpu=11 start=3.25  finish=664.09\n        40874) bash             cpu=11 start=3.26  finish=664.09\n          40891) omnetpp_r_base.  cpu=11 start=3.26  finish=664.07\n      40863) sh               cpu=8 start=3.25  finish=665.78\n        40875) bash             cpu=12 start=3.26  finish=665.78\n          40890) omnetpp_r_base.  cpu=12 start=3.26  finish=665.76\n      40866) sh               cpu=9 start=3.25  finish=665.49\n        40876) bash             cpu=13 start=3.26  finish=665.49\n          40892) omnetpp_r_base.  cpu=13 start=3.26  finish=665.47\n      40868) sh               cpu=14 start=3.25  finish=666.85\n        40877) bash             cpu=14 start=3.26  finish=666.85\n          40893) omnetpp_r_base.  cpu=14 start=3.26  finish=666.82\n      40870) sh               cpu=14 start=3.26  finish=667.86\n        40878) bash             cpu=15 start=3.26  finish=667.86\n          40894) omnetpp_r_base.  cpu=15 start=3.26  finish=667.85\n\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>omnetpp is a SPEC CPU(R) benchmark written in C++ and described here. The workload runs on all logical cores. Topdown profile shows a backend-bound workload. AMD metrics on 7840 show ~2\/3 of time waiting in memory stalls. Process overview shows <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/cpu2017\/520-omnetpp_r\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":2297,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-2383","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2383","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=2383"}],"version-history":[{"count":3,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2383\/revisions"}],"predecessor-version":[{"id":2453,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2383\/revisions\/2453"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2297"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=2383"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}