{"id":1507,"date":"2024-02-04T10:26:32","date_gmt":"2024-02-04T10:26:32","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1507"},"modified":"2024-02-10T00:41:20","modified_gmt":"2024-02-10T00:41:20","slug":"byte","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/byte\/","title":{"rendered":"byte"},"content":{"rendered":"\n<p>This workload tries the BYTE magazine benchmarks. There is one test which reports a &#8220;LPS&#8221; score and is labeled Dhrystone2. Topdown metrics show this is single-threaded<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-47.png\" alt=\"\" class=\"wp-image-1651\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-47.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-47-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-47-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows a high retirement rate and low backend stalls.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-49.png\" alt=\"\" class=\"wp-image-1653\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-49.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-49-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-49-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show little floating point or L2 access. The retirement rate is high and backend stalls are low.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              376.144\non_cpu               0.030          # 0.48 \/ 16 cores\nutime                181.239\nstime                0.842\nnvcsw                2871           # 77.14%\nnivcsw               851            # 22.86%\ninblock              0              # 0.00\/sec\nonblock              13224          # 35.16\/sec\ncpu-clock            182136835292   # 182.137 seconds\ntask-clock           182143833055   # 182.144 seconds\npage faults          169796         # 932.208\/sec\ncontext switches     5163           # 28.346\/sec\ncpu migrations       309            # 1.696\/sec\nmajor page faults    2              # 0.011\/sec\nminor page faults    169794         # 932.197\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             524735697772   # 211.506 branches per 1000 inst\nbranch misses        96962116       # 0.02% branch miss\nconditional          174354415594   # 70.277 conditional branches per 1000 inst\nindirect             10331405266    # 4.164 indirect branches per 1000 inst\ncpu-cycles           818505851767   # 0.14 GHz\ninstructions         2460257662364  # 3.01 IPC high\nslots                1643099602008  #\nretiring             959887401847   # 58.4% (58.4%) high\n-- ucode             3447437753     #     0.2%\n-- fastpath          956439964094   #    58.2%\nfrontend             614844455750   # 37.4% (37.4%)\n-- latency           231180899748   #    14.1%\n-- bandwidth         383663556002   #    23.3%\nbackend              67559726976    #  4.1% ( 4.1%) low\n-- cpu               11680342079    #     0.7%\n-- memory            55879384897    #     3.4%\nspeculation          606848834      #  0.0% ( 0.0%) low\n-- branch mispredict 601459009      #     0.0%\n-- pipeline restart  5389825        #     0.0%\nsmt-contention       200847683      #  0.0% ( 0.0%)\ncpu-cycles           818794666497   # 0.14 GHz\ninstructions         2474199493126  # 3.02 IPC high\ninstructions         826840623538   # 0.150 l2 access per 1000 inst\nl2 hit from l1       106675842      # 19.72% l2 miss\nl2 miss from l1      14876369       #\nl2 hit from l2 pf    7586711        #\nl3 hit from l2 pf    4462654        #\nl3 miss from l2 pf   5067724        #\ninstructions         826429055369   # 33.886 float per 1000 inst\nfloat 512            141            # 0.000 AVX-512 per 1000 inst\nfloat 256            608            # 0.000 AVX-256 per 1000 inst\nfloat 128            28003985855    # 33.886 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\ninstructions         2690625        #\nopcache              995497         # 369.987 opcache per 1000 inst\nopcache miss         534201         # 53.7% opcache miss rate\nl1 dTLB miss         6519           # 2.423 L1 dTLB per 1000 inst\nl2 dTLB miss         1233           # 0.458 L2 dTLB per 1000 inst\ninstructions         2706981        #\nicache               1318482        # 487.067 icache per 1000 inst\nicache miss          110888         #  8.4% icache miss rate\nl1 iTLB miss         16             # 0.006 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            19             # 0.007 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              376.121\non_cpu               0.030          # 0.48 \/ 16 cores\nutime                181.045\nstime                0.504\nnvcsw                2808           # 75.69%\nnivcsw               902            # 24.31%\ninblock              1016           # 2.70\/sec\nonblock              1984           # 5.27\/sec\ncpu-clock            181606889909   # 181.607 seconds\ntask-clock           181615759582   # 181.616 seconds\npage faults          152124         # 837.615\/sec\ncontext switches     5162           # 28.423\/sec\ncpu migrations       294            # 1.619\/sec\nmajor page faults    13             # 0.072\/sec\nminor page faults    152111         # 837.543\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             479534884128   # 208.329 branches per 1000 inst\nbranch misses        21110946       # 0.00% branch miss\nconditional          479534904608   # 208.329 conditional branches per 1000 inst\nindirect             9581080556     # 4.162 indirect branches per 1000 inst\nslots                4118598327020  #\nretiring             2550938357460  # 61.9% (61.9%) high\n-- ucode             348610784208   #     8.5%\n-- fastpath          2202327573252  #    53.5%\nfrontend             1496772644013  # 36.3% (36.3%)\n-- latency           503365377121   #    12.2%\n-- bandwidth         993407266892   #    24.1%\nbackend              58209963637    #  1.4% ( 1.4%) low\n-- cpu               24924613936    #     0.6%\n-- memory            33285349701    #     0.8%\nspeculation          12739638981    #  0.3% ( 0.3%) low\n-- branch mispredict 4540195336     #     0.1%\n-- pipeline restart  8199443645     #     0.2%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           686567174342   # 0.11 GHz\ninstructions         2312687887380  # 3.37 IPC high\nl2 access            309209566      # 0.134 l2 access per 1000 inst\nl2 miss              103542227      # 33.49% l2 miss\ncpu-cycles           686548172908   #  2.9% memory latency\nload stalls          19650125766    #  2.7% l1 bound\nl1 miss              972656265      #  0.1% l2 bound\nl2 miss              509358137      #  0.0% l3 bound\nl3 miss              348875035      #  0.1% dram bound\nstore_stalls         84906014       #  0.0% store bound\n<\/code><\/pre>\n\n\n\n<p>Process summary confirms invocations of Dhrystone<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>639 processes\n\t 18 dhry2                  179.96     0.00\n\t 68 clinfo                  16.88     5.55\n\t 38 vulkaninfo               1.14     1.13\n\t  4 vulkani:disk$0           0.12     0.12\n\t  6 glxinfo:gdrv0            0.09     0.07\n\t  6 glxinfo:gl0              0.09     0.07\n\t  6 php                      0.07     0.11\n\t  2 llvmpipe-0               0.06     0.06\n\t  2 llvmpipe-1               0.06     0.06\n\t  2 llvmpipe-10              0.06     0.06\n\t  2 llvmpipe-11              0.06     0.06\n\t  2 llvmpipe-12              0.06     0.06\n\t  2 llvmpipe-13              0.06     0.06\n\t  2 llvmpipe-14              0.06     0.06\n\t  2 llvmpipe-15              0.06     0.06\n\t  2 llvmpipe-2               0.06     0.06\n\t  2 llvmpipe-3               0.06     0.06\n\t  2 llvmpipe-4               0.06     0.06\n\t  2 llvmpipe-5               0.06     0.06\n\t  2 llvmpipe-6               0.06     0.06\n\t  2 llvmpipe-7               0.06     0.06\n\t  2 llvmpipe-8               0.06     0.06\n\t  2 llvmpipe-9               0.06     0.06\n\t  2 glxinfo                  0.05     0.03\n\t  2 glxinfo:cs0              0.05     0.03\n\t  2 glxinfo:disk$0           0.05     0.03\n\t  2 glxinfo:sh0              0.05     0.03\n\t  2 glxinfo:shlo0            0.05     0.03\n\t  6 clang                    0.04     0.08\n\t  3 rocminfo                 0.03     0.00\n\t  1 lspci                    0.00     0.02\n\t  1 ps                       0.00     0.01\n\t109 sh                       0.00     0.00\n\t 57 Run                      0.00     0.00\n\t 36 sync                     0.00     0.00\n\t 21 time                     0.00     0.00\n\t 18 rm                       0.00     0.00\n\t 18 sleep                    0.00     0.00\n\t 15 cat                      0.00     0.00\n\t 14 gsettings                0.00     0.00\n\t 13 date                     0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t 12 cleanup.sh               0.00     0.00\n\t 10 wc                       0.00     0.00\n\t  9 awk                      0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  7 sed                      0.00     0.00\n\t  6 chmod                    0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  6 make                     0.00     0.00\n\t  6 who                      0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  5 uname                    0.00     0.00\n\t  4 grep                     0.00     0.00\n\t  4 sort                     0.00     0.00\n\t  3 byte                     0.00     0.00\n\t  3 join                     0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 dconf worker             0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 gmain                    0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Computation blocks have some extra processes<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      3425010) byte             cpu=13 start=5.56  finish=125.70\n        3425011) rm               cpu=7 start=5.56  finish=5.56 \n        3425012) Run              cpu=11 start=5.56  finish=125.70\n          3425013) Run              cpu=2 start=5.57  finish=5.57 \n            3425014) time             cpu=4 start=5.57  finish=5.57 \n              3425017) date             cpu=7 start=5.57  finish=5.57 \n            3425015) wc               cpu=6 start=5.57  finish=5.57 \n            3425016) sed              cpu=0 start=5.57  finish=5.57 \n          3425018) make             cpu=1 start=5.57  finish=5.59 \n            3425019) make             cpu=0 start=5.57  finish=5.59 \n              3425020) sh               cpu=2 start=5.58  finish=5.58 \n              3425021) sh               cpu=4 start=5.58  finish=5.58 \n              3425022) sh               cpu=7 start=5.58  finish=5.58 \n              3425023) sh               cpu=6 start=5.58  finish=5.58 \n              3425024) sh               cpu=13 start=5.58  finish=5.58 \n              3425025) sh               cpu=2 start=5.58  finish=5.58 \n              3425026) sh               cpu=7 start=5.58  finish=5.59 \n                3425027) chmod            cpu=4 start=5.59  finish=5.59 \n          3425028) Run              cpu=10 start=5.59  finish=5.59 \n          3425029) Run              cpu=6 start=5.59  finish=5.59 \n          3425030) Run              cpu=13 start=5.59  finish=5.59 \n          3425031) Run              cpu=10 start=5.59  finish=5.59 \n          3425032) Run              cpu=0 start=5.59  finish=5.59 \n          3425033) Run              cpu=7 start=5.59  finish=5.59 \n          3425034) Run              cpu=12 start=5.59  finish=5.59 \n          3425035) cat              cpu=1 start=5.59  finish=5.59 \n          3425036) rm               cpu=7 start=5.60  finish=5.60 \n          3425037) chmod            cpu=0 start=5.60  finish=5.60 \n          3425038) date             cpu=12 start=5.60  finish=5.60 \n          3425039) cat              cpu=7 start=5.60  finish=5.60 \n          3425040) rm               cpu=2 start=5.60  finish=5.60 \n          3425041) uname            cpu=6 start=5.60  finish=5.60 \n          3425042) date             cpu=0 start=5.61  finish=5.61 \n          3425043) Run              cpu=7 start=5.61  finish=5.61 \n            3425044) who              cpu=4 start=5.61  finish=5.61 \n            3425045) wc               cpu=1 start=5.61  finish=5.61 \n          3425046) Run              cpu=2 start=5.61  finish=5.61 \n            3425047) Run              cpu=0 start=5.61  finish=5.61 \n            3425048) sed              cpu=6 start=5.61  finish=5.61 \n          3425049) rm               cpu=0 start=5.61  finish=5.61 \n          3425050) sync             cpu=7 start=5.61  finish=5.62 \n          3425052) sync             cpu=0 start=5.62  finish=5.62 \n          3425053) sleep            cpu=4 start=5.62  finish=15.62\n          3425056) Run              cpu=5 start=15.62 finish=15.62\n          3425057) time             cpu=6 start=15.62 finish=25.63\n            3425058) dhry2            cpu=7 start=15.63 finish=25.63\n          3425061) sync             cpu=5 start=25.63 finish=25.63\n          3425062) sync             cpu=6 start=25.63 finish=25.63\n          3425063) sleep            cpu=5 start=25.63 finish=35.63\n          3425064) Run              cpu=6 start=35.63 finish=35.63\n          3425065) time             cpu=6 start=35.63 finish=45.64\n            3425066) dhry2            cpu=15 start=35.64 finish=45.64\n          3425067) sync             cpu=4 start=45.64 finish=45.64\n          3425068) sync             cpu=5 start=45.64 finish=45.64\n          3425069) sleep            cpu=15 start=45.64 finish=55.64\n          3425070) Run              cpu=12 start=55.65 finish=55.65\n          3425071) time             cpu=5 start=55.65 finish=65.65\n            3425072) dhry2            cpu=0 start=55.65 finish=65.65\n          3425073) sync             cpu=12 start=65.65 finish=65.65\n          3425074) sync             cpu=6 start=65.65 finish=65.65\n          3425075) sleep            cpu=7 start=65.65 finish=75.65\n          3425076) Run              cpu=5 start=75.66 finish=75.66\n          3425077) time             cpu=4 start=75.66 finish=85.66\n            3425078) dhry2            cpu=6 start=75.66 finish=85.66\n          3425079) sync             cpu=5 start=85.66 finish=85.66\n          3425080) sync             cpu=6 start=85.66 finish=85.66\n          3425081) sleep            cpu=5 start=85.66 finish=95.67\n          3425082) Run              cpu=6 start=95.67 finish=95.67\n          3425083) time             cpu=7 start=95.67 finish=105.67\n            3425084) dhry2            cpu=0 start=95.67 finish=105.67\n          3425085) sync             cpu=12 start=105.67 finish=105.67\n          3425086) sync             cpu=5 start=105.67 finish=105.67\n          3425087) sleep            cpu=6 start=105.67 finish=115.68\n          3425088) Run              cpu=12 start=115.68 finish=115.68\n          3425089) time             cpu=13 start=115.68 finish=125.68\n            3425090) dhry2            cpu=6 start=115.68 finish=125.68\n          3425091) cleanup.sh       cpu=4 start=125.68 finish=125.69\n            3425092) cleanup.sh       cpu=13 start=125.68 finish=125.68\n            3425093) cleanup.sh       cpu=7 start=125.68 finish=125.68\n            3425094) awk              cpu=0 start=125.68 finish=125.69\n            3425095) cat              cpu=6 start=125.69 finish=125.69\n            3425096) rm               cpu=7 start=125.69 finish=125.69\n            3425097) cleanup.sh       cpu=1 start=125.69 finish=125.69\n          3425098) date             cpu=2 start=125.69 finish=125.69\n          3425099) Run              cpu=13 start=125.69 finish=125.69\n            3425100) who              cpu=6 start=125.69 finish=125.69\n            3425101) wc               cpu=0 start=125.69 finish=125.69\n          3425102) sh               cpu=4 start=125.69 finish=125.69\n            3425103) awk              cpu=7 start=125.69 finish=125.69\n          3425104) sh               cpu=9 start=125.69 finish=125.70\n            3425105) sort             cpu=7 start=125.70 finish=125.70\n            3425106) join             cpu=2 start=125.70 finish=125.70\n            3425107) awk              cpu=5 start=125.70 finish=125.70\n            3425108) rm               cpu=4 start=125.70 finish=125.70\n          3425109) cat              cpu=6 start=125.70 finish=125.70\n        3425110) cat              cpu=0 start=125.70 finish=125.70\n        3425111) grep             cpu=7 start=125.70 finish=125.70\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>This workload tries the BYTE magazine benchmarks. There is one test which reports a &#8220;LPS&#8221; score and is labeled Dhrystone2. Topdown metrics show this is single-threaded Topdown profile shows a high retirement rate and low backend stalls. AMD metrics show <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/byte\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1507","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1507","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1507"}],"version-history":[{"count":3,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1507\/revisions"}],"predecessor-version":[{"id":1654,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1507\/revisions\/1654"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1507"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}