{"id":988,"date":"2024-01-27T17:25:23","date_gmt":"2024-01-27T17:25:23","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=988"},"modified":"2024-01-28T19:19:09","modified_gmt":"2024-01-28T19:19:09","slug":"glibc-bench","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/glibc-bench\/","title":{"rendered":"glibc-bench"},"content":{"rendered":"\n<p>Quick running test of 15 glib functions, mostly math routines and also pthread creation.  These run single-threaded. This benchmark is an outlier with Intel CPU reporting ~2X faster than AMD while almost every case has AMD faster and a median of AMD as 1.25x faster than my Intel CPU<\/p>\n\n\n\n<p>Looks like these are single-threaded tests also not using the CPU much during that time.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-74.png\" alt=\"\" class=\"wp-image-1000\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-74.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-74-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-74-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows across the map but generally frontend bound is highest.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-111.png\" alt=\"\" class=\"wp-image-1001\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-111.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-111-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-111-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics confirm a low on-core amount of less than 0.25. This is floating point code with 1\/6 of the instructions as branches.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              361.394\non_cpu               0.014          # 0.22 \/ 16 cores\nutime                79.680\nstime                1.329\nnvcsw                2624           # 86.03%\nnivcsw               426            # 13.97%\ninblock              0              # 0.00\/sec\nonblock              590512         # 1633.98\/sec\ncpu-clock            81084003036    # 81.084 seconds\ntask-clock           81091329790    # 81.091 seconds\npage faults          171284         # 2112.236\/sec\ncontext switches     4566           # 56.307\/sec\ncpu migrations       308            # 3.798\/sec\nmajor page faults    2              # 0.025\/sec\nminor page faults    171282         # 2112.211\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             143004053928   # 168.158 branches per 1000 inst\nbranch misses        147238834      # 0.10% branch miss\nconditional          76874647961    # 90.397 conditional branches per 1000 inst\nindirect             18182592456    # 21.381 indirect branches per 1000 inst\ncpu-cycles           296276857093   # 0.05 GHz\ninstructions         835134997575   # 2.82 IPC\nslots                595808733960   #\nretiring             292824575828   # 49.1% (49.2%)\n-- ucode             489790524      #     0.1%\n-- fastpath          292334785304   #    49.1%\nfrontend             75678185011    # 12.7% (12.7%)\n-- latency           33401593776    #     5.6%\n-- bandwidth         42276591235    #     7.1%\nbackend              225177966311   # 37.8% (37.8%)\n-- cpu               177289153913   #    29.8%\n-- memory            47888812398    #     8.0%\nspeculation          1976038812     #  0.3% ( 0.3%) low\n-- branch mispredict 1551074100     #     0.3%\n-- pipeline restart  424964712      #     0.1%\nsmt-contention       151682832      #  0.0% ( 0.0%)\ncpu-cycles           308438408523   # 0.05 GHz\ninstructions         873920649612   # 2.83 IPC\ninstructions         291987017136   # 0.711 l2 access per 1000 inst\nl2 hit from l1       186290923      # 15.32% l2 miss\nl2 miss from l1      20387235       #\nl2 hit from l2 pf    9865651        #\nl3 hit from l2 pf    4957233        #\nl3 miss from l2 pf   6466175        #\ninstructions         291912058167   # 241.214 float per 1000 inst\nfloat 512            99             # 0.000 AVX-512 per 1000 inst\nfloat 256            686            # 0.000 AVX-256 per 1000 inst\nfloat 128            70413333814    # 241.214 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              348.282\non_cpu               0.014          # 0.23 \/ 16 cores\nutime                78.891\nstime                0.979\nnvcsw                2392           # 86.26%\nnivcsw               381            # 13.74%\ninblock              8              # 0.02\/sec\nonblock              579000         # 1662.44\/sec\ncpu-clock            79962687901    # 79.963 seconds\ntask-clock           79973477567    # 79.973 seconds\npage faults          159208         # 1990.760\/sec\ncontext switches     4242           # 53.043\/sec\ncpu migrations       348            # 4.351\/sec\nmajor page faults    0              # 0.000\/sec\nminor page faults    159208         # 1990.760\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             184610022112   # 183.279 branches per 1000 inst\nbranch misses        48479139       # 0.03% branch miss\nconditional          184610037632   # 183.279 conditional branches per 1000 inst\nindirect             24923776264    # 24.744 indirect branches per 1000 inst\nslots                1796340876272  #\nretiring             1014873631013  # 56.5% (56.5%) high\n-- ucode             80587003382    #     4.5%\n-- fastpath          934286627631   #    52.0%\nfrontend             48346621350    #  2.7% ( 2.7%) low\n-- latency           4806798326     #     0.3%\n-- bandwidth         43539823024    #     2.4%\nbackend              727802271913   # 40.5% (40.5%)\n-- cpu               547583708462   #    30.5%\n-- memory            180218563451   #    10.0%\nspeculation          5601582618     #  0.3% ( 0.3%) low\n-- branch mispredict 5416805254     #     0.3%\n-- pipeline restart  184777364      #     0.0%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           356432878563   # 0.05 GHz\ninstructions         1213906374887  # 3.41 IPC high\nl2 access            416472360      # 0.343 l2 access per 1000 inst\nl2 miss              129752775      # 31.16% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process overview shows largest proportion in the linker, ld.so and glibc-bench having zero time.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>470 processes\n\t 47 ld.so                   76.06     0.00\n\t 68 clinfo                  17.19     8.32\n\t 38 vulkaninfo               1.51     0.76\n\t  4 vulkani:disk$0           0.16     0.08\n\t  6 php                      0.12     0.59\n\t  6 glxinfo:gdrv0            0.12     0.08\n\t  6 glxinfo:gl0              0.12     0.08\n\t  2 llvmpipe-0               0.08     0.04\n\t  2 llvmpipe-1               0.08     0.04\n\t  2 llvmpipe-10              0.08     0.04\n\t  2 llvmpipe-11              0.08     0.04\n\t  2 llvmpipe-12              0.08     0.04\n\t  2 llvmpipe-13              0.08     0.04\n\t  2 llvmpipe-14              0.08     0.04\n\t  2 llvmpipe-15              0.08     0.04\n\t  2 llvmpipe-2               0.08     0.04\n\t  2 llvmpipe-3               0.08     0.04\n\t  2 llvmpipe-4               0.08     0.04\n\t  2 llvmpipe-5               0.08     0.04\n\t  2 llvmpipe-6               0.08     0.04\n\t  2 llvmpipe-7               0.08     0.04\n\t  2 llvmpipe-8               0.08     0.04\n\t  2 llvmpipe-9               0.08     0.04\n\t  2 glxinfo                  0.06     0.04\n\t  2 glxinfo:cs0              0.06     0.04\n\t  2 glxinfo:disk$0           0.06     0.04\n\t  2 glxinfo:sh0              0.06     0.04\n\t  2 glxinfo:shlo0            0.06     0.04\n\t  6 clang                    0.05     0.07\n\t  3 rocminfo                 0.03     0.00\n\t  1 lspci                    0.00     0.03\n\t  1 ps                       0.00     0.01\n\t110 sh                       0.00     0.00\n\t 47 glibc-bench              0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t 11 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  3 gmain                    0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 dconf worker             0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Process overview shows the &#8220;ld.so&#8221; is the proxy for where the tests are running.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      1103623) glibc-bench      cpu=2 start=5.48  finish=6.62 \n        1103624) ld.so            cpu=3 start=5.48  finish=6.62 \n      1103625) glibc-bench      cpu=1 start=10.63 finish=11.77\n        1103626) ld.so            cpu=10 start=10.63 finish=11.77\n      1103629) glibc-bench      cpu=10 start=15.77 finish=16.91\n        1103630) ld.so            cpu=11 start=15.77 finish=16.91\n      1103631) glibc-bench      cpu=1 start=20.91 finish=22.06\n        1103632) ld.so            cpu=2 start=20.92 finish=22.05\n      1103633) sh               cpu=1 start=22.06 finish=22.06\n        1103634) sh               cpu=11 start=22.06 finish=22.06\n      1103635) glibc-bench      cpu=1 start=32.21 finish=36.35\n        1103636) ld.so            cpu=10 start=32.22 finish=36.35\n      1103637) glibc-bench      cpu=9 start=40.35 finish=44.48\n        1103638) ld.so            cpu=10 start=40.35 finish=44.48\n      1103639) glibc-bench      cpu=9 start=48.49 finish=52.62\n        1103640) ld.so            cpu=2 start=48.49 finish=52.62\n      1103641) sh               cpu=11 start=52.62 finish=52.63\n        1103642) sh               cpu=12 start=52.62 finish=52.63\n      1103643) glibc-bench      cpu=2 start=62.79 finish=63.91\n        1103644) ld.so            cpu=3 start=62.79 finish=63.91\n      1103645) glibc-bench      cpu=10 start=67.92 finish=69.05\n        1103646) ld.so            cpu=11 start=67.92 finish=69.05\n      1103647) glibc-bench      cpu=9 start=73.05 finish=74.18\n        1103648) ld.so            cpu=10 start=73.05 finish=74.18\n      1103650) sh               cpu=10 start=74.18 finish=74.18\n        1103651) sh               cpu=3 start=74.18 finish=74.18\n      1103652) glibc-bench      cpu=1 start=84.34 finish=86.49\n        1103653) ld.so            cpu=10 start=84.35 finish=86.49\n      1103654) glibc-bench      cpu=9 start=90.49 finish=92.64\n        1103655) ld.so            cpu=10 start=90.50 finish=92.64\n      1103656) glibc-bench      cpu=9 start=96.64 finish=98.78\n        1103657) ld.so            cpu=10 start=96.64 finish=98.78\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Quick running test of 15 glib functions, mostly math routines and also pthread creation. These run single-threaded. This benchmark is an outlier with Intel CPU reporting ~2X faster than AMD while almost every case has AMD faster and a median <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/glibc-bench\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-988","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/988","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=988"}],"version-history":[{"count":3,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/988\/revisions"}],"predecessor-version":[{"id":1002,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/988\/revisions\/1002"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=988"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}