{"id":794,"date":"2024-01-21T23:35:00","date_gmt":"2024-01-21T23:35:00","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=794"},"modified":"2024-01-21T23:35:01","modified_gmt":"2024-01-21T23:35:01","slug":"build-mplayer","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/build-mplayer\/","title":{"rendered":"build-mplayer"},"content":{"rendered":"\n<p>Another build test, this time for the mplayer media player. A fairly quick build taking less than a minute.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-50.png\" alt=\"\" class=\"wp-image-795\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-50.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-50-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-50-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown overview shows a similar overview to other build processes. Generally heavier on front-end stalls and not as much backend stalls. Slight differences towards end at link stage.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-88.png\" alt=\"\" class=\"wp-image-796\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-88.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-88-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-88-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show ~12 cores kept busy and 1\/5th of the instructions are branches.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              131.083\non_cpu               0.749          # 11.98 \/ 16 cores\nutime                1434.909\nstime                136.095\nnvcsw                184881         # 54.88%\nnivcsw               151998         # 45.12%\ninblock              4840           # 36.92\/sec\nonblock              815992         # 6225.02\/sec\ncpu-clock            1569995918518  # 1569.996 seconds\ntask-clock           1570022200562  # 1570.022 seconds\npage faults          33042137       # 21045.650\/sec\ncontext switches     311123         # 198.165\/sec\ncpu migrations       32456          # 20.672\/sec\nmajor page faults    704            # 0.448\/sec\nminor page faults    33041433       # 21045.201\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             1588505172838  # 207.892 branches per 1000 inst\nbranch misses        43921596872    # 2.76% branch miss\nconditional          1230296354350  # 161.012 conditional branches per 1000 inst\nindirect             34954151096    # 4.575 indirect branches per 1000 inst\ncpu-cycles           5992955401481  # 2.86 GHz\ninstructions         7521843661027  # 1.26 IPC\nslots                12295701158958 #\nretiring             2473184787261  # 20.1% (25.0%)\n-- ucode             3277908923     #     0.0%\n-- fastpath          2469906878338  #    20.1%\nfrontend             4267191925642  # 34.7% (43.1%)\n-- latency           3145065275880  #    25.6%\n-- bandwidth         1122126649762  #     9.1%\nbackend              2704586566512  # 22.0% (27.3%)\n-- cpu               373789605992   #     3.0%\n-- memory            2330796960520  #    19.0%\nspeculation          458055998059   #  3.7% ( 4.6%)\n-- branch mispredict 453301995104   #     3.7%\n-- pipeline restart  4754002955     #     0.0%\nsmt-contention       2392662345903  # 19.5% ( 0.0%)\ncpu-cycles           5999256771224  # 2.86 GHz\ninstructions         7518589349203  # 1.25 IPC\ninstructions         2545551521414  # 39.154 l2 access per 1000 inst\nl2 hit from l1       86386798530    # 17.25% l2 miss\nl2 miss from l1      10772079364    #\nl2 hit from l2 pf    6864784791     #\nl3 hit from l2 pf    3645118667     #\nl3 miss from l2 pf   2771144343     #\ninstructions         2542523986390  # 24.173 float per 1000 inst\nfloat 512            8375           # 0.000 AVX-512 per 1000 inst\nfloat 256            567122         # 0.000 AVX-256 per 1000 inst\nfloat 128            61459757300    # 24.173 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         1              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics show a larger overall elapsed time, perhaps because of additional variability depending on what cores are used.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              702.937\non_cpu               0.830          # 13.27 \/ 16 cores\nutime                8850.134\nstime                480.073\nnvcsw                805769         # 51.24%\nnivcsw               766621         # 48.76%\ninblock              37360          # 53.15\/sec\nonblock              2924240        # 4160.03\/sec\ncpu-clock            9327209381783  # 9327.209 seconds\ntask-clock           9327273727003  # 9327.274 seconds\npage faults          150700356      # 16156.957\/sec\ncontext switches     1466331        # 157.209\/sec\ncpu migrations       144812         # 15.526\/sec\nmajor page faults    2783           # 0.298\/sec\nminor page faults    150697573      # 16156.658\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             7741484482358  # 207.389 branches per 1000 inst\nbranch misses        191113104298   # 2.47% branch miss\nconditional          7741488432022  # 207.389 conditional branches per 1000 inst\nindirect             1540448949106  # 41.268 indirect branches per 1000 inst\nslots                9270230760800  #\nretiring             3740816186972  # 40.4% (40.4%)\n-- ucode             254059658497   #     2.7%\n-- fastpath          3486756528475  #    37.6%\nfrontend             3249083635174  # 35.0% (35.0%)\n-- latency           1538385778881  #    16.6%\n-- bandwidth         1710697856293  #    18.5%\nbackend              700103408302   #  7.6% ( 7.6%)\n-- cpu               352231579537   #     3.8%\n-- memory            347871828765   #     3.8%\nspeculation          1600749496422  # 17.3% (17.3%)\n-- branch mispredict 1561048913198  #    16.8%\n-- pipeline restart  39700583224    #     0.4%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           3783910784078  # 1.61 GHz\ninstructions         5236575955741  # 1.38 IPC\nl2 access            158201882092   # 39.489 l2 access per 1000 inst\nl2 miss              35363291724    # 22.35% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process summary shows more compilation than I expected with ~5900 C complier front end (cc1) dominating the time.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>28756 processes\n\t5918 cc1                   1292.43    69.61\n\t331 yasm                    56.10     0.40\n\t 68 clinfo                  16.53     5.99\n\t 30 make                     6.38     2.36\n\t5917 as                       3.86     0.09\n\t157 ld                       1.67     1.16\n\t  1 xz                       0.80     0.05\n\t 38 vulkaninfo               0.76     1.34\n\t137 awk                      0.61     0.01\n\t 22 ar                       0.33     0.58\n\t  6 glxinfo:gdrv0            0.14     0.07\n\t  6 php                      0.09     0.21\n\t  3 codec-cfg                0.09     0.00\n\t  4 vulkani:disk$0           0.08     0.14\n\t  6 clang                    0.07     0.05\n\t  2 glxinfo                  0.07     0.03\n\t  2 glxinfo:cs0              0.07     0.03\n\t  2 glxinfo:disk$0           0.07     0.03\n\t  2 glxinfo:shlo0            0.07     0.03\n\t  2 glxinfo:sh0              0.06     0.03\n\t2351 configure                0.04     0.32\n\t  2 llvmpipe-0               0.04     0.07\n\t  2 llvmpipe-1               0.04     0.07\n\t  2 llvmpipe-10              0.04     0.07\n\t  2 llvmpipe-11              0.04     0.07\n\t  2 llvmpipe-12              0.04     0.07\n\t  2 llvmpipe-13              0.04     0.07\n\t  2 llvmpipe-14              0.04     0.07\n\t  2 llvmpipe-15              0.04     0.07\n\t  2 llvmpipe-2               0.04     0.07\n\t  2 llvmpipe-3               0.04     0.07\n\t  2 llvmpipe-4               0.04     0.07\n\t  2 llvmpipe-5               0.04     0.07\n\t  2 llvmpipe-6               0.04     0.07\n\t  2 llvmpipe-7               0.04     0.07\n\t  2 llvmpipe-8               0.04     0.07\n\t  2 llvmpipe-9               0.04     0.07\n\t  3 rocminfo                 0.03     0.00\n\t  1 tar                      0.02     0.41\n\t  1 lspci                    0.01     0.01\n\t431 rm                       0.00     0.33\n\t5643 sh                       0.00     0.07\n\t  1 ps                       0.00     0.01\n\t5934 cc                       0.00     0.00\n\t536 cat                      0.00     0.00\n\t526 tr                       0.00     0.00\n\t157 collect2                 0.00     0.00\n\t 46 sed                      0.00     0.00\n\t 24 pkg-config               0.00     0.00\n\t 21 true                     0.00     0.00\n\t 14 cut                      0.00     0.00\n\t 14 tmp                      0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t 13 grep                     0.00     0.00\n\t 12 cmp                      0.00     0.00\n\t 12 gsettings                0.00     0.00\n\t 10 cp                       0.00     0.00\n\t 10 head                     0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  8 version.sh               0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  4 bash                     0.00     0.00\n\t  4 help_create.sh           0.00     0.00\n\t  4 uname                    0.00     0.00\n\t  3 basename                 0.00     0.00\n\t  3 mv                       0.00     0.00\n\t  3 time-compile-mp          0.00     0.00\n\t  3 touch                    0.00     0.00\n\t  2 dconf worker             0.00     0.00\n\t  2 git                      0.00     0.00\n\t  2 gmain                    0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 tail                     0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mkdir                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 nm                       0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 strings                  0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n77 processes running\n158 maximum processes\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Another build test, this time for the mplayer media player. A fairly quick build taking less than a minute. Topdown overview shows a similar overview to other build processes. Generally heavier on front-end stalls and not as much backend stalls. <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/build-mplayer\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-794","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/794","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=794"}],"version-history":[{"count":1,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/794\/revisions"}],"predecessor-version":[{"id":797,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/794\/revisions\/797"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=794"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}