{"id":810,"date":"2024-01-22T10:57:14","date_gmt":"2024-01-22T10:57:14","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=810"},"modified":"2024-01-24T12:07:43","modified_gmt":"2024-01-24T12:07:43","slug":"build-php","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/build-php\/","title":{"rendered":"build-php"},"content":{"rendered":"\n<p>A test of how long it takes to build php. This is a quick running test taking not much more than a minute. Profile below seems to show a parallel build step followed by a link step.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-57.png\" alt=\"\" class=\"wp-image-857\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-57.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-57-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-57-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile has some noise, but as with other build benchmarks, frontend stalls are high and backend stalls are lower. This also seems to show some branch misprediction, more so in the link phases.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-95.png\" alt=\"\" class=\"wp-image-859\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-95.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-95-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-95-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show not much floating point, a lower than average L2 rate. On-cpu is about half of the cores kept busy reflecting the balance between parallel compiles and single link.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              221.438\non_cpu               0.495          # 7.92 \/ 16 cores\nutime                1555.675\nstime                197.680\nnvcsw                450612         # 65.88%\nnivcsw               233377         # 34.12%\ninblock              0              # 0.00\/sec\nonblock              6228496        # 28127.56\/sec\ncpu-clock            1748873750683  # 1748.874 seconds\ntask-clock           1748972352894  # 1748.972 seconds\npage faults          58117746       # 33229.654\/sec\ncontext switches     528516         # 302.187\/sec\ncpu migrations       79690          # 45.564\/sec\nmajor page faults    2405           # 1.375\/sec\nminor page faults    58115341       # 33228.279\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             1594432602942  # 207.684 branches per 1000 inst\nbranch misses        52058772559    # 3.27% branch miss\nconditional          1213061060088  # 158.008 conditional branches per 1000 inst\nindirect             34178953667    # 4.452 indirect branches per 1000 inst\ncpu-cycles           6605765875438  # 1.87 GHz\ninstructions         7229655455835  # 1.09 IPC\nslots                14201053487148 #\nretiring             2501769169927  # 17.6% (20.7%)\n-- ucode             3564851676     #     0.0%\n-- fastpath          2498204318251  #    17.6%\nfrontend             4796047097332  # 33.8% (39.8%)\n-- latency           3588596456616  #    25.3%\n-- bandwidth         1207450640716  #     8.5%\nbackend              4211510261156  # 29.7% (34.9%)\n-- cpu               469036196670   #     3.3%\n-- memory            3742474064486  #    26.4%\nspeculation          554865158779   #  3.9% ( 4.6%)\n-- branch mispredict 548480394414   #     3.9%\n-- pipeline restart  6384764365     #     0.0%\nsmt-contention       2136780015843  # 15.0% ( 0.0%)\ncpu-cycles           6604211246538  # 1.86 GHz\ninstructions         7236674856639  # 1.10 IPC\ninstructions         2553785601077  # 43.692 l2 access per 1000 inst\nl2 hit from l1       93760895583    # 21.62% l2 miss\nl2 miss from l1      14637262233    #\nl2 hit from l2 pf    8330227096     #\nl3 hit from l2 pf    4566519708     #\nl3 miss from l2 pf   4923249837     #\ninstructions         2553295973680  # 24.210 float per 1000 inst\nfloat 512            25045          # 0.000 AVX-512 per 1000 inst\nfloat 256            3210309        # 0.001 AVX-256 per 1000 inst\nfloat 128            61812154017    # 24.209 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              235.695\non_cpu               0.506          # 8.09 \/ 16 cores\nutime                1765.203\nstime                141.292\nnvcsw                440121         # 65.10%\nnivcsw               235947         # 34.90%\ninblock              28440          # 120.66\/sec\nonblock              6217808        # 26380.75\/sec\ncpu-clock            1902685896087  # 1902.686 seconds\ntask-clock           1902723588935  # 1902.724 seconds\npage faults          57383846       # 30158.793\/sec\ncontext switches     526263         # 276.584\/sec\ncpu migrations       70667          # 37.140\/sec\nmajor page faults    1502           # 0.789\/sec\nminor page faults    57382344       # 30158.003\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             1557014667338  # 204.660 branches per 1000 inst\nbranch misses        37368996589    # 2.40% branch miss\nconditional          1557020297226  # 204.661 conditional branches per 1000 inst\nindirect             257048243067   # 33.787 indirect branches per 1000 inst\nslots                13590541943186 #\nretiring             4548295163703  # 33.5% (33.5%)\n-- ucode             358429660714   #     2.6%\n-- fastpath          4189865502989  #    30.8%\nfrontend             4396200016791  # 32.3% (32.3%)\n-- latency           2292355074576  #    16.9%\n-- bandwidth         2103844942215  #    15.5%\nbackend              2506228334418  # 18.4% (18.4%)\n-- cpu               822785143049   #     6.1%\n-- memory            1683443191369  #    12.4%\nspeculation          2187866665844  # 16.1% (16.1%)\n-- branch mispredict 2120136787339  #    15.6%\n-- pipeline restart  67729878505    #     0.5%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           4992606491814  # 1.33 GHz\ninstructions         6885561939212  # 1.38 IPC\nl2 access            216898743213   # 45.077 l2 access per 1000 inst\nl2 miss              66417480784    # 30.62% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process overview shows the C compiler frontend takes most of the time with ~2300 compilations over three runs and 221 invocations of ld. Still 175k processes started. The high number in the &#8220;running&#8221; suggests we still lose some events.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>175042 processes\n\t2307 cc1                   1142.99    76.63\n\t2208 as                      42.90     4.12\n\t 68 clinfo                  16.20     6.65\n\t91992 bash                    10.43    12.96\n\t  3 minilua                  4.18     0.02\n\t221 ld                       2.43     1.31\n\t 38 vulkaninfo               1.15     0.96\n\t  1 xz                       0.70     0.06\n\t  6 make                     0.59     0.15\n\t  4 vulkani:disk$0           0.13     0.11\n\t  6 glxinfo:gdrv0            0.11     0.10\n\t 13 php                      0.09     0.05\n\t  6 clang                    0.08     0.04\n\t  2 llvmpipe-0               0.06     0.05\n\t  2 llvmpipe-1               0.06     0.05\n\t  2 llvmpipe-10              0.06     0.05\n\t  2 llvmpipe-11              0.06     0.05\n\t  2 llvmpipe-12              0.06     0.05\n\t  2 llvmpipe-13              0.06     0.05\n\t  2 llvmpipe-14              0.06     0.05\n\t  2 llvmpipe-15              0.06     0.05\n\t  2 llvmpipe-2               0.06     0.05\n\t  2 llvmpipe-3               0.06     0.05\n\t  2 llvmpipe-4               0.06     0.05\n\t  2 llvmpipe-5               0.06     0.05\n\t  2 llvmpipe-6               0.06     0.05\n\t  2 llvmpipe-7               0.06     0.05\n\t  2 llvmpipe-8               0.06     0.05\n\t  2 llvmpipe-9               0.06     0.05\n\t  2 glxinfo                  0.05     0.04\n\t  2 glxinfo:cs0              0.05     0.04\n\t  2 glxinfo:disk$0           0.05     0.04\n\t  2 glxinfo:sh0              0.05     0.04\n\t  2 glxinfo:shlo0            0.05     0.04\n\t  1 tar                      0.04     0.66\n\t 15 find                     0.03     0.10\n\t  3 rocminfo                 0.03     0.00\n\t  1 lspci                    0.01     0.01\n\t4779 rm                       0.00     0.53\n\t2266 cc                       0.00     0.04\n\t  1 ps                       0.00     0.01\n\t42477 sed                      0.00     0.00\n\t3813 cat                      0.00     0.00\n\t1095 mv                       0.00     0.00\n\t1055 tr                       0.00     0.00\n\t1010 shtool                   0.00     0.00\n\t238 sh                       0.00     0.00\n\t228 grep                     0.00     0.00\n\t216 collect2                 0.00     0.00\n\t 95 mkdir                    0.00     0.00\n\t 63 cut                      0.00     0.00\n\t 52 awk                      0.00     0.00\n\t 44 conftest                 0.00     0.00\n\t 34 wc                       0.00     0.00\n\t 28 expr                     0.00     0.00\n\t 18 dirname                  0.00     0.00\n\t 18 xargs                    0.00     0.00\n\t 17 uname                    0.00     0.00\n\t 14 gsettings                0.00     0.00\n\t 13 configure                0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t 13 pkg-config               0.00     0.00\n\t  8 chmod                    0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  7 basename                 0.00     0.00\n\t  7 ln                       0.00     0.00\n\t  6 cp                       0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 nawk                     0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  5 rmdir                    0.00     0.00\n\t  5 sort                     0.00     0.00\n\t  4 diff                     0.00     0.00\n\t  4 hostname                 0.00     0.00\n\t  3 ldconfig.real            0.00     0.00\n\t  3 mktemp                   0.00     0.00\n\t  3 nm                       0.00     0.00\n\t  3 time-compile-ph          0.00     0.00\n\t  2 arch                     0.00     0.00\n\t  2 echo                     0.00     0.00\n\t  2 ls                       0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 tail                     0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 bison                    0.00     0.00\n\t  1 cmp                      0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dconf worker             0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 file                     0.00     0.00\n\t  1 getconf                  0.00     0.00\n\t  1 gmain                    0.00     0.00\n\t  1 head                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 ldd                      0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 strip                    0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 touch                    0.00     0.00\n\t  1 uniq                     0.00     0.00\n\t  1 xrandr                   0.00     0.00\n3586 processes running\n3641 maximum processes\n<\/code><\/pre>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A test of how long it takes to build php. This is a quick running test taking not much more than a minute. Profile below seems to show a parallel build step followed by a link step. Topdown profile has <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/build-php\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-810","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/810","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=810"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/810\/revisions"}],"predecessor-version":[{"id":860,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/810\/revisions\/860"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=810"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}