{"id":1499,"date":"2024-02-04T10:22:51","date_gmt":"2024-02-04T10:22:51","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1499"},"modified":"2024-02-09T23:41:04","modified_gmt":"2024-02-09T23:41:04","slug":"build-python","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/build-python\/","title":{"rendered":"build-python"},"content":{"rendered":"\n<p>This workload builds the python reference implementation. There are two builds &#8211; a quick default build followed by a longer running release build with PGO and LTO that takes much longer. These builds appear to have just a few processes much of the time.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-43.png\" alt=\"\" class=\"wp-image-1636\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-43.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-43-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-43-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows a mismash with a small section that is backend bound, a lot of frontend bound but also still a moderate retirement rate.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-45.png\" alt=\"\" class=\"wp-image-1637\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-45.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-45-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-45-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics are a composite of the chart above. Frontend bound stalls dominate. 1\/5 of the instructions are branches and there is little floating point.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              294.674\non_cpu               0.206          # 3.29 \/ 16 cores\nutime                893.905\nstime                76.692\nnvcsw                67142          # 42.74%\nnivcsw               89970          # 57.26%\ninblock              0              # 0.00\/sec\nonblock              6307048        # 21403.46\/sec\ncpu-clock            969640099315   # 969.640 seconds\ntask-clock           969714791934   # 969.715 seconds\npage faults          20445840       # 21084.385\/sec\ncontext switches     137572         # 141.869\/sec\ncpu migrations       3852           # 3.972\/sec\nmajor page faults    70             # 0.072\/sec\nminor page faults    20445770       # 21084.313\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             1263933056973  # 209.294 branches per 1000 inst\nbranch misses        34718136087    # 2.75% branch miss\nconditional          978456026819   # 162.022 conditional branches per 1000 inst\nindirect             28886374282    # 4.783 indirect branches per 1000 inst\ncpu-cycles           3927940269465  # 0.83 GHz\ninstructions         5937421788873  # 1.51 IPC\nslots                8063145430038  #\nretiring             1967167817888  # 24.4% (27.2%)\n-- ucode             2070157350     #     0.0%\n-- fastpath          1965097660538  #    24.4%\nfrontend             2824239940812  # 35.0% (39.0%)\n-- latency           2017706742918  #    25.0%\n-- bandwidth         806533197894   #    10.0%\nbackend              1919470574265  # 23.8% (26.5%)\n-- cpu               258477109913   #     3.2%\n-- memory            1660993464352  #    20.6%\nspeculation          529853804360   #  6.6% ( 7.3%)\n-- branch mispredict 522073957926   #     6.5%\n-- pipeline restart  7779846434     #     0.1%\nsmt-contention       822390673299   # 10.2% ( 0.0%)\ncpu-cycles           3938096924311  # 0.83 GHz\ninstructions         5938426219684  # 1.51 IPC\ninstructions         2010234670211  # 37.745 l2 access per 1000 inst\nl2 hit from l1       64750413013    # 19.12% l2 miss\nl2 miss from l1      8916585026     #\nl2 hit from l2 pf    5535757794     #\nl3 hit from l2 pf    3443656207     #\nl3 miss from l2 pf   2146741781     #\ninstructions         2009987788704  # 20.783 float per 1000 inst\nfloat 512            5524           # 0.000 AVX-512 per 1000 inst\nfloat 256            39346          # 0.000 AVX-256 per 1000 inst\nfloat 128            41773787580    # 20.783 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         97743          # 0.000 scalar per 1000 inst\ninstructions         2706172        #\nopcache              1010366        # 373.356 opcache per 1000 inst\nopcache miss         541242         # 53.6% opcache miss rate\nl1 dTLB miss         5369           # 1.984 L1 dTLB per 1000 inst\nl2 dTLB miss         1181           # 0.436 L2 dTLB per 1000 inst\ninstructions         2689543        #\nicache               1313506        # 488.375 icache per 1000 inst\nicache miss          109997         #  8.4% icache miss rate\nl1 iTLB miss         9              # 0.003 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            20             # 0.007 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              324.771\non_cpu               0.203          # 3.25 \/ 16 cores\nutime                1005.209\nstime                49.719\nnvcsw                68644          # 43.09%\nnivcsw               90647          # 56.91%\ninblock              105168         # 323.82\/sec\nonblock              6299984        # 19398.21\/sec\ncpu-clock            1052934916576  # 1052.935 seconds\ntask-clock           1053092218263  # 1053.092 seconds\npage faults          20459766       # 19428.276\/sec\ncontext switches     140082         # 133.020\/sec\ncpu migrations       5057           # 4.802\/sec\nmajor page faults    415            # 0.394\/sec\nminor page faults    20459351       # 19427.882\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             1252397412602  # 207.980 branches per 1000 inst\nbranch misses        30147723172    # 2.41% branch miss\nconditional          1252398109242  # 207.980 conditional branches per 1000 inst\nindirect             123489984664   # 20.507 indirect branches per 1000 inst\nslots                13345151418776 #\nretiring             4518175248187  # 33.9% (33.9%)\n-- ucode             338293584704   #     2.5%\n-- fastpath          4179881663483  #    31.3%\nfrontend             4419550588938  # 33.1% (33.1%)\n-- latency           2249693527456  #    16.9%\n-- bandwidth         2169857061482  #    16.3%\nbackend              1978497388105  # 14.8% (14.8%) low\n-- cpu               879770968330   #     6.6%\n-- memory            1098726419775  #     8.2%\nspeculation          2486337364015  # 18.6% (18.6%) high\n-- branch mispredict 2411739973223  #    18.1%\n-- pipeline restart  74597390792    #     0.6%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           3207783156104  # 0.62 GHz\ninstructions         5611188459036  # 1.75 IPC\nl2 access            173010524412   # 35.971 l2 access per 1000 inst\nl2 miss              45961720723    # 26.57% l2 miss\ncpu-cycles           2745334048792  # 23.3% memory latency\nload stalls          618825196463   #  3.4% l1 bound\nl1 miss              525797505055   # 10.3% l2 bound\nl2 miss              243721523743   #  2.5% l3 bound\nl3 miss              175148141314   #  6.4% dram bound\nstore_stalls         19524283725    #  0.7% store bound\n<\/code><\/pre>\n\n\n\n<p>Process overview shows the largest portion of time spent in python with other processes mixed<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>21759 processes\n\t213 python                1725.33    47.47\n\t1960 cc1                    458.14    25.39\n\t151 lto1                   298.61     6.30\n\t1816 as                      59.87     5.13\n\t 68 clinfo                  16.42     5.98\n\t799 ld                       3.61     1.32\n\t 56 _testembed               2.27     0.15\n\t 38 vulkaninfo               1.14     1.14\n\t  2 gzip                     0.92     0.03\n\t5053 bash                     0.67     2.50\n\t  4 vulkani:disk$0           0.12     0.12\n\t  6 php                      0.11     0.16\n\t 14 make                     0.10     0.07\n\t  6 glxinfo:gdrv0            0.10     0.05\n\t  6 glxinfo:gl0              0.10     0.05\n\t  2 llvmpipe-0               0.06     0.06\n\t  2 llvmpipe-1               0.06     0.06\n\t  2 llvmpipe-10              0.06     0.06\n\t  2 llvmpipe-11              0.06     0.06\n\t  2 llvmpipe-12              0.06     0.06\n\t  2 llvmpipe-13              0.06     0.06\n\t  2 llvmpipe-14              0.06     0.06\n\t  2 llvmpipe-15              0.06     0.06\n\t  2 llvmpipe-2               0.06     0.06\n\t  2 llvmpipe-3               0.06     0.06\n\t  2 llvmpipe-4               0.06     0.06\n\t  2 llvmpipe-5               0.06     0.06\n\t  2 llvmpipe-6               0.06     0.06\n\t  2 llvmpipe-7               0.06     0.06\n\t  2 llvmpipe-8               0.06     0.06\n\t  2 llvmpipe-9               0.06     0.06\n\t  2 glxinfo                  0.06     0.03\n\t  2 glxinfo:cs0              0.06     0.03\n\t  2 glxinfo:disk$0           0.06     0.03\n\t  2 glxinfo:sh0              0.06     0.03\n\t  2 glxinfo:shlo0            0.06     0.03\n\t  6 clang                    0.05     0.07\n\t  2 tar                      0.04     0.46\n\t  3 rocminfo                 0.03     0.00\n\t  6 print                    0.02     0.00\n\t  3 ar                       0.01     0.42\n\t 12 pkg-config               0.01     0.00\n\t3222 rm                       0.00     0.25\n\t151 lto-wrapper              0.00     0.13\n\t2362 gcc                      0.00     0.07\n\t 46 find                     0.00     0.06\n\t  1 lspci                    0.00     0.03\n\t2141 cat                      0.00     0.00\n\t1465 sed                      0.00     0.00\n\t797 collect2                 0.00     0.00\n\t353 grep                     0.00     0.00\n\t258 mv                       0.00     0.00\n\t223 sh                       0.00     0.00\n\t 84 conftest                 0.00     0.00\n\t 68 basename                 0.00     0.00\n\t 32 configure                0.00     0.00\n\t 32 uname                    0.00     0.00\n\t 21 bunzip2                  0.00     0.00\n\t 19 dirname                  0.00     0.00\n\t 18 expr                     0.00     0.00\n\t 15 ldd                      0.00     0.00\n\t 14 awk                      0.00     0.00\n\t 14 mkdir                    0.00     0.00\n\t 12 gsettings                0.00     0.00\n\t 12 tr                       0.00     0.00\n\t  9 ld-linux-x86-64          0.00     0.00\n\t  8 cc                       0.00     0.00\n\t  8 ln                       0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  7 sort                     0.00     0.00\n\t  6 hostname                 0.00     0.00\n\t  6 ld-linux.so.2            0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  6 rmdir                    0.00     0.00\n\t  5 mktemp                   0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  4 arch                     0.00     0.00\n\t  4 chmod                    0.00     0.00\n\t  4 diff                     0.00     0.00\n\t  4 install                  0.00     0.00\n\t  4 ls                       0.00     0.00\n\t  3 gmain                    0.00     0.00\n\t  3 touch                    0.00     0.00\n\t  2 build-python             0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dconf worker             0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 ps                       0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 true                     0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n109 maximum processes\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>This workload builds the python reference implementation. There are two builds &#8211; a quick default build followed by a longer running release build with PGO and LTO that takes much longer. These builds appear to have just a few processes <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/build-python\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1499","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1499","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1499"}],"version-history":[{"count":3,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1499\/revisions"}],"predecessor-version":[{"id":1638,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1499\/revisions\/1638"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1499"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}