{"id":1080,"date":"2024-01-29T11:26:14","date_gmt":"2024-01-29T11:26:14","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1080"},"modified":"2024-01-30T01:44:58","modified_gmt":"2024-01-30T01:44:58","slug":"pybench","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/pybench\/","title":{"rendered":"pybench"},"content":{"rendered":"\n<p>PyBench is a benchmark suite for python. The test is single-threaded and have very high retirement rate and IPC. It is also short and quick.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-87.png\" alt=\"\" class=\"wp-image-1113\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-87.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-87-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-87-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows a consistent high retirement rate.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-124.png\" alt=\"\" class=\"wp-image-1115\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-124.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-124-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-124-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show very little floating point or L2 access. Both frontend and backend stalls are low.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              63.702\non_cpu               0.049          # 0.78 \/ 16 cores\nutime                48.990\nstime                0.793\nnvcsw                1986           # 81.90%\nnivcsw               439            # 18.10%\ninblock              0              # 0.00\/sec\nonblock              12648          # 198.55\/sec\ncpu-clock            49805588772    # 49.806 seconds\ntask-clock           49808093050    # 49.808 seconds\npage faults          160691         # 3226.203\/sec\ncontext switches     2564           # 51.478\/sec\ncpu migrations       270            # 5.421\/sec\nmajor page faults    2              # 0.040\/sec\nminor page faults    160689         # 3226.162\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             194912652988   # 186.412 branches per 1000 inst\nbranch misses        270411164      # 0.14% branch miss\nconditional          154479730117   # 147.743 conditional branches per 1000 inst\nindirect             18920311748    # 18.095 indirect branches per 1000 inst\ncpu-cycles           374557890559   # 0.22 GHz\ninstructions         1734702413427  # 4.63 IPC high\nslots                752332043412   #\nretiring             561257349214   # 74.6% (74.6%) high\n-- ucode             1483477360     #     0.2%\n-- fastpath          559773871854   #    74.4%\nfrontend             77108360889    # 10.2% (10.3%)\n-- latency           34005993816    #     4.5%\n-- bandwidth         43102367073    #     5.7%\nbackend              97631998057    # 13.0% (13.0%) low\n-- cpu               20007069354    #     2.7%\n-- memory            77624928703    #    10.3%\nspeculation          16192947256    #  2.2% ( 2.2%)\n-- branch mispredict 13204745243    #     1.8%\n-- pipeline restart  2988202013     #     0.4%\nsmt-contention       141103425      #  0.0% ( 0.0%)\ncpu-cycles           225789349951   # 0.22 GHz\ninstructions         1039703315363  # 4.60 IPC high\ninstructions         348255417695   # 1.063 l2 access per 1000 inst\nl2 hit from l1       345411754      # 5.98% l2 miss\nl2 miss from l1      13213404       #\nl2 hit from l2 pf    15775251       #\nl3 hit from l2 pf    4367477        #\nl3 miss from l2 pf   4560360        #\ninstructions         347971571386   # 7.418 float per 1000 inst\nfloat 512            52             # 0.000 AVX-512 per 1000 inst\nfloat 256            580            # 0.000 AVX-256 per 1000 inst\nfloat 128            2581105775     # 7.418 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         13             # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              63.672\non_cpu               0.049          # 0.78 \/ 16 cores\nutime                49.601\nstime                0.357\nnvcsw                1946           # 86.07%\nnivcsw               315            # 13.93%\ninblock              688            # 10.81\/sec\nonblock              1368           # 21.49\/sec\ncpu-clock            49970397576    # 49.970 seconds\ntask-clock           49972774658    # 49.973 seconds\npage faults          149794         # 2997.512\/sec\ncontext switches     2408           # 48.186\/sec\ncpu migrations       202            # 4.042\/sec\nmajor page faults    1              # 0.020\/sec\nminor page faults    149793         # 2997.492\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             194468472774   # 185.970 branches per 1000 inst\nbranch misses        96408844       # 0.05% branch miss\nconditional          194468484966   # 185.970 conditional branches per 1000 inst\nindirect             18922715964    # 18.096 indirect branches per 1000 inst\nslots                1132078905056  #\nretiring             991835993221   # 87.6% (87.6%) high\n-- ucode             47158698511    #     4.2%\n-- fastpath          944677294710   #    83.4%\nfrontend             32564331849    #  2.9% ( 2.9%) low\n-- latency           7632247901     #     0.7%\n-- bandwidth         24932083948    #     2.2%\nbackend              61583929109    #  5.4% ( 5.4%) low\n-- cpu               45713589376    #     4.0%\n-- memory            15870339733    #     1.4%\nspeculation          31665658842    #  2.8% ( 2.8%)\n-- branch mispredict 9823086021     #     0.9%\n-- pipeline restart  21842572821    #     1.9%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           188457635434   # 0.18 GHz\ninstructions         1043848855562  # 5.54 IPC high\nl2 access            693102729      # 0.664 l2 access per 1000 inst\nl2 miss              98568390       # 14.22% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process overview shows standard test overhead and four invocations of python<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>361 processes\n\t  4 python3                 48.35     0.02\n\t 68 clinfo                  18.20     4.66\n\t 38 vulkaninfo               0.95     1.34\n\t  6 glxinfo:gdrv0            0.12     0.07\n\t  6 glxinfo:gl0              0.12     0.07\n\t  4 vulkani:disk$0           0.10     0.15\n\t  6 php                      0.07     0.07\n\t  2 glxinfo                  0.06     0.03\n\t  2 glxinfo:cs0              0.06     0.03\n\t  2 glxinfo:disk$0           0.06     0.03\n\t  2 glxinfo:sh0              0.06     0.03\n\t  2 glxinfo:shlo0            0.06     0.03\n\t  2 llvmpipe-0               0.05     0.08\n\t  2 llvmpipe-1               0.05     0.08\n\t  2 llvmpipe-10              0.05     0.08\n\t  2 llvmpipe-11              0.05     0.08\n\t  2 llvmpipe-12              0.05     0.08\n\t  2 llvmpipe-13              0.05     0.08\n\t  2 llvmpipe-14              0.05     0.08\n\t  2 llvmpipe-15              0.05     0.08\n\t  2 llvmpipe-2               0.05     0.08\n\t  2 llvmpipe-3               0.05     0.08\n\t  2 llvmpipe-4               0.05     0.08\n\t  2 llvmpipe-5               0.05     0.08\n\t  2 llvmpipe-6               0.05     0.08\n\t  2 llvmpipe-7               0.05     0.08\n\t  2 llvmpipe-8               0.05     0.08\n\t  2 llvmpipe-9               0.05     0.08\n\t  6 clang                    0.03     0.09\n\t  3 rocminfo                 0.03     0.00\n\t  1 lspci                    0.00     0.01\n\t  1 ps                       0.00     0.01\n\t 83 sh                       0.00     0.00\n\t 12 gcc                      0.00     0.00\n\t  8 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 gmain                    0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  5 uname                    0.00     0.00\n\t  3 dconf worker             0.00     0.00\n\t  3 file                     0.00     0.00\n\t  3 pybench                  0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 cc                       0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 python                   0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Computation section<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      1622774) pybench          cpu=7 start=5.46  finish=21.52\n        1622775) python3          cpu=2 start=5.46  finish=21.52\n          1622776) file             cpu=11 start=5.49  finish=5.49 \n          1622777) uname            cpu=6 start=5.49  finish=5.49 \n      1622780) pybench          cpu=5 start=25.52 finish=41.88\n        1622781) python3          cpu=6 start=25.52 finish=41.88\n          1622782) file             cpu=15 start=25.55 finish=25.55\n          1622783) uname            cpu=15 start=25.55 finish=25.55\n      1622787) pybench          cpu=5 start=45.89 finish=61.89\n        1622788) python3          cpu=6 start=45.89 finish=61.88\n          1622789) file             cpu=15 start=45.92 finish=45.92\n          1622790) uname            cpu=15 start=45.92 finish=45.92\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>PyBench is a benchmark suite for python. The test is single-threaded and have very high retirement rate and IPC. It is also short and quick. Topdown profile shows a consistent high retirement rate. AMD metrics show very little floating point <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/pybench\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1080","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1080","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1080"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1080\/revisions"}],"predecessor-version":[{"id":1116,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1080\/revisions\/1116"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1080"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}