{"id":381,"date":"2024-01-09T12:48:25","date_gmt":"2024-01-09T12:48:25","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=381"},"modified":"2024-01-10T03:03:52","modified_gmt":"2024-01-10T03:03:52","slug":"y-cruncher","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/y-cruncher\/","title":{"rendered":"y-cruncher"},"content":{"rendered":"\n<p>y-cruncher is a program that calculates many digits of Pi.  My running seems to hang. Initially it hung on both AMD and Intel, but at different places. The Intel version hung at 1B digits and the AMD at 5B digits. The y-cruncher page also <a href=\"http:\/\/www.numberworld.org\/y-cruncher\/news\/2023.html#2023_12_27\">reports a hang<\/a> subsequently fixed but their description of symptoms is different from what I see. In any case, I only have AMD metrics here.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-33.png\" alt=\"\" class=\"wp-image-390\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-33.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-33-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-33-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show this is very much a backend bound application with both CPU and memory contributing, Almost no frontend stalls. There is a small amount of floating point code and a reasonable L2 rate with some misses.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              679.121\non_cpu               0.920          # 14.72 \/ 16 cores\nutime                9967.348\nstime                31.191\nnvcsw                360008         # 42.57%\nnivcsw               485694         # 57.43%\ninblock              1280448        # 1885.45\/sec\nonblock              29298368       # 43141.61\/sec\ncpu-clock            10001201726708 # 10001.202 seconds\ntask-clock           10001591896343 # 10001.592 seconds\npage faults          184372         # 18.434\/sec\ncontext switches     848898         # 84.876\/sec\ncpu migrations       171430         # 17.140\/sec\nmajor page faults    7089           # 0.709\/sec\nminor page faults    177282         # 17.725\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             760807352038   # 22.662 branches per 1000 inst\nbranch misses        4713539070     # 0.62% branch miss\nconditional          587069670473   # 17.487 conditional branches per 1000 inst\nindirect             9994645887     # 0.298 indirect branches per 1000 inst\ncpu-cycles           40453340253892 # 3.71 GHz\ninstructions         33575761816393 # 0.83 IPC\nslots                80902865789910 #\nretiring             11902956182674 # 14.7% (17.2%)\n-- ucode             4585744763     #     0.0%\n-- fastpath          11898370437911 #    14.7%\nfrontend             2041876738557  #  2.5% ( 3.0%)\n-- latency           1659454106610  #     2.1%\n-- bandwidth         382422631947   #     0.5%\nbackend              54990294148872 # 68.0% (79.6%)\n-- cpu               24297716954409 #    30.0%\n-- memory            30692577194463 #    37.9%\nspeculation          115427066513   #  0.1% ( 0.2%)\n-- branch mispredict 72716495043    #     0.1%\n-- pipeline restart  42710571470    #     0.1%\nsmt-contention       11852224998687 # 14.6% ( 0.0%)\ncpu-cycles           40459239031625 # 3.71 GHz\ninstructions         33573330939113 # 0.83 IPC\ninstructions         11192891062647 # 106.431 l2 access per 1000 inst\nl2 hit from l1       897172713903   # 11.75% l2 miss\nl2 miss from l1      64301791449    #\nl2 hit from l2 pf    218454217312   #\nl3 hit from l2 pf    48863370442    #\nl3 miss from l2 pf   26783988521    #\ninstructions         11187826293958 # 22.845 float per 1000 inst\nfloat 512            60             # 0.000 AVX-512 per 1000 inst\nfloat 256            96781510168    # 8.651 AVX-256 per 1000 inst\nfloat 128            158800636747   # 14.194 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Process statistics don&#8217;t seem to get much system or user time.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>548 processes\n\t 64 clinfo                  10.88     3.46\n\t 38 vulkaninfo               0.94     0.96\n\t  6 php                      0.16     0.11\n\t  4 vulkani:disk$0           0.10     0.11\n\t  6 glxinfo:gdrv0            0.08     0.09\n\t  2 llvmpipe-0               0.05     0.06\n\t  2 llvmpipe-1               0.05     0.06\n\t  2 llvmpipe-10              0.05     0.06\n\t  2 llvmpipe-11              0.05     0.06\n\t  2 llvmpipe-12              0.05     0.06\n\t  2 llvmpipe-13              0.05     0.06\n\t  2 llvmpipe-14              0.05     0.06\n\t  2 llvmpipe-15              0.05     0.06\n\t  2 llvmpipe-2               0.05     0.06\n\t  2 llvmpipe-3               0.05     0.06\n\t  2 llvmpipe-4               0.05     0.06\n\t  2 llvmpipe-5               0.05     0.06\n\t  2 llvmpipe-6               0.05     0.06\n\t  2 llvmpipe-7               0.05     0.06\n\t  2 llvmpipe-8               0.05     0.06\n\t  2 llvmpipe-9               0.05     0.06\n\t  2 glxinfo                  0.05     0.04\n\t  2 glxinfo:cs0              0.05     0.04\n\t  2 glxinfo:disk$0           0.05     0.04\n\t  2 glxinfo:shlo0            0.05     0.04\n\t  2 glxinfo:sh0              0.04     0.03\n\t  6 clang                    0.03     0.04\n\t  1 lspci                    0.01     0.02\n\t189 22-ZN4 ~ Kizuna          0.00     0.00\n\t 91 sh                       0.00     0.00\n\t 12 gcc                      0.00     0.00\n\t  9 stty                     0.00     0.00\n\t  8 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  6 y-cruncher               0.00     0.00\n\t  5 gmain                    0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  4 sed                      0.00     0.00\n\t  3 dconf worker             0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 cc                       0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 ps                       0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n76 maximum processes<\/code><\/pre>\n\n\n\n<p>The core computation starts many threads. One interesting thing is the program runs a binary named &#8220;ZN4&#8221; and the archive seems to ship with a static set of binaries. This suggests that perhaps these are already hardwired for Zen4 and other cores?<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      430509) y-cruncher       cpu=13 start=5.87  finish=224.89\n        430510) y-cruncher       cpu=15 start=5.88  finish=224.89\n          430512) sh               cpu=11 start=5.88  finish=224.89\n            430513) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.89\n              430514) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.89\n              430515) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.89\n              430516) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.89\n              430517) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.88\n              430518) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.88\n              430519) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.88\n              430520) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.88\n              430521) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.88\n              430522) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.88\n              430523) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.87\n              430524) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.87\n              430525) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.87\n              430526) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.87\n              430527) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.87\n              430528) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.87\n              430529) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.86\n              430530) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.86\n              430531) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.86\n              430532) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.86\n              430533) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.85\n              430534) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.85\n              430535) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.85\n              430536) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.85\n              430537) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.85\n              430538) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.84\n              430539) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.84\n              430540) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.84\n              430541) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.84\n              430542) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.84\n              430543) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.83\n              430544) 22-ZN4 ~ Kizuna  cpu=0 start=5.88  finish=224.83\n              430545) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430546) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430547) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430548) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430549) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430550) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430551) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430552) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430553) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430554) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430555) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.51\n              430556) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430557) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430558) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430559) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430560) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430561) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430562) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430563) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430564) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430565) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430566) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430567) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430568) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430569) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430570) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430571) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430572) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430573) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430574) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n              430575) 22-ZN4 ~ Kizuna  cpu=0 start=6.06  finish=223.50\n        430511) sed              cpu=6 start=5.88  finish=224.89\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>y-cruncher is a program that calculates many digits of Pi. My running seems to hang. Initially it hung on both AMD and Intel, but at different places. The Intel version hung at 1B digits and the AMD at 5B digits. <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/y-cruncher\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-381","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/381","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=381"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/381\/revisions"}],"predecessor-version":[{"id":391,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/381\/revisions\/391"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=381"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}