{"id":1916,"date":"2024-03-02T16:04:44","date_gmt":"2024-03-02T16:04:44","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1916"},"modified":"2024-03-03T18:18:00","modified_gmt":"2024-03-03T18:18:00","slug":"cpuminer-opt","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/cpuminer-opt\/","title":{"rendered":"cpuminer-opt"},"content":{"rendered":"\n<p>A test of mining crypto currency algorithms. There are 11 different subtests these seem to run on all cores.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-9.png\" alt=\"\" class=\"wp-image-1932\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-9.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-9-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-9-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile varies by workload but generally has a high retirement rate.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-10.png\" alt=\"\" class=\"wp-image-1935\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-10.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-10-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-10-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics confirm a high retirement rate and low backend stalls. These are heavy users of floating point and have low l2 access.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              1186.942\non_cpu               0.829          # 13.26 \/ 16 cores\nutime                15726.834\nstime                13.029\nnvcsw                256859         # 60.07%\nnivcsw               170746         # 39.93%\ninblock              696            # 0.59\/sec\nonblock              16936          # 14.27\/sec\ncpu-clock            15740358273528 # 15740.358 seconds\ntask-clock           15740514667117 # 15740.515 seconds\npage faults          511031         # 32.466\/sec\ncontext switches     433255         # 27.525\/sec\ncpu migrations       926            # 0.059\/sec\nmajor page faults    3              # 0.000\/sec\nminor page faults    511028         # 32.466\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             2090369343773  # 14.338 branches per 1000 inst\nbranch misses        3813560343     # 0.18% branch miss\nconditional          1647812055510  # 11.302 conditional branches per 1000 inst\nindirect             43048271602    # 0.295 indirect branches per 1000 inst\ncpu-cycles           67447012867666 # 3.36 GHz\ninstructions         153480915748738 # 2.28 IPC\nslots                134919266203302 #\nretiring             53357281273014 # 39.5% (67.7%) high\n-- ucode             6047913102     #     0.0%\n-- fastpath          53351233359912 #    39.5%\nfrontend             11600223754461 #  8.6% (14.7%)\n-- latency           6449312850150  #     4.8%\n-- bandwidth         5150910904311  #     3.8%\nbackend              13668219129691 # 10.1% (17.4%) low\n-- cpu               6385106468801  #     4.7%\n-- memory            7283112660890  #     5.4%\nspeculation          137310283792   #  0.1% ( 0.2%) low\n-- branch mispredict 103973737424   #     0.1%\n-- pipeline restart  33336546368    #     0.0%\nsmt-contention       56155929162151 # 41.6% ( 0.0%)\ncpu-cycles           86598124912935 # 3.39 GHz\ninstructions         194339496558672 # 2.24 IPC\ninstructions         64776067771495 # 9.871 l2 access per 1000 inst\nl2 hit from l1       502652839432   # 1.96% l2 miss\nl2 miss from l1      6399129863     #\nl2 hit from l2 pf    130677289731   #\nl3 hit from l2 pf    6090078866     #\nl3 miss from l2 pf   15348144       #\ninstructions         64773315824514 # 412.281 float per 1000 inst\nfloat 512            105            # 0.000 AVX-512 per 1000 inst\nfloat 256            396            # 0.000 AVX-256 per 1000 inst\nfloat 128            26704775475676 # 412.281 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\ninstructions         150080388093265 #\nopcache              12215943573405 # 81.396 opcache per 1000 inst\nopcache miss         2349395900729  # 19.2% opcache miss rate\nl1 dTLB miss         9781304650     # 0.065 L1 dTLB per 1000 inst\nl2 dTLB miss         59821905       # 0.000 L2 dTLB per 1000 inst\ninstructions         145591861587551 #\nicache               2424868764773  # 16.655 icache per 1000 inst\nicache miss          349158176071   # 14.4% icache miss rate\nl1 iTLB miss         9568505376     # 0.066 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            39405          # 0.000 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics confirm the high retirement rate and most of the load stalls are in L1.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              1727.587\non_cpu               0.828          # 13.25 \/ 16 cores\nutime                22841.922\nstime                41.821\nnvcsw                6141143        # 96.75%\nnivcsw               206599         # 3.25%\ninblock              21224          # 12.29\/sec\nonblock              5984           # 3.46\/sec\ncpu-clock            22880383776525 # 22880.384 seconds\ntask-clock           22881586483761 # 22881.586 seconds\npage faults          659851         # 28.838\/sec\ncontext switches     6356052        # 277.780\/sec\ncpu migrations       1202           # 0.053\/sec\nmajor page faults    108            # 0.005\/sec\nminor page faults    659743         # 28.833\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             5064640396537  # 27.581 branches per 1000 inst\nbranch misses        8449691387     # 0.17% branch miss\nconditional          5064640439545  # 27.581 conditional branches per 1000 inst\nindirect             2271735861178  # 12.371 indirect branches per 1000 inst\nslots                87294144025490 #\nretiring             61295422610381 # 70.2% (70.2%) high\n-- ucode             741043489965   #     0.8%\n-- fastpath          60554379120416 #    69.4%\nfrontend             22874749087179 # 26.2% (26.2%)\n-- latency           16506085956484 #    18.9%\n-- bandwidth         6368663130695  #     7.3%\nbackend              2902802335508  #  3.3% ( 3.3%) low\n-- cpu               2302827560561  #     2.6%\n-- memory            599974774947   #     0.7%\nspeculation          201710698859   #  0.2% ( 0.2%) low\n-- branch mispredict 175915295288   #     0.2%\n-- pipeline restart  25795403571    #     0.0%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           26530424318137 # 1.40 GHz\ninstructions         55820306624790 # 2.10 IPC\nl2 access            240757767124   # 4.314 l2 access per 1000 inst\nl2 miss              14209076422    # 5.90% l2 miss\ncpu-cycles           26539736731262 #  8.5% memory latency\nload stalls          2218592855843  #  7.3% l1 bound\nl1 miss              272119502032   #  0.7% l2 bound\nl2 miss              93202531099    #  0.3% l3 bound\nl3 miss              476989596      #  0.0% dram bound\nstore_stalls         40924997470    #  0.2% store bound\n<\/code><\/pre>\n\n\n\n<p>Process overview shows time spent in the cpuminer driver application.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>1028 processes\n\t594 cpuminer             282960.95   136.17\n\t 68 clinfo                  17.53     4.65\n\t 38 vulkaninfo               1.14     1.33\n\t  6 php                      0.13     0.20\n\t  4 vulkani:disk$0           0.12     0.14\n\t  6 glxinfo:gdrv0            0.09     0.10\n\t  6 glxinfo:gl0              0.09     0.10\n\t  2 llvmpipe-0               0.06     0.07\n\t  2 llvmpipe-1               0.06     0.07\n\t  2 llvmpipe-10              0.06     0.07\n\t  2 llvmpipe-11              0.06     0.07\n\t  2 llvmpipe-12              0.06     0.07\n\t  2 llvmpipe-13              0.06     0.07\n\t  2 llvmpipe-14              0.06     0.07\n\t  2 llvmpipe-15              0.06     0.07\n\t  2 llvmpipe-2               0.06     0.07\n\t  2 llvmpipe-3               0.06     0.07\n\t  2 llvmpipe-4               0.06     0.07\n\t  2 llvmpipe-5               0.06     0.07\n\t  2 llvmpipe-6               0.06     0.07\n\t  2 llvmpipe-7               0.06     0.07\n\t  2 llvmpipe-8               0.06     0.07\n\t  2 llvmpipe-9               0.06     0.07\n\t  2 glxinfo                  0.05     0.04\n\t  2 glxinfo:cs0              0.05     0.04\n\t  2 glxinfo:disk$0           0.05     0.04\n\t  2 glxinfo:sh0              0.05     0.04\n\t  2 glxinfo:shlo0            0.05     0.04\n\t  6 clang                    0.04     0.08\n\t  1 lspci                    0.00     0.02\n\t  1 ps                       0.00     0.01\n\t102 sh                       0.00     0.00\n\t 34 grep                     0.00     0.00\n\t 33 cpuminer-opt             0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t 10 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  4 gmain                    0.00     0.00\n\t  3 rocminfo                 0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 dconf worker             0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>An example of the computation block<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      865661) cpuminer-opt     cpu=4 start=6.16  finish=36.12\n        865662) cpuminer         cpu=1 start=6.16  finish=36.12\n          865664) cpuminer         cpu=6 start=6.17  finish=36.12\n          865665) cpuminer         cpu=0 start=6.17  finish=36.12\n          865666) cpuminer         cpu=1 start=6.17  finish=36.12\n          865667) cpuminer         cpu=2 start=6.17  finish=36.12\n          865668) cpuminer         cpu=3 start=6.17  finish=36.12\n          865669) cpuminer         cpu=4 start=6.17  finish=36.12\n          865670) cpuminer         cpu=5 start=6.17  finish=36.12\n          865671) cpuminer         cpu=6 start=6.17  finish=36.12\n          865672) cpuminer         cpu=7 start=6.17  finish=36.12\n          865673) cpuminer         cpu=8 start=6.17  finish=36.12\n          865674) cpuminer         cpu=9 start=6.18  finish=36.12\n          865675) cpuminer         cpu=10 start=6.18  finish=36.12\n          865676) cpuminer         cpu=11 start=6.18  finish=36.12\n          865677) cpuminer         cpu=12 start=6.18  finish=36.12\n          865678) cpuminer         cpu=13 start=6.18  finish=36.12\n          865679) cpuminer         cpu=14 start=6.18  finish=36.12\n          865680) cpuminer         cpu=15 start=6.18  finish=36.12\n        865663) grep             cpu=0 start=6.16  finish=36.12\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>A test of mining crypto currency algorithms. There are 11 different subtests these seem to run on all cores. Topdown profile varies by workload but generally has a high retirement rate. AMD metrics confirm a high retirement rate and low <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/cpuminer-opt\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1916","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1916","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1916"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1916\/revisions"}],"predecessor-version":[{"id":1936,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1916\/revisions\/1936"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1916"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}