{"id":657,"date":"2024-01-17T11:59:11","date_gmt":"2024-01-17T11:59:11","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=657"},"modified":"2024-01-17T11:59:12","modified_gmt":"2024-01-17T11:59:12","slug":"crafty","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/crafty\/","title":{"rendered":"crafty"},"content":{"rendered":"\n<p>crafty is a quick running chess benchmark. A single threaded program.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-31.png\" alt=\"\" class=\"wp-image-658\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-31.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-31-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-31-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown shows very high branch misprediction and low backend stalls.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-68.png\" alt=\"\" class=\"wp-image-659\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-68.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-68-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-68-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              85.026\non_cpu               0.049          # 0.79 \/ 16 cores\nutime                66.219\nstime                1.031\nnvcsw                1987           # 80.06%\nnivcsw               495            # 19.94%\ninblock              0              # 0.00\/sec\nonblock              13520          # 159.01\/sec\ncpu-clock            67275625058    # 67.276 seconds\ntask-clock           67278460780    # 67.278 seconds\npage faults          254497         # 3782.741\/sec\ncontext switches     2732           # 40.607\/sec\ncpu migrations       277            # 4.117\/sec\nmajor page faults    2              # 0.030\/sec\nminor page faults    254495         # 3782.711\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             103658642395   # 126.323 branches per 1000 inst\nbranch misses        3344279538     # 3.23% branch miss\nconditional          81346335549    # 99.132 conditional branches per 1000 inst\nindirect             699836262      # 0.853 indirect branches per 1000 inst\ncpu-cycles           227536355149   # 0.22 GHz\ninstructions         612422018555   # 2.69 IPC\nslots                457299337278   #\nretiring             207822186201   # 45.4% (45.4%)\n-- ucode             20268546       #     0.0%\n-- fastpath          207801917655   #    45.4%\nfrontend             124436967971   # 27.2% (27.2%)\n-- latency           79829054820    #    17.5%\n-- bandwidth         44607913151    #     9.8%\nbackend              65528959985    # 14.3% (14.3%)\n-- cpu               12232245149    #     2.7%\n-- memory            53296714836    #    11.7%\nspeculation          59488082175    # 13.0% (13.0%)\n-- branch mispredict 58719071362    #    12.8%\n-- pipeline restart  769010813      #     0.2%\nsmt-contention       22893938       #  0.0% ( 0.0%)\ncpu-cycles           227110461275   # 0.22 GHz\ninstructions         612833388726   # 2.70 IPC\ninstructions         204805663491   # 10.496 l2 access per 1000 inst\nl2 hit from l1       1989973299     # 4.61% l2 miss\nl2 miss from l1      56774383       #\nl2 hit from l2 pf    117293988      #\nl3 hit from l2 pf    27300597       #\nl3 miss from l2 pf   15007672       #\ninstructions         204922878110   # 17.716 float per 1000 inst\nfloat 512            50             # 0.000 AVX-512 per 1000 inst\nfloat 256            618            # 0.000 AVX-256 per 1000 inst\nfloat 128            3630465889     # 17.716 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         10             # 0.000 scalar per 1000 inst\n\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              78.685\non_cpu               0.052          # 0.83 \/ 16 cores\nutime                64.831\nstime                0.552\nnvcsw                1881           # 81.78%\nnivcsw               419            # 18.22%\ninblock              8              # 0.10\/sec\nonblock              2008           # 25.52\/sec\ncpu-clock            65403411728    # 65.403 seconds\ntask-clock           65406142261    # 65.406 seconds\npage faults          213026         # 3256.972\/sec\ncontext switches     2523           # 38.574\/sec\ncpu migrations       300            # 4.587\/sec\nmajor page faults    0              # 0.000\/sec\nminor page faults    213026         # 3256.972\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             77620340217    # 126.456 branches per 1000 inst\nbranch misses        2814845760     # 3.63% branch miss\nconditional          77620352217    # 126.456 conditional branches per 1000 inst\nindirect             546661136      # 0.891 indirect branches per 1000 inst\nslots                1479746226446  #\nretiring             586349958636   # 39.6% (39.6%)\n-- ucode             23262388203    #     1.6%\n-- fastpath          563087570433   #    38.1%\nfrontend             441928498060   # 29.9% (29.9%)\n-- latency           219742463361   #    14.9%\n-- bandwidth         222186034699   #    15.0%\nbackend              128165873654   #  8.7% ( 8.7%)\n-- cpu               73050345133    #     4.9%\n-- memory            55115528521    #     3.7%\nspeculation          330332085857   # 22.3% (22.3%)\n-- branch mispredict 324491614752   #    21.9%\n-- pipeline restart  5840471105     #     0.4%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           246727230707   # 0.20 GHz\ninstructions         613593772132   # 2.49 IPC\nl2 access            4856487075     # 7.918 l2 access per 1000 inst\nl2 miss              738889264      # 15.21% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process structure shows just six invocations of crafty and a moderate percentage of overhead.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              78.685\non_cpu               0.052          # 0.83 \/ 16 cores\nutime                64.831\nstime                0.552\nnvcsw                1881           # 81.78%\nnivcsw               419            # 18.22%\ninblock              8              # 0.10\/sec\nonblock              2008           # 25.52\/sec\ncpu-clock            65403411728    # 65.403 seconds\ntask-clock           65406142261    # 65.406 seconds\npage faults          213026         # 3256.972\/sec\ncontext switches     2523           # 38.574\/sec\ncpu migrations       300            # 4.587\/sec\nmajor page faults    0              # 0.000\/sec\nminor page faults    213026         # 3256.972\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             77620340217    # 126.456 branches per 1000 inst\nbranch misses        2814845760     # 3.63% branch miss\nconditional          77620352217    # 126.456 conditional branches per 1000 inst\nindirect             546661136      # 0.891 indirect branches per 1000 inst\nslots                1479746226446  #\nretiring             586349958636   # 39.6% (39.6%)\n-- ucode             23262388203    #     1.6%\n-- fastpath          563087570433   #    38.1%\nfrontend             441928498060   # 29.9% (29.9%)\n-- latency           219742463361   #    14.9%\n-- bandwidth         222186034699   #    15.0%\nbackend              128165873654   #  8.7% ( 8.7%)\n-- cpu               73050345133    #     4.9%\n-- memory            55115528521    #     3.7%\nspeculation          330332085857   # 22.3% (22.3%)\n-- branch mispredict 324491614752   #    21.9%\n-- pipeline restart  5840471105     #     0.4%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           246727230707   # 0.20 GHz\ninstructions         613593772132   # 2.49 IPC\nl2 access            4856487075     # 7.918 l2 access per 1000 inst\nl2 miss              738889264      # 15.21% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Core computation is simple<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      1954945) sh               cpu=4 start=5.57  finish=5.57 \n        1954946) stty             cpu=1 start=5.57  finish=5.57 \n      1954947) crafty-benchmar  cpu=3 start=5.57  finish=21.44\n        1954948) crafty           cpu=5 start=5.58  finish=21.43\n          1954951) crafty           cpu=7 start=21.29 finish=21.43\n      1954952) crafty-benchmar  cpu=3 start=25.44 finish=41.30\n        1954953) crafty           cpu=12 start=25.45 finish=41.29\n          1954956) crafty           cpu=13 start=41.14 finish=41.29\n      1954957) crafty-benchmar  cpu=11 start=45.30 finish=61.09\n        1954958) crafty           cpu=12 start=45.30 finish=61.09\n          1954959) crafty           cpu=14 start=60.96 finish=61.09\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>crafty is a quick running chess benchmark. A single threaded program. Topdown shows very high branch misprediction and low backend stalls. AMD metrics Intel metrics Process structure shows just six invocations of crafty and a moderate percentage of overhead. Core <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/crafty\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-657","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/657","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=657"}],"version-history":[{"count":1,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/657\/revisions"}],"predecessor-version":[{"id":660,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/657\/revisions\/660"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=657"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}