{"id":1456,"date":"2024-02-03T23:24:12","date_gmt":"2024-02-03T23:24:12","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1456"},"modified":"2024-02-07T19:27:44","modified_gmt":"2024-02-07T19:27:44","slug":"brl-cad","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/brl-cad\/","title":{"rendered":"brl-cad"},"content":{"rendered":"\n<p>brl-cad is a cross-platform solid modeling system. There is one workload and it returns a single result.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-33.png\" alt=\"\" class=\"wp-image-1571\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-33.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-33-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-33-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows a moderate retirement rate with backend stalls around 35% and frontend stalls a bit lower.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-35.png\" alt=\"\" class=\"wp-image-1573\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-35.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-35-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-35-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics confirms this is on_cpu a high percentage of the time. This is floating point code with 1\/5 instruction as branches.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              639.225\non_cpu               0.974          # 15.59 \/ 16 cores\nutime                9958.646\nstime                6.480\nnvcsw                215292         # 68.36%\nnivcsw               99651          # 31.64%\ninblock              0              # 0.00\/sec\nonblock              71144          # 111.30\/sec\ncpu-clock            9966039733748  # 9966.040 seconds\ntask-clock           9966158602700  # 9966.159 seconds\npage faults          735554         # 73.805\/sec\ncontext switches     313640         # 31.471\/sec\ncpu migrations       1654           # 0.166\/sec\nmajor page faults    8              # 0.001\/sec\nminor page faults    735546         # 73.804\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             14007495341612 # 211.445 branches per 1000 inst\nbranch misses        3341257044     # 0.02% branch miss\nconditional          11685179228188 # 176.389 conditional branches per 1000 inst\nindirect             525318798526   # 7.930 indirect branches per 1000 inst\ncpu-cycles           47112380829398 # 3.74 GHz\ninstructions         81128016902473 # 1.72 IPC\nslots                94276717844766 #\nretiring             27376809665597 # 29.0% (42.0%)\n-- ucode             11102347489    #     0.0%\n-- fastpath          27365707318108 #    29.0%\nfrontend             11901900325957 # 12.6% (18.2%)\n-- latency           6234972940752  #     6.6%\n-- bandwidth         5666927385205  #     6.0%\nbackend              24883562569187 # 26.4% (38.1%)\n-- cpu               8635199705225  #     9.2%\n-- memory            16248362863962 #    17.2%\nspeculation          1078411183751  #  1.1% ( 1.7%)\n-- branch mispredict 179175070523   #     0.2%\n-- pipeline restart  899236113228   #     1.0%\nsmt-contention       29035889927224 # 30.8% ( 0.0%)\ncpu-cycles           39243991456063 # 3.73 GHz\ninstructions         67559938891205 # 1.72 IPC\ninstructions         22520649909355 # 41.274 l2 access per 1000 inst\nl2 hit from l1       902034805920   # 1.60% l2 miss\nl2 miss from l1      13786700485    #\nl2 hit from l2 pf    26419658106    #\nl3 hit from l2 pf    1056269999     #\nl3 miss from l2 pf   13978899       #\ninstructions         22505317679834 # 218.552 float per 1000 inst\nfloat 512            982            # 0.000 AVX-512 per 1000 inst\nfloat 256            614            # 0.000 AVX-256 per 1000 inst\nfloat 128            4918591984823  # 218.552 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\ninstructions         2672234        #\nopcache              987136         # 369.405 opcache per 1000 inst\nopcache miss         524283         # 53.1% opcache miss rate\nl1 dTLB miss         6777           # 2.536 L1 dTLB per 1000 inst\nl2 dTLB miss         1199           # 0.449 L2 dTLB per 1000 inst\ninstructions         2700478        #\nicache               1309002        # 484.730 icache per 1000 inst\nicache miss          110099         #  8.4% icache miss rate\nl1 iTLB miss         9              # 0.003 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            20             # 0.007 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              1076.187\non_cpu               0.984          # 15.75 \/ 16 cores\nutime                16941.117\nstime                10.424\nnvcsw                628917         # 82.99%\nnivcsw               128878         # 17.01%\ninblock              56528          # 52.53\/sec\nonblock              72224          # 67.11\/sec\ncpu-clock            16951994430620 # 16951.994 seconds\ntask-clock           16952153400873 # 16952.153 seconds\npage faults          932561         # 55.011\/sec\ncontext switches     756965         # 44.653\/sec\ncpu migrations       3177           # 0.187\/sec\nmajor page faults    259            # 0.015\/sec\nminor page faults    932302         # 54.996\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             20434183000659 # 211.784 branches per 1000 inst\nbranch misses        8516881008     # 0.04% branch miss\nconditional          20434183256243 # 211.784 conditional branches per 1000 inst\nindirect             4426944693177  # 45.882 indirect branches per 1000 inst\nslots                84159439116410 #\nretiring             51389887615992 # 61.1% (61.1%) high\n-- ucode             3059294803207  #     3.6%\n-- fastpath          48330592812785 #    57.4%\nfrontend             31103101493743 # 37.0% (37.0%)\n-- latency           10041258437846 #    11.9%\n-- bandwidth         21061843055897 #    25.0%\nbackend              1082659792903  #  1.3% ( 1.3%) low\n-- cpu               470510574004   #     0.6%\n-- memory            612149218899   #     0.7%\nspeculation          545808698563   #  0.6% ( 0.6%) low\n-- branch mispredict 239531142061   #     0.3%\n-- pipeline restart  306277556502   #     0.4%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           34668355324230 # 2.20 GHz\ninstructions         66488539967899 # 1.92 IPC\nl2 access            790072912798   # 18.573 l2 access per 1000 inst\nl2 miss              16242071025    # 2.06% l2 miss\ncpu-cycles           22697715534731 # 20.7% memory latency\nload stalls          4415703252072  # 11.8% l1 bound\nl1 miss              1743541712815  #  6.6% l2 bound\nl2 miss              238627192184   #  1.0% l3 bound\nl3 miss              632705958      #  0.0% dram bound\nstore_stalls         287744462332   #  1.3% store bound\n<\/code><\/pre>\n\n\n\n<p>Process overview shows rt as the primary process. Looks like it was incomplete in getting all the processes.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>6394 processes\n\t1269 rt                   194461.77   119.96\n\t 68 clinfo                  19.53     5.96\n\t 38 vulkaninfo               1.15     1.14\n\t  4 vulkani:disk$0           0.13     0.12\n\t  6 glxinfo:gdrv0            0.11     0.09\n\t  6 glxinfo:gl0              0.11     0.09\n\t  2 llvmpipe-0               0.07     0.06\n\t  2 llvmpipe-1               0.07     0.06\n\t  2 llvmpipe-10              0.07     0.06\n\t  2 llvmpipe-11              0.07     0.06\n\t  2 llvmpipe-12              0.07     0.06\n\t  2 llvmpipe-13              0.07     0.06\n\t  2 llvmpipe-14              0.07     0.06\n\t  2 llvmpipe-15              0.07     0.06\n\t  2 llvmpipe-2               0.07     0.06\n\t  2 llvmpipe-3               0.07     0.06\n\t  2 llvmpipe-4               0.07     0.06\n\t  2 llvmpipe-5               0.07     0.06\n\t  2 llvmpipe-6               0.07     0.06\n\t  2 llvmpipe-7               0.07     0.06\n\t  2 llvmpipe-8               0.07     0.06\n\t  2 llvmpipe-9               0.07     0.06\n\t  6 clang                    0.07     0.05\n\t  2 glxinfo                  0.07     0.03\n\t  2 glxinfo:cs0              0.07     0.03\n\t  2 glxinfo:disk$0           0.07     0.03\n\t  2 glxinfo:sh0              0.07     0.03\n\t  2 glxinfo:shlo0            0.07     0.03\n\t  6 php                      0.04     0.10\n\t  3 rocminfo                 0.01     0.01\n\t648 benchmark                0.00     0.16\n\t  1 lspci                    0.00     0.03\n\t  1 ps                       0.00     0.01\n\t1453 expr                     0.00     0.00\n\t1328 elapsed.sh               0.00     0.00\n\t804 awk                      0.00     0.00\n\t137 date                     0.00     0.00\n\t 92 wc                       0.00     0.00\n\t 80 sh                       0.00     0.00\n\t 66 rm                       0.00     0.00\n\t 55 grep                     0.00     0.00\n\t 46 cat                      0.00     0.00\n\t 45 mv                       0.00     0.00\n\t 31 dc                       0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t 10 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  7 pixcmp                   0.00     0.00\n\t  7 tr                       0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  5 sed                      0.00     0.00\n\t  4 gmain                    0.00     0.00\n\t  3 uname                    0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 dconf worker             0.00     0.00\n\t  2 dirname                  0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 basename                 0.00     0.00\n\t  1 bc                       0.00     0.00\n\t  1 brl-cad                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 hostname                 0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 touch                    0.00     0.00\n\t  1 xrandr                   0.00     0.00\n52 processes running\n99 maximum processes\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>brl-cad is a cross-platform solid modeling system. There is one workload and it returns a single result. Topdown profile shows a moderate retirement rate with backend stalls around 35% and frontend stalls a bit lower. AMD metrics confirms this is <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/brl-cad\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1456","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1456","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1456"}],"version-history":[{"count":3,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1456\/revisions"}],"predecessor-version":[{"id":1574,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1456\/revisions\/1574"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1456"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}