{"id":710,"date":"2024-01-20T00:50:22","date_gmt":"2024-01-20T00:50:22","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=710"},"modified":"2024-01-20T11:46:45","modified_gmt":"2024-01-20T11:46:45","slug":"appleseed","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/appleseed\/","title":{"rendered":"appleseed"},"content":{"rendered":"\n<p>Appleseed is a rendering engine with three workloads taking almost 100% of the CPU<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-41.png\" alt=\"\" class=\"wp-image-728\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-41.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-41-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-41-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Overall topdown shows backend stalls dominating but each workload with  slightly different profiles particularly the amount of frontend\/backend stalls.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-79.png\" alt=\"\" class=\"wp-image-730\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-79.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-79-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-79-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics grouped together summary shows floating point code with moderate number of branches.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              1544.222\non_cpu               0.929          # 14.87 \/ 16 cores\nutime                22862.002\nstime                97.298\nnvcsw                5096488        # 74.55%\nnivcsw               1739487        # 25.45%\ninblock              256            # 0.17\/sec\nonblock              14856          # 9.62\/sec\ncpu-clock            22963044408516 # 22963.044 seconds\ntask-clock           22964554666466 # 22964.555 seconds\npage faults          763574         # 33.250\/sec\ncontext switches     6843527        # 298.004\/sec\ncpu migrations       105915         # 4.612\/sec\nmajor page faults    3              # 0.000\/sec\nminor page faults    763571         # 33.250\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             9634462408321  # 87.422 branches per 1000 inst\nbranch misses        224077950950   # 2.33% branch miss\nconditional          5989445877683  # 54.348 conditional branches per 1000 inst\nindirect             615593307750   # 5.586 indirect branches per 1000 inst\ncpu-cycles           91048561979812 # 3.68 GHz\ninstructions         110241188818645 # 1.21 IPC\nslots                182062196321010 #\nretiring             39056699611588 # 21.5% (29.0%)\n-- ucode             68402854315    #     0.0%\n-- fastpath          38988296757273 #    21.4%\nfrontend             34087251629845 # 18.7% (25.3%)\n-- latency           24525417770772 #    13.5%\n-- bandwidth         9561833859073  #     5.3%\nbackend              54966957402417 # 30.2% (40.8%)\n-- cpu               18646175227647 #    10.2%\n-- memory            36320782174770 #    19.9%\nspeculation          6473364404284  #  3.6% ( 4.8%)\n-- branch mispredict 6109336851991  #     3.4%\n-- pipeline restart  364027552293   #     0.2%\nsmt-contention       47477116668028 # 26.1% ( 0.0%)\ncpu-cycles           90992949840803 # 3.68 GHz\ninstructions         110221349618840 # 1.21 IPC\ninstructions         36742471260019 # 67.276 l2 access per 1000 inst\nl2 hit from l1       2395790225846  # 3.75% l2 miss\nl2 miss from l1      59983147918    #\nl2 hit from l2 pf    43433251732    #\nl3 hit from l2 pf    25336176898    #\nl3 miss from l2 pf   7343561271     #\ninstructions         36727462123726 # 340.122 float per 1000 inst\nfloat 512            66             # 0.000 AVX-512 per 1000 inst\nfloat 256            672            # 0.000 AVX-256 per 1000 inst\nfloat 128            12491831036792 # 340.122 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         4              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              2110.124\non_cpu               0.905          # 14.48 \/ 16 cores\nutime                30455.635\nstime                100.074\nnvcsw                6781397        # 79.56%\nnivcsw               1741812        # 20.44%\ninblock              619080         # 293.39\/sec\nonblock              3608           # 1.71\/sec\ncpu-clock            30553924914659 # 30553.925 seconds\ntask-clock           30555699410017 # 30555.699 seconds\npage faults          745731         # 24.406\/sec\ncontext switches     8533593        # 279.280\/sec\ncpu migrations       407712         # 13.343\/sec\nmajor page faults    480            # 0.016\/sec\nminor page faults    745251         # 24.390\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             9629081309023  # 87.361 branches per 1000 inst\nbranch misses        246716502984   # 2.56% branch miss\nconditional          9629081324607  # 87.361 conditional branches per 1000 inst\nindirect             3046285794073  # 27.638 indirect branches per 1000 inst\nslots                150149147058038 #\nretiring             63179454555339 # 42.1% (42.1%)\n-- ucode             3495132839894  #     2.3%\n-- fastpath          59684321715445 #    39.8%\nfrontend             34116202842742 # 22.7% (22.7%)\n-- latency           20312273735714 #    13.5%\n-- bandwidth         13803929107028 #     9.2%\nbackend              33291561383047 # 22.2% (22.2%)\n-- cpu               19378372968340 #    12.9%\n-- memory            13913188414707 #     9.3%\nspeculation          18977631365379 # 12.6% (12.6%)\n-- branch mispredict 18538903435909 #    12.3%\n-- pipeline restart  438727929470   #     0.3%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           94207477479198 # 2.80 GHz\ninstructions         129518441410678 # 1.37 IPC\nl2 access            3838689652808  # 60.556 l2 access per 1000 inst\nl2 miss              274555835292   # 7.15% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process overview shows a set of named worked processes for the CLI<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>461 processes\n\t 19 appleseed.cli        194557.01   683.21\n\t  6 worker_002           34187.89   104.13\n\t  6 worker_003           34187.89   104.13\n\t  6 worker_004           34187.89   104.13\n\t  6 worker_005           34187.89   104.13\n\t  6 worker_006           34187.89   104.13\n\t  6 worker_007           34187.89   104.13\n\t  6 worker_008           34187.89   104.13\n\t  6 worker_009           34187.89   104.13\n\t  6 worker_010           34187.89   104.13\n\t  6 worker_011           34187.89   104.13\n\t  6 worker_012           34187.89   104.13\n\t  6 worker_013           34187.89   104.13\n\t  6 worker_014           34187.89   104.13\n\t  6 worker_015           34187.89   104.13\n\t  6 pass_manager         34187.88   104.13\n\t  6 worker_000           34187.88   104.13\n\t  6 worker_001           34187.88   104.13\n\t 68 clinfo                  16.53     6.32\n\t 38 vulkaninfo               0.95     1.23\n\t  6 glxinfo:gdrv0            0.12     0.10\n\t  4 vulkani:disk$0           0.10     0.13\n\t  6 php                      0.07     0.16\n\t  6 clang                    0.06     0.06\n\t  2 glxinfo                  0.06     0.04\n\t  2 glxinfo:cs0              0.06     0.04\n\t  2 glxinfo:disk$0           0.06     0.04\n\t  2 glxinfo:sh0              0.06     0.04\n\t  2 glxinfo:shlo0            0.06     0.04\n\t  2 llvmpipe-0               0.05     0.07\n\t  2 llvmpipe-1               0.05     0.07\n\t  2 llvmpipe-10              0.05     0.07\n\t  2 llvmpipe-11              0.05     0.07\n\t  2 llvmpipe-12              0.05     0.07\n\t  2 llvmpipe-13              0.05     0.07\n\t  2 llvmpipe-14              0.05     0.07\n\t  2 llvmpipe-15              0.05     0.07\n\t  2 llvmpipe-2               0.05     0.07\n\t  2 llvmpipe-3               0.05     0.07\n\t  2 llvmpipe-4               0.05     0.07\n\t  2 llvmpipe-5               0.05     0.07\n\t  2 llvmpipe-6               0.05     0.07\n\t  2 llvmpipe-7               0.05     0.07\n\t  2 llvmpipe-8               0.05     0.07\n\t  2 llvmpipe-9               0.05     0.07\n\t  1 lspci                    0.01     0.02\n\t  3 rocminfo                 0.00     0.03\n\t  1 ps                       0.00     0.01\n\t 79 sh                       0.00     0.00\n\t 12 gcc                      0.00     0.00\n\t 12 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  3 appleseed-bench          0.00     0.00\n\t  2 dconf worker             0.00     0.00\n\t  2 gmain                    0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 cc                       0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>An example computation structure<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      2652887) appleseed-bench  cpu=13 start=5.82  finish=727.07\n        2652888) appleseed.cli    cpu=2 start=5.82  finish=727.01\n          2652889) appleseed.cli    cpu=7 start=6.03  finish=727.01\n          2652890) appleseed.cli    cpu=11 start=6.03  finish=727.01\n          2652891) appleseed.cli    cpu=3 start=6.03  finish=727.01\n          2652892) appleseed.cli    cpu=4 start=6.03  finish=727.01\n          2652893) appleseed.cli    cpu=10 start=6.03  finish=727.01\n          2652894) appleseed.cli    cpu=13 start=6.03  finish=727.01\n          2652895) appleseed.cli    cpu=8 start=6.03  finish=727.01\n          2652896) appleseed.cli    cpu=5 start=6.03  finish=727.01\n          2652897) appleseed.cli    cpu=9 start=6.03  finish=727.01\n          2652898) appleseed.cli    cpu=0 start=6.03  finish=727.01\n          2652899) appleseed.cli    cpu=1 start=6.03  finish=727.01\n          2652900) appleseed.cli    cpu=10 start=6.03  finish=727.01\n          2652901) appleseed.cli    cpu=15 start=6.03  finish=727.01\n          2652902) appleseed.cli    cpu=12 start=6.03  finish=727.01\n          2652903) appleseed.cli    cpu=14 start=6.03  finish=727.01\n          2652904) appleseed.cli    cpu=6 start=6.03  finish=727.01\n          2652905) worker_000       cpu=13 start=8.24  finish=367.37\n          2652906) worker_001       cpu=10 start=8.24  finish=367.37\n          2652907) worker_002       cpu=2 start=8.24  finish=367.37\n          2652908) worker_003       cpu=6 start=8.24  finish=367.37\n          2652909) worker_004       cpu=0 start=8.24  finish=367.37\n          2652910) worker_005       cpu=4 start=8.24  finish=367.37\n          2652911) worker_006       cpu=1 start=8.24  finish=367.37\n          2652912) worker_007       cpu=15 start=8.24  finish=367.37\n          2652913) worker_008       cpu=12 start=8.24  finish=367.37\n          2652914) worker_009       cpu=5 start=8.24  finish=367.37\n          2652915) worker_010       cpu=14 start=8.24  finish=367.37\n          2652916) worker_011       cpu=8 start=8.24  finish=367.37\n          2652917) worker_012       cpu=3 start=8.24  finish=367.37\n          2652918) worker_013       cpu=10 start=8.24  finish=367.37\n          2652919) worker_014       cpu=11 start=8.24  finish=367.37\n          2652920) worker_015       cpu=13 start=8.24  finish=367.38\n          2652921) pass_manager     cpu=12 start=8.24  finish=367.37\n          2652925) worker_000       cpu=12 start=367.41 finish=726.97\n          2652926) worker_001       cpu=7 start=367.41 finish=726.97\n          2652927) worker_002       cpu=13 start=367.42 finish=726.97\n          2652928) worker_003       cpu=1 start=367.42 finish=726.97\n          2652929) worker_004       cpu=15 start=367.42 finish=726.97\n          2652930) worker_005       cpu=13 start=367.42 finish=726.97\n          2652931) worker_006       cpu=14 start=367.42 finish=726.97\n          2652932) worker_007       cpu=9 start=367.42 finish=726.97\n          2652933) worker_008       cpu=12 start=367.42 finish=726.97\n          2652934) worker_009       cpu=0 start=367.42 finish=726.97\n          2652935) worker_010       cpu=8 start=367.42 finish=726.97\n          2652936) worker_011       cpu=11 start=367.42 finish=726.97\n          2652937) worker_012       cpu=3 start=367.42 finish=726.97\n          2652938) worker_013       cpu=4 start=367.42 finish=726.97\n          2652939) worker_014       cpu=6 start=367.42 finish=726.97\n          2652940) worker_015       cpu=5 start=367.42 finish=726.97\n          2652941) pass_manager     cpu=4 start=367.42 finish=726.96\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Appleseed is a rendering engine with three workloads taking almost 100% of the CPU Overall topdown shows backend stalls dominating but each workload with slightly different profiles particularly the amount of frontend\/backend stalls. AMD metrics grouped together summary shows floating <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/appleseed\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-710","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/710","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=710"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/710\/revisions"}],"predecessor-version":[{"id":731,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/710\/revisions\/731"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=710"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}