Appleseed is a rendering engine with three workloads taking almost 100% of the CPU

Overall topdown shows backend stalls dominating but each workload with slightly different profiles particularly the amount of frontend/backend stalls.

AMD metrics grouped together summary shows floating point code with moderate number of branches.

elapsed              1544.222
on_cpu               0.929          # 14.87 / 16 cores
utime                22862.002
stime                97.298
nvcsw                5096488        # 74.55%
nivcsw               1739487        # 25.45%
inblock              256            # 0.17/sec
onblock              14856          # 9.62/sec
cpu-clock            22963044408516 # 22963.044 seconds
task-clock           22964554666466 # 22964.555 seconds
page faults          763574         # 33.250/sec
context switches     6843527        # 298.004/sec
cpu migrations       105915         # 4.612/sec
major page faults    3              # 0.000/sec
minor page faults    763571         # 33.250/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             9634462408321  # 87.422 branches per 1000 inst
branch misses        224077950950   # 2.33% branch miss
conditional          5989445877683  # 54.348 conditional branches per 1000 inst
indirect             615593307750   # 5.586 indirect branches per 1000 inst
cpu-cycles           91048561979812 # 3.68 GHz
instructions         110241188818645 # 1.21 IPC
slots                182062196321010 #
retiring             39056699611588 # 21.5% (29.0%)
-- ucode             68402854315    #     0.0%
-- fastpath          38988296757273 #    21.4%
frontend             34087251629845 # 18.7% (25.3%)
-- latency           24525417770772 #    13.5%
-- bandwidth         9561833859073  #     5.3%
backend              54966957402417 # 30.2% (40.8%)
-- cpu               18646175227647 #    10.2%
-- memory            36320782174770 #    19.9%
speculation          6473364404284  #  3.6% ( 4.8%)
-- branch mispredict 6109336851991  #     3.4%
-- pipeline restart  364027552293   #     0.2%
smt-contention       47477116668028 # 26.1% ( 0.0%)
cpu-cycles           90992949840803 # 3.68 GHz
instructions         110221349618840 # 1.21 IPC
instructions         36742471260019 # 67.276 l2 access per 1000 inst
l2 hit from l1       2395790225846  # 3.75% l2 miss
l2 miss from l1      59983147918    #
l2 hit from l2 pf    43433251732    #
l3 hit from l2 pf    25336176898    #
l3 miss from l2 pf   7343561271     #
instructions         36727462123726 # 340.122 float per 1000 inst
float 512            66             # 0.000 AVX-512 per 1000 inst
float 256            672            # 0.000 AVX-256 per 1000 inst
float 128            12491831036792 # 340.122 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         4              # 0.000 scalar per 1000 inst

Intel metrics

elapsed              2110.124
on_cpu               0.905          # 14.48 / 16 cores
utime                30455.635
stime                100.074
nvcsw                6781397        # 79.56%
nivcsw               1741812        # 20.44%
inblock              619080         # 293.39/sec
onblock              3608           # 1.71/sec
cpu-clock            30553924914659 # 30553.925 seconds
task-clock           30555699410017 # 30555.699 seconds
page faults          745731         # 24.406/sec
context switches     8533593        # 279.280/sec
cpu migrations       407712         # 13.343/sec
major page faults    480            # 0.016/sec
minor page faults    745251         # 24.390/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             9629081309023  # 87.361 branches per 1000 inst
branch misses        246716502984   # 2.56% branch miss
conditional          9629081324607  # 87.361 conditional branches per 1000 inst
indirect             3046285794073  # 27.638 indirect branches per 1000 inst
slots                150149147058038 #
retiring             63179454555339 # 42.1% (42.1%)
-- ucode             3495132839894  #     2.3%
-- fastpath          59684321715445 #    39.8%
frontend             34116202842742 # 22.7% (22.7%)
-- latency           20312273735714 #    13.5%
-- bandwidth         13803929107028 #     9.2%
backend              33291561383047 # 22.2% (22.2%)
-- cpu               19378372968340 #    12.9%
-- memory            13913188414707 #     9.3%
speculation          18977631365379 # 12.6% (12.6%)
-- branch mispredict 18538903435909 #    12.3%
-- pipeline restart  438727929470   #     0.3%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           94207477479198 # 2.80 GHz
instructions         129518441410678 # 1.37 IPC
l2 access            3838689652808  # 60.556 l2 access per 1000 inst
l2 miss              274555835292   # 7.15% l2 miss

Process overview shows a set of named worked processes for the CLI

461 processes
	 19 appleseed.cli        194557.01   683.21
	  6 worker_002           34187.89   104.13
	  6 worker_003           34187.89   104.13
	  6 worker_004           34187.89   104.13
	  6 worker_005           34187.89   104.13
	  6 worker_006           34187.89   104.13
	  6 worker_007           34187.89   104.13
	  6 worker_008           34187.89   104.13
	  6 worker_009           34187.89   104.13
	  6 worker_010           34187.89   104.13
	  6 worker_011           34187.89   104.13
	  6 worker_012           34187.89   104.13
	  6 worker_013           34187.89   104.13
	  6 worker_014           34187.89   104.13
	  6 worker_015           34187.89   104.13
	  6 pass_manager         34187.88   104.13
	  6 worker_000           34187.88   104.13
	  6 worker_001           34187.88   104.13
	 68 clinfo                  16.53     6.32
	 38 vulkaninfo               0.95     1.23
	  6 glxinfo:gdrv0            0.12     0.10
	  4 vulkani:disk$0           0.10     0.13
	  6 php                      0.07     0.16
	  6 clang                    0.06     0.06
	  2 glxinfo                  0.06     0.04
	  2 glxinfo:cs0              0.06     0.04
	  2 glxinfo:disk$0           0.06     0.04
	  2 glxinfo:sh0              0.06     0.04
	  2 glxinfo:shlo0            0.06     0.04
	  2 llvmpipe-0               0.05     0.07
	  2 llvmpipe-1               0.05     0.07
	  2 llvmpipe-10              0.05     0.07
	  2 llvmpipe-11              0.05     0.07
	  2 llvmpipe-12              0.05     0.07
	  2 llvmpipe-13              0.05     0.07
	  2 llvmpipe-14              0.05     0.07
	  2 llvmpipe-15              0.05     0.07
	  2 llvmpipe-2               0.05     0.07
	  2 llvmpipe-3               0.05     0.07
	  2 llvmpipe-4               0.05     0.07
	  2 llvmpipe-5               0.05     0.07
	  2 llvmpipe-6               0.05     0.07
	  2 llvmpipe-7               0.05     0.07
	  2 llvmpipe-8               0.05     0.07
	  2 llvmpipe-9               0.05     0.07
	  1 lspci                    0.01     0.02
	  3 rocminfo                 0.00     0.03
	  1 ps                       0.00     0.01
	 79 sh                       0.00     0.00
	 12 gcc                      0.00     0.00
	 12 gsettings                0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  3 appleseed-bench          0.00     0.00
	  2 dconf worker             0.00     0.00
	  2 gmain                    0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 cc                       0.00     0.00
	  1 date                     0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
47 maximum processes

An example computation structure

      2652887) appleseed-bench  cpu=13 start=5.82  finish=727.07
        2652888) appleseed.cli    cpu=2 start=5.82  finish=727.01
          2652889) appleseed.cli    cpu=7 start=6.03  finish=727.01
          2652890) appleseed.cli    cpu=11 start=6.03  finish=727.01
          2652891) appleseed.cli    cpu=3 start=6.03  finish=727.01
          2652892) appleseed.cli    cpu=4 start=6.03  finish=727.01
          2652893) appleseed.cli    cpu=10 start=6.03  finish=727.01
          2652894) appleseed.cli    cpu=13 start=6.03  finish=727.01
          2652895) appleseed.cli    cpu=8 start=6.03  finish=727.01
          2652896) appleseed.cli    cpu=5 start=6.03  finish=727.01
          2652897) appleseed.cli    cpu=9 start=6.03  finish=727.01
          2652898) appleseed.cli    cpu=0 start=6.03  finish=727.01
          2652899) appleseed.cli    cpu=1 start=6.03  finish=727.01
          2652900) appleseed.cli    cpu=10 start=6.03  finish=727.01
          2652901) appleseed.cli    cpu=15 start=6.03  finish=727.01
          2652902) appleseed.cli    cpu=12 start=6.03  finish=727.01
          2652903) appleseed.cli    cpu=14 start=6.03  finish=727.01
          2652904) appleseed.cli    cpu=6 start=6.03  finish=727.01
          2652905) worker_000       cpu=13 start=8.24  finish=367.37
          2652906) worker_001       cpu=10 start=8.24  finish=367.37
          2652907) worker_002       cpu=2 start=8.24  finish=367.37
          2652908) worker_003       cpu=6 start=8.24  finish=367.37
          2652909) worker_004       cpu=0 start=8.24  finish=367.37
          2652910) worker_005       cpu=4 start=8.24  finish=367.37
          2652911) worker_006       cpu=1 start=8.24  finish=367.37
          2652912) worker_007       cpu=15 start=8.24  finish=367.37
          2652913) worker_008       cpu=12 start=8.24  finish=367.37
          2652914) worker_009       cpu=5 start=8.24  finish=367.37
          2652915) worker_010       cpu=14 start=8.24  finish=367.37
          2652916) worker_011       cpu=8 start=8.24  finish=367.37
          2652917) worker_012       cpu=3 start=8.24  finish=367.37
          2652918) worker_013       cpu=10 start=8.24  finish=367.37
          2652919) worker_014       cpu=11 start=8.24  finish=367.37
          2652920) worker_015       cpu=13 start=8.24  finish=367.38
          2652921) pass_manager     cpu=12 start=8.24  finish=367.37
          2652925) worker_000       cpu=12 start=367.41 finish=726.97
          2652926) worker_001       cpu=7 start=367.41 finish=726.97
          2652927) worker_002       cpu=13 start=367.42 finish=726.97
          2652928) worker_003       cpu=1 start=367.42 finish=726.97
          2652929) worker_004       cpu=15 start=367.42 finish=726.97
          2652930) worker_005       cpu=13 start=367.42 finish=726.97
          2652931) worker_006       cpu=14 start=367.42 finish=726.97
          2652932) worker_007       cpu=9 start=367.42 finish=726.97
          2652933) worker_008       cpu=12 start=367.42 finish=726.97
          2652934) worker_009       cpu=0 start=367.42 finish=726.97
          2652935) worker_010       cpu=8 start=367.42 finish=726.97
          2652936) worker_011       cpu=11 start=367.42 finish=726.97
          2652937) worker_012       cpu=3 start=367.42 finish=726.97
          2652938) worker_013       cpu=4 start=367.42 finish=726.97
          2652939) worker_014       cpu=6 start=367.42 finish=726.97
          2652940) worker_015       cpu=5 start=367.42 finish=726.97
          2652941) pass_manager     cpu=4 start=367.42 finish=726.96