povray is a SPEC CPU(R) benchmark written in C and C++ and described here. The workload runs on all logical cores.

Topdown profile shows a high retirement rate with some backend stalls.

AMD metrics confirms this runs on all cores. The backend stalls are a mixture of memory and CPU. There is a moderate level of L2 access and almost no L2 misses.

elapsed              1207.797
on_cpu               0.975          # 15.60 / 16 cores
utime                18835.978
stime                7.810
nvcsw                26880          # 13.67%
nivcsw               169808         # 86.33%
inblock              0              # 0.00/sec
onblock              1750920        # 1449.68/sec
cpu-clock            18843959086829 # 18843.959 seconds
task-clock           18844041233899 # 18844.041 seconds
page faults          1200044        # 63.683/sec
context switches     196017         # 10.402/sec
cpu migrations       187            # 0.010/sec
major page faults    848            # 0.045/sec
minor page faults    1199196        # 63.638/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             20597032052711 # 157.689 branches per 1000 inst
branch misses        52441407427    # 0.25% branch miss
conditional          14324263755382 # 109.665 conditional branches per 1000 inst
indirect             1423684510013  # 10.900 indirect branches per 1000 inst
cpu-cycles           72261524343074 # 3.78 GHz
instructions         130633370200321 # 1.81 IPC
slots                144529562538048 #
retiring             45790610884003 # 31.7% (52.4%)
-- ucode             452654600211   #     0.3%
-- fastpath          45337956283792 #    31.4%
frontend             5742560763746  #  4.0% ( 6.6%)
-- latency           3599231892750  #     2.5%
-- bandwidth         2143328870996  #     1.5%
backend              34057388037444 # 23.6% (38.9%)
-- cpu               16179137901330 #    11.2%
-- memory            17878250136114 #    12.4%
speculation          1877350050617  #  1.3% ( 2.1%)
-- branch mispredict 1551312091401  #     1.1%
-- pipeline restart  326037959216   #     0.2%
smt-contention       57061496732727 # 39.5% ( 0.0%)
cpu-cycles           72310084126317 # 3.76 GHz
instructions         130637928317245 # 1.81 IPC
instructions         43547668810124 # 63.187 l2 access per 1000 inst
l2 hit from l1       2423517159190  # 0.06% l2 miss
l2 miss from l1      873514352      #
l2 hit from l2 pf    327458307998   #
l3 hit from l2 pf    606029657      #
l3 miss from l2 pf   59682300       #
instructions         43529508664435 # 244.832 float per 1000 inst
float 512            289            # 0.000 AVX-512 per 1000 inst
float 256            2219972971     # 0.051 AVX-256 per 1000 inst
float 128            10655211619067 # 244.781 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         0              # 0.000 scalar per 1000 inst
instructions         130617322132691 #
opcache              20744462326405 # 158.819 opcache per 1000 inst
opcache miss         771836069050   #  3.7% opcache miss rate
l1 dTLB miss         658886665708   # 5.044 L1 dTLB per 1000 inst
l2 dTLB miss         8891149600     # 0.068 L2 dTLB per 1000 inst
instructions         130617394601927 #
icache               1101558221297  # 8.433 icache per 1000 inst
icache miss          230432987096   # 20.9% icache miss rate
l1 iTLB miss         73327277129    # 0.561 L1 iTLB per 1000 inst
l2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst
tlb flush            105996         # 0.000 TLB flush per 1000 inst

Process overview shows almost all time spent in povray_r_base.m

691 processes
	 48 povray_r_base.m      18713.39     2.26
	 71 specperl                41.33     1.60
	 48 imagevalidate_5          8.77     1.21
	  2 clang++                  0.02     0.01
	  2 clang                    0.01     0.01
	 10 ps                       0.00     0.01
	225 sh                       0.00     0.00
	 54 specrxp                  0.00     0.00
	 48 bash                     0.00     0.00
	 41 specinvoke               0.00     0.00
	 22 cat                      0.00     0.00
	 21 grep                     0.00     0.00
	 12 uniq                     0.00     0.00
	 11 sort                     0.00     0.00
	 10 expand                   0.00     0.00
	  7 specmake                 0.00     0.00
	  6 pwd                      0.00     0.00
	  5 basename                 0.00     0.00
	  5 systemctl                0.00     0.00
	  4 rm                       0.00     0.00
	  4 specpp                   0.00     0.00
	  4 uname                    0.00     0.00
	  3 dirname                  0.00     0.00
	  3 dmidecode                0.00     0.00
	  3 lscpu                    0.00     0.00
	  2 df                       0.00     0.00
	  2 dpkg                     0.00     0.00
	  2 runcpu                   0.00     0.00
	  2 specsha512sum            0.00     0.00
	  2 specxz                   0.00     0.00
	  2 who                      0.00     0.00
	  1 cpupower                 0.00     0.00
	  1 head                     0.00     0.00
	  1 logname                  0.00     0.00
	  1 ls                       0.00     0.00
	  1 lsb_release              0.00     0.00
	  1 numactl                  0.00     0.00
	  1 sysctl                   0.00     0.00
	  1 w                        0.00     0.00
	  1 wc                       0.00     0.00
	  1 which                    0.00     0.00
0 processes running
53 maximum processes

specinvoke fires up separate processes for each logical core.

    400740) specinvoke       cpu=2 start=3.81  finish=395.96
      400742) sh               cpu=13 start=3.81  finish=393.80
        400749) bash             cpu=0 start=3.81  finish=393.80
          400775) povray_r_base.m  cpu=0 start=3.81  finish=393.79
      400743) sh               cpu=2 start=3.81  finish=393.56
        400752) bash             cpu=1 start=3.81  finish=393.56
          400776) povray_r_base.m  cpu=1 start=3.81  finish=393.56
      400744) sh               cpu=2 start=3.81  finish=391.13
        400751) bash             cpu=2 start=3.81  finish=391.13
          400777) povray_r_base.m  cpu=2 start=3.81  finish=391.13
      400745) sh               cpu=15 start=3.81  finish=393.58
        400753) bash             cpu=3 start=3.81  finish=393.58
          400774) povray_r_base.m  cpu=3 start=3.81  finish=393.58
      400746) sh               cpu=2 start=3.81  finish=392.22
        400759) bash             cpu=4 start=3.81  finish=392.22
          400779) povray_r_base.m  cpu=4 start=3.81  finish=392.22
      400747) sh               cpu=8 start=3.81  finish=395.02
        400754) bash             cpu=5 start=3.81  finish=395.02
          400778) povray_r_base.m  cpu=5 start=3.81  finish=395.02
      400748) sh               cpu=9 start=3.81  finish=395.96
        400757) bash             cpu=6 start=3.81  finish=395.96
          400780) povray_r_base.m  cpu=6 start=3.82  finish=395.96
      400750) sh               cpu=7 start=3.81  finish=393.05
        400764) bash             cpu=7 start=3.81  finish=393.05
          400782) povray_r_base.m  cpu=7 start=3.82  finish=393.05
      400755) sh               cpu=9 start=3.81  finish=394.66
        400765) bash             cpu=8 start=3.81  finish=394.65
          400781) povray_r_base.m  cpu=8 start=3.82  finish=394.65
      400756) sh               cpu=2 start=3.81  finish=393.30
        400763) bash             cpu=9 start=3.81  finish=393.30
          400783) povray_r_base.m  cpu=9 start=3.82  finish=393.30
      400758) sh               cpu=12 start=3.81  finish=394.22
        400768) bash             cpu=10 start=3.81  finish=394.22
          400784) povray_r_base.m  cpu=10 start=3.82  finish=394.22
      400760) sh               cpu=2 start=3.81  finish=392.57
        400769) bash             cpu=11 start=3.81  finish=392.57
          400787) povray_r_base.m  cpu=11 start=3.82  finish=392.57
      400761) sh               cpu=2 start=3.81  finish=393.60
        400770) bash             cpu=12 start=3.81  finish=393.60
          400785) povray_r_base.m  cpu=12 start=3.82  finish=393.60
      400762) sh               cpu=15 start=3.81  finish=393.13
        400771) bash             cpu=13 start=3.81  finish=393.13
          400789) povray_r_base.m  cpu=13 start=3.82  finish=393.13
      400766) sh               cpu=10 start=3.81  finish=395.03
        400772) bash             cpu=14 start=3.81  finish=395.03
          400786) povray_r_base.m  cpu=14 start=3.82  finish=395.03
      400767) sh               cpu=2 start=3.81  finish=393.01
        400773) bash             cpu=15 start=3.81  finish=393.01
          400788) povray_r_base.m  cpu=15 start=3.82  finish=393.01