cam4 is a SPEC CPU(R) benchmark described here and written in Fortran and C. The workload runs on all logical cores.

Topdown profile shows it is dominated by backend stalls but with a varying profile over time.

AMD metrics confirm this as ~40% memory bound and ~20% CPU bound. Only ~60 L2 accesses per 1000 instructions with a 20% miss rate.
elapsed 1312.508
on_cpu 0.988 # 15.81 / 16 cores
utime 20528.728
stime 225.953
nvcsw 29409 # 12.94%
nivcsw 197920 # 87.06%
inblock 9840 # 7.50/sec
onblock 1228368 # 935.89/sec
cpu-clock 20757774966166 # 20757.775 seconds
task-clock 20757987236593 # 20757.987 seconds
page faults 75899772 # 3656.413/sec
context switches 226656 # 10.919/sec
cpu migrations 203 # 0.010/sec
major page faults 1395 # 0.067/sec
minor page faults 75898377 # 3656.346/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 9015331689309 # 124.383 branches per 1000 inst
branch misses 103625466297 # 1.15% branch miss
conditional 6474016811013 # 89.321 conditional branches per 1000 inst
indirect 635818303619 # 8.772 indirect branches per 1000 inst
cpu-cycles 85797916011836 # 4.07 GHz
instructions 72472521907425 # 0.84 IPC
slots 171567233970366 #
retiring 24958749784066 # 14.5% (16.4%)
-- ucode 5475009509 # 0.0%
-- fastpath 24953274774557 # 14.5%
frontend 24472498750726 # 14.3% (16.0%)
-- latency 15983869130694 # 9.3%
-- bandwidth 8488629620032 # 4.9%
backend 102011177142788 # 59.5% (66.9%)
-- cpu 33453535278844 # 19.5%
-- memory 68557641863944 # 40.0%
speculation 1135201453063 # 0.7% ( 0.7%) low
-- branch mispredict 1086579188867 # 0.6%
-- pipeline restart 48622264196 # 0.0%
smt-contention 18989527666963 # 11.1% ( 0.0%)
cpu-cycles 86324947438235 # 4.07 GHz
instructions 72455714789886 # 0.84 IPC
instructions 24154762289200 # 62.695 l2 access per 1000 inst
l2 hit from l1 1211423015562 # 20.78% l2 miss
l2 miss from l1 176222621478 #
l2 hit from l2 pf 164446062366 #
l3 hit from l2 pf 60313265001 #
l3 miss from l2 pf 78199571898 #
instructions 24154893211351 # 189.588 float per 1000 inst
float 512 299 # 0.000 AVX-512 per 1000 inst
float 256 23880669948 # 0.989 AVX-256 per 1000 inst
float 128 4555597500473 # 188.599 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
instructions 72455795340647 #
opcache 11296118454852 # 155.904 opcache per 1000 inst
opcache miss 898107672809 # 8.0% opcache miss rate
l1 dTLB miss 224971601225 # 3.105 L1 dTLB per 1000 inst
l2 dTLB miss 9665338857 # 0.133 L2 dTLB per 1000 inst
instructions 72441840668917 #
icache 1522351345670 # 21.015 icache per 1000 inst
icache miss 394640134744 # 25.9% icache miss rate
l1 iTLB miss 13022382440 # 0.180 L1 iTLB per 1000 inst
l2 iTLB miss 0 # 0.000 L2 iTLB per 1000 inst
tlb flush 260298 # 0.000 TLB flush per 1000 inst
Process overview shows time spent in cam4_r_base.mev
693 processes
48 cam4_r_base.mev 20627.37 219.18
71 specperl 13.77 2.95
48 cam4_validate_5 1.01 0.41
2 clang 0.01 0.01
2 flang 0.01 0.01
1 lsb_release 0.01 0.00
11 ps 0.00 0.01
226 sh 0.00 0.00
54 specrxp 0.00 0.00
48 bash 0.00 0.00
41 specinvoke 0.00 0.00
22 cat 0.00 0.00
21 grep 0.00 0.00
12 uniq 0.00 0.00
11 sort 0.00 0.00
10 expand 0.00 0.00
7 specmake 0.00 0.00
6 pwd 0.00 0.00
5 basename 0.00 0.00
5 systemctl 0.00 0.00
4 rm 0.00 0.00
4 specpp 0.00 0.00
4 uname 0.00 0.00
3 dirname 0.00 0.00
3 dmidecode 0.00 0.00
3 lscpu 0.00 0.00
2 df 0.00 0.00
2 dpkg 0.00 0.00
2 runcpu 0.00 0.00
2 specsha512sum 0.00 0.00
2 specxz 0.00 0.00
2 who 0.00 0.00
1 cpupower 0.00 0.00
1 head 0.00 0.00
1 logname 0.00 0.00
1 ls 0.00 0.00
1 numactl 0.00 0.00
1 sysctl 0.00 0.00
1 w 0.00 0.00
1 wc 0.00 0.00
1 which 0.00 0.00
0 processes running
53 maximum processes
specinvoke runs separate copies on each core.
440422) specinvoke cpu=14 start=4.62 finish=438.07
440424) sh cpu=13 start=4.62 finish=435.22
440430) bash cpu=0 start=4.63 finish=435.22
440455) cam4_r_base.mev cpu=0 start=4.63 finish=435.11
440425) sh cpu=8 start=4.63 finish=434.75
440431) bash cpu=1 start=4.63 finish=434.75
440456) cam4_r_base.mev cpu=1 start=4.63 finish=434.62
440426) sh cpu=10 start=4.63 finish=436.92
440433) bash cpu=2 start=4.63 finish=436.91
440457) cam4_r_base.mev cpu=2 start=4.63 finish=436.84
440427) sh cpu=9 start=4.63 finish=437.71
440437) bash cpu=3 start=4.63 finish=437.71
440458) cam4_r_base.mev cpu=3 start=4.63 finish=437.64
440428) sh cpu=4 start=4.63 finish=438.07
440438) bash cpu=4 start=4.63 finish=438.07
440462) cam4_r_base.mev cpu=4 start=4.63 finish=437.99
440429) sh cpu=13 start=4.63 finish=436.44
440440) bash cpu=5 start=4.63 finish=436.44
440459) cam4_r_base.mev cpu=5 start=4.63 finish=436.37
440432) sh cpu=14 start=4.63 finish=435.07
440439) bash cpu=6 start=4.63 finish=435.07
440460) cam4_r_base.mev cpu=6 start=4.63 finish=434.98
440434) sh cpu=7 start=4.63 finish=437.53
440443) bash cpu=7 start=4.63 finish=437.53
440464) cam4_r_base.mev cpu=7 start=4.63 finish=437.43
440435) sh cpu=8 start=4.63 finish=432.12
440446) bash cpu=8 start=4.63 finish=432.12
440463) cam4_r_base.mev cpu=8 start=4.63 finish=431.97
440436) sh cpu=1 start=4.63 finish=435.17
440448) bash cpu=9 start=4.63 finish=435.17
440466) cam4_r_base.mev cpu=9 start=4.63 finish=435.07
440441) sh cpu=9 start=4.63 finish=436.55
440450) bash cpu=10 start=4.63 finish=436.55
440465) cam4_r_base.mev cpu=10 start=4.63 finish=436.46
440442) sh cpu=5 start=4.63 finish=437.18
440451) bash cpu=11 start=4.63 finish=437.18
440469) cam4_r_base.mev cpu=11 start=4.63 finish=437.08
440444) sh cpu=12 start=4.63 finish=438.04
440452) bash cpu=12 start=4.63 finish=438.04
440468) cam4_r_base.mev cpu=12 start=4.63 finish=437.94
440445) sh cpu=14 start=4.63 finish=435.20
440453) bash cpu=13 start=4.63 finish=435.20
440467) cam4_r_base.mev cpu=13 start=4.63 finish=435.07
440447) sh cpu=8 start=4.63 finish=432.74
440454) bash cpu=14 start=4.63 finish=432.74
440470) cam4_r_base.mev cpu=14 start=4.63 finish=432.62
440449) sh cpu=15 start=4.63 finish=437.93
440461) bash cpu=15 start=4.63 finish=437.92
440471) cam4_r_base.mev cpu=15 start=4.63 finish=437.86
