deepsjeng is a SPEC CPU(R) benchmark written in C++ and described here. The workload runs on all logical cores.

Topdown profile shows a workload with mixed levels of backend stalls, frontend stalls and retiring instructions.

AMD metrics on 7840 confirm the balance between frontend, backend and retiring instructions.
elapsed 811.223
on_cpu 0.986 # 15.78 / 16 cores
utime 12773.395
stime 28.115
nvcsw 18695 # 14.28%
nivcsw 112210 # 85.72%
inblock 0 # 0.00/sec
onblock 30064 # 37.06/sec
cpu-clock 12802138749308 # 12802.139 seconds
task-clock 12802230919689 # 12802.231 seconds
page faults 9202017 # 718.782/sec
context switches 130342 # 10.181/sec
cpu migrations 155 # 0.012/sec
major page faults 1031 # 0.081/sec
minor page faults 9200986 # 718.702/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 9272134301352 # 123.840 branches per 1000 inst
branch misses 370304637120 # 3.99% branch miss
conditional 7301483191614 # 97.520 conditional branches per 1000 inst
indirect 67884778718 # 0.907 indirect branches per 1000 inst
cpu-cycles 52817901328679 # 4.08 GHz
instructions 74866236272888 # 1.42 IPC
slots 105654379952364 #
retiring 24901505427526 # 23.6% (30.2%)
-- ucode 337138071 # 0.0%
-- fastpath 24901168289455 # 23.6%
frontend 24245524942663 # 22.9% (29.4%)
-- latency 15873166991802 # 15.0%
-- bandwidth 8372357950861 # 7.9%
backend 28818020778073 # 27.3% (34.9%)
-- cpu 3887120280115 # 3.7%
-- memory 24930900497958 # 23.6%
speculation 4558465612069 # 4.3% ( 5.5%)
-- branch mispredict 4439266810809 # 4.2%
-- pipeline restart 119198801260 # 0.1%
smt-contention 23130785988881 # 21.9% ( 0.0%)
cpu-cycles 52806133923811 # 4.07 GHz
instructions 74880184852645 # 1.42 IPC
instructions 24960402824569 # 23.537 l2 access per 1000 inst
l2 hit from l1 449020291198 # 4.85% l2 miss
l2 miss from l1 17110339701 #
l2 hit from l2 pf 127086650332 #
l3 hit from l2 pf 1955075744 #
l3 miss from l2 pf 9419542641 #
instructions 24950191292383 # 21.274 float per 1000 inst
float 512 240 # 0.000 AVX-512 per 1000 inst
float 256 6813033600 # 0.273 AVX-256 per 1000 inst
float 128 523978621400 # 21.001 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 9 # 0.000 scalar per 1000 inst
instructions 74868153303309 #
opcache 13003747392264 # 173.689 opcache per 1000 inst
opcache miss 2298953449524 # 17.7% opcache miss rate
l1 dTLB miss 33051193481 # 0.441 L1 dTLB per 1000 inst
l2 dTLB miss 17262144796 # 0.231 L2 dTLB per 1000 inst
instructions 74868187361005 #
icache 3249314680210 # 43.400 icache per 1000 inst
icache miss 532103183743 # 16.4% icache miss rate
l1 iTLB miss 139714605 # 0.002 L1 iTLB per 1000 inst
l2 iTLB miss 0 # 0.000 L2 iTLB per 1000 inst
tlb flush 67264 # 0.000 TLB flush per 1000 inst
Process summary shows time spent in deepsjeng_r_bas
581 processes
48 deepsjeng_r_bas 12714.00 22.13
69 specperl 9.48 1.49
1 lsb_release 0.01 0.00
11 ps 0.00 0.01
1 clang++ 0.00 0.01
173 sh 0.00 0.00
54 specrxp 0.00 0.00
48 bash 0.00 0.00
41 specinvoke 0.00 0.00
21 grep 0.00 0.00
20 cat 0.00 0.00
12 uniq 0.00 0.00
11 sort 0.00 0.00
10 expand 0.00 0.00
6 pwd 0.00 0.00
5 basename 0.00 0.00
5 specmake 0.00 0.00
5 systemctl 0.00 0.00
4 specpp 0.00 0.00
4 uname 0.00 0.00
3 dirname 0.00 0.00
3 dmidecode 0.00 0.00
3 lscpu 0.00 0.00
2 df 0.00 0.00
2 dpkg 0.00 0.00
2 rm 0.00 0.00
2 runcpu 0.00 0.00
2 specsha512sum 0.00 0.00
2 specxz 0.00 0.00
2 who 0.00 0.00
1 cpupower 0.00 0.00
1 head 0.00 0.00
1 logname 0.00 0.00
1 ls 0.00 0.00
1 numactl 0.00 0.00
1 sysctl 0.00 0.00
1 w 0.00 0.00
1 wc 0.00 0.00
1 which 0.00 0.00
0 processes running
53 maximum processes
specinvoke fires off copies on each logical core
52747) specinvoke cpu=2 start=3.11 finish=269.45
52749) sh cpu=2 start=3.11 finish=268.60
52757) bash cpu=0 start=3.11 finish=268.60
52782) deepsjeng_r_bas cpu=0 start=3.11 finish=268.56
52750) sh cpu=1 start=3.11 finish=266.98
52759) bash cpu=1 start=3.11 finish=266.98
52783) deepsjeng_r_bas cpu=1 start=3.11 finish=266.93
52751) sh cpu=1 start=3.11 finish=267.81
52760) bash cpu=2 start=3.11 finish=267.81
52781) deepsjeng_r_bas cpu=2 start=3.11 finish=267.75
52752) sh cpu=13 start=3.11 finish=268.54
52762) bash cpu=3 start=3.11 finish=268.53
52785) deepsjeng_r_bas cpu=3 start=3.11 finish=268.48
52753) sh cpu=5 start=3.11 finish=269.45
52763) bash cpu=4 start=3.11 finish=269.45
52786) deepsjeng_r_bas cpu=4 start=3.11 finish=269.42
52754) sh cpu=10 start=3.11 finish=267.95
52765) bash cpu=5 start=3.11 finish=267.95
52784) deepsjeng_r_bas cpu=5 start=3.11 finish=267.90
52755) sh cpu=13 start=3.11 finish=268.64
52776) bash cpu=6 start=3.11 finish=268.64
52790) deepsjeng_r_bas cpu=6 start=3.12 finish=268.60
52756) sh cpu=6 start=3.11 finish=268.72
52767) bash cpu=7 start=3.11 finish=268.72
52788) deepsjeng_r_bas cpu=7 start=3.11 finish=268.68
52758) sh cpu=10 start=3.11 finish=268.28
52769) bash cpu=8 start=3.11 finish=268.28
52789) deepsjeng_r_bas cpu=8 start=3.11 finish=268.22
52761) sh cpu=1 start=3.11 finish=267.11
52771) bash cpu=9 start=3.11 finish=267.11
52787) deepsjeng_r_bas cpu=9 start=3.11 finish=267.07
52764) sh cpu=9 start=3.11 finish=267.90
52778) bash cpu=10 start=3.11 finish=267.90
52793) deepsjeng_r_bas cpu=10 start=3.12 finish=267.85
52766) sh cpu=10 start=3.11 finish=268.60
52774) bash cpu=11 start=3.11 finish=268.60
52794) deepsjeng_r_bas cpu=11 start=3.12 finish=268.56
52768) sh cpu=11 start=3.11 finish=269.38
52775) bash cpu=12 start=3.11 finish=269.38
52791) deepsjeng_r_bas cpu=12 start=3.12 finish=269.33
52770) sh cpu=15 start=3.11 finish=268.34
52777) bash cpu=13 start=3.11 finish=268.33
52792) deepsjeng_r_bas cpu=13 start=3.12 finish=268.30
52772) sh cpu=13 start=3.11 finish=268.57
52779) bash cpu=14 start=3.11 finish=268.57
52795) deepsjeng_r_bas cpu=14 start=3.12 finish=268.51
52773) sh cpu=10 start=3.11 finish=268.20
52780) bash cpu=15 start=3.11 finish=268.20
52796) deepsjeng_r_bas cpu=15 start=3.12 finish=268.14
