deepsjeng is a SPEC CPU(R) benchmark written in C++ and described here. The workload runs on all logical cores.

Topdown profile shows a workload with mixed levels of backend stalls, frontend stalls and retiring instructions.

AMD metrics on 7840 confirm the balance between frontend, backend and retiring instructions.

elapsed              811.223
on_cpu               0.986          # 15.78 / 16 cores
utime                12773.395
stime                28.115
nvcsw                18695          # 14.28%
nivcsw               112210         # 85.72%
inblock              0              # 0.00/sec
onblock              30064          # 37.06/sec
cpu-clock            12802138749308 # 12802.139 seconds
task-clock           12802230919689 # 12802.231 seconds
page faults          9202017        # 718.782/sec
context switches     130342         # 10.181/sec
cpu migrations       155            # 0.012/sec
major page faults    1031           # 0.081/sec
minor page faults    9200986        # 718.702/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             9272134301352  # 123.840 branches per 1000 inst
branch misses        370304637120   # 3.99% branch miss
conditional          7301483191614  # 97.520 conditional branches per 1000 inst
indirect             67884778718    # 0.907 indirect branches per 1000 inst
cpu-cycles           52817901328679 # 4.08 GHz
instructions         74866236272888 # 1.42 IPC
slots                105654379952364 #
retiring             24901505427526 # 23.6% (30.2%)
-- ucode             337138071      #     0.0%
-- fastpath          24901168289455 #    23.6%
frontend             24245524942663 # 22.9% (29.4%)
-- latency           15873166991802 #    15.0%
-- bandwidth         8372357950861  #     7.9%
backend              28818020778073 # 27.3% (34.9%)
-- cpu               3887120280115  #     3.7%
-- memory            24930900497958 #    23.6%
speculation          4558465612069  #  4.3% ( 5.5%)
-- branch mispredict 4439266810809  #     4.2%
-- pipeline restart  119198801260   #     0.1%
smt-contention       23130785988881 # 21.9% ( 0.0%)
cpu-cycles           52806133923811 # 4.07 GHz
instructions         74880184852645 # 1.42 IPC
instructions         24960402824569 # 23.537 l2 access per 1000 inst
l2 hit from l1       449020291198   # 4.85% l2 miss
l2 miss from l1      17110339701    #
l2 hit from l2 pf    127086650332   #
l3 hit from l2 pf    1955075744     #
l3 miss from l2 pf   9419542641     #
instructions         24950191292383 # 21.274 float per 1000 inst
float 512            240            # 0.000 AVX-512 per 1000 inst
float 256            6813033600     # 0.273 AVX-256 per 1000 inst
float 128            523978621400   # 21.001 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         9              # 0.000 scalar per 1000 inst
instructions         74868153303309 #
opcache              13003747392264 # 173.689 opcache per 1000 inst
opcache miss         2298953449524  # 17.7% opcache miss rate
l1 dTLB miss         33051193481    # 0.441 L1 dTLB per 1000 inst
l2 dTLB miss         17262144796    # 0.231 L2 dTLB per 1000 inst
instructions         74868187361005 #
icache               3249314680210  # 43.400 icache per 1000 inst
icache miss          532103183743   # 16.4% icache miss rate
l1 iTLB miss         139714605      # 0.002 L1 iTLB per 1000 inst
l2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst
tlb flush            67264          # 0.000 TLB flush per 1000 inst

Process summary shows time spent in deepsjeng_r_bas

581 processes
	 48 deepsjeng_r_bas      12714.00    22.13
	 69 specperl                 9.48     1.49
	  1 lsb_release              0.01     0.00
	 11 ps                       0.00     0.01
	  1 clang++                  0.00     0.01
	173 sh                       0.00     0.00
	 54 specrxp                  0.00     0.00
	 48 bash                     0.00     0.00
	 41 specinvoke               0.00     0.00
	 21 grep                     0.00     0.00
	 20 cat                      0.00     0.00
	 12 uniq                     0.00     0.00
	 11 sort                     0.00     0.00
	 10 expand                   0.00     0.00
	  6 pwd                      0.00     0.00
	  5 basename                 0.00     0.00
	  5 specmake                 0.00     0.00
	  5 systemctl                0.00     0.00
	  4 specpp                   0.00     0.00
	  4 uname                    0.00     0.00
	  3 dirname                  0.00     0.00
	  3 dmidecode                0.00     0.00
	  3 lscpu                    0.00     0.00
	  2 df                       0.00     0.00
	  2 dpkg                     0.00     0.00
	  2 rm                       0.00     0.00
	  2 runcpu                   0.00     0.00
	  2 specsha512sum            0.00     0.00
	  2 specxz                   0.00     0.00
	  2 who                      0.00     0.00
	  1 cpupower                 0.00     0.00
	  1 head                     0.00     0.00
	  1 logname                  0.00     0.00
	  1 ls                       0.00     0.00
	  1 numactl                  0.00     0.00
	  1 sysctl                   0.00     0.00
	  1 w                        0.00     0.00
	  1 wc                       0.00     0.00
	  1 which                    0.00     0.00
0 processes running
53 maximum processes

specinvoke fires off copies on each logical core

    52747) specinvoke       cpu=2 start=3.11  finish=269.45
      52749) sh               cpu=2 start=3.11  finish=268.60
        52757) bash             cpu=0 start=3.11  finish=268.60
          52782) deepsjeng_r_bas  cpu=0 start=3.11  finish=268.56
      52750) sh               cpu=1 start=3.11  finish=266.98
        52759) bash             cpu=1 start=3.11  finish=266.98
          52783) deepsjeng_r_bas  cpu=1 start=3.11  finish=266.93
      52751) sh               cpu=1 start=3.11  finish=267.81
        52760) bash             cpu=2 start=3.11  finish=267.81
          52781) deepsjeng_r_bas  cpu=2 start=3.11  finish=267.75
      52752) sh               cpu=13 start=3.11  finish=268.54
        52762) bash             cpu=3 start=3.11  finish=268.53
          52785) deepsjeng_r_bas  cpu=3 start=3.11  finish=268.48
      52753) sh               cpu=5 start=3.11  finish=269.45
        52763) bash             cpu=4 start=3.11  finish=269.45
          52786) deepsjeng_r_bas  cpu=4 start=3.11  finish=269.42
      52754) sh               cpu=10 start=3.11  finish=267.95
        52765) bash             cpu=5 start=3.11  finish=267.95
          52784) deepsjeng_r_bas  cpu=5 start=3.11  finish=267.90
      52755) sh               cpu=13 start=3.11  finish=268.64
        52776) bash             cpu=6 start=3.11  finish=268.64
          52790) deepsjeng_r_bas  cpu=6 start=3.12  finish=268.60
      52756) sh               cpu=6 start=3.11  finish=268.72
        52767) bash             cpu=7 start=3.11  finish=268.72
          52788) deepsjeng_r_bas  cpu=7 start=3.11  finish=268.68
      52758) sh               cpu=10 start=3.11  finish=268.28
        52769) bash             cpu=8 start=3.11  finish=268.28
          52789) deepsjeng_r_bas  cpu=8 start=3.11  finish=268.22
      52761) sh               cpu=1 start=3.11  finish=267.11
        52771) bash             cpu=9 start=3.11  finish=267.11
          52787) deepsjeng_r_bas  cpu=9 start=3.11  finish=267.07
      52764) sh               cpu=9 start=3.11  finish=267.90
        52778) bash             cpu=10 start=3.11  finish=267.90
          52793) deepsjeng_r_bas  cpu=10 start=3.12  finish=267.85
      52766) sh               cpu=10 start=3.11  finish=268.60
        52774) bash             cpu=11 start=3.11  finish=268.60
          52794) deepsjeng_r_bas  cpu=11 start=3.12  finish=268.56
      52768) sh               cpu=11 start=3.11  finish=269.38
        52775) bash             cpu=12 start=3.11  finish=269.38
          52791) deepsjeng_r_bas  cpu=12 start=3.12  finish=269.33
      52770) sh               cpu=15 start=3.11  finish=268.34
        52777) bash             cpu=13 start=3.11  finish=268.33
          52792) deepsjeng_r_bas  cpu=13 start=3.12  finish=268.30
      52772) sh               cpu=13 start=3.11  finish=268.57
        52779) bash             cpu=14 start=3.11  finish=268.57
          52795) deepsjeng_r_bas  cpu=14 start=3.12  finish=268.51
      52773) sh               cpu=10 start=3.11  finish=268.20
        52780) bash             cpu=15 start=3.11  finish=268.20
          52796) deepsjeng_r_bas  cpu=15 start=3.12  finish=268.14