xalancbmk is a SPEC CPU(R) benchmark written in C++ and described here. The workload runs on all logical cores.

Topdown profile shows two different workload regions with the first with higher backend stalls

AMD metrics on 7840 show over 1/4 of instructions are branches and that memory stalls are ~40% of the time.

elapsed              716.422
on_cpu               0.983          # 15.72 / 16 cores
utime                11200.976
stime                62.076
nvcsw                16868          # 14.21%
nivcsw               101844         # 85.79%
inblock              0              # 0.00/sec
onblock              5941288        # 8293.00/sec
cpu-clock            11264034913550 # 11264.035 seconds
task-clock           11264141447655 # 11264.141 seconds
page faults          10479547       # 930.346/sec
context switches     118151         # 10.489/sec
cpu migrations       156            # 0.014/sec
major page faults    1111           # 0.099/sec
minor page faults    10478436       # 930.247/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             10322703995232 # 267.659 branches per 1000 inst
branch misses        34592086095    # 0.34% branch miss
conditional          9047084912417  # 234.583 conditional branches per 1000 inst
indirect             287264629665   # 7.449 indirect branches per 1000 inst
cpu-cycles           49888709082342 # 4.32 GHz
instructions         38564460869272 # 0.77 IPC
slots                99786843927960 #
retiring             12243245308066 # 12.3% (19.0%)
-- ucode             71983786453    #     0.1%
-- fastpath          12171261521613 #    12.2%
frontend             5763112454235  #  5.8% ( 8.9%)
-- latency           3553925492466  #     3.6%
-- bandwidth         2209186961769  #     2.2%
backend              45856010835439 # 46.0% (71.1%) high
-- cpu               2998476011384  #     3.0%
-- memory            42857534824055 #    42.9%
speculation          590816854271   #  0.6% ( 0.9%) low
-- branch mispredict 530059406767   #     0.5%
-- pipeline restart  60757447504    #     0.1%
smt-contention       35333602096762 # 35.4% ( 0.0%)
cpu-cycles           49821898939519 # 4.31 GHz
instructions         38564401740635 # 0.77 IPC
instructions         12856131091504 # 75.229 l2 access per 1000 inst
l2 hit from l1       817320093125   # 15.96% l2 miss
l2 miss from l1      49304847483    #
l2 hit from l2 pf    44749019911    #
l3 hit from l2 pf    33478037235    #
l3 miss from l2 pf   71608977929    #
instructions         12853296360811 # 34.341 float per 1000 inst
float 512            183            # 0.000 AVX-512 per 1000 inst
float 256            334329         # 0.000 AVX-256 per 1000 inst
float 128            441397062278   # 34.341 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         7              # 0.000 scalar per 1000 inst
instructions         38562362441417 #
opcache              5403832937953  # 140.132 opcache per 1000 inst
opcache miss         214599945034   #  4.0% opcache miss rate
l1 dTLB miss         245187730687   # 6.358 L1 dTLB per 1000 inst
l2 dTLB miss         7784051070     # 0.202 L2 dTLB per 1000 inst
instructions         38561867880756 #
icache               295547014102   # 7.664 icache per 1000 inst
icache miss          61133657855    # 20.7% icache miss rate
l1 iTLB miss         61677389585    # 1.599 L1 iTLB per 1000 inst
l2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst
tlb flush            87745          # 0.000 TLB flush per 1000 inst

Process overview shows computation primarily in cpuxaln_r_base

581 processes
	 48 cpuxalan_r_base      11171.98    53.53
	 69 specperl                10.35     3.47
	  1 clang++                  0.01     0.00
	  1 lsb_release              0.01     0.00
	 11 ps                       0.00     0.01
	173 sh                       0.00     0.00
	 54 specrxp                  0.00     0.00
	 48 bash                     0.00     0.00
	 41 specinvoke               0.00     0.00
	 21 grep                     0.00     0.00
	 20 cat                      0.00     0.00
	 12 uniq                     0.00     0.00
	 11 sort                     0.00     0.00
	 10 expand                   0.00     0.00
	  6 pwd                      0.00     0.00
	  5 basename                 0.00     0.00
	  5 specmake                 0.00     0.00
	  5 systemctl                0.00     0.00
	  4 specpp                   0.00     0.00
	  4 uname                    0.00     0.00
	  3 dirname                  0.00     0.00
	  3 dmidecode                0.00     0.00
	  3 lscpu                    0.00     0.00
	  2 df                       0.00     0.00
	  2 dpkg                     0.00     0.00
	  2 rm                       0.00     0.00
	  2 runcpu                   0.00     0.00
	  2 specsha512sum            0.00     0.00
	  2 specxz                   0.00     0.00
	  2 who                      0.00     0.00
	  1 cpupower                 0.00     0.00
	  1 head                     0.00     0.00
	  1 logname                  0.00     0.00
	  1 ls                       0.00     0.00
	  1 numactl                  0.00     0.00
	  1 sysctl                   0.00     0.00
	  1 w                        0.00     0.00
	  1 wc                       0.00     0.00
	  1 which                    0.00     0.00
0 processes running
53 maximum processes

specinvoke fires up separate copies on each logical core.

    47048) specinvoke       cpu=8 start=3.51  finish=239.04
      47050) sh               cpu=2 start=3.51  finish=235.55
        47056) bash             cpu=0 start=3.51  finish=235.54
          47082) cpuxalan_r_base  cpu=0 start=3.51  finish=235.48
      47051) sh               cpu=3 start=3.51  finish=237.87
        47058) bash             cpu=1 start=3.51  finish=237.87
          47083) cpuxalan_r_base  cpu=1 start=3.51  finish=237.83
      47052) sh               cpu=2 start=3.51  finish=235.00
        47059) bash             cpu=2 start=3.51  finish=235.00
          47080) cpuxalan_r_base  cpu=2 start=3.51  finish=234.93
      47053) sh               cpu=0 start=3.51  finish=236.37
        47060) bash             cpu=3 start=3.51  finish=236.37
          47081) cpuxalan_r_base  cpu=3 start=3.51  finish=236.30
      47054) sh               cpu=8 start=3.51  finish=236.94
        47068) bash             cpu=4 start=3.51  finish=236.94
          47090) cpuxalan_r_base  cpu=4 start=3.52  finish=236.88
      47055) sh               cpu=4 start=3.51  finish=238.19
        47066) bash             cpu=5 start=3.51  finish=238.19
          47088) cpuxalan_r_base  cpu=5 start=3.52  finish=238.15
      47057) sh               cpu=8 start=3.51  finish=236.78
        47063) bash             cpu=6 start=3.51  finish=236.77
          47086) cpuxalan_r_base  cpu=6 start=3.52  finish=236.71
      47061) sh               cpu=0 start=3.51  finish=239.04
        47067) bash             cpu=7 start=3.51  finish=239.04
          47087) cpuxalan_r_base  cpu=7 start=3.52  finish=239.01
      47062) sh               cpu=8 start=3.51  finish=235.89
        47072) bash             cpu=8 start=3.51  finish=235.88
          47093) cpuxalan_r_base  cpu=8 start=3.52  finish=235.83
      47064) sh               cpu=8 start=3.51  finish=236.63
        47071) bash             cpu=9 start=3.51  finish=236.63
          47089) cpuxalan_r_base  cpu=9 start=3.52  finish=236.56
      47065) sh               cpu=12 start=3.51  finish=237.21
        47075) bash             cpu=10 start=3.51  finish=237.21
          47091) cpuxalan_r_base  cpu=10 start=3.52  finish=237.16
      47069) sh               cpu=15 start=3.51  finish=237.53
        47077) bash             cpu=11 start=3.51  finish=237.53
          47092) cpuxalan_r_base  cpu=11 start=3.52  finish=237.49
      47070) sh               cpu=0 start=3.51  finish=236.97
        47078) bash             cpu=12 start=3.51  finish=236.97
          47094) cpuxalan_r_base  cpu=12 start=3.52  finish=236.92
      47073) sh               cpu=15 start=3.51  finish=237.90
        47079) bash             cpu=13 start=3.51  finish=237.90
          47096) cpuxalan_r_base  cpu=13 start=3.52  finish=237.84
      47074) sh               cpu=12 start=3.51  finish=237.30
        47085) bash             cpu=14 start=3.51  finish=237.30
          47097) cpuxalan_r_base  cpu=14 start=3.52  finish=237.26
      47076) sh               cpu=0 start=3.51  finish=237.23
        47084) bash             cpu=15 start=3.51  finish=237.23
          47095) cpuxalan_r_base  cpu=15 start=3.52  finish=237.17