cactuBSSN is a SPEC CPU(R) benchmark described here and written in C, C++ and Fortran. The workload runs on all logical cores.

Topdown profile shows this is a backend-bound workload.

AMD metrics confirm the benchmark is memory bound. Only ~55 floating point instructions per 1000. Not many branches. There is a large percentage of L2 access including ~10% L2 misses.

elapsed              609.922
on_cpu               0.981          # 15.69 / 16 cores
utime                9531.490
stime                37.821
nvcsw                15443          # 11.89%
nivcsw               114431         # 88.11%
inblock              0              # 0.00/sec
onblock              302096         # 495.30/sec
cpu-clock            9571161124089  # 9571.161 seconds
task-clock           9571308947593  # 9571.309 seconds
page faults          10865926       # 1135.260/sec
context switches     129126         # 13.491/sec
cpu migrations       250            # 0.026/sec
major page faults    1384           # 0.145/sec
minor page faults    10864542       # 1135.116/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             458027626592   # 48.787 branches per 1000 inst
branch misses        2930723831     # 0.64% branch miss
conditional          313602888445   # 33.403 conditional branches per 1000 inst
indirect             35057805161    # 3.734 indirect branches per 1000 inst
cpu-cycles           41339157355858 # 4.21 GHz
instructions         9387529160935  # 0.23 IPC low
slots                82669018078590 #
retiring             3387165858961  #  4.1% ( 4.3%) low
-- ucode             385306245      #     0.0%
-- fastpath          3386780552716  #     4.1%
frontend             3820302772231  #  4.6% ( 4.8%) low
-- latency           2938260448878  #     3.6%
-- bandwidth         882042323353   #     1.1%
backend              72418746717494 # 87.6% (90.9%) high
-- cpu               7511969518602  #     9.1%
-- memory            64906777198892 #    78.5%
speculation          54605694491    #  0.1% ( 0.1%) low
-- branch mispredict 33703731722    #     0.0%
-- pipeline restart  20901962769    #     0.0%
smt-contention       2988168305306  #  3.6% ( 0.0%)
cpu-cycles           41172189685093 # 4.21 GHz
instructions         9393665948727  # 0.23 IPC low
instructions         3125620100025  # 476.467 l2 access per 1000 inst
l2 hit from l1       1102457333145  # 10.76% l2 miss
l2 miss from l1      96542777657    #
l2 hit from l2 pf    323167259778   #
l3 hit from l2 pf    24513021077    #
l3 miss from l2 pf   39118477073    #
instructions         3128526163741  # 55.472 float per 1000 inst
float 512            308            # 0.000 AVX-512 per 1000 inst
float 256            1236272        # 0.000 AVX-256 per 1000 inst
float 128            173544139570   # 55.472 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         1              # 0.000 scalar per 1000 inst
instructions         9383775361054  #
opcache              1267171993600  # 135.039 opcache per 1000 inst
opcache miss         604865222155   # 47.7% opcache miss rate
l1 dTLB miss         534977958244   # 57.011 L1 dTLB per 1000 inst
l2 dTLB miss         7159514212     # 0.763 L2 dTLB per 1000 inst
instructions         9383603598466  #
icache               648567980402   # 69.117 icache per 1000 inst
icache miss          271389699675   # 41.8% icache miss rate
l1 iTLB miss         569831957      # 0.061 L1 iTLB per 1000 inst
l2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst
tlb flush            106503         # 0.000 TLB flush per 1000 inst

Process overview shows cactusBSSN_r_ba as the primary process.

775 processes
	 48 cactusBSSN_r_ba       9598.22    27.75
	165 specperl                25.72     3.22
	 41 specinvoke               0.01     0.00
	  1 clang++                  0.01     0.00
	  1 flang                    0.01     0.00
	  1 lsb_release              0.01     0.00
	  1 clang                    0.00     0.01
	270 sh                       0.00     0.00
	 54 specrxp                  0.00     0.00
	 48 bash                     0.00     0.00
	 21 grep                     0.00     0.00
	 20 cat                      0.00     0.00
	 12 uniq                     0.00     0.00
	 11 sort                     0.00     0.00
	 10 expand                   0.00     0.00
	 10 ps                       0.00     0.00
	  6 pwd                      0.00     0.00
	  5 basename                 0.00     0.00
	  5 specmake                 0.00     0.00
	  5 systemctl                0.00     0.00
	  4 specpp                   0.00     0.00
	  4 uname                    0.00     0.00
	  3 dirname                  0.00     0.00
	  3 dmidecode                0.00     0.00
	  3 lscpu                    0.00     0.00
	  2 df                       0.00     0.00
	  2 dpkg                     0.00     0.00
	  2 rm                       0.00     0.00
	  2 runcpu                   0.00     0.00
	  2 specsha512sum            0.00     0.00
	  2 specxz                   0.00     0.00
	  2 who                      0.00     0.00
	  1 cpupower                 0.00     0.00
	  1 head                     0.00     0.00
	  1 logname                  0.00     0.00
	  1 ls                       0.00     0.00
	  1 numactl                  0.00     0.00
	  1 sysctl                   0.00     0.00
	  1 w                        0.00     0.00
	  1 wc                       0.00     0.00
	  1 which                    0.00     0.00
0 processes running
53 maximum processes

Computation blocks are simple with the spec harness invoking one copy on each logical core.

    371157) specinvoke       cpu=10 start=3.69  finish=206.29
      371159) sh               cpu=5 start=3.69  finish=205.12
        371166) bash             cpu=0 start=3.69  finish=205.12
          371191) cactusBSSN_r_ba  cpu=0 start=3.69  finish=205.05
      371160) sh               cpu=9 start=3.69  finish=204.08
        371168) bash             cpu=1 start=3.69  finish=204.08
          371197) cactusBSSN_r_ba  cpu=1 start=3.69  finish=203.99
      371161) sh               cpu=2 start=3.69  finish=203.74
        371169) bash             cpu=2 start=3.69  finish=203.74
          371192) cactusBSSN_r_ba  cpu=2 start=3.69  finish=203.61
      371162) sh               cpu=7 start=3.69  finish=203.55
        371174) bash             cpu=3 start=3.69  finish=203.55
          371193) cactusBSSN_r_ba  cpu=3 start=3.69  finish=203.45
      371163) sh               cpu=13 start=3.69  finish=206.29
        371175) bash             cpu=4 start=3.69  finish=206.29
          371196) cactusBSSN_r_ba  cpu=4 start=3.69  finish=206.23
      371164) sh               cpu=9 start=3.69  finish=204.12
        371178) bash             cpu=5 start=3.69  finish=204.12
          371194) cactusBSSN_r_ba  cpu=5 start=3.69  finish=204.02
      371165) sh               cpu=10 start=3.69  finish=204.49
        371179) bash             cpu=6 start=3.69  finish=204.49
          371195) cactusBSSN_r_ba  cpu=6 start=3.69  finish=204.40
      371167) sh               cpu=7 start=3.69  finish=202.92
        371176) bash             cpu=7 start=3.69  finish=202.92
          371199) cactusBSSN_r_ba  cpu=7 start=3.69  finish=202.80
      371170) sh               cpu=9 start=3.69  finish=205.12
        371181) bash             cpu=8 start=3.69  finish=205.12
          371198) cactusBSSN_r_ba  cpu=8 start=3.69  finish=205.05
      371171) sh               cpu=10 start=3.69  finish=204.07
        371182) bash             cpu=9 start=3.69  finish=204.07
          371201) cactusBSSN_r_ba  cpu=9 start=3.69  finish=203.97
      371172) sh               cpu=3 start=3.69  finish=203.74
        371185) bash             cpu=10 start=3.69  finish=203.74
          371202) cactusBSSN_r_ba  cpu=10 start=3.69  finish=203.61
      371173) sh               cpu=11 start=3.69  finish=203.56
        371186) bash             cpu=11 start=3.69  finish=203.55
          371200) cactusBSSN_r_ba  cpu=11 start=3.69  finish=203.45
      371177) sh               cpu=10 start=3.69  finish=206.29
        371187) bash             cpu=12 start=3.69  finish=206.29
          371203) cactusBSSN_r_ba  cpu=12 start=3.69  finish=206.23
      371180) sh               cpu=10 start=3.69  finish=204.12
        371188) bash             cpu=13 start=3.69  finish=204.12
          371204) cactusBSSN_r_ba  cpu=13 start=3.70  finish=204.02
      371183) sh               cpu=1 start=3.69  finish=204.48
        371189) bash             cpu=14 start=3.69  finish=204.48
          371205) cactusBSSN_r_ba  cpu=14 start=3.70  finish=204.40
      371184) sh               cpu=14 start=3.69  finish=204.69
        371190) bash             cpu=15 start=3.69  finish=204.69
          371206) cactusBSSN_r_ba  cpu=15 start=3.70  finish=204.62