bwaves is a SPEC CPU(R) benchmark described here. This Fortran workload runs consistently on all logical cores.

Topdown profile shows this as a backend-bound workload.

AMD metrics for a 7840 processor confirms a backend-bound and particularly memory-bound process. The L2 access rate is ~133 per 1000 instructions and half of these are misses. There are very few branches. Approximately 1/4 of the instructions are floating point.

elapsed              4632.332
on_cpu               0.990          # 15.85 / 16 cores
utime                73220.919
stime                181.889
nvcsw                96986          # 11.93%
nivcsw               715815         # 88.07%
inblock              0              # 0.00/sec
onblock              21864          # 4.72/sec
cpu-clock            73420734604801 # 73420.735 seconds
task-clock           73422148131842 # 73422.148 seconds
page faults          41516197       # 565.445/sec
context switches     811557         # 11.053/sec
cpu migrations       521            # 0.007/sec
major page faults    1855           # 0.025/sec
minor page faults    41514342       # 565.420/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             833760239357   # 19.172 branches per 1000 inst
branch misses        11492503371    # 1.38% branch miss
conditional          651248492065   # 14.975 conditional branches per 1000 inst
indirect             60811077930    # 1.398 indirect branches per 1000 inst
cpu-cycles           332144753197546 # 4.47 GHz
instructions         43479218645688 # 0.13 IPC low
slots                664161383718216 #
retiring             14604192045965 #  2.2% ( 2.3%) low
-- ucode             3068808550     #     0.0%
-- fastpath          14601123237415 #     2.2%
frontend             8982526992280  #  1.4% ( 1.4%) low
-- latency           7953765511854  #     1.2%
-- bandwidth         1028761480426  #     0.2%
backend              619986995494093 # 93.3% (96.3%) high
-- cpu               45180566558648 #     6.8%
-- memory            574806428935445 #    86.5%
speculation          473785697223   #  0.1% ( 0.1%) low
-- branch mispredict 231332716080   #     0.0%
-- pipeline restart  242452981143   #     0.0%
smt-contention       20113692014147 #  3.0% ( 0.0%)
cpu-cycles           332893296051422 # 4.47 GHz
instructions         43483758640393 # 0.13 IPC low
instructions         14498863832586 # 132.988 l2 access per 1000 inst
l2 hit from l1       1503844106419  # 49.20% l2 miss
l2 miss from l1      707659574769   #
l2 hit from l2 pf    183385007792   #
l3 hit from l2 pf    9300133264     #
l3 miss from l2 pf   231652380877   #
instructions         14491614030746 # 260.237 float per 1000 inst
float 512            580            # 0.000 AVX-512 per 1000 inst
float 256            2082           # 0.000 AVX-256 per 1000 inst
float 128            3771247626425  # 260.237 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         2              # 0.000 scalar per 1000 inst
instructions         43471275755268 #
opcache              5086289745292  # 117.003 opcache per 1000 inst
opcache miss         495051701654   #  9.7% opcache miss rate
l1 dTLB miss         79670864160    # 1.833 L1 dTLB per 1000 inst
l2 dTLB miss         48345344515    # 1.112 L2 dTLB per 1000 inst
instructions         43470965049493 #
icache               600576925212   # 13.816 icache per 1000 inst
icache miss          39215141772    #  6.5% icache miss rate
l1 iTLB miss         578741019      # 0.013 L1 iTLB per 1000 inst
l2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst
tlb flush            272150         # 0.000 TLB flush per 1000 inst

The process overview shows bwaves_r as primary process with the small amount of rest being the spec harness.

1016 processes
	144 bwaves_r_base.m      53541.39    89.25
	142 specperl                23.53     4.73
	  1 lsb_release              0.01     0.00
	 33 specinvoke               0.00     0.09
	144 bash                     0.00     0.03
	 10 ps                       0.00     0.02
	  1 flang                    0.00     0.02
	348 sh                       0.00     0.00
	 21 grep                     0.00     0.00
	 20 cat                      0.00     0.00
	 12 uniq                     0.00     0.00
	 11 sort                     0.00     0.00
	 10 expand                   0.00     0.00
	  6 pwd                      0.00     0.00
	  5 basename                 0.00     0.00
	  5 specmake                 0.00     0.00
	  5 specrxp                  0.00     0.00
	  5 systemctl                0.00     0.00
	  4 specpp                   0.00     0.00
	  4 uname                    0.00     0.00
	  3 dirname                  0.00     0.00
	  3 dmidecode                0.00     0.00
	  3 lscpu                    0.00     0.00
	  2 df                       0.00     0.00
	  2 dpkg                     0.00     0.00
	  2 rm                       0.00     0.00
	  2 runcpu                   0.00     0.00
	  2 specsha512sum            0.00     0.00
	  2 specxz                   0.00     0.00
	  2 who                      0.00     0.00
	  1 cpupower                 0.00     0.00
	  1 head                     0.00     0.00
	  1 logname                  0.00     0.00
	  1 ls                       0.00     0.00
	  1 numactl                  0.00     0.00
	  1 sysctl                   0.00     0.00
	  1 w                        0.00     0.00
	  1 wc                       0.00     0.00
	  1 which                    0.00     0.00
53 processes running
53 maximum processes

Specinvoke fires up separate processes for each core.

    356101) specinvoke       cpu=7 start=3.24  finish=1546.35
      356103) sh               cpu=0 start=3.24  finish=295.94
        356109) bash             cpu=0 start=3.24  finish=295.94
          356135) bwaves_r_base.m  cpu=0 start=3.24  finish=295.69
      356104) sh               cpu=1 start=3.24  finish=289.88
        356110) bash             cpu=1 start=3.24  finish=289.88
          356133) bwaves_r_base.m  cpu=1 start=3.24  finish=289.62
      356105) sh               cpu=2 start=3.24  finish=294.24
        356114) bash             cpu=2 start=3.24  finish=294.24
          356137) bwaves_r_base.m  cpu=2 start=3.24  finish=294.02
      356106) sh               cpu=3 start=3.24  finish=294.03
        356117) bash             cpu=3 start=3.24  finish=294.03
          356140) bwaves_r_base.m  cpu=3 start=3.24  finish=293.80
      356107) sh               cpu=4 start=3.24  finish=287.40
        356115) bash             cpu=4 start=3.24  finish=287.39
          356138) bwaves_r_base.m  cpu=4 start=3.24  finish=287.17
      356108) sh               cpu=5 start=3.24  finish=293.24
        356113) bash             cpu=5 start=3.24  finish=293.23
          356139) bwaves_r_base.m  cpu=5 start=3.24  finish=292.99
      356111) sh               cpu=6 start=3.24  finish=303.56
        356120) bash             cpu=6 start=3.24  finish=303.56
          356141) bwaves_r_base.m  cpu=6 start=3.24  finish=303.33
      356112) sh               cpu=7 start=3.24  finish=295.06
        356122) bash             cpu=7 start=3.24  finish=295.06
          356142) bwaves_r_base.m  cpu=7 start=3.24  finish=294.80
      356116) sh               cpu=8 start=3.24  finish=291.15
        356125) bash             cpu=8 start=3.24  finish=291.15
          356143) bwaves_r_base.m  cpu=8 start=3.25  finish=290.90
      356118) sh               cpu=9 start=3.24  finish=291.24
        356127) bash             cpu=9 start=3.24  finish=291.24
          356147) bwaves_r_base.m  cpu=9 start=3.25  finish=291.03
      356119) sh               cpu=10 start=3.24  finish=296.70
        356128) bash             cpu=10 start=3.24  finish=296.70
          356149) bwaves_r_base.m  cpu=10 start=3.25  finish=296.45
      356121) sh               cpu=11 start=3.24  finish=294.39
        356130) bash             cpu=11 start=3.24  finish=294.39
          356148) bwaves_r_base.m  cpu=11 start=3.25  finish=294.19
      356123) sh               cpu=12 start=3.24  finish=288.30
        356131) bash             cpu=12 start=3.24  finish=288.30
          356146) bwaves_r_base.m  cpu=12 start=3.25  finish=288.11
      356124) sh               cpu=13 start=3.24  finish=293.74
        356132) bash             cpu=13 start=3.24  finish=293.74
          356145) bwaves_r_base.m  cpu=13 start=3.25  finish=293.56
      356126) sh               cpu=14 start=3.24  finish=298.69
        356134) bash             cpu=14 start=3.24  finish=298.69
          356144) bwaves_r_base.m  cpu=14 start=3.25  finish=298.44
      356129) sh               cpu=15 start=3.24  finish=302.56
        356136) bash             cpu=15 start=3.24  finish=302.56
          356150) bwaves_r_base.m  cpu=15 start=3.25  finish=302.30