The helsing benchmark computes Vampire Numbers. There are two workloads: one for 12 digits and one for 14 digits with almost all the time going to the later. The second workload seems to run consistently on all cores.

Topdown profile shows frontend stalls dominating with a reasonably high retirement rate.

AMD metrics show little floating point. There is a high L2 access rate with few L2 misses.

elapsed              1544.288
on_cpu               0.977          # 15.64 / 16 cores
utime                24144.861
stime                5.900
nvcsw                2209           # 1.07%
nivcsw               203297         # 98.93%
inblock              8              # 0.01/sec
onblock              12760          # 8.26/sec
cpu-clock            24151759949628 # 24151.760 seconds
task-clock           24151872221780 # 24151.872 seconds
page faults          686457         # 28.423/sec
context switches     213041         # 8.821/sec
cpu migrations       309            # 0.013/sec
major page faults    2              # 0.000/sec
minor page faults    686455         # 28.422/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             37996374962268 # 213.248 branches per 1000 inst
branch misses        426355218797   # 1.12% branch miss
conditional          37900334306940 # 212.709 conditional branches per 1000 inst
indirect             1414195149     # 0.008 indirect branches per 1000 inst
cpu-cycles           93192585501105 # 3.76 GHz
instructions         178220525834252 # 1.91 IPC
slots                186374372514954 #
retiring             54534867291655 # 29.3% (39.7%)
-- ucode             602790653      #     0.0%
-- fastpath          54534264501002 #    29.3%
frontend             62137799188459 # 33.3% (45.2%) high
-- latency           33686540498436 #    18.1%
-- bandwidth         28451258690023 #    15.3%
backend              17152720336039 #  9.2% (12.5%) low
-- cpu               1227200586083  #     0.7%
-- memory            15925519749956 #     8.5%
speculation          3537961598501  #  1.9% ( 2.6%)
-- branch mispredict 3526283395770  #     1.9%
-- pipeline restart  11678202731    #     0.0%
smt-contention       49010830482301 # 26.3% ( 0.0%)
cpu-cycles           93281791687407 # 3.76 GHz
instructions         178199946791830 # 1.91 IPC
instructions         59402683308294 # 199.834 l2 access per 1000 inst
l2 hit from l1       6701172480876  # 0.11% l2 miss
l2 miss from l1      6682864116     #
l2 hit from l2 pf    5163093205308  #
l3 hit from l2 pf    3420115358     #
l3 miss from l2 pf   3018255728     #
instructions         59385720242127 # 1.468 float per 1000 inst
float 512            46             # 0.000 AVX-512 per 1000 inst
float 256            434            # 0.000 AVX-256 per 1000 inst
float 128            87181981843    # 1.468 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         0              # 0.000 scalar per 1000 inst
instructions         2648516        #
opcache              975479         # 368.312 opcache per 1000 inst
opcache miss         522758         # 53.6% opcache miss rate
l1 dTLB miss         5395           # 2.037 L1 dTLB per 1000 inst
l2 dTLB miss         1159           # 0.438 L2 dTLB per 1000 inst
instructions         2722566        #
icache               1311083        # 481.562 icache per 1000 inst
icache miss          112185         #  8.6% icache miss rate
l1 iTLB miss         11             # 0.004 L1 iTLB per 1000 inst
l2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst
tlb flush            19             # 0.007 TLB flush per 1000 inst

Intel metrics

elapsed              2195.089
on_cpu               0.983          # 15.73 / 16 cores
utime                34523.529
stime                3.532
nvcsw                5708           # 2.26%
nivcsw               247340         # 97.74%
inblock              736504         # 335.52/sec
onblock              1520           # 0.69/sec
cpu-clock            34527696384921 # 34527.696 seconds
task-clock           34527759388483 # 34527.759 seconds
page faults          694041         # 20.101/sec
context switches     263807         # 7.640/sec
cpu migrations       329            # 0.010/sec
major page faults    4046           # 0.117/sec
minor page faults    689995         # 19.984/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             37995215477051 # 213.238 branches per 1000 inst
branch misses        554926384677   # 1.46% branch miss
conditional          37995215494459 # 213.238 conditional branches per 1000 inst
indirect             7321841704743  # 41.092 indirect branches per 1000 inst
slots                157759917459068 #
retiring             93941253151823 # 59.5% (59.5%) high
-- ucode             587027031054   #     0.4%
-- fastpath          93354226120769 #    59.2%
frontend             27064981185513 # 17.2% (17.2%)
-- latency           7967772793571  #     5.1%
-- bandwidth         19097208391942 #    12.1%
backend              8761522150341  #  5.6% ( 5.6%) low
-- cpu               7775846235503  #     4.9%
-- memory            985675914838   #     0.6%
speculation          28702420410020 # 18.2% (18.2%) high
-- branch mispredict 28702222069505 #    18.2%
-- pipeline restart  198340515      #     0.0%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           86412475211325 # 2.47 GHz
instructions         192985063456344 # 2.23 IPC
l2 access            7513647170631  # 63.938 l2 access per 1000 inst
l2 miss              49651416890    # 0.66% l2 miss
cpu-cycles           52549088264933 # 13.4% memory latency
load stalls          7024140040104  #  0.0% l1 bound
l1 miss              31270895567874 # 57.1% l2 bound
l2 miss              1284758476766  #  0.7% l3 bound
l3 miss              929695149808   #  1.8% dram bound
store_stalls         782090312      #  0.0% store bound

Process overview shows the helsing process taking the most time.

458 processes
	108 helsing              411130.16    82.11
	 68 clinfo                  16.22     6.64
	 38 vulkaninfo               1.14     1.14
	  6 php                      0.17     0.10
	  6 glxinfo:gdrv0            0.13     0.07
	  4 vulkani:disk$0           0.12     0.12
	  6 glxinfo:gl0              0.12     0.06
	  6 clang                    0.07     0.05
	  2 glxinfo                  0.07     0.03
	  2 glxinfo:cs0              0.07     0.03
	  2 glxinfo:disk$0           0.07     0.03
	  2 glxinfo:sh0              0.07     0.03
	  2 glxinfo:shlo0            0.07     0.03
	  2 llvmpipe-0               0.06     0.06
	  2 llvmpipe-1               0.06     0.06
	  2 llvmpipe-10              0.06     0.06
	  2 llvmpipe-11              0.06     0.06
	  2 llvmpipe-12              0.06     0.06
	  2 llvmpipe-13              0.06     0.06
	  2 llvmpipe-14              0.06     0.06
	  2 llvmpipe-15              0.06     0.06
	  2 llvmpipe-2               0.06     0.06
	  2 llvmpipe-3               0.06     0.06
	  2 llvmpipe-4               0.06     0.06
	  2 llvmpipe-5               0.06     0.06
	  2 llvmpipe-6               0.06     0.06
	  2 llvmpipe-7               0.06     0.06
	  2 llvmpipe-8               0.06     0.06
	  2 llvmpipe-9               0.06     0.06
	  3 rocminfo                 0.03     0.00
	  1 lspci                    0.00     0.02
	 84 sh                       0.00     0.00
	 13 gcc                      0.00     0.00
	 12 gsettings                0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  3 gmain                    0.00     0.00
	  2 cc                       0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 date                     0.00     0.00
	  1 dconf worker             0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 ps                       0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
47 maximum processes

Computation structure is straightforward.

      233808) helsing          cpu=4 start=14.92 finish=20.30
        233809) helsing          cpu=13 start=14.93 finish=20.30
          233810) helsing          cpu=6 start=14.93 finish=20.23
          233811) helsing          cpu=15 start=14.93 finish=20.30
          233812) helsing          cpu=2 start=14.93 finish=20.23
          233813) helsing          cpu=0 start=14.93 finish=20.24
          233814) helsing          cpu=1 start=14.93 finish=20.20
          233815) helsing          cpu=3 start=14.93 finish=20.24
          233816) helsing          cpu=4 start=14.93 finish=20.21
          233817) helsing          cpu=5 start=14.93 finish=20.26
          233818) helsing          cpu=8 start=14.93 finish=20.29
          233819) helsing          cpu=7 start=14.93 finish=20.26
          233820) helsing          cpu=10 start=14.93 finish=20.25
          233821) helsing          cpu=14 start=14.93 finish=20.27
          233822) helsing          cpu=11 start=14.93 finish=20.23
          233823) helsing          cpu=9 start=14.93 finish=20.21
          233824) helsing          cpu=12 start=14.93 finish=20.23
          233825) helsing          cpu=13 start=14.93 finish=20.21