Json parsing workload with five test cases. Not much variation between the cases at this level, all single threaded.

Topdown metrics show some variation in first workload vs. other three. Otherwise a higher retirement rate limited by backend stalls.

AMD metrics show many branches though not much mispredict ratio.

elapsed              1167.221
on_cpu               0.058          # 0.93 / 16 cores
utime                947.249
stime                132.789
nvcsw                2144           # 31.39%
nivcsw               4686           # 68.61%
inblock              0              # 0.00/sec
onblock              13520          # 11.58/sec
cpu-clock            1060764404934  # 1060.764 seconds
task-clock           1043565962599  # 1043.566 seconds
page faults          4074935        # 3904.818/sec
context switches     12459          # 11.939/sec
cpu migrations       384            # 0.368/sec
major page faults    2              # 0.002/sec
minor page faults    4074933        # 3904.816/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             1623251344069  # 208.543 branches per 1000 inst
branch misses        6830289396     # 0.42% branch miss
conditional          1555765693583  # 199.873 conditional branches per 1000 inst
indirect             1109680660     # 0.143 indirect branches per 1000 inst
cpu-cycles           4597218192476  # 0.25 GHz
instructions         14599846846048 # 3.18 IPC
slots                7265075693532  #
retiring             3534025371404  # 48.6% (48.6%)
-- ucode             19825160717    #     0.3%
-- fastpath          3514200210687  #    48.4%
frontend             885850299239   # 12.2% (12.2%)
-- latency           352808287308   #     4.9%
-- bandwidth         533042011931   #     7.3%
backend              2637646765616  # 36.3% (36.3%)
-- cpu               1449201042303  #    19.9%
-- memory            1188445723313  #    16.4%
speculation          207095188915   #  2.9% ( 2.9%)
-- branch mispredict 202244588303   #     2.8%
-- pipeline restart  4850600612     #     0.1%
smt-contention       457507570      #  0.0% ( 0.0%)
cpu-cycles           4580022863126  # 0.25 GHz
instructions         14573909729059 # 3.18 IPC
instructions         3860925654216  # 48.096 l2 access per 1000 inst
l2 hit from l1       100834500364   # 32.20% l2 miss
l2 miss from l1      7616180873     #
l2 hit from l2 pf    32681720010    #
l3 hit from l2 pf    31643262792    #
l3 miss from l2 pf   20537034976    #
instructions         3859524617727  # 61.757 float per 1000 inst
float 512            49             # 0.000 AVX-512 per 1000 inst
float 256            20825894754    # 5.396 AVX-256 per 1000 inst
float 128            217526082164   # 56.361 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         7              # 0.000 scalar per 1000 inst

Intel metrics

elapsed              1064.589
on_cpu               0.057          # 0.92 / 16 cores
utime                900.027
stime                76.835
nvcsw                2024           # 31.70%
nivcsw               4361           # 68.30%
inblock              5824           # 5.47/sec
onblock              2256           # 2.12/sec
cpu-clock            958565278780   # 958.565 seconds
task-clock           958030477330   # 958.030 seconds
page faults          3644797        # 3804.469/sec
context switches     11494          # 11.998/sec
cpu migrations       715            # 0.746/sec
major page faults    21             # 0.022/sec
minor page faults    3644776        # 3804.447/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             2034933733227  # 151.323 branches per 1000 inst
branch misses        6610203847     # 0.32% branch miss
conditional          2034933746187  # 151.323 conditional branches per 1000 inst
indirect             1353675464     # 0.101 indirect branches per 1000 inst
slots                20711401748834 #
retiring             9582714603191  # 46.3% (46.3%)
-- ucode             833592945195   #     4.0%
-- fastpath          8749121657996  #    42.2%
frontend             3817548477815  # 18.4% (18.4%)
-- latency           1765649812090  #     8.5%
-- bandwidth         2051898665725  #     9.9%
backend              6769475286303  # 32.7% (32.7%)
-- cpu               2797925769978  #    13.5%
-- memory            3971549516325  #    19.2%
speculation          1539595442119  #  7.4% ( 7.4%)
-- branch mispredict 1277383990632  #     6.2%
-- pipeline restart  262211451487   #     1.3%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           3457915967322  # 0.21 GHz
instructions         13453460153942 # 3.89 IPC
l2 access            386536318482   # 28.736 l2 access per 1000 inst
l2 miss              145523023280   # 37.65% l2 miss

Process overview

380 processes
	 15 bench_ondemand         947.56   144.01
	 68 clinfo                  16.20     6.66
	 38 vulkaninfo               1.31     0.95
	  4 vulkani:disk$0           0.14     0.10
	  6 glxinfo:gdrv0            0.14     0.09
	  6 php                      0.09     0.27
	  2 llvmpipe-0               0.07     0.05
	  2 llvmpipe-1               0.07     0.05
	  2 llvmpipe-10              0.07     0.05
	  2 llvmpipe-11              0.07     0.05
	  2 llvmpipe-12              0.07     0.05
	  2 llvmpipe-13              0.07     0.05
	  2 llvmpipe-14              0.07     0.05
	  2 llvmpipe-15              0.07     0.05
	  2 llvmpipe-2               0.07     0.05
	  2 llvmpipe-3               0.07     0.05
	  2 llvmpipe-4               0.07     0.05
	  2 llvmpipe-5               0.07     0.05
	  2 llvmpipe-6               0.07     0.05
	  2 llvmpipe-7               0.07     0.05
	  2 llvmpipe-8               0.07     0.05
	  2 llvmpipe-9               0.07     0.05
	  6 clang                    0.06     0.06
	  2 glxinfo                  0.06     0.04
	  2 glxinfo:cs0              0.06     0.03
	  2 glxinfo:disk$0           0.06     0.03
	  2 glxinfo:sh0              0.06     0.03
	  2 glxinfo:shlo0            0.06     0.03
	  3 rocminfo                 0.03     0.00
	  1 lspci                    0.00     0.02
	  1 ps                       0.00     0.01
	 90 sh                       0.00     0.00
	 15 simdjson                 0.00     0.00
	 13 gcc                      0.00     0.00
	 10 gsettings                0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  4 gmain                    0.00     0.00
	  2 cc                       0.00     0.00
	  2 dconf worker             0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 date                     0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
47 maximum processes

Computation blocks are straighforward

      2823492) simdjson         cpu=15 start=5.84  finish=98.89
        2823493) bench_ondemand   cpu=14 start=5.84  finish=98.89
      2823497) simdjson         cpu=14 start=102.89 finish=196.91
        2823498) bench_ondemand   cpu=7 start=102.89 finish=196.91
      2823501) simdjson         cpu=6 start=200.92 finish=294.99
        2823502) bench_ondemand   cpu=7 start=200.92 finish=294.99
      2823504) sh               cpu=8 start=294.99 finish=294.99
        2823505) sh               cpu=1 start=294.99 finish=294.99
      2823506) simdjson         cpu=7 start=305.31 finish=378.69
        2823507) bench_ondemand   cpu=8 start=305.32 finish=378.69
      2823508) simdjson         cpu=14 start=382.69 finish=455.96
        2823509) bench_ondemand   cpu=15 start=382.69 finish=455.96
      2823544) simdjson         cpu=14 start=459.97 finish=533.54
        2823545) bench_ondemand   cpu=0 start=459.97 finish=533.54
      2823546) sh               cpu=7 start=533.54 finish=533.54
        2823547) sh               cpu=0 start=533.54 finish=533.54
      2823548) simdjson         cpu=6 start=543.72 finish=594.32
        2823549) bench_ondemand   cpu=7 start=543.73 finish=594.32
      2823550) simdjson         cpu=6 start=598.32 finish=649.21
        2823551) bench_ondemand   cpu=15 start=598.32 finish=649.21
      2823552) simdjson         cpu=6 start=653.21 finish=703.82
        2823553) bench_ondemand   cpu=15 start=653.21 finish=703.82
      2823555) sh               cpu=15 start=703.82 finish=703.82
        2823556) sh               cpu=0 start=703.82 finish=703.82
      2823558) simdjson         cpu=14 start=714.00 finish=786.58
        2823559) bench_ondemand   cpu=7 start=714.00 finish=786.58
      2823631) simdjson         cpu=6 start=790.58 finish=862.67
        2823632) bench_ondemand   cpu=7 start=790.59 finish=862.67
      2823634) simdjson         cpu=14 start=866.68 finish=938.66
        2823635) bench_ondemand   cpu=7 start=866.68 finish=938.66
      2823639) sh               cpu=0 start=938.66 finish=938.66
        2823640) sh               cpu=9 start=938.66 finish=938.66
      2823641) simdjson         cpu=6 start=949.02 finish=1022.77
        2823642) bench_ondemand   cpu=7 start=949.02 finish=1022.77
      2823644) simdjson         cpu=14 start=1026.78 finish=1101.20
        2823645) bench_ondemand   cpu=7 start=1026.78 finish=1101.20
      2823648) simdjson         cpu=14 start=1105.21 finish=1178.76
        2823649) bench_ondemand   cpu=7 start=1105.21 finish=1178.75