A H.266 video encoder. with four test cases. Runs on all cores with each workload slightly different cpu busy profile.

Topdown profile shows moderate retirement rate limited more by frontend stalls than backend.

AMD metrics

elapsed              910.226
on_cpu               0.739          # 11.83 / 16 cores
utime                10625.310
stime                143.634
nvcsw                1926068        # 82.44%
nivcsw               410360         # 17.56%
inblock              3896           # 4.28/sec
onblock              13280          # 14.59/sec
cpu-clock            10772646851924 # 10772.647 seconds
task-clock           10773516408795 # 10773.516 seconds
page faults          22719430       # 2108.822/sec
context switches     2340781        # 217.272/sec
cpu migrations       164307         # 15.251/sec
major page faults    13             # 0.001/sec
minor page faults    22719417       # 2108.821/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             6857273267357  # 95.819 branches per 1000 inst
branch misses        95079953506    # 1.39% branch miss
conditional          5898906790895  # 82.428 conditional branches per 1000 inst
indirect             141469966782   # 1.977 indirect branches per 1000 inst
cpu-cycles           41460858052026 # 2.81 GHz
instructions         71608739292263 # 1.73 IPC
slots                82933942651956 #
retiring             24177769564488 # 29.2% (40.2%)
-- ucode             319921379423   #     0.4%
-- fastpath          23857848185065 #    28.8%
frontend             11886006036051 # 14.3% (19.8%)
-- latency           7575762751728  #     9.1%
-- bandwidth         4310243284323  #     5.2%
backend              22821946777065 # 27.5% (38.0%)
-- cpu               6695369364745  #     8.1%
-- memory            16126577412320 #    19.4%
speculation          1202222269765  #  1.4% ( 2.0%)
-- branch mispredict 1154699472637  #     1.4%
-- pipeline restart  47522797128    #     0.1%
smt-contention       22845563644718 # 27.5% ( 0.0%)
cpu-cycles           41405432666677 # 2.77 GHz
instructions         71526738202667 # 1.73 IPC
instructions         23847203050777 # 55.784 l2 access per 1000 inst
l2 hit from l1       1000350502604  # 8.78% l2 miss
l2 miss from l1      59618749154    #
l2 hit from l2 pf    272733137303   #
l3 hit from l2 pf    40268494255    #
l3 miss from l2 pf   16941099985    #
instructions         23835385703310 # 178.948 float per 1000 inst
float 512            92             # 0.000 AVX-512 per 1000 inst
float 256            714            # 0.000 AVX-256 per 1000 inst
float 128            4265301205354  # 178.948 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         0              # 0.000 scalar per 1000 inst

Intel metrics

elapsed              1142.666
on_cpu               0.751          # 12.02 / 16 cores
utime                13631.248
stime                104.240
nvcsw                1304868        # 75.71%
nivcsw               418550         # 24.29%
inblock              18795744       # 16449.03/sec
onblock              2040           # 1.79/sec
cpu-clock            13737027683049 # 13737.028 seconds
task-clock           13737572064843 # 13737.572 seconds
page faults          22921326       # 1668.514/sec
context switches     1728942        # 125.855/sec
cpu migrations       216303         # 15.745/sec
major page faults    111            # 0.008/sec
minor page faults    22921215       # 1668.506/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             6920566559508  # 96.451 branches per 1000 inst
branch misses        92751466267    # 1.34% branch miss
conditional          6920566578324  # 96.451 conditional branches per 1000 inst
indirect             1600611418708  # 22.307 indirect branches per 1000 inst
slots                70202024331338 #
retiring             39826875497438 # 56.7% (56.7%)
-- ucode             3276105188693  #     4.7%
-- fastpath          36550770308745 #    52.1%
frontend             14572791682931 # 20.8% (20.8%)
-- latency           7132152801138  #    10.2%
-- bandwidth         7440638881793  #    10.6%
backend              10333613723682 # 14.7% (14.7%)
-- cpu               5828042496726  #     8.3%
-- memory            4505571226956  #     6.4%
speculation          5686879274637  #  8.1% ( 8.1%)
-- branch mispredict 5429543298347  #     7.7%
-- pipeline restart  257335976290   #     0.4%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           42320701718846 # 2.31 GHz
instructions         83198283723517 # 1.97 IPC
l2 access            1709736624260  # 41.157 l2 access per 1000 inst
l2 miss              252358065728   # 14.76% l2 miss

Process overview

564 processes
	204 vvencapp             180330.99  2203.52
	 68 clinfo                  17.86     6.32
	 38 vulkaninfo               0.93     1.33
	  6 glxinfo:gdrv0            0.13     0.10
	  6 php                      0.11     0.18
	  4 vulkani:disk$0           0.10     0.14
	  2 glxinfo                  0.07     0.05
	  2 glxinfo:cs0              0.07     0.05
	  2 glxinfo:disk$0           0.07     0.05
	  2 glxinfo:sh0              0.07     0.04
	  2 glxinfo:shlo0            0.07     0.04
	  6 clang                    0.05     0.07
	  2 llvmpipe-0               0.05     0.07
	  2 llvmpipe-1               0.05     0.07
	  2 llvmpipe-10              0.05     0.07
	  2 llvmpipe-11              0.05     0.07
	  2 llvmpipe-12              0.05     0.07
	  2 llvmpipe-13              0.05     0.07
	  2 llvmpipe-14              0.05     0.07
	  2 llvmpipe-15              0.05     0.07
	  2 llvmpipe-2               0.05     0.07
	  2 llvmpipe-3               0.05     0.07
	  2 llvmpipe-4               0.05     0.07
	  2 llvmpipe-5               0.05     0.07
	  2 llvmpipe-6               0.05     0.07
	  2 llvmpipe-7               0.05     0.07
	  2 llvmpipe-8               0.05     0.07
	  2 llvmpipe-9               0.05     0.07
	  3 rocminfo                 0.03     0.00
	  1 lspci                    0.01     0.01
	  1 ps                       0.00     0.01
	 88 sh                       0.00     0.00
	 13 gcc                      0.00     0.00
	 12 vvenc                    0.00     0.00
	 11 gsettings                0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  4 gmain                    0.00     0.00
	  2 cc                       0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 date                     0.00     0.00
	  1 dconf worker             0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
47 maximum processes

Process computation is straightforward with one process on each core.

      2906507) vvenc            cpu=11 start=6.19  finish=148.80
        2906508) vvencapp         cpu=5 start=6.19  finish=148.80
          2906509) vvencapp         cpu=0 start=6.20  finish=148.54
          2906510) vvencapp         cpu=9 start=6.20  finish=148.54
          2906511) vvencapp         cpu=14 start=6.20  finish=148.54
          2906512) vvencapp         cpu=12 start=6.20  finish=148.54
          2906513) vvencapp         cpu=3 start=6.20  finish=148.54
          2906514) vvencapp         cpu=8 start=6.20  finish=148.54
          2906515) vvencapp         cpu=8 start=6.20  finish=148.54
          2906516) vvencapp         cpu=2 start=6.20  finish=148.54
          2906517) vvencapp         cpu=10 start=6.20  finish=148.54
          2906518) vvencapp         cpu=1 start=6.20  finish=148.54
          2906519) vvencapp         cpu=13 start=6.20  finish=148.54
          2906520) vvencapp         cpu=4 start=6.20  finish=148.54
          2906521) vvencapp         cpu=11 start=6.20  finish=148.54
          2906522) vvencapp         cpu=7 start=6.20  finish=148.54
          2906523) vvencapp         cpu=15 start=6.20  finish=148.54
          2906524) vvencapp         cpu=12 start=6.20  finish=148.54