A H.266 video encoder. with four test cases. Runs on all cores with each workload slightly different cpu busy profile.

Topdown profile shows moderate retirement rate limited more by frontend stalls than backend.

AMD metrics
elapsed 910.226
on_cpu 0.739 # 11.83 / 16 cores
utime 10625.310
stime 143.634
nvcsw 1926068 # 82.44%
nivcsw 410360 # 17.56%
inblock 3896 # 4.28/sec
onblock 13280 # 14.59/sec
cpu-clock 10772646851924 # 10772.647 seconds
task-clock 10773516408795 # 10773.516 seconds
page faults 22719430 # 2108.822/sec
context switches 2340781 # 217.272/sec
cpu migrations 164307 # 15.251/sec
major page faults 13 # 0.001/sec
minor page faults 22719417 # 2108.821/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 6857273267357 # 95.819 branches per 1000 inst
branch misses 95079953506 # 1.39% branch miss
conditional 5898906790895 # 82.428 conditional branches per 1000 inst
indirect 141469966782 # 1.977 indirect branches per 1000 inst
cpu-cycles 41460858052026 # 2.81 GHz
instructions 71608739292263 # 1.73 IPC
slots 82933942651956 #
retiring 24177769564488 # 29.2% (40.2%)
-- ucode 319921379423 # 0.4%
-- fastpath 23857848185065 # 28.8%
frontend 11886006036051 # 14.3% (19.8%)
-- latency 7575762751728 # 9.1%
-- bandwidth 4310243284323 # 5.2%
backend 22821946777065 # 27.5% (38.0%)
-- cpu 6695369364745 # 8.1%
-- memory 16126577412320 # 19.4%
speculation 1202222269765 # 1.4% ( 2.0%)
-- branch mispredict 1154699472637 # 1.4%
-- pipeline restart 47522797128 # 0.1%
smt-contention 22845563644718 # 27.5% ( 0.0%)
cpu-cycles 41405432666677 # 2.77 GHz
instructions 71526738202667 # 1.73 IPC
instructions 23847203050777 # 55.784 l2 access per 1000 inst
l2 hit from l1 1000350502604 # 8.78% l2 miss
l2 miss from l1 59618749154 #
l2 hit from l2 pf 272733137303 #
l3 hit from l2 pf 40268494255 #
l3 miss from l2 pf 16941099985 #
instructions 23835385703310 # 178.948 float per 1000 inst
float 512 92 # 0.000 AVX-512 per 1000 inst
float 256 714 # 0.000 AVX-256 per 1000 inst
float 128 4265301205354 # 178.948 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
Intel metrics
elapsed 1142.666
on_cpu 0.751 # 12.02 / 16 cores
utime 13631.248
stime 104.240
nvcsw 1304868 # 75.71%
nivcsw 418550 # 24.29%
inblock 18795744 # 16449.03/sec
onblock 2040 # 1.79/sec
cpu-clock 13737027683049 # 13737.028 seconds
task-clock 13737572064843 # 13737.572 seconds
page faults 22921326 # 1668.514/sec
context switches 1728942 # 125.855/sec
cpu migrations 216303 # 15.745/sec
major page faults 111 # 0.008/sec
minor page faults 22921215 # 1668.506/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 6920566559508 # 96.451 branches per 1000 inst
branch misses 92751466267 # 1.34% branch miss
conditional 6920566578324 # 96.451 conditional branches per 1000 inst
indirect 1600611418708 # 22.307 indirect branches per 1000 inst
slots 70202024331338 #
retiring 39826875497438 # 56.7% (56.7%)
-- ucode 3276105188693 # 4.7%
-- fastpath 36550770308745 # 52.1%
frontend 14572791682931 # 20.8% (20.8%)
-- latency 7132152801138 # 10.2%
-- bandwidth 7440638881793 # 10.6%
backend 10333613723682 # 14.7% (14.7%)
-- cpu 5828042496726 # 8.3%
-- memory 4505571226956 # 6.4%
speculation 5686879274637 # 8.1% ( 8.1%)
-- branch mispredict 5429543298347 # 7.7%
-- pipeline restart 257335976290 # 0.4%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 42320701718846 # 2.31 GHz
instructions 83198283723517 # 1.97 IPC
l2 access 1709736624260 # 41.157 l2 access per 1000 inst
l2 miss 252358065728 # 14.76% l2 miss
Process overview
564 processes
204 vvencapp 180330.99 2203.52
68 clinfo 17.86 6.32
38 vulkaninfo 0.93 1.33
6 glxinfo:gdrv0 0.13 0.10
6 php 0.11 0.18
4 vulkani:disk$0 0.10 0.14
2 glxinfo 0.07 0.05
2 glxinfo:cs0 0.07 0.05
2 glxinfo:disk$0 0.07 0.05
2 glxinfo:sh0 0.07 0.04
2 glxinfo:shlo0 0.07 0.04
6 clang 0.05 0.07
2 llvmpipe-0 0.05 0.07
2 llvmpipe-1 0.05 0.07
2 llvmpipe-10 0.05 0.07
2 llvmpipe-11 0.05 0.07
2 llvmpipe-12 0.05 0.07
2 llvmpipe-13 0.05 0.07
2 llvmpipe-14 0.05 0.07
2 llvmpipe-15 0.05 0.07
2 llvmpipe-2 0.05 0.07
2 llvmpipe-3 0.05 0.07
2 llvmpipe-4 0.05 0.07
2 llvmpipe-5 0.05 0.07
2 llvmpipe-6 0.05 0.07
2 llvmpipe-7 0.05 0.07
2 llvmpipe-8 0.05 0.07
2 llvmpipe-9 0.05 0.07
3 rocminfo 0.03 0.00
1 lspci 0.01 0.01
1 ps 0.00 0.01
88 sh 0.00 0.00
13 gcc 0.00 0.00
12 vvenc 0.00 0.00
11 gsettings 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
5 phoronix-test-s 0.00 0.00
4 gmain 0.00 0.00
2 cc 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dconf worker 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
Process computation is straightforward with one process on each core.
2906507) vvenc cpu=11 start=6.19 finish=148.80
2906508) vvencapp cpu=5 start=6.19 finish=148.80
2906509) vvencapp cpu=0 start=6.20 finish=148.54
2906510) vvencapp cpu=9 start=6.20 finish=148.54
2906511) vvencapp cpu=14 start=6.20 finish=148.54
2906512) vvencapp cpu=12 start=6.20 finish=148.54
2906513) vvencapp cpu=3 start=6.20 finish=148.54
2906514) vvencapp cpu=8 start=6.20 finish=148.54
2906515) vvencapp cpu=8 start=6.20 finish=148.54
2906516) vvencapp cpu=2 start=6.20 finish=148.54
2906517) vvencapp cpu=10 start=6.20 finish=148.54
2906518) vvencapp cpu=1 start=6.20 finish=148.54
2906519) vvencapp cpu=13 start=6.20 finish=148.54
2906520) vvencapp cpu=4 start=6.20 finish=148.54
2906521) vvencapp cpu=11 start=6.20 finish=148.54
2906522) vvencapp cpu=7 start=6.20 finish=148.54
2906523) vvencapp cpu=15 start=6.20 finish=148.54
2906524) vvencapp cpu=12 start=6.20 finish=148.54
