Encode-opus is one of several quick-running encode benchmarks. These are quick-running high-IPC programs with just a few threads. In this case, a constant set of two runnable processes.

System overview shows a high retirement rate with a small number of backend stalls.

AMD metrics show a lot of floating point code, not many branches and cpu-bound more than memory bound but overall very high IPC and high retirement rate.
elapsed 139.297
on_cpu 0.053 # 0.84 / 16 cores
utime 115.226
stime 1.839
nvcsw 2815 # 81.74%
nivcsw 629 # 18.26%
inblock 0 # 0.00/sec
onblock 1732768 # 12439.35/sec
cpu-clock 117091880626 # 117.092 seconds
task-clock 117095710268 # 117.096 seconds
page faults 161156 # 1376.276/sec
context switches 3933 # 33.588/sec
cpu migrations 300 # 2.562/sec
major page faults 2 # 0.017/sec
minor page faults 161154 # 1376.259/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 153579074036 # 94.603 branches per 1000 inst
branch misses 2912235529 # 1.90% branch miss
conditional 134250372526 # 82.697 conditional branches per 1000 inst
indirect 1193893855 # 0.735 indirect branches per 1000 inst
cpu-cycles 537156399526 # 0.24 GHz
instructions 1619661764279 # 3.02 IPC
slots 1076831653824 #
retiring 523624378763 # 48.6% (48.6%)
-- ucode 27642325 # 0.0%
-- fastpath 523596736438 # 48.6%
frontend 100476135520 # 9.3% ( 9.3%)
-- latency 65112118866 # 6.0%
-- bandwidth 35364016654 # 3.3%
backend 364834219382 # 33.9% (33.9%)
-- cpu 279301543322 # 25.9%
-- memory 85532676060 # 7.9%
speculation 87781529545 # 8.2% ( 8.2%)
-- branch mispredict 86601453537 # 8.0%
-- pipeline restart 1180076008 # 0.1%
smt-contention 115109929 # 0.0% ( 0.0%)
cpu-cycles 537866827145 # 0.24 GHz
instructions 1622228251326 # 3.02 IPC
instructions 541136738630 # 17.839 l2 access per 1000 inst
l2 hit from l1 6466702852 # 2.16% l2 miss
l2 miss from l1 106528981 #
l2 hit from l2 pf 3085379042 #
l3 hit from l2 pf 91625958 #
l3 miss from l2 pf 9890270 #
instructions 540965737285 # 358.276 float per 1000 inst
float 512 112 # 0.000 AVX-512 per 1000 inst
float 256 602 # 0.000 AVX-256 per 1000 inst
float 128 193814859934 # 358.276 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
Intel metrics show a similar low amount of on cpu, essentially single-threaded.
elapsed 158.215
on_cpu 0.053 # 0.86 / 16 cores
utime 133.985
stime 1.313
nvcsw 2331 # 77.65%
nivcsw 671 # 22.35%
inblock 81936 # 517.88/sec
onblock 1721520 # 10880.86/sec
cpu-clock 135300390813 # 135.300 seconds
task-clock 135304167762 # 135.304 seconds
page faults 145874 # 1078.119/sec
context switches 3580 # 26.459/sec
cpu migrations 366 # 2.705/sec
major page faults 18 # 0.133/sec
minor page faults 145856 # 1077.986/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 152826156114 # 94.293 branches per 1000 inst
branch misses 2865383218 # 1.87% branch miss
conditional 152826169330 # 94.293 conditional branches per 1000 inst
indirect 1214079796 # 0.749 indirect branches per 1000 inst
slots 3067321672538 #
retiring 1541367561479 # 50.3% (50.3%)
-- ucode 48525608582 # 1.6%
-- fastpath 1492841952897 # 48.7%
frontend 293193133379 # 9.6% ( 9.6%)
-- latency 110784373995 # 3.6%
-- bandwidth 182408759384 # 5.9%
backend 805991778923 # 26.3% (26.3%)
-- cpu 581617263000 # 19.0%
-- memory 224374515923 # 7.3%
speculation 431547827766 # 14.1% (14.1%)
-- branch mispredict 424707967622 # 13.8%
-- pipeline restart 6839860144 # 0.2%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 511371921430 # 0.20 GHz
instructions 1620717767784 # 3.17 IPC
l2 access 9846953252 # 6.076 l2 access per 1000 inst
l2 miss 321920749 # 3.27% l2 miss
Process summary is straightforward
387 processes
25 opusenc 114.12 0.86
68 clinfo 16.20 6.31
38 vulkaninfo 0.94 1.15
6 glxinfo:gdrv0 0.12 0.06
4 vulkani:disk$0 0.10 0.13
6 php 0.08 0.06
2 glxinfo 0.07 0.03
2 glxinfo:cs0 0.07 0.03
2 glxinfo:disk$0 0.07 0.03
2 glxinfo:shlo0 0.07 0.03
2 glxinfo:sh0 0.06 0.02
6 clang 0.05 0.07
2 llvmpipe-0 0.05 0.07
2 llvmpipe-1 0.05 0.07
2 llvmpipe-10 0.05 0.07
2 llvmpipe-11 0.05 0.07
2 llvmpipe-12 0.05 0.07
2 llvmpipe-13 0.05 0.07
2 llvmpipe-14 0.05 0.07
2 llvmpipe-2 0.05 0.07
2 llvmpipe-4 0.05 0.07
2 llvmpipe-5 0.05 0.07
2 llvmpipe-6 0.05 0.07
2 llvmpipe-7 0.05 0.07
2 llvmpipe-8 0.05 0.07
2 llvmpipe-9 0.05 0.07
2 llvmpipe-15 0.05 0.06
2 llvmpipe-3 0.05 0.06
3 rocminfo 0.03 0.00
1 lspci 0.01 0.02
1 ps 0.00 0.01
87 sh 0.00 0.00
13 gcc 0.00 0.00
12 gsettings 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
5 bash 0.00 0.00
5 encode-opus 0.00 0.00
5 phoronix-test-s 0.00 0.00
5 rm 0.00 0.00
3 gmain 0.00 0.00
2 cc 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dconf worker 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
The core computation blocks show a parent process followed by five children in order.
45221) encode-opus cpu=14 start=5.62 finish=28.68
45222) opusenc cpu=3 start=5.62 finish=10.44
45223) opusenc cpu=15 start=10.44 finish=15.01
45226) opusenc cpu=0 start=15.01 finish=19.57
45227) opusenc cpu=15 start=19.57 finish=24.13
45228) opusenc cpu=0 start=24.13 finish=28.68
