lzbench is a benchmark for compression with seven workloads. These have both compress and decompress metrics so ~14 total. This is a single-thread workload.

Topdown profile shows frontend and backend stalls about even but varying by the test case. Branch misprediction is surprisingly high.

AMD metrics show the high rate of speculation misses. This code has little floating point and a moderate level of branches.
elapsed 755.928
on_cpu 0.051 # 0.81 / 16 cores
utime 583.276
stime 32.447
nvcsw 2596 # 49.00%
nivcsw 2702 # 51.00%
inblock 0 # 0.00/sec
onblock 15424 # 20.40/sec
cpu-clock 615848223310 # 615.848 seconds
task-clock 615860546931 # 615.861 seconds
page faults 18659342 # 30297.999/sec
context switches 8845 # 14.362/sec
cpu migrations 326 # 0.529/sec
major page faults 2 # 0.003/sec
minor page faults 18659340 # 30297.995/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 783659267177 # 130.034 branches per 1000 inst
branch misses 47146996285 # 6.02% branch miss
conditional 685766650450 # 113.791 conditional branches per 1000 inst
indirect 6891243211 # 1.143 indirect branches per 1000 inst
cpu-cycles 2372502199230 # 0.23 GHz
instructions 5279491010779 # 2.23 IPC
slots 4748483394420 #
retiring 1718435481181 # 36.2% (36.2%)
-- ucode 497136752 # 0.0%
-- fastpath 1717938344429 # 36.2%
frontend 1151657223287 # 24.3% (24.3%)
-- latency 735670326708 # 15.5%
-- bandwidth 415986896579 # 8.8%
backend 1018700227325 # 21.5% (21.5%)
-- cpu 179530650466 # 3.8%
-- memory 839169576859 # 17.7%
speculation 859479089638 # 18.1% (18.1%) high
-- branch mispredict 855319116325 # 18.0%
-- pipeline restart 4159973313 # 0.1%
smt-contention 211006212 # 0.0% ( 0.0%)
cpu-cycles 2366480123372 # 0.23 GHz
instructions 5261054743756 # 2.22 IPC
instructions 1754626398723 # 24.734 l2 access per 1000 inst
l2 hit from l1 30101932020 # 19.34% l2 miss
l2 miss from l1 2239669010 #
l2 hit from l2 pf 7142938168 #
l3 hit from l2 pf 1738642924 #
l3 miss from l2 pf 4416281676 #
instructions 1754894600217 # 20.259 float per 1000 inst
float 512 67 # 0.000 AVX-512 per 1000 inst
float 256 496 # 0.000 AVX-256 per 1000 inst
float 128 35553242796 # 20.259 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
Intel metrics
elapsed 6333.440
on_cpu 0.742 # 11.88 / 16 cores
utime 74750.330
stime 473.297
nvcsw 1313743 # 89.74%
nivcsw 150279 # 10.26%
inblock 30890000 # 4877.29/sec
onblock 694880 # 109.72/sec
cpu-clock 75222931189368 # 75222.931 seconds
task-clock 75223322994820 # 75223.323 seconds
page faults 85985637 # 1143.072/sec
context switches 1495425 # 19.880/sec
cpu migrations 54829 # 0.729/sec
major page faults 1233961 # 16.404/sec
minor page faults 84751671 # 1126.667/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 45503197495161 # 71.793 branches per 1000 inst
branch misses 90982505997 # 0.20% branch miss
conditional 45503197511129 # 71.793 conditional branches per 1000 inst
indirect 11993405533405 # 18.923 indirect branches per 1000 inst
slots 433838595012596 #
retiring 283286737116435 # 65.3% (65.3%) high
-- ucode 13222986405835 # 3.0%
-- fastpath 270063750710600 # 62.2%
frontend 19439778785847 # 4.5% ( 4.5%) low
-- latency 9361508360355 # 2.2%
-- bandwidth 10078270425492 # 2.3%
backend 112743375337721 # 26.0% (26.0%)
-- cpu 63018362871765 # 14.5%
-- memory 49725012465956 # 11.5%
speculation 12881927224817 # 3.0% ( 3.0%)
-- branch mispredict 12368006069230 # 2.9%
-- pipeline restart 513921155587 # 0.1%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 216187026514697 # 2.16 GHz
instructions 883987677201512 # 4.09 IPC high
l2 access 3937606297777 # 13.285 l2 access per 1000 inst
l2 miss 358227811293 # 9.10% l2 miss
Process overview is straightforward with invocations of lzbench
402 processes
42 lzbench 486.92 26.60
68 clinfo 19.50 7.66
38 vulkaninfo 1.10 1.15
6 glxinfo:gdrv0 0.15 0.06
6 glxinfo:gl0 0.15 0.06
4 vulkani:disk$0 0.11 0.13
6 php 0.08 0.25
2 glxinfo 0.07 0.02
2 glxinfo:cs0 0.07 0.02
2 glxinfo:disk$0 0.07 0.02
2 glxinfo:sh0 0.07 0.02
2 glxinfo:shlo0 0.07 0.02
2 llvmpipe-0 0.06 0.07
2 llvmpipe-1 0.06 0.07
2 llvmpipe-2 0.06 0.07
2 llvmpipe-3 0.06 0.07
2 llvmpipe-4 0.06 0.07
6 clang 0.06 0.06
2 llvmpipe-10 0.06 0.06
2 llvmpipe-11 0.06 0.06
2 llvmpipe-12 0.06 0.06
2 llvmpipe-13 0.06 0.06
2 llvmpipe-14 0.06 0.06
2 llvmpipe-15 0.06 0.06
2 llvmpipe-5 0.06 0.06
2 llvmpipe-6 0.06 0.06
2 llvmpipe-7 0.06 0.06
2 llvmpipe-8 0.06 0.06
2 llvmpipe-9 0.06 0.06
3 rocminfo 0.03 0.00
1 lspci 0.00 0.03
1 ps 0.00 0.01
94 sh 0.00 0.00
13 gcc 0.00 0.00
13 gsettings 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
5 phoronix-test-s 0.00 0.00
2 cc 0.00 0.00
2 gmain 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dconf worker 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
Example of computation blocks
39349) lzbench cpu=15 start=5.62 finish=33.77
39350) lzbench cpu=9 start=5.62 finish=33.77
39353) lzbench cpu=13 start=37.78 finish=66.05
39354) lzbench cpu=6 start=37.78 finish=66.04
39355) lzbench cpu=13 start=70.05 finish=98.24
39356) lzbench cpu=15 start=70.05 finish=98.24
39357) sh cpu=6 start=98.24 finish=98.24
39358) sh cpu=7 start=98.24 finish=98.24
39359) lzbench cpu=5 start=108.42 finish=131.47
39360) lzbench cpu=6 start=108.42 finish=131.46
39361) lzbench cpu=5 start=135.47 finish=158.54
39362) lzbench cpu=6 start=135.47 finish=158.54
39363) lzbench cpu=13 start=162.54 finish=185.54
39364) lzbench cpu=6 start=162.54 finish=185.54
39366) sh cpu=13 start=185.54 finish=185.54
39367) sh cpu=7 start=185.54 finish=185.54
39368) lzbench cpu=5 start=195.90 finish=222.22
39369) lzbench cpu=6 start=195.90 finish=222.22
39370) lzbench cpu=14 start=226.23 finish=252.51
39371) lzbench cpu=7 start=226.23 finish=252.51
39374) lzbench cpu=13 start=256.51 finish=282.70
39375) lzbench cpu=6 start=256.51 finish=282.70
39416) sh cpu=13 start=282.70 finish=282.70
39417) sh cpu=7 start=282.70 finish=282.70
39418) lzbench cpu=5 start=293.20 finish=318.00
39419) lzbench cpu=14 start=293.20 finish=318.00
39420) lzbench cpu=5 start=322.01 finish=347.21
39421) lzbench cpu=6 start=322.01 finish=347.20
39422) lzbench cpu=6 start=351.21 finish=376.21
39423) lzbench cpu=7 start=351.21 finish=376.21
39424) sh cpu=13 start=376.22 finish=376.22
39425) sh cpu=7 start=376.22 finish=376.22
39427) lzbench cpu=6 start=386.40 finish=410.11
39428) lzbench cpu=7 start=386.40 finish=410.10
39429) lzbench cpu=5 start=414.11 finish=436.48
39430) lzbench cpu=6 start=414.11 finish=436.48
39431) lzbench cpu=13 start=440.48 finish=462.81
39432) lzbench cpu=14 start=440.49 finish=462.81
39433) sh cpu=6 start=462.81 finish=462.81
39434) sh cpu=15 start=462.81 finish=462.81
39435) lzbench cpu=6 start=472.99 finish=495.27
39436) lzbench cpu=15 start=472.99 finish=495.27
39437) lzbench cpu=13 start=499.27 finish=522.41
39438) lzbench cpu=14 start=499.28 finish=522.40
39439) lzbench cpu=13 start=526.41 finish=548.59
39440) lzbench cpu=14 start=526.41 finish=548.59
39441) sh cpu=15 start=548.59 finish=548.59
39442) sh cpu=9 start=548.59 finish=548.59
39443) lzbench cpu=2 start=559.46 finish=583.09
39444) lzbench cpu=5 start=559.46 finish=583.08
39447) lzbench cpu=9 start=587.09 finish=610.72
39448) lzbench cpu=2 start=587.09 finish=610.71
39449) lzbench cpu=9 start=614.72 finish=638.31
39450) lzbench cpu=10 start=614.72 finish=638.31
