Compressing and decompressing a tar file. Seven different tests below with slightly different profiles in the testing below, useful to later separate these out. Interesting that none tests of different compression tools use the same metrics and workload so not easy to compare between tools.

AMD metrics show a multi-threaded code but only 4.17 cores and not much I/O, also a smaller amount of floating point code and otherwise memory-bound program. The profile above shows several workloads are more memory-bound than others.
elapsed 2901.236
on_cpu 0.261 # 4.17 / 16 cores
utime 12051.991
stime 42.412
nvcsw 761019 # 85.50%
nivcsw 129106 # 14.50%
inblock 416416 # 143.53/sec
onblock 8296 # 2.86/sec
cpu-clock 12088006176826 # 12088.006 seconds
task-clock 12089423964530 # 12089.424 seconds
page faults 10563887 # 873.812/sec
context switches 904309 # 74.802/sec
cpu migrations 73731 # 6.099/sec
major page faults 19 # 0.002/sec
minor page faults 10563868 # 873.811/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 8500567397945 # 123.737 branches per 1000 inst
branch misses 235070446072 # 2.77% branch miss
conditional 7691845876170 # 111.965 conditional branches per 1000 inst
indirect 2728510194 # 0.040 indirect branches per 1000 inst
cpu-cycles 64399572067291 # 1.03 GHz
instructions 88389737628447 # 1.37 IPC
slots 128795715843420 #
retiring 28442626656254 # 22.1% (24.1%)
-- ucode 320626280 # 0.0%
-- fastpath 28442306029974 # 22.1%
frontend 11861452414971 # 9.2% (10.1%)
-- latency 7369823434362 # 5.7%
-- bandwidth 4491628980609 # 3.5%
backend 68935929188030 # 53.5% (58.5%)
-- cpu 4938287659500 # 3.8%
-- memory 63997641528530 # 49.7%
speculation 8580707704693 # 6.7% ( 7.3%)
-- branch mispredict 8456809584640 # 6.6%
-- pipeline restart 123898120053 # 0.1%
smt-contention 10974862393946 # 8.5% ( 0.0%)
cpu-cycles 55104213476379 # 1.18 GHz
instructions 71191798572066 # 1.29 IPC
instructions 23728504766934 # 34.177 l2 access per 1000 inst
l2 hit from l1 639125597952 # 39.52% l2 miss
l2 miss from l1 221137265226 #
l2 hit from l2 pf 72461497838 #
l3 hit from l2 pf 56428551426 #
l3 miss from l2 pf 42946931535 #
instructions 23721123454337 # 22.889 float per 1000 inst
float 512 135 # 0.000 AVX-512 per 1000 inst
float 256 274 # 0.000 AVX-256 per 1000 inst
float 128 542946205318 # 22.889 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
Intel metrics
elapsed 1535.496
on_cpu 0.260 # 4.16 / 16 cores
utime 6370.957
stime 15.872
nvcsw 238930 # 79.50%
nivcsw 61627 # 20.50%
inblock 417560 # 271.94/sec
onblock 3752 # 2.44/sec
cpu-clock 6383952839045 # 6383.953 seconds
task-clock 6384328182880 # 6384.328 seconds
page faults 6088624 # 953.683/sec
context switches 307983 # 48.240/sec
cpu migrations 80353 # 12.586/sec
major page faults 22 # 0.003/sec
minor page faults 6088602 # 953.679/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 3627813375966 # 127.466 branches per 1000 inst
branch misses 100424353464 # 2.77% branch miss
conditional 3627813400830 # 127.466 conditional branches per 1000 inst
indirect 460528146902 # 16.181 indirect branches per 1000 inst
slots 69134049635162 #
retiring 21718349215816 # 31.4% (31.4%)
-- ucode 670291785478 # 1.0%
-- fastpath 21048057430338 # 30.4%
frontend 4655453627152 # 6.7% ( 6.7%)
-- latency 2225063048502 # 3.2%
-- bandwidth 2430390578650 # 3.5%
backend 31176250844073 # 45.1% (45.1%)
-- cpu 9246534763123 # 13.4%
-- memory 21929716080950 # 31.7%
speculation 11778658018445 # 17.0% (17.0%)
-- branch mispredict 11694875066244 # 16.9%
-- pipeline restart 83782952201 # 0.1%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 22497766397352 # 0.92 GHz
instructions 37035269756825 # 1.65 IPC
l2 access 554900697586 # 26.405 l2 access per 1000 inst
l2 miss 245652286024 # 44.27% l2 miss
Process tree structure. Overall we have longer runtime than with other compression tests.
989 processes
561 zstd 145543.77 430.84
64 clinfo 10.88 5.44
38 vulkaninfo 1.14 0.76
6 php 0.15 0.56
4 vulkani:disk$0 0.12 0.08
6 glxinfo:gdrv0 0.09 0.09
2 llvmpipe-0 0.06 0.04
2 llvmpipe-1 0.06 0.04
2 llvmpipe-10 0.06 0.04
2 llvmpipe-11 0.06 0.04
2 llvmpipe-12 0.06 0.04
2 llvmpipe-13 0.06 0.04
2 llvmpipe-14 0.06 0.04
2 llvmpipe-15 0.06 0.04
2 llvmpipe-2 0.06 0.04
2 llvmpipe-3 0.06 0.04
2 llvmpipe-4 0.06 0.04
2 llvmpipe-5 0.06 0.04
2 llvmpipe-6 0.06 0.04
2 llvmpipe-7 0.06 0.04
2 llvmpipe-8 0.06 0.04
2 llvmpipe-9 0.06 0.04
2 glxinfo 0.05 0.04
2 glxinfo:cs0 0.05 0.03
2 glxinfo:disk$0 0.05 0.03
2 glxinfo:sh0 0.05 0.03
2 glxinfo:shlo0 0.05 0.03
6 clang 0.04 0.03
1 lspci 0.01 0.03
101 sh 0.00 0.00
34 sed 0.00 0.00
33 compress-zstd 0.00 0.00
13 gcc 0.00 0.00
10 gsettings 0.00 0.00
9 stty 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
5 phoronix-test-s 0.00 0.00
4 gmain 0.00 0.00
2 cc 0.00 0.00
2 dconf worker 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 ps 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sort 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
Core computation blocks look as follows show we start one process per core.
43700) compress-zstd start=71.84 finish=134.40
43701) zstd start=71.84 finish=134.40
43702) zstd start=72.47 finish=134.35
43703) zstd start=72.47 finish=134.35
43704) zstd start=72.47 finish=134.35
43705) zstd start=72.47 finish=134.35
43706) zstd start=72.47 finish=134.35
43707) zstd start=72.47 finish=134.35
43708) zstd start=72.47 finish=134.35
43709) zstd start=72.47 finish=134.34
43710) zstd start=72.47 finish=134.35
43711) zstd start=72.47 finish=134.35
43712) zstd start=72.47 finish=134.35
43713) zstd start=72.47 finish=134.35
43714) zstd start=72.47 finish=134.35
43715) zstd start=72.47 finish=134.34
43716) zstd start=72.47 finish=134.34
43717) zstd start=72.47 finish=134.34
43718) sed start=134.40 finish=134.40
