Testing libgcrypt with the integrated benchmark. Looks to be single-threaded.

Topdown profile shows some blurring, probably from different crypt subtests.

AMD metrics show little floaitng point, a very low amount of frontend stalls and very little L2 access.
elapsed 531.902
on_cpu 0.061 # 0.97 / 16 cores
utime 516.828
stime 0.740
nvcsw 2043 # 43.96%
nivcsw 2604 # 56.04%
inblock 64 # 0.12/sec
onblock 13832 # 26.00/sec
cpu-clock 517639332237 # 517.639 seconds
task-clock 517645091350 # 517.645 seconds
page faults 147616 # 285.168/sec
context switches 7137 # 13.787/sec
cpu migrations 331 # 0.639/sec
major page faults 3 # 0.006/sec
minor page faults 147613 # 285.163/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 380131956406 # 60.983 branches per 1000 inst
branch misses 431943792 # 0.11% branch miss
conditional 262048446662 # 42.039 conditional branches per 1000 inst
indirect 26285539920 # 4.217 indirect branches per 1000 inst
cpu-cycles 2417266512530 # 0.28 GHz
instructions 6219334247412 # 2.57 IPC
slots 4842841465530 #
retiring 2252950350168 # 46.5% (46.5%)
-- ucode 3132389809 # 0.1%
-- fastpath 2249817960359 # 46.5%
frontend 135360904140 # 2.8% ( 2.8%) low
-- latency 41369303016 # 0.9%
-- bandwidth 93991601124 # 1.9%
backend 2433230177150 # 50.2% (50.2%)
-- cpu 472885962398 # 9.8%
-- memory 1960344214752 # 40.5%
speculation 20862241606 # 0.4% ( 0.4%) low
-- branch mispredict 16213481345 # 0.3%
-- pipeline restart 4648760261 # 0.1%
smt-contention 437317253 # 0.0% ( 0.0%)
cpu-cycles 2413505785886 # 0.28 GHz
instructions 6212281735247 # 2.57 IPC
instructions 2071438044827 # 0.094 l2 access per 1000 inst
l2 hit from l1 178758560 # 12.16% l2 miss
l2 miss from l1 14677271 #
l2 hit from l2 pf 7254392 #
l3 hit from l2 pf 4271893 #
l3 miss from l2 pf 4767842 #
instructions 2070106154657 # 22.429 float per 1000 inst
float 512 60 # 0.000 AVX-512 per 1000 inst
float 256 620 # 0.000 AVX-256 per 1000 inst
float 128 46429470920 # 22.429 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
instructions 6226129956548 #
opcache 601881526493 # 96.670 opcache per 1000 inst
opcache miss 7859374446 # 1.3% opcache miss rate
l1 dTLB miss 28467427 # 0.005 L1 dTLB per 1000 inst
l2 dTLB miss 4893619 # 0.001 L2 dTLB per 1000 inst
instructions 6234336458913 #
icache 13440378870 # 2.156 icache per 1000 inst
icache miss 654545777 # 4.9% icache miss rate
l1 iTLB miss 7928289 # 0.001 L1 iTLB per 1000 inst
l2 iTLB miss 0 # 0.000 L2 iTLB per 1000 inst
tlb flush 16577 # 0.000 TLB flush per 1000 inst
Intel metrics show memory accesses are all L1. Interesting to see relative amounts of memory-bound vs cpu-bound flipped between AMD and Intel.
elapsed 635.257
on_cpu 0.061 # 0.98 / 16 cores
utime 619.955
stime 0.494
nvcsw 2947 # 49.68%
nivcsw 2985 # 50.32%
inblock 225048 # 354.26/sec
onblock 2688 # 4.23/sec
cpu-clock 620494490675 # 620.494 seconds
task-clock 620499746686 # 620.500 seconds
page faults 138394 # 223.036/sec
context switches 8940 # 14.408/sec
cpu migrations 472 # 0.761/sec
major page faults 1148 # 1.850/sec
minor page faults 137246 # 221.186/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 370758952846 # 60.936 branches per 1000 inst
branch misses 576569854 # 0.16% branch miss
conditional 370758964686 # 60.936 conditional branches per 1000 inst
indirect 26280180628 # 4.319 indirect branches per 1000 inst
slots 14166997571918 #
retiring 7649074520400 # 54.0% (54.0%)
-- ucode 500008685427 # 3.5%
-- fastpath 7149065834973 # 50.5%
frontend 1409343082210 # 9.9% ( 9.9%)
-- latency 280678924673 # 2.0%
-- bandwidth 1128664157537 # 8.0%
backend 5938083799852 # 41.9% (41.9%)
-- cpu 5064130744520 # 35.7%
-- memory 873953055332 # 6.2%
speculation 113614731650 # 0.8% ( 0.8%) low
-- branch mispredict 113465068431 # 0.8%
-- pipeline restart 149663219 # 0.0%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 2352185195712 # 0.23 GHz
instructions 6088304938733 # 2.59 IPC
l2 access 490630406 # 0.081 l2 access per 1000 inst
l2 miss 98092730 # 19.99% l2 miss
cpu-cycles 2352817541230 # 9.6% memory latency
load stalls 225825885761 # 9.6% l1 bound
l1 miss 938789868 # 0.0% l2 bound
l2 miss 410168589 # 0.0% l3 bound
l3 miss 261950789 # 0.0% dram bound
store_stalls 101113868 # 0.0% store bound
Process time shows most all the time spent in a benchmark application.
354 processes
3 benchmark 514.36 0.00
68 clinfo 16.22 6.31
38 vulkaninfo 1.52 0.94
4 vulkani:disk$0 0.16 0.10
6 glxinfo:gdrv0 0.11 0.05
6 glxinfo:gl0 0.11 0.05
6 php 0.08 0.09
2 llvmpipe-0 0.08 0.05
2 llvmpipe-1 0.08 0.05
2 llvmpipe-10 0.08 0.05
2 llvmpipe-11 0.08 0.05
2 llvmpipe-12 0.08 0.05
2 llvmpipe-13 0.08 0.05
2 llvmpipe-14 0.08 0.05
2 llvmpipe-15 0.08 0.05
2 llvmpipe-2 0.08 0.05
2 llvmpipe-3 0.08 0.05
2 llvmpipe-4 0.08 0.05
2 llvmpipe-5 0.08 0.05
2 llvmpipe-6 0.08 0.05
2 llvmpipe-7 0.08 0.05
2 llvmpipe-8 0.08 0.05
2 llvmpipe-9 0.08 0.05
2 glxinfo 0.06 0.02
2 glxinfo:cs0 0.06 0.02
2 glxinfo:disk$0 0.06 0.02
2 glxinfo:sh0 0.06 0.02
2 glxinfo:shlo0 0.06 0.02
6 clang 0.04 0.08
3 rocminfo 0.00 0.03
1 lspci 0.00 0.02
82 sh 0.00 0.00
13 gcc 0.00 0.00
11 gsettings 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
5 phoronix-test-s 0.00 0.00
3 gcrypt 0.00 0.00
3 gmain 0.00 0.00
2 cc 0.00 0.00
2 dconf worker 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 ps 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
Core computation pieces
823837) gcrypt cpu=1 start=5.64 finish=177.28
823838) benchmark cpu=11 start=5.64 finish=177.28
823842) gcrypt cpu=1 start=181.28 finish=352.58
823843) benchmark cpu=10 start=181.28 finish=352.58
823967) gcrypt cpu=1 start=356.58 finish=528.03
823968) benchmark cpu=10 start=356.59 finish=528.03
