botan is a cryptography library. There are six workloads for different algorithms. Looks like they are all single-threaded

Topdown profile shows variations among the workloads with retirement rates in the 70s for two workloads and backend stalls limiting other workloads. Frontend stalls look to be low except for the second workload and brief time at start of each run.

AMD metrics confirm this is single-threaded with almost no L2 access. Backend limitation is split between cpu and memory. There is a light amount of floating point
elapsed 669.399
on_cpu 0.053 # 0.84 / 16 cores
utime 562.839
stime 0.988
nvcsw 2105 # 44.72%
nivcsw 2602 # 55.28%
inblock 0 # 0.00/sec
onblock 16152 # 24.13/sec
cpu-clock 563924791974 # 563.925 seconds
task-clock 563932739166 # 563.933 seconds
page faults 166765 # 295.718/sec
context switches 7838 # 13.899/sec
cpu migrations 367 # 0.651/sec
major page faults 2 # 0.004/sec
minor page faults 166763 # 295.714/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 139814477227 # 18.870 branches per 1000 inst
branch misses 203665490 # 0.15% branch miss
conditional 97115995872 # 13.107 conditional branches per 1000 inst
indirect 13663343420 # 1.844 indirect branches per 1000 inst
cpu-cycles 2610654455476 # 0.24 GHz
instructions 7394582566327 # 2.83 IPC
slots 5229705578412 #
retiring 2826810653088 # 54.1% (54.1%) high
-- ucode 8993481439 # 0.2%
-- fastpath 2817817171649 # 53.9%
frontend 325117530676 # 6.2% ( 6.2%)
-- latency 280235511756 # 5.4%
-- bandwidth 44882018920 # 0.9%
backend 2072076508020 # 39.6% (39.6%)
-- cpu 847158851484 # 16.2%
-- memory 1224917656536 # 23.4%
speculation 5536343464 # 0.1% ( 0.1%) low
-- branch mispredict 4117725817 # 0.1%
-- pipeline restart 1418617647 # 0.0%
smt-contention 164221719 # 0.0% ( 0.0%)
cpu-cycles 3878464228704 # 0.25 GHz
instructions 10719706960041 # 2.76 IPC
instructions 3578967705036 # 0.116 l2 access per 1000 inst
l2 hit from l1 394836268 # 6.20% l2 miss
l2 miss from l1 15707758 #
l2 hit from l2 pf 11116615 #
l3 hit from l2 pf 4545213 #
l3 miss from l2 pf 5561337 #
instructions 3572012090278 # 98.084 float per 1000 inst
float 512 86 # 0.000 AVX-512 per 1000 inst
float 256 664 # 0.000 AVX-256 per 1000 inst
float 128 350357518379 # 98.084 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
instructions 2688067 #
opcache 984278 # 366.166 opcache per 1000 inst
opcache miss 525590 # 53.4% opcache miss rate
l1 dTLB miss 6562 # 2.441 L1 dTLB per 1000 inst
l2 dTLB miss 1164 # 0.433 L2 dTLB per 1000 inst
instructions 2703815 #
icache 1321789 # 488.861 icache per 1000 inst
icache miss 112783 # 8.5% icache miss rate
l1 iTLB miss 10 # 0.004 L1 iTLB per 1000 inst
l2 iTLB miss 0 # 0.000 L2 iTLB per 1000 inst
tlb flush 19 # 0.007 TLB flush per 1000 inst
Intel metrics confirm memory access is minimal.
elapsed 660.876
on_cpu 0.053 # 0.84 / 16 cores
utime 556.986
stime 0.629
nvcsw 2047 # 42.81%
nivcsw 2735 # 57.19%
inblock 10496 # 15.88/sec
onblock 4800 # 7.26/sec
cpu-clock 557684948455 # 557.685 seconds
task-clock 557691700152 # 557.692 seconds
page faults 156021 # 279.762/sec
context switches 7880 # 14.130/sec
cpu migrations 392 # 0.703/sec
major page faults 52 # 0.093/sec
minor page faults 155969 # 279.669/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 122073618847 # 19.314 branches per 1000 inst
branch misses 175355007 # 0.14% branch miss
conditional 122073632095 # 19.314 conditional branches per 1000 inst
indirect 12056510177 # 1.908 indirect branches per 1000 inst
slots 12677475659618 #
retiring 6680431127271 # 52.7% (52.7%)
-- ucode 450464298592 # 3.6%
-- fastpath 6229966828679 # 49.1%
frontend 121467626633 # 1.0% ( 1.0%) low
-- latency 37175507861 # 0.3%
-- bandwidth 84292118772 # 0.7%
backend 5843301211811 # 46.1% (46.1%)
-- cpu 5372679687169 # 42.4%
-- memory 470621524642 # 3.7%
speculation 27230529716 # 0.2% ( 0.2%) low
-- branch mispredict 27046269202 # 0.2%
-- pipeline restart 184260514 # 0.0%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 2113556449501 # 0.20 GHz
instructions 6324122251604 # 2.99 IPC
l2 access 340667126 # 0.054 l2 access per 1000 inst
l2 miss 103730719 # 30.45% l2 miss
cpu-cycles 2113192432479 # 4.9% memory latency
load stalls 102748763011 # 4.8% l1 bound
l1 miss 1018223889 # 0.0% l2 bound
l2 miss 480434465 # 0.0% l3 bound
l3 miss 314612452 # 0.0% dram bound
store_stalls 122368374 # 0.0% store bound
Process summary shows botan is the primary process
394 processes
36 botan 561.84 0.00
68 clinfo 20.17 3.25
38 vulkaninfo 0.76 1.52
6 glxinfo:gdrv0 0.15 0.04
6 glxinfo:gl0 0.15 0.04
6 php 0.08 0.21
4 vulkani:disk$0 0.08 0.16
2 glxinfo 0.07 0.02
2 glxinfo:cs0 0.07 0.02
2 glxinfo:disk$0 0.07 0.02
2 glxinfo:sh0 0.07 0.02
2 glxinfo:shlo0 0.07 0.02
6 clang 0.04 0.08
2 llvmpipe-0 0.04 0.08
2 llvmpipe-1 0.04 0.08
2 llvmpipe-10 0.04 0.08
2 llvmpipe-11 0.04 0.08
2 llvmpipe-12 0.04 0.08
2 llvmpipe-13 0.04 0.08
2 llvmpipe-14 0.04 0.08
2 llvmpipe-15 0.04 0.08
2 llvmpipe-2 0.04 0.08
2 llvmpipe-3 0.04 0.08
2 llvmpipe-4 0.04 0.08
2 llvmpipe-5 0.04 0.08
2 llvmpipe-6 0.04 0.08
2 llvmpipe-7 0.04 0.08
2 llvmpipe-8 0.04 0.08
2 llvmpipe-9 0.04 0.08
3 rocminfo 0.03 0.00
1 lspci 0.00 0.02
1 ps 0.00 0.01
92 sh 0.00 0.00
13 gcc 0.00 0.00
12 gsettings 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
5 phoronix-test-s 0.00 0.00
3 gmain 0.00 0.00
2 cc 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dconf worker 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
Computation structure is simple
723566) botan cpu=1 start=5.75 finish=35.85
723567) botan cpu=6 start=5.75 finish=35.85
723569) botan cpu=6 start=39.85 finish=69.95
723570) botan cpu=7 start=39.86 finish=69.95
723571) botan cpu=13 start=73.96 finish=104.06
723572) botan cpu=14 start=73.96 finish=104.06
723573) sh cpu=14 start=104.06 finish=104.06
723574) sh cpu=7 start=104.06 finish=104.06
