Open source library for quantitative finance written in C++. Two workloads, the first runs on all 16 cores and the latter is only two cores (listed as single-threaded).

This code has a high retirement rate and a low number of frontend stalls.

AMD metrics show on average we run on half the cores with a moderate amount of floating point, not many L2 misses and a high retirement rate.
elapsed 239.451
on_cpu 0.541 # 8.66 / 16 cores
utime 2065.423
stime 7.805
nvcsw 4633 # 14.38%
nivcsw 27581 # 85.62%
inblock 0 # 0.00/sec
onblock 13160 # 54.96/sec
cpu-clock 2073306190419 # 2073.306 seconds
task-clock 2073321140694 # 2073.321 seconds
page faults 2947742 # 1421.749/sec
context switches 33177 # 16.002/sec
cpu migrations 1346 # 0.649/sec
major page faults 82 # 0.040/sec
minor page faults 2947660 # 1421.709/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 2559158486088 # 137.454 branches per 1000 inst
branch misses 4546709879 # 0.18% branch miss
conditional 1792036393402 # 96.251 conditional branches per 1000 inst
indirect 296331568425 # 15.916 indirect branches per 1000 inst
cpu-cycles 8147071650405 # 2.12 GHz
instructions 18623586468548 # 2.29 IPC
slots 16294264396806 #
retiring 6411101162579 # 39.3% (63.0%)
-- ucode 51614158594 # 0.3%
-- fastpath 6359487003985 # 39.0%
frontend 973719191859 # 6.0% ( 9.6%)
-- latency 661631081958 # 4.1%
-- bandwidth 312088109901 # 1.9%
backend 2670978253953 # 16.4% (26.3%)
-- cpu 1852113077242 # 11.4%
-- memory 818865176711 # 5.0%
speculation 112642574434 # 0.7% ( 1.1%)
-- branch mispredict 89165824746 # 0.5%
-- pipeline restart 23476749688 # 0.1%
smt-contention 6125810500089 # 37.6% ( 0.0%)
cpu-cycles 8153806283473 # 2.12 GHz
instructions 18614267485562 # 2.28 IPC
instructions 6204973623074 # 13.313 l2 access per 1000 inst
l2 hit from l1 65552599571 # 6.74% l2 miss
l2 miss from l1 1533617239 #
l2 hit from l2 pf 13019558189 #
l3 hit from l2 pf 3987750228 #
l3 miss from l2 pf 45252945 #
instructions 6207650287054 # 131.654 float per 1000 inst
float 512 107 # 0.000 AVX-512 per 1000 inst
float 256 77880 # 0.000 AVX-256 per 1000 inst
float 128 817261944049 # 131.654 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 32 # 0.000 scalar per 1000 inst
Intel metrics
elapsed 976.488
on_cpu 0.778 # 12.44 / 16 cores
utime 12129.067
stime 19.617
nvcsw 12945 # 9.55%
nivcsw 122641 # 90.45%
inblock 52656 # 53.92/sec
onblock 2240 # 2.29/sec
cpu-clock 12148886067864 # 12148.886 seconds
task-clock 12148922790298 # 12148.923 seconds
page faults 11001944 # 905.590/sec
context switches 140037 # 11.527/sec
cpu migrations 3387 # 0.279/sec
major page faults 594 # 0.049/sec
minor page faults 11001350 # 905.541/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 12200124995178 # 137.798 branches per 1000 inst
branch misses 25542654236 # 0.21% branch miss
conditional 12200125023466 # 137.798 conditional branches per 1000 inst
indirect 3350072945374 # 37.838 indirect branches per 1000 inst
slots 60608212198784 #
retiring 48586807424801 # 80.2% (80.2%)
-- ucode 3956469197097 # 6.5%
-- fastpath 44630338227704 # 73.6%
frontend 9060412040521 # 14.9% (14.9%)
-- latency 3984241307781 # 6.6%
-- bandwidth 5076170732740 # 8.4%
backend 2034236705457 # 3.4% ( 3.4%)
-- cpu 1227430698305 # 2.0%
-- memory 806806007152 # 1.3%
speculation 1657739725109 # 2.7% ( 2.7%)
-- branch mispredict 1471736909095 # 2.4%
-- pipeline restart 186002816014 # 0.3%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 16929758726185 # 1.70 GHz
instructions 44769453310654 # 2.64 IPC
l2 access 241331604127 # 8.271 l2 access per 1000 inst
l2 miss 17781280151 # 7.37% l2 miss
Process structure is straightforward
452 processes
102 quantlib-benchm 2058.50 6.04
68 clinfo 16.59 5.98
38 vulkaninfo 0.57 1.68
6 glxinfo:gdrv0 0.14 0.07
6 php 0.07 0.09
6 clang 0.07 0.05
2 glxinfo 0.07 0.03
2 glxinfo:cs0 0.07 0.03
2 glxinfo:disk$0 0.07 0.03
2 glxinfo:sh0 0.07 0.03
2 glxinfo:shlo0 0.07 0.03
4 vulkani:disk$0 0.06 0.17
2 llvmpipe-0 0.03 0.09
2 llvmpipe-1 0.03 0.09
2 llvmpipe-10 0.03 0.09
2 llvmpipe-11 0.03 0.09
2 llvmpipe-12 0.03 0.09
2 llvmpipe-13 0.03 0.09
2 llvmpipe-14 0.03 0.09
2 llvmpipe-15 0.03 0.09
2 llvmpipe-2 0.03 0.09
2 llvmpipe-3 0.03 0.09
2 llvmpipe-4 0.03 0.09
2 llvmpipe-5 0.03 0.09
2 llvmpipe-6 0.03 0.09
2 llvmpipe-7 0.03 0.09
2 llvmpipe-8 0.03 0.09
2 llvmpipe-9 0.03 0.09
3 rocminfo 0.03 0.00
1 lspci 0.00 0.02
1 ps 0.00 0.01
84 sh 0.00 0.00
13 gcc 0.00 0.00
11 gsettings 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
6 quantlib 0.00 0.00
5 phoronix-test-s 0.00 0.00
4 gmain 0.00 0.00
2 cc 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dconf worker 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
The core parallel code sections
12611) quantlib cpu=9 start=5.87 finish=52.15
12612) quantlib-benchm cpu=12 start=5.87 finish=52.15
12613) quantlib-benchm cpu=4 start=5.88 finish=52.14
12615) quantlib-benchm cpu=0 start=5.88 finish=52.14
12614) quantlib-benchm cpu=0 start=5.88 finish=52.14
12617) quantlib-benchm cpu=0 start=5.88 finish=52.14
12616) quantlib-benchm cpu=10 start=5.88 finish=52.15
12619) quantlib-benchm cpu=9 start=5.88 finish=52.14
12618) quantlib-benchm cpu=2 start=5.88 finish=52.14
12621) quantlib-benchm cpu=2 start=5.88 finish=52.14
12620) quantlib-benchm cpu=5 start=5.88 finish=52.15
12623) quantlib-benchm cpu=5 start=5.88 finish=52.14
12622) quantlib-benchm cpu=14 start=5.88 finish=52.15
12625) quantlib-benchm cpu=15 start=5.88 finish=52.14
12624) quantlib-benchm cpu=4 start=5.88 finish=52.15
12627) quantlib-benchm cpu=11 start=5.88 finish=52.14
12626) quantlib-benchm cpu=13 start=5.88 finish=52.15
12629) quantlib-benchm cpu=3 start=5.88 finish=52.14
12628) quantlib-benchm cpu=6 start=5.88 finish=52.14
12631) quantlib-benchm cpu=6 start=5.88 finish=52.14
12630) quantlib-benchm cpu=4 start=5.88 finish=52.14
12632) quantlib-benchm cpu=13 start=5.88 finish=52.14
12633) quantlib-benchm cpu=7 start=5.88 finish=52.15
12635) quantlib-benchm cpu=7 start=5.88 finish=52.14
12634) quantlib-benchm cpu=8 start=5.88 finish=52.14
12637) quantlib-benchm cpu=4 start=5.88 finish=52.14
12636) quantlib-benchm cpu=10 start=5.88 finish=52.14
12639) quantlib-benchm cpu=10 start=5.88 finish=52.14
12638) quantlib-benchm cpu=8 start=5.88 finish=52.14
12641) quantlib-benchm cpu=8 start=5.88 finish=52.14
12640) quantlib-benchm cpu=4 start=5.88 finish=52.15
12643) quantlib-benchm cpu=14 start=5.89 finish=52.14
12642) quantlib-benchm cpu=10 start=5.88 finish=52.15
12644) quantlib-benchm cpu=1 start=5.89 finish=52.14
