A test of Bayesian analysis with very high IPC and retirement rate. Also a case where my AMD chip is more than 2x faster than my Intel chip. Overall looks like half the cores are used.

Topdown metrics highlight a high retirement rate. The backend stalls are more because of CPU than memory.

AMD metrics show floating point code, and low L2 access. I expect this is a code that mostly runs inside the smaller caches.
elapsed 269.927
on_cpu 0.463 # 7.41 / 16 cores
utime 1977.824
stime 21.480
nvcsw 85907 # 93.51%
nivcsw 5958 # 6.49%
inblock 0 # 0.00/sec
onblock 304448 # 1127.89/sec
cpu-clock 1999166570466 # 1999.167 seconds
task-clock 1999209718504 # 1999.210 seconds
page faults 204079 # 102.080/sec
context switches 93020 # 46.528/sec
cpu migrations 3757 # 1.879/sec
major page faults 21 # 0.011/sec
minor page faults 204058 # 102.069/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 3162931731939 # 118.634 branches per 1000 inst
branch misses 25621198898 # 0.81% branch miss
conditional 1856981848725 # 69.651 conditional branches per 1000 inst
indirect 417142083066 # 15.646 indirect branches per 1000 inst
cpu-cycles 8074389261324 # 1.87 GHz
instructions 26661505226380 # 3.30 IPC
slots 16157107428102 #
retiring 9088398115318 # 56.3% (56.3%)
-- ucode 1179133273 # 0.0%
-- fastpath 9087218982045 # 56.2%
frontend 3084602572199 # 19.1% (19.1%)
-- latency 1923637763226 # 11.9%
-- bandwidth 1160964808973 # 7.2%
backend 3449463140862 # 21.3% (21.4%)
-- cpu 2951291320114 # 18.3%
-- memory 498171820748 # 3.1%
speculation 512392460847 # 3.2% ( 3.2%)
-- branch mispredict 506025917807 # 3.1%
-- pipeline restart 6366543040 # 0.0%
smt-contention 22220399884 # 0.1% ( 0.0%)
cpu-cycles 8098416904667 # 1.86 GHz
instructions 26659435250290 # 3.29 IPC
instructions 8890128447153 # 18.809 l2 access per 1000 inst
l2 hit from l1 144952426458 # 1.71% l2 miss
l2 miss from l1 1287365557 #
l2 hit from l2 pf 20689080276 #
l3 hit from l2 pf 1567049370 #
l3 miss from l2 pf 7639525 #
instructions 8886841340745 # 201.198 float per 1000 inst
float 512 66 # 0.000 AVX-512 per 1000 inst
float 256 2078 # 0.000 AVX-256 per 1000 inst
float 128 1788012564286 # 201.198 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
Intel metrics
elapsed 588.389
on_cpu 0.571 # 9.14 / 16 cores
utime 5356.216
stime 19.068
nvcsw 125957 # 90.00%
nivcsw 14002 # 10.00%
inblock 6480 # 11.01/sec
onblock 439592 # 747.11/sec
cpu-clock 5375028742520 # 5375.029 seconds
task-clock 5375084266375 # 5375.084 seconds
page faults 188727 # 35.111/sec
context switches 142691 # 26.547/sec
cpu migrations 4349 # 0.809/sec
major page faults 87 # 0.016/sec
minor page faults 188640 # 35.095/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 4764011541426 # 118.508 branches per 1000 inst
branch misses 56769803842 # 1.19% branch miss
conditional 4764011554770 # 118.508 conditional branches per 1000 inst
indirect 1535012956749 # 38.184 indirect branches per 1000 inst
slots 43147684016834 #
retiring 25314831849978 # 58.7% (58.7%)
-- ucode 1933281821437 # 4.5%
-- fastpath 23381550028541 # 54.2%
frontend 5572332160739 # 12.9% (12.9%)
-- latency 1951907585256 # 4.5%
-- bandwidth 3620424575483 # 8.4%
backend 8926440911379 # 20.7% (20.7%)
-- cpu 8234468738720 # 19.1%
-- memory 691972172659 # 1.6%
speculation 3396035279367 # 7.9% ( 7.9%)
-- branch mispredict 3310807856172 # 7.7%
-- pipeline restart 85227423195 # 0.2%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 17676562915344 # 1.89 GHz
instructions 61132090382453 # 3.46 IPC
l2 access 346458995583 # 13.864 l2 access per 1000 inst
l2 miss 2051152576 # 0.59% l2 miss
Process summary
387 processes
24 mb 1971.54 19.39
68 clinfo 16.38 5.99
18 mpiexec 1.73 6.87
38 vulkaninfo 1.31 1.13
6 php 0.15 0.31
6 glxinfo:gdrv0 0.14 0.10
4 vulkani:disk$0 0.13 0.12
2 llvmpipe-0 0.07 0.06
2 llvmpipe-1 0.07 0.06
2 llvmpipe-10 0.07 0.06
2 llvmpipe-11 0.07 0.06
2 llvmpipe-12 0.07 0.06
2 llvmpipe-13 0.07 0.06
2 llvmpipe-14 0.07 0.06
2 llvmpipe-15 0.07 0.06
2 llvmpipe-2 0.07 0.06
2 llvmpipe-3 0.07 0.06
2 llvmpipe-4 0.07 0.06
2 llvmpipe-5 0.07 0.06
2 llvmpipe-6 0.07 0.06
2 llvmpipe-7 0.07 0.06
2 llvmpipe-8 0.07 0.06
2 llvmpipe-9 0.07 0.06
6 clang 0.06 0.06
2 glxinfo 0.06 0.04
2 glxinfo:cs0 0.06 0.04
2 glxinfo:disk$0 0.06 0.04
2 glxinfo:sh0 0.06 0.04
2 glxinfo:shlo0 0.06 0.04
3 rocminfo 0.03 0.00
1 lspci 0.00 0.02
82 sh 0.00 0.00
14 gsettings 0.00 0.00
13 gcc 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
5 phoronix-test-s 0.00 0.00
3 mrbayes 0.00 0.00
2 cc 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dconf worker 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 gmain 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 ps 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
The program runs via MPI
259388) mrbayes cpu=2 start=5.74 finish=89.86
259389) mpiexec cpu=0 start=5.74 finish=89.83
259393) mpiexec cpu=12 start=6.32 finish=89.83
259394) mpiexec cpu=14 start=6.32 finish=6.32
259395) mpiexec cpu=11 start=6.34 finish=89.83
259397) mpiexec cpu=15 start=6.83 finish=89.83
259398) mpiexec cpu=9 start=6.83 finish=89.83
259399) mb cpu=8 start=6.86 finish=89.69
259400) mb cpu=3 start=6.86 finish=89.55
259401) mb cpu=12 start=6.86 finish=89.49
259402) mb cpu=13 start=6.87 finish=89.80
259403) mb cpu=14 start=6.87 finish=89.66
259404) mb cpu=7 start=6.88 finish=89.21
259405) mb cpu=2 start=6.88 finish=89.73
259406) mb cpu=1 start=6.88 finish=89.83
