An example of particle transport code. This test fails on my Intel processor because it shows 12 cores and this number does not evenly divide the 192 subdomains. It runs on AMD with one test reporting throughput.

Topdown profile shows high backend stalls

AMD metrics confirm high levels of backend stalls, low retirement rate. This is floating point code with moderate L2 access. Frontend stalls are low including low opcache misses and icache misses.
elapsed 338.975
on_cpu 0.904 # 14.46 / 16 cores
utime 4868.298
stime 33.381
nvcsw 38377 # 47.37%
nivcsw 42643 # 52.63%
inblock 8 # 0.02/sec
onblock 64688 # 190.83/sec
cpu-clock 4903440773300 # 4903.441 seconds
task-clock 4903610508341 # 4903.611 seconds
page faults 12944947 # 2639.881/sec
context switches 82519 # 16.828/sec
cpu migrations 1575 # 0.321/sec
major page faults 247 # 0.050/sec
minor page faults 12944700 # 2639.830/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 1570374683161 # 127.305 branches per 1000 inst
branch misses 3334562600 # 0.21% branch miss
conditional 1489339157977 # 120.735 conditional branches per 1000 inst
indirect 17684445861 # 1.434 indirect branches per 1000 inst
cpu-cycles 21425296984683 # 3.96 GHz
instructions 12365832850477 # 0.58 IPC low
slots 42847795462728 #
retiring 3768422249309 # 8.8% ( 9.8%) low
-- ucode 2331708814 # 0.0%
-- fastpath 3766090540495 # 8.8%
frontend 2271128871110 # 5.3% ( 5.9%)
-- latency 1030938149040 # 2.4%
-- bandwidth 1240190722070 # 2.9%
backend 32215012757623 # 75.2% (84.1%) high
-- cpu 3230412385624 # 7.5%
-- memory 28984600371999 # 67.6%
speculation 53466143550 # 0.1% ( 0.1%) low
-- branch mispredict 42049729003 # 0.1%
-- pipeline restart 11416414547 # 0.0%
smt-contention 4539745528168 # 10.6% ( 0.0%)
cpu-cycles 21287261506351 # 3.95 GHz
instructions 12490617528632 # 0.59 IPC low
instructions 4162705943333 # 75.198 l2 access per 1000 inst
l2 hit from l1 159347998760 # 29.04% l2 miss
l2 miss from l1 4011291175 #
l2 hit from l2 pf 66785684621 #
l3 hit from l2 pf 45675249390 #
l3 miss from l2 pf 41218623098 #
instructions 4160899813361 # 320.435 float per 1000 inst
float 512 85 # 0.000 AVX-512 per 1000 inst
float 256 518 # 0.000 AVX-256 per 1000 inst
float 128 1333298242721 # 320.435 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
instructions 12441394927557 #
opcache 624303591988 # 50.180 opcache per 1000 inst
opcache miss 18711598680 # 3.0% opcache miss rate
l1 dTLB miss 6646693200 # 0.534 L1 dTLB per 1000 inst
l2 dTLB miss 2584799514 # 0.208 L2 dTLB per 1000 inst
instructions 12383667677284 #
icache 37128517492 # 2.998 icache per 1000 inst
icache miss 1944446624 # 5.2% icache miss rate
l1 iTLB miss 10303586 # 0.001 L1 iTLB per 1000 inst
l2 iTLB miss 0 # 0.000 L2 iTLB per 1000 inst
tlb flush 85556 # 0.000 TLB flush per 1000 inst
Process overview shows the kripke.exe are primary process.
933 processes
480 kripke.exe 92076.97 608.11
68 clinfo 14.55 4.67
90 mpirun 4.40 11.17
38 vulkaninfo 0.96 0.96
6 php 0.30 0.83
4 vulkani:disk$0 0.11 0.11
6 glxinfo:gdrv0 0.08 0.07
6 glxinfo:gl0 0.08 0.07
2 llvmpipe-0 0.06 0.06
2 llvmpipe-10 0.06 0.06
2 llvmpipe-11 0.06 0.06
2 llvmpipe-12 0.06 0.06
2 llvmpipe-13 0.06 0.06
2 llvmpipe-14 0.06 0.06
2 llvmpipe-15 0.06 0.06
2 llvmpipe-2 0.06 0.06
2 llvmpipe-3 0.06 0.06
2 llvmpipe-4 0.06 0.06
2 llvmpipe-5 0.06 0.06
2 llvmpipe-6 0.06 0.06
2 llvmpipe-7 0.06 0.06
2 llvmpipe-8 0.06 0.06
2 llvmpipe-9 0.06 0.06
2 llvmpipe-1 0.05 0.06
6 clang 0.04 0.05
2 glxinfo 0.04 0.03
2 glxinfo:cs0 0.04 0.03
2 glxinfo:disk$0 0.04 0.03
2 glxinfo:sh0 0.04 0.03
2 glxinfo:shlo0 0.04 0.03
1 lspci 0.00 0.02
82 sh 0.00 0.00
15 kripke 0.00 0.00
13 gcc 0.00 0.00
8 gsettings 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
5 gmain 0.00 0.00
5 phoronix-test-s 0.00 0.00
3 dconf worker 0.00 0.00
3 rocminfo 0.00 0.00
2 cc 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 ps 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
Computation blocks
1068858) kripke cpu=1 start=5.33 finish=146.13
1068859) mpirun cpu=9 start=5.33 finish=146.10
1068942) mpirun cpu=8 start=5.89 finish=146.10
1068943) mpirun cpu=1 start=5.89 finish=5.89
1068944) mpirun cpu=9 start=5.90 finish=146.09
1069134) mpirun cpu=9 start=6.38 finish=146.09
1069138) mpirun cpu=9 start=6.39 finish=146.09
1069204) kripke.exe cpu=0 start=6.46 finish=146.02
1069207) kripke.exe cpu=13 start=6.47 finish=145.69
1069211) kripke.exe cpu=15 start=6.47 finish=145.68
1069402) kripke.exe cpu=0 start=6.81 finish=146.02
1069206) kripke.exe cpu=14 start=6.46 finish=146.04
1069212) kripke.exe cpu=3 start=6.48 finish=145.69
1069226) kripke.exe cpu=14 start=6.52 finish=145.68
1069696) kripke.exe cpu=6 start=7.62 finish=146.05
1069209) kripke.exe cpu=11 start=6.47 finish=146.02
1069216) kripke.exe cpu=11 start=6.49 finish=145.68
1069221) kripke.exe cpu=15 start=6.50 finish=145.68
1069674) kripke.exe cpu=13 start=7.57 finish=146.02
1069215) kripke.exe cpu=5 start=6.48 finish=146.06
1069222) kripke.exe cpu=5 start=6.50 finish=145.69
1069233) kripke.exe cpu=15 start=6.52 finish=145.68
1069676) kripke.exe cpu=10 start=7.58 finish=146.06
1069220) kripke.exe cpu=15 start=6.50 finish=146.08
1069224) kripke.exe cpu=9 start=6.51 finish=145.69
1069231) kripke.exe cpu=1 start=6.52 finish=145.68
1069681) kripke.exe cpu=15 start=7.59 finish=146.08
1069223) kripke.exe cpu=13 start=6.51 finish=146.02
1069241) kripke.exe cpu=13 start=6.53 finish=145.68
1069246) kripke.exe cpu=13 start=6.54 finish=145.68
1069686) kripke.exe cpu=13 start=7.60 finish=146.02
1069228) kripke.exe cpu=2 start=6.52 finish=146.03
1069240) kripke.exe cpu=6 start=6.53 finish=145.68
1069249) kripke.exe cpu=15 start=6.54 finish=145.68
1069683) kripke.exe cpu=4 start=7.59 finish=146.04
1069238) kripke.exe cpu=9 start=6.52 finish=146.03
1069248) kripke.exe cpu=13 start=6.54 finish=145.69
1069253) kripke.exe cpu=1 start=6.55 finish=145.68
1069707) kripke.exe cpu=8 start=7.65 finish=146.03
