Doflyn is a very quick running benchmark in CFD. The entire test with three iterations completes in less than a minute.

The topdown profile has a relatively high retirement rate with some cpu-based backend stalls.

AMD metrics show floating point code and somewhat low L2 access. The on cpu is a single-threaded program.
elapsed 52.855
on_cpu 0.046 # 0.73 / 16 cores
utime 37.267
stime 1.516
nvcsw 8067 # 95.26%
nivcsw 401 # 4.74%
inblock 8 # 0.15/sec
onblock 135856 # 2570.36/sec
cpu-clock 38749381704 # 38.749 seconds
task-clock 38759105490 # 38.759 seconds
page faults 204106 # 5266.014/sec
context switches 8431 # 217.523/sec
cpu migrations 280 # 7.224/sec
major page faults 2 # 0.052/sec
minor page faults 204104 # 5265.963/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 74085607480 # 112.056 branches per 1000 inst
branch misses 297221047 # 0.40% branch miss
conditional 67706793685 # 102.408 conditional branches per 1000 inst
indirect 332626041 # 0.503 indirect branches per 1000 inst
cpu-cycles 171812239762 # 0.20 GHz
instructions 657055209430 # 3.82 IPC
slots 347057238312 #
retiring 211479166926 # 60.9% (60.9%)
-- ucode 36275577 # 0.0%
-- fastpath 211442891349 # 60.9%
frontend 21493803160 # 6.2% ( 6.2%)
-- latency 13293731340 # 3.8%
-- bandwidth 8200071820 # 2.4%
backend 104985951609 # 30.3% (30.3%)
-- cpu 83128694937 # 24.0%
-- memory 21857256672 # 6.3%
speculation 9051941671 # 2.6% ( 2.6%)
-- branch mispredict 8406933807 # 2.4%
-- pipeline restart 645007864 # 0.2%
smt-contention 46103022 # 0.0% ( 0.0%)
cpu-cycles 171757808937 # 0.20 GHz
instructions 655371739384 # 3.82 IPC
instructions 220121829248 # 45.411 l2 access per 1000 inst
l2 hit from l1 6181572101 # 23.15% l2 miss
l2 miss from l1 233242806 #
l2 hit from l2 pf 1733150975 #
l3 hit from l2 pf 2074871799 #
l3 miss from l2 pf 6331954 #
instructions 220424184932 # 197.147 float per 1000 inst
float 512 124 # 0.000 AVX-512 per 1000 inst
float 256 648 # 0.000 AVX-256 per 1000 inst
float 128 43455996941 # 197.147 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
Process summary
480 processes
21 dolfyn 35.58 0.41
67 clinfo 16.23 6.22
38 vulkaninfo 1.24 1.06
18 dolgeo 0.20 0.03
4 vulkani:disk$0 0.13 0.11
6 glxinfo:gdrv0 0.12 0.07
6 php 0.08 0.18
2 llvmpipe-0 0.07 0.06
2 llvmpipe-1 0.07 0.06
2 llvmpipe-10 0.07 0.06
2 llvmpipe-11 0.07 0.06
2 llvmpipe-12 0.07 0.06
2 llvmpipe-13 0.07 0.06
2 llvmpipe-14 0.07 0.06
2 llvmpipe-15 0.07 0.06
2 llvmpipe-2 0.07 0.06
2 llvmpipe-3 0.07 0.06
2 llvmpipe-4 0.07 0.06
2 llvmpipe-5 0.07 0.06
2 llvmpipe-6 0.07 0.06
2 llvmpipe-7 0.07 0.06
2 llvmpipe-8 0.07 0.06
2 llvmpipe-9 0.07 0.06
2 glxinfo 0.06 0.03
2 glxinfo:cs0 0.06 0.03
2 glxinfo:disk$0 0.06 0.03
2 glxinfo:sh0 0.06 0.03
2 glxinfo:shlo0 0.06 0.03
6 clang 0.05 0.07
3 rocminfo 0.03 0.00
3 doit.sh 0.00 0.03
1 lspci 0.00 0.03
1 ps 0.00 0.01
90 rm 0.00 0.00
82 sh 0.00 0.00
13 gcc 0.00 0.00
11 gsettings 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
8 which 0.00 0.00
6 llvm-link 0.00 0.00
5 phoronix-test-s 0.00 0.00
3 gmain 0.00 0.00
2 cc 0.00 0.00
2 dconf worker 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
The computation has more processes than I would expect though they all run quickly.
1968383) dolfyn cpu=12 start=5.59 finish=17.85
1968384) doit.sh cpu=3 start=5.59 finish=17.85
1968385) which cpu=14 start=5.60 finish=5.60
1968386) which cpu=13 start=5.60 finish=5.60
1968387) dolgeo cpu=1 start=5.60 finish=5.61
1968388) rm cpu=14 start=5.61 finish=5.62
1968389) rm cpu=13 start=5.62 finish=5.62
1968390) rm cpu=1 start=5.62 finish=5.62
1968391) rm cpu=14 start=5.62 finish=5.62
1968392) rm cpu=2 start=5.62 finish=5.62
1968393) dolfyn cpu=1 start=5.62 finish=5.89
1968394) dolgeo cpu=5 start=5.89 finish=5.90
1968395) rm cpu=14 start=5.91 finish=5.91
1968396) rm cpu=10 start=5.91 finish=5.91
1968397) rm cpu=14 start=5.91 finish=5.91
1968398) rm cpu=12 start=5.91 finish=5.91
1968399) rm cpu=10 start=5.91 finish=5.91
1968400) dolfyn cpu=5 start=5.91 finish=6.05
1968401) dolgeo cpu=4 start=6.06 finish=6.07
1968402) rm cpu=5 start=6.07 finish=6.07
1968403) rm cpu=6 start=6.07 finish=6.07
1968404) rm cpu=9 start=6.07 finish=6.08
1968405) rm cpu=5 start=6.08 finish=6.08
1968406) rm cpu=10 start=6.08 finish=6.08
1968407) dolfyn cpu=9 start=6.08 finish=6.24
1968408) dolgeo cpu=12 start=6.24 finish=6.27
1968409) rm cpu=5 start=6.28 finish=6.28
1968410) rm cpu=6 start=6.28 finish=6.28
1968411) rm cpu=10 start=6.28 finish=6.28
1968412) rm cpu=5 start=6.28 finish=6.28
1968413) rm cpu=6 start=6.29 finish=6.29
1968414) dolfyn cpu=10 start=6.29 finish=11.36
1968415) dolgeo cpu=4 start=11.36 finish=11.41
1968416) rm cpu=5 start=11.41 finish=11.41
1968417) rm cpu=10 start=11.41 finish=11.41
1968418) rm cpu=4 start=11.41 finish=11.41
1968419) rm cpu=5 start=11.41 finish=11.41
1968420) rm cpu=10 start=11.41 finish=11.42
1968421) dolfyn cpu=6 start=11.42 finish=17.82
1968423) dolgeo cpu=12 start=17.82 finish=17.83
1968424) rm cpu=5 start=17.83 finish=17.83
1968425) rm cpu=6 start=17.83 finish=17.83
1968426) rm cpu=5 start=17.83 finish=17.83
1968427) rm cpu=9 start=17.83 finish=17.83
1968428) rm cpu=12 start=17.83 finish=17.84
1968429) dolfyn cpu=6 start=17.84 finish=17.85
