A set of Fortran benchmarks. There are 16 benchmarks and these look to be single-threaded. As shown below, they have variable run times and sometimes take a while to meet the tolerance.

Topdown profile shows some of the benchmarks as heavily backend bound.

AMD metrics confirm single-threaded floating point code. There is a large amount of backend stalls and low amounts of frontend stalls.
elapsed 4737.128
on_cpu 0.060 # 0.97 / 16 cores
utime 4556.868
stime 18.251
nvcsw 4849 # 20.85%
nivcsw 18404 # 79.15%
inblock 264 # 0.06/sec
onblock 3573632 # 754.39/sec
cpu-clock 4575749642184 # 4575.750 seconds
task-clock 4575803984388 # 4575.804 seconds
page faults 10022222 # 2190.265/sec
context switches 45700 # 9.987/sec
cpu migrations 1372 # 0.300/sec
major page faults 2 # 0.000/sec
minor page faults 10021242 # 2190.051/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 2245832727155 # 67.249 branches per 1000 inst
branch misses 17173602597 # 0.76% branch miss
conditional 1840810629065 # 55.121 conditional branches per 1000 inst
indirect 58846789866 # 1.762 indirect branches per 1000 inst
cpu-cycles 22123942736621 # 0.28 GHz
instructions 34070440192734 # 1.54 IPC
slots 44291344077096 #
retiring 11820303560337 # 26.7% (26.7%)
-- ucode 16865516586 # 0.0%
-- fastpath 11803438043751 # 26.6%
frontend 2205416817160 # 5.0% ( 5.0%) low
-- latency 1274058683130 # 2.9%
-- bandwidth 931358134030 # 2.1%
backend 28704712440615 # 64.8% (64.8%)
-- cpu 8592742892623 # 19.4%
-- memory 20111969547992 # 45.4%
speculation 1558879121001 # 3.5% ( 3.5%)
-- branch mispredict 936831923609 # 2.1%
-- pipeline restart 622047197392 # 1.4%
smt-contention 2030399575 # 0.0% ( 0.0%)
cpu-cycles 21934959167797 # 0.28 GHz
instructions 33302060161144 # 1.52 IPC
instructions 11110726378722 # 71.100 l2 access per 1000 inst
l2 hit from l1 459381538859 # 30.57% l2 miss
l2 miss from l1 47924498024 #
l2 hit from l2 pf 137056719572 #
l3 hit from l2 pf 82624785013 #
l3 miss from l2 pf 110904485231 #
instructions 11098484048000 # 533.624 float per 1000 inst
float 512 368 # 0.000 AVX-512 per 1000 inst
float 256 680 # 0.000 AVX-256 per 1000 inst
float 128 5922414961441 # 533.624 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 267 # 0.000 scalar per 1000 inst
Intel metrics
elapsed 6678.941
on_cpu 0.061 # 0.98 / 16 cores
utime 6501.062
stime 17.364
nvcsw 5597 # 15.99%
nivcsw 29404 # 84.01%
inblock 71384 # 10.69/sec
onblock 11211128 # 1678.58/sec
cpu-clock 6519156639830 # 6519.157 seconds
task-clock 6519224954320 # 6519.225 seconds
page faults 12112559 # 1857.975/sec
context switches 67064 # 10.287/sec
cpu migrations 2142 # 0.329/sec
major page faults 400 # 0.061/sec
minor page faults 12111079 # 1857.748/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 2472460569494 # 60.235 branches per 1000 inst
branch misses 14595245167 # 0.59% branch miss
conditional 2472460618294 # 60.235 conditional branches per 1000 inst
indirect 67286701778 # 1.639 indirect branches per 1000 inst
slots 154241019088376 #
retiring 59456611512691 # 38.5% (38.5%)
-- ucode 4120344351436 # 2.7%
-- fastpath 55336267161255 # 35.9%
frontend 12168581990389 # 7.9% ( 7.9%)
-- latency 7353937926030 # 4.8%
-- bandwidth 4814644064359 # 3.1%
backend 82167689791228 # 53.3% (53.3%)
-- cpu 41742511040976 # 27.1%
-- memory 40425178750252 # 26.2%
speculation 3387869739534 # 2.2% ( 2.2%)
-- branch mispredict 2885268249368 # 1.9%
-- pipeline restart 502601490166 # 0.3%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 17004366686201 # 0.22 GHz
instructions 36491717172578 # 2.15 IPC
l2 access 2002446821273 # 54.885 l2 access per 1000 inst
l2 miss 1073448003882 # 53.61% l2 miss
Process overview gives each executable a different name. tfft2 looks like it dominates the time.
665 processes
52 tfft2 3221.60 3.17
12 ac 67.65 0.00
19 mdbx 66.41 0.00
17 linpk 51.55 0.16
10 doduc 49.52 0.00
10 air 11.34 0.00
34 clinfo 9.26 3.92
6 f951 6.87 0.26
19 vulkaninfo 0.57 0.76
6 clang 0.08 0.04
3 glxinfo:gdrv0 0.07 0.04
3 glxinfo:gl0 0.07 0.04
2 vulkani:disk$0 0.06 0.08
6 ld 0.06 0.02
1 llvmpipe-0 0.03 0.04
1 llvmpipe-1 0.03 0.04
1 llvmpipe-10 0.03 0.04
1 llvmpipe-11 0.03 0.04
1 llvmpipe-12 0.03 0.04
1 llvmpipe-13 0.03 0.04
1 llvmpipe-14 0.03 0.04
1 llvmpipe-15 0.03 0.04
1 llvmpipe-2 0.03 0.04
1 llvmpipe-3 0.03 0.04
1 llvmpipe-4 0.03 0.04
1 llvmpipe-5 0.03 0.04
1 llvmpipe-6 0.03 0.04
1 llvmpipe-7 0.03 0.04
1 llvmpipe-8 0.03 0.04
1 llvmpipe-9 0.03 0.04
1 glxinfo 0.03 0.02
1 glxinfo:cs0 0.03 0.02
1 glxinfo:disk$0 0.03 0.02
1 glxinfo:sh0 0.03 0.02
1 glxinfo:shlo0 0.03 0.02
6 as 0.01 0.00
1 ps 0.00 0.01
253 sh 0.00 0.00
35 rm 0.00 0.00
15 cat 0.00 0.00
15 mv 0.00 0.00
13 gcc 0.00 0.00
11 gsettings 0.00 0.00
8 systemd-detect- 0.00 0.00
8 which 0.00 0.00
7 stat 0.00 0.00
6 collect2 0.00 0.00
6 gfortran 0.00 0.00
6 llvm-link 0.00 0.00
5 pbharness 0.00 0.00
5 polyhedron 0.00 0.00
5 zip 0.00 0.00
4 phoronix-test-s 0.00 0.00
3 gmain 0.00 0.00
1 cc 0.00 0.00
1 date 0.00 0.00
1 dconf worker 0.00 0.00
1 dirname 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lscpu 0.00 0.00
1 mktemp 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
1 xset 0.00 0.00
13 processes running
47 maximum processes
An example of the computation block for tfft2. It looks like it was invoked many time, perhaps it didn’t reach tolerance easily… It also looks like it crashed part way through.
202655) ?? cpu=0 start=311.22 finish=0.00
202656) rm cpu=5 start=311.22 finish=311.22
202657) which cpu=15 start=311.22 finish=311.22
202658) ?? cpu=0 start=311.22 finish=0.00
202659) sh cpu=10 start=311.23 finish=311.23
202660) rm cpu=14 start=311.23 finish=311.23
202661) sh cpu=5 start=311.23 finish=311.23
202662) rm cpu=7 start=311.23 finish=311.23
202663) sh cpu=7 start=311.23 finish=311.23
202664) rm cpu=10 start=311.23 finish=311.23
202665) sh cpu=5 start=311.23 finish=311.24
202666) rm cpu=14 start=311.24 finish=311.24
202667) sh cpu=7 start=311.24 finish=311.59
202668) gfortran cpu=3 start=311.24 finish=311.59
202669) f951 cpu=5 start=311.24 finish=311.54
202670) as cpu=13 start=311.55 finish=311.55
202671) collect2 cpu=4 start=311.56 finish=311.59
202672) ld cpu=13 start=311.56 finish=311.59
202673) sh cpu=10 start=311.59 finish=378.36
202674) tfft2 cpu=14 start=311.59 finish=378.35
202675) sh cpu=10 start=378.36 finish=441.60
202676) tfft2 cpu=3 start=378.36 finish=441.59
202677) sh cpu=10 start=441.61 finish=508.63
202678) tfft2 cpu=4 start=441.61 finish=508.62
202680) sh cpu=3 start=508.63 finish=574.73
202681) tfft2 cpu=12 start=508.63 finish=574.72
202682) sh cpu=2 start=574.74 finish=639.72
202683) tfft2 cpu=12 start=574.74 finish=639.71
202684) sh cpu=2 start=639.72 finish=702.28
202685) tfft2 cpu=11 start=639.72 finish=702.27
202686) sh cpu=2 start=702.28 finish=766.45
202687) tfft2 cpu=12 start=702.28 finish=766.44
202689) sh cpu=2 start=766.45 finish=828.73
202690) tfft2 cpu=11 start=766.45 finish=828.72
202693) sh cpu=2 start=828.74 finish=894.45
202694) tfft2 cpu=12 start=828.74 finish=894.44
202695) sh cpu=3 start=894.45 finish=960.28
202696) tfft2 cpu=4 start=894.45 finish=960.27
202698) sh cpu=2 start=960.28 finish=1021.84
202699) tfft2 cpu=11 start=960.28 finish=1021.83
202700) sh cpu=2 start=1021.84 finish=1085.58
202701) tfft2 cpu=12 start=1021.84 finish=1085.56
202702) sh cpu=2 start=1085.58 finish=1148.30
202703) tfft2 cpu=3 start=1085.58 finish=1148.29
202705) sh cpu=2 start=1148.30 finish=1214.35
202706) tfft2 cpu=4 start=1148.30 finish=1214.34
202707) sh cpu=2 start=1214.35 finish=1277.37
202708) tfft2 cpu=11 start=1214.35 finish=1277.36
202709) sh cpu=11 start=1277.37 finish=1341.02
202710) tfft2 cpu=4 start=1277.37 finish=1341.01
202744) sh cpu=13 start=1341.02 finish=1405.75
202745) tfft2 cpu=14 start=1341.02 finish=1405.74
202746) sh cpu=2 start=1405.75 finish=1467.64
202747) tfft2 cpu=3 start=1405.75 finish=1467.63
202750) sh cpu=3 start=1467.64 finish=1528.10
202751) tfft2 cpu=4 start=1467.64 finish=1528.08
202752) sh cpu=2 start=1528.10 finish=1589.74
202753) tfft2 cpu=11 start=1528.10 finish=1589.73
202754) sh cpu=2 start=1589.74 finish=1650.78
202755) tfft2 cpu=14 start=1589.74 finish=1650.77
202757) sh cpu=2 start=1650.78 finish=1711.63
202758) tfft2 cpu=11 start=1650.78 finish=1711.62
202759) sh cpu=2 start=1711.63 finish=1777.84
202760) tfft2 cpu=12 start=1711.63 finish=1777.83
202776) sh cpu=3 start=1777.85 finish=1837.28
202777) tfft2 cpu=5 start=1777.85 finish=1837.27
202786) sh cpu=10 start=1837.28 finish=1901.39
202787) tfft2 cpu=4 start=1837.28 finish=1901.38
202788) sh cpu=3 start=1901.39 finish=1966.72
202789) tfft2 cpu=4 start=1901.39 finish=1966.71
202790) sh cpu=10 start=1966.72 finish=2024.52
202791) tfft2 cpu=11 start=1966.73 finish=2024.51
202830) sh cpu=11 start=2024.53 finish=2087.31
202831) tfft2 cpu=12 start=2024.53 finish=2087.30
202835) sh cpu=10 start=2087.31 finish=2153.38
202836) tfft2 cpu=4 start=2087.31 finish=2153.36
202839) sh cpu=10 start=2153.38 finish=2213.02
202840) tfft2 cpu=11 start=2153.38 finish=2213.01
202841) sh cpu=11 start=2213.03 finish=2275.35
202842) tfft2 cpu=4 start=2213.03 finish=2275.34
202843) sh cpu=2 start=2275.35 finish=2336.29
202844) tfft2 cpu=11 start=2275.36 finish=2336.28
202845) sh cpu=11 start=2336.29 finish=2399.39
202846) tfft2 cpu=4 start=2336.29 finish=2399.38
202847) sh cpu=10 start=2399.39 finish=2459.42
202750) sh cpu=3 start=1467.64 finish=1528.10
202751) tfft2 cpu=4 start=1467.64 finish=1528.08
202752) sh cpu=2 start=1528.10 finish=1589.74
202753) tfft2 cpu=11 start=1528.10 finish=1589.73
202754) sh cpu=2 start=1589.74 finish=1650.78
202755) tfft2 cpu=14 start=1589.74 finish=1650.77
202757) sh cpu=2 start=1650.78 finish=1711.63
202758) tfft2 cpu=11 start=1650.78 finish=1711.62
202759) sh cpu=2 start=1711.63 finish=1777.84
202760) tfft2 cpu=12 start=1711.63 finish=1777.83
202776) sh cpu=3 start=1777.85 finish=1837.28
202777) tfft2 cpu=5 start=1777.85 finish=1837.27
202786) sh cpu=10 start=1837.28 finish=1901.39
202787) tfft2 cpu=4 start=1837.28 finish=1901.38
202788) sh cpu=3 start=1901.39 finish=1966.72
202789) tfft2 cpu=4 start=1901.39 finish=1966.71
202790) sh cpu=10 start=1966.72 finish=2024.52
202791) tfft2 cpu=11 start=1966.73 finish=2024.51
202830) sh cpu=11 start=2024.53 finish=2087.31
202831) tfft2 cpu=12 start=2024.53 finish=2087.30
202835) sh cpu=10 start=2087.31 finish=2153.38
202836) tfft2 cpu=4 start=2087.31 finish=2153.36
202839) sh cpu=10 start=2153.38 finish=2213.02
202840) tfft2 cpu=11 start=2153.38 finish=2213.01
202841) sh cpu=11 start=2213.03 finish=2275.35
202842) tfft2 cpu=4 start=2213.03 finish=2275.34
202843) sh cpu=2 start=2275.35 finish=2336.29
202844) tfft2 cpu=11 start=2275.36 finish=2336.28
202845) sh cpu=11 start=2336.29 finish=2399.39
202846) tfft2 cpu=4 start=2336.29 finish=2399.38
202847) sh cpu=10 start=2399.39 finish=2459.42
202848) tfft2 cpu=11 start=2399.39 finish=2459.41
202850) sh cpu=4 start=2459.42 finish=2523.13
202851) tfft2 cpu=5 start=2459.42 finish=2523.12
202853) sh cpu=10 start=2523.14 finish=2583.64
202854) tfft2 cpu=11 start=2523.14 finish=2583.63
202987) sh cpu=12 start=2583.64 finish=2644.09
202988) tfft2 cpu=13 start=2583.65 finish=2644.07
202994) sh cpu=3 start=2644.09 finish=2706.17
202995) tfft2 cpu=4 start=2644.09 finish=2706.16
202996) sh cpu=10 start=2706.18 finish=2764.61
202997) tfft2 cpu=4 start=2706.18 finish=2764.60
203001) sh cpu=10 start=2764.61 finish=2825.18
203002) tfft2 cpu=11 start=2764.62 finish=2825.17
203003) sh cpu=9 start=2825.18 finish=2869.45
203004) tfft2 cpu=2 start=2825.18 finish=2869.44
203012) sh cpu=11 start=2869.45 finish=2933.77
203013) tfft2 cpu=4 start=2869.45 finish=2933.76
203017) sh cpu=9 start=2933.77 finish=2995.04
203018) tfft2 cpu=2 start=2933.78 finish=2995.03
203019) sh cpu=9 start=2995.04 finish=3054.13
203020) tfft2 cpu=11 start=2995.04 finish=3054.12
203021) sh cpu=9 start=3054.13 finish=3116.91
203022) tfft2 cpu=10 start=3054.13 finish=3116.89
203023) sh cpu=11 start=3116.91 finish=3174.87
203024) tfft2 cpu=12 start=3116.91 finish=3174.86
203058) sh cpu=1 start=3174.87 finish=3233.53
203059) tfft2 cpu=10 start=3174.87 finish=3233.52
203063) sh cpu=1 start=3233.53 finish=3292.30
203064) tfft2 cpu=11 start=3233.53 finish=3292.29
203066) sh cpu=1 start=3292.30 finish=3353.34
203067) tfft2 cpu=2 start=3292.30 finish=3353.33
203068) sh cpu=1 start=3353.34 finish=3414.59
203069) tfft2 cpu=2 start=3353.34 finish=3414.58
203070) sh cpu=11 start=3414.59 finish=3475.60
203071) tfft2 cpu=4 start=3414.59 finish=3475.59
203072) sh cpu=1 start=3475.60 finish=3537.73
203073) tfft2 cpu=10 start=3475.61 finish=3537.72
203076) ?? cpu=0 start=3537.73 finish=0.00
203077) ?? cpu=0 start=3537.73 finish=0.00
