A GPU texture codec. We convert sRGB PNGs into Basis format. There are four test cases: one for ETC1S format and the other three are settings of UASTC.

Topdown profile shows the latter cases have a high retirement rate and lower backend stalls and even lower frontend stalls.

AMD metrics shows little L2 access. There are relatively low amounts of both floating point and branches. The retirement rate is high.
elapsed 393.438
on_cpu 0.568 # 9.09 / 16 cores
utime 3556.506
stime 19.117
nvcsw 104051 # 73.09%
nivcsw 38311 # 26.91%
inblock 0 # 0.00/sec
onblock 1034568 # 2629.56/sec
cpu-clock 3575685439676 # 3575.685 seconds
task-clock 3575739871237 # 3575.740 seconds
page faults 10123639 # 2831.201/sec
context switches 144127 # 40.307/sec
cpu migrations 546 # 0.153/sec
major page faults 2 # 0.001/sec
minor page faults 10123637 # 2831.201/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 1860603453653 # 62.324 branches per 1000 inst
branch misses 46545288283 # 2.50% branch miss
conditional 1574153948382 # 52.729 conditional branches per 1000 inst
indirect 18464690805 # 0.619 indirect branches per 1000 inst
cpu-cycles 14719060282545 # 2.34 GHz
instructions 29857793829195 # 2.03 IPC
slots 29440547407560 #
retiring 10251143761091 # 34.8% (59.5%) high
-- ucode 2526045358 # 0.0%
-- fastpath 10248617715733 # 34.8%
frontend 1591349042820 # 5.4% ( 9.2%)
-- latency 1177272934056 # 4.0%
-- bandwidth 414076108764 # 1.4%
backend 4652362002095 # 15.8% (27.0%)
-- cpu 2873256958424 # 9.8%
-- memory 1779105043671 # 6.0%
speculation 721889065401 # 2.5% ( 4.2%)
-- branch mispredict 712138305407 # 2.4%
-- pipeline restart 9750759994 # 0.0%
smt-contention 12223752882981 # 41.5% ( 0.0%)
cpu-cycles 14724190174595 # 2.34 GHz
instructions 29848045087509 # 2.03 IPC
instructions 9954066979774 # 1.926 l2 access per 1000 inst
l2 hit from l1 17922234932 # 5.54% l2 miss
l2 miss from l1 364414574 #
l2 hit from l2 pf 547494977 #
l3 hit from l2 pf 130041269 #
l3 miss from l2 pf 567968539 #
instructions 9947200217083 # 52.036 float per 1000 inst
float 512 76 # 0.000 AVX-512 per 1000 inst
float 256 426 # 0.000 AVX-256 per 1000 inst
float 128 517613926919 # 52.036 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 0 # 0.000 scalar per 1000 inst
instructions 2648711 #
opcache 982432 # 370.909 opcache per 1000 inst
opcache miss 530583 # 54.0% opcache miss rate
l1 dTLB miss 5128 # 1.936 L1 dTLB per 1000 inst
l2 dTLB miss 1081 # 0.408 L2 dTLB per 1000 inst
instructions 2702117 #
icache 1303643 # 482.452 icache per 1000 inst
icache miss 107870 # 8.3% icache miss rate
l1 iTLB miss 8 # 0.003 L1 iTLB per 1000 inst
l2 iTLB miss 0 # 0.000 L2 iTLB per 1000 inst
tlb flush 20 # 0.007 TLB flush per 1000 inst
Intel metrics
elapsed 564.028
on_cpu 0.621 # 9.94 / 16 cores
utime 5587.346
stime 19.458
nvcsw 350602 # 88.43%
nivcsw 45891 # 11.57%
inblock 119136 # 211.22/sec
onblock 1462992 # 2593.83/sec
cpu-clock 5606668790586 # 5606.669 seconds
task-clock 5606750505852 # 5606.751 seconds
page faults 12957851 # 2311.116/sec
context switches 399115 # 71.185/sec
cpu migrations 977 # 0.174/sec
major page faults 26 # 0.005/sec
minor page faults 12957825 # 2311.111/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 2581393381877 # 61.878 branches per 1000 inst
branch misses 64084539150 # 2.48% branch miss
conditional 2581393402549 # 61.878 conditional branches per 1000 inst
indirect 485684172431 # 11.642 indirect branches per 1000 inst
slots 31009559996612 #
retiring 20683755340987 # 66.7% (66.7%) high
-- ucode 900670609933 # 2.9%
-- fastpath 19783084731054 # 63.8%
frontend 4086258401910 # 13.2% (13.2%)
-- latency 2151755377775 # 6.9%
-- bandwidth 1934503024135 # 6.2%
backend 2810948298561 # 9.1% ( 9.1%) low
-- cpu 2115192794049 # 6.8%
-- memory 695755504512 # 2.2%
speculation 3566097449765 # 11.5% (11.5%) high
-- branch mispredict 3466175330406 # 11.2%
-- pipeline restart 99922119359 # 0.3%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 12506418678251 # 1.39 GHz
instructions 26601249276020 # 2.13 IPC
l2 access 42456611815 # 2.043 l2 access per 1000 inst
l2 miss 6363280059 # 14.99% l2 miss
cpu-cycles 9757175649420 # 10.0% memory latency
load stalls 965497573205 # 9.4% l1 bound
l1 miss 49236423562 # 0.2% l2 bound
l2 miss 27779716539 # 0.1% l3 bound
l3 miss 19026555974 # 0.2% dram bound
store_stalls 13608703956 # 0.1% store bound
Process overview shows basisu is the primary process
558 processes
192 basisu 56746.72 281.28
68 clinfo 16.53 6.00
38 vulkaninfo 1.14 1.14
6 php 0.13 0.08
4 vulkani:disk$0 0.12 0.12
6 glxinfo:gdrv0 0.12 0.06
6 glxinfo:gl0 0.12 0.06
2 llvmpipe-0 0.06 0.06
2 llvmpipe-1 0.06 0.06
2 llvmpipe-10 0.06 0.06
2 llvmpipe-11 0.06 0.06
2 llvmpipe-12 0.06 0.06
2 llvmpipe-13 0.06 0.06
2 llvmpipe-14 0.06 0.06
2 llvmpipe-15 0.06 0.06
2 llvmpipe-2 0.06 0.06
2 llvmpipe-3 0.06 0.06
2 llvmpipe-4 0.06 0.06
2 llvmpipe-5 0.06 0.06
2 llvmpipe-6 0.06 0.06
2 llvmpipe-7 0.06 0.06
2 llvmpipe-8 0.06 0.06
2 llvmpipe-9 0.06 0.06
2 glxinfo 0.06 0.02
2 glxinfo:cs0 0.06 0.02
2 glxinfo:disk$0 0.06 0.02
2 glxinfo:sh0 0.06 0.02
2 glxinfo:shlo0 0.06 0.02
6 clang 0.05 0.07
1 lspci 0.00 0.02
88 sh 0.00 0.00
13 gcc 0.00 0.00
13 gsettings 0.00 0.00
12 basis 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
5 phoronix-test-s 0.00 0.00
3 rocminfo 0.00 0.00
2 cc 0.00 0.00
2 gmain 0.00 0.00
2 lscpu 0.00 0.00
2 uname 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dconf worker 0.00 0.00
1 dirname 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 grep 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 ps 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sed 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
0 processes running
47 maximum processes
The process structure is simple.
667833) basis cpu=13 start=5.53 finish=29.94
667834) basisu cpu=7 start=5.53 finish=29.90
667835) basisu cpu=6 start=5.56 finish=29.90
667836) basisu cpu=13 start=5.56 finish=29.90
667837) basisu cpu=9 start=5.56 finish=29.90
667838) basisu cpu=8 start=5.56 finish=29.90
667839) basisu cpu=10 start=5.57 finish=29.90
667840) basisu cpu=11 start=5.57 finish=29.90
667841) basisu cpu=5 start=5.57 finish=29.90
667842) basisu cpu=14 start=5.57 finish=29.90
667843) basisu cpu=2 start=5.57 finish=29.90
667844) basisu cpu=0 start=5.57 finish=29.90
667845) basisu cpu=4 start=5.57 finish=29.90
667846) basisu cpu=1 start=5.57 finish=29.90
667847) basisu cpu=15 start=5.57 finish=29.90
667848) basisu cpu=7 start=5.57 finish=29.90
667849) basisu cpu=12 start=5.57 finish=29.90
