A real-time vector ray-tracing engine. There are eight subtests. These run on all cores.

Topdown profile shows these mostly backend bound with few frontend stalls.

AMD metrics show backend stalls split between CPU and memory. There is not much floating point code.

elapsed              625.743
on_cpu               0.714          # 11.42 / 16 cores
utime                7119.296
stime                28.262
nvcsw                543310         # 80.96%
nivcsw               127794         # 19.04%
inblock              0              # 0.00/sec
onblock              14424          # 23.05/sec
cpu-clock            7147322701511  # 7147.323 seconds
task-clock           7147938611459  # 7147.939 seconds
page faults          9605265        # 1343.781/sec
context switches     674000         # 94.293/sec
cpu migrations       2367           # 0.331/sec
major page faults    77             # 0.011/sec
minor page faults    9605188        # 1343.770/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             2213138560582  # 83.291 branches per 1000 inst
branch misses        17501902038    # 0.79% branch miss
conditional          1933794071567  # 72.778 conditional branches per 1000 inst
indirect             163873148      # 0.006 indirect branches per 1000 inst
cpu-cycles           26735053193850 # 2.68 GHz
instructions         26540275610856 # 0.99 IPC
slots                53478182239044 #
retiring             9404982091633  # 17.6% (23.2%)
-- ucode             23083359398    #     0.0%
-- fastpath          9381898732235  #    17.5%
frontend             1432749254238  #  2.7% ( 3.5%) low
-- latency           913772041278   #     1.7%
-- bandwidth         518977212960   #     1.0%
backend              29314007637038 # 54.8% (72.2%) high
-- cpu               15535825778777 #    29.1%
-- memory            13778181858261 #    25.8%
speculation          433053049363   #  0.8% ( 1.1%)
-- branch mispredict 301142558895   #     0.6%
-- pipeline restart  131910490468   #     0.2%
smt-contention       12893301100630 # 24.1% ( 0.0%)
cpu-cycles           26684097340491 # 2.63 GHz
instructions         26508111828655 # 0.99 IPC
instructions         8838043367721  # 217.120 l2 access per 1000 inst
l2 hit from l1       1452674025909  # 0.93% l2 miss
l2 miss from l1      11743811973    #
l2 hit from l2 pf    460145048721   #
l3 hit from l2 pf    4891501169     #
l3 miss from l2 pf   1201823009     #
instructions         8829963878485  # 31.739 float per 1000 inst
float 512            85             # 0.000 AVX-512 per 1000 inst
float 256            468            # 0.000 AVX-256 per 1000 inst
float 128            280256805592   # 31.739 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         0              # 0.000 scalar per 1000 inst
instructions         26550065692299 #
opcache              4885638845761  # 184.016 opcache per 1000 inst
opcache miss         188790824089   #  3.9% opcache miss rate
l1 dTLB miss         218391826788   # 8.226 L1 dTLB per 1000 inst
l2 dTLB miss         2754869850     # 0.104 L2 dTLB per 1000 inst
instructions         26547412969503 #
icache               244980878191   # 9.228 icache per 1000 inst
icache miss          21417678897    #  8.7% icache miss rate
l1 iTLB miss         9824289        # 0.000 L1 iTLB per 1000 inst
l2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst
tlb flush            42902          # 0.000 TLB flush per 1000 inst

Intel metrics show most of the backend memory stalls are L1 and L2

elapsed              628.053
on_cpu               0.558          # 8.93 / 16 cores
utime                5593.746
stime                15.852
nvcsw                225970         # 79.57%
nivcsw               58003          # 20.43%
inblock              3232           # 5.15/sec
onblock              2664           # 4.24/sec
cpu-clock            5608322435946  # 5608.322 seconds
task-clock           5608584641226  # 5608.585 seconds
page faults          9542595        # 1701.427/sec
context switches     286886         # 51.151/sec
cpu migrations       3881           # 0.692/sec
major page faults    104            # 0.019/sec
minor page faults    9542491        # 1701.408/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             1680443054208  # 85.552 branches per 1000 inst
branch misses        10375087163    # 0.62% branch miss
conditional          1680443078208  # 85.552 conditional branches per 1000 inst
indirect             515691696613   # 26.254 indirect branches per 1000 inst
slots                21897839692094 #
retiring             9731326497537  # 44.4% (44.4%)
-- ucode             413075777081   #     1.9%
-- fastpath          9318250720456  #    42.6%
frontend             4331544860416  # 19.8% (19.8%)
-- latency           3653140251914  #    16.7%
-- bandwidth         678404608502   #     3.1%
backend              7277613673818  # 33.2% (33.2%)
-- cpu               2606213574374  #    11.9%
-- memory            4671400099444  #    21.3%
speculation          604514458652   #  2.8% ( 2.8%)
-- branch mispredict 537968863823   #     2.5%
-- pipeline restart  66545594829    #     0.3%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           13966210798435 # 1.34 GHz
instructions         19034644351405 # 1.36 IPC
l2 access            516811085855   # 52.304 l2 access per 1000 inst
l2 miss              5176108418     # 1.00% l2 miss
cpu-cycles           7239863197863  # 37.8% memory latency
load stalls          2649725159323  # 16.7% l1 bound
l1 miss              1442347735969  # 19.8% l2 bound
l2 miss              9228044552     #  0.1% l3 bound
l3 miss              4311222790     #  0.1% dram bound
store_stalls         85504284960    #  1.2% store bound

Process overview shows RooT.x64f64 as the primary process

796 processes
	408 RooT.x64f64          121183.97   394.41
	 68 clinfo                  15.21     7.64
	 38 vulkaninfo               1.33     1.15
	  6 php                      0.14     0.15
	  4 vulkani:disk$0           0.14     0.12
	  6 glxinfo:gdrv0            0.09     0.09
	  6 glxinfo:gl0              0.09     0.09
	  6 clang                    0.08     0.04
	  2 llvmpipe-0               0.07     0.06
	  2 llvmpipe-1               0.07     0.06
	  2 llvmpipe-10              0.07     0.06
	  2 llvmpipe-11              0.07     0.06
	  2 llvmpipe-12              0.07     0.06
	  2 llvmpipe-13              0.07     0.06
	  2 llvmpipe-14              0.07     0.06
	  2 llvmpipe-15              0.07     0.06
	  2 llvmpipe-2               0.07     0.06
	  2 llvmpipe-3               0.07     0.06
	  2 llvmpipe-4               0.07     0.06
	  2 llvmpipe-5               0.07     0.06
	  2 llvmpipe-6               0.07     0.06
	  2 llvmpipe-7               0.07     0.06
	  2 llvmpipe-8               0.07     0.06
	  2 llvmpipe-9               0.07     0.06
	  2 glxinfo                  0.05     0.03
	  2 glxinfo:cs0              0.05     0.03
	  2 glxinfo:disk$0           0.05     0.03
	  2 glxinfo:sh0              0.05     0.03
	  2 glxinfo:shlo0            0.05     0.03
	  3 rocminfo                 0.03     0.00
	  1 lspci                    0.01     0.01
	  1 ps                       0.00     0.01
	 97 sh                       0.00     0.00
	 24 quadray                  0.00     0.00
	 13 gcc                      0.00     0.00
	 13 gsettings                0.00     0.00
	  9 systemd-detect-          0.00     0.00
	  8 stat                     0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  3 gmain                    0.00     0.00
	  2 cc                       0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 date                     0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
47 maximum processes

Computation blocks

      22741) quadray          cpu=13 start=5.64  finish=25.78
        22742) RooT.x64f64      cpu=9 start=5.64  finish=25.78
          22743) RooT.x64f64      cpu=0 start=5.64  finish=25.77
          22744) RooT.x64f64      cpu=1 start=5.64  finish=25.77
          22745) RooT.x64f64      cpu=2 start=5.64  finish=25.77
          22746) RooT.x64f64      cpu=3 start=5.64  finish=25.77
          22747) RooT.x64f64      cpu=4 start=5.64  finish=25.77
          22748) RooT.x64f64      cpu=5 start=5.64  finish=25.77
          22749) RooT.x64f64      cpu=6 start=5.64  finish=25.77
          22750) RooT.x64f64      cpu=7 start=5.64  finish=25.77
          22751) RooT.x64f64      cpu=8 start=5.64  finish=25.77
          22752) RooT.x64f64      cpu=9 start=5.64  finish=25.77
          22753) RooT.x64f64      cpu=10 start=5.64  finish=25.77
          22754) RooT.x64f64      cpu=11 start=5.64  finish=25.77
          22755) RooT.x64f64      cpu=12 start=5.64  finish=25.77
          22756) RooT.x64f64      cpu=13 start=5.64  finish=25.77
          22757) RooT.x64f64      cpu=14 start=5.64  finish=25.77
          22758) RooT.x64f64      cpu=15 start=5.64  finish=25.77