A Vulkan compute benchmark. Perhaps using GPU more than CPU, but interesting to see as a workload. The scores for AMD are ~4x that of Intel and the “on cpu” for AMD is extremely low, so likely this is a GPU benchmark on AMD and not a CPU benchmark.

Topdown profile not as interesting for a GPU workload. Most of the limited time is in frontend stalls.

AMD metrics

elapsed              553.514
on_cpu               0.000          # 0.00 / 16 cores
utime                1.526
stime                1.212
nvcsw                2733           # 86.98%
nivcsw               409            # 13.02%
inblock              0              # 0.00/sec
onblock              12856          # 23.23/sec
cpu-clock            2855338425     # 2.855 seconds
task-clock           2868514388     # 2.869 seconds
page faults          188333         # 65655.240/sec
context switches     5740           # 2001.036/sec
cpu migrations       365            # 127.244/sec
major page faults    2              # 0.697/sec
minor page faults    188331         # 65654.543/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             2368093368     # 193.615 branches per 1000 inst
branch misses        95611140       # 4.04% branch miss
conditional          1562371350     # 127.739 conditional branches per 1000 inst
indirect             71829798       # 5.873 indirect branches per 1000 inst
cpu-cycles           6729727183     # 0.00 GHz
instructions         11277840703    # 1.68 IPC
slots                14128227432    #
retiring             4129006916     # 29.2% (29.2%)
-- ucode             14670688       #     0.1%
-- fastpath          4114336228     #    29.1%
frontend             6880000619     # 48.7% (48.7%) high
-- latency           5825721654     #    41.2%
-- bandwidth         1054278965     #     7.5%
backend              2397483487     # 17.0% (17.0%) low
-- cpu               367027234      #     2.6%
-- memory            2030456253     #    14.4%
speculation          712684044      #  5.0% ( 5.0%)
-- branch mispredict 705290499      #     5.0%
-- pipeline restart  7393545        #     0.1%
smt-contention       8850370        #  0.1% ( 0.0%)
cpu-cycles           6758209267     # 0.00 GHz
instructions         11770449191    # 1.74 IPC
instructions         4184068846     # 37.128 l2 access per 1000 inst
l2 hit from l1       133916262      # 20.49% l2 miss
l2 miss from l1      20575948       #
l2 hit from l2 pf    10174800       #
l3 hit from l2 pf    5159661        #
l3 miss from l2 pf   6096840        #
instructions         4024389819     # 16.687 float per 1000 inst
float 512            60             # 0.000 AVX-512 per 1000 inst
float 256            620            # 0.000 AVX-256 per 1000 inst
float 128            67152429       # 16.686 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         0              # 0.000 scalar per 1000 inst

Intel metrics

elapsed              487.511
on_cpu               0.000          # 0.00 / 16 cores
utime                1.679
stime                0.655
nvcsw                2699           # 93.98%
nivcsw               173            # 6.02%
inblock              15608          # 32.02/sec
onblock              1944           # 3.99/sec
cpu-clock            2416426268     # 2.416 seconds
task-clock           2426124568     # 2.426 seconds
page faults          173841         # 71653.782/sec
context switches     5143           # 2119.842/sec
cpu migrations       261            # 107.579/sec
major page faults    77             # 31.738/sec
minor page faults    173764         # 71622.044/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             2151066026     # 188.267 branches per 1000 inst
branch misses        28106902       # 1.31% branch miss
conditional          2151078154     # 188.268 conditional branches per 1000 inst
indirect             74254188       # 6.499 indirect branches per 1000 inst
slots                31725273092    #
retiring             10600531446    # 33.4% (33.4%)
-- ucode             1170997398     #     3.7%
-- fastpath          9429534048     #    29.7%
frontend             8715553434     # 27.5% (27.5%)
-- latency           4390001681     #    13.8%
-- bandwidth         4325551753     #    13.6%
backend              8702700014     # 27.4% (27.4%)
-- cpu               2757085515     #     8.7%
-- memory            5945614499     #    18.7%
speculation          3449049077     # 10.9% (10.9%) high
-- branch mispredict 3212998733     #    10.1%
-- pipeline restart  236050344      #     0.7%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           5320185366     # 0.00 GHz
instructions         10362909825    # 1.95 IPC
l2 access            370759599      # 36.047 l2 access per 1000 inst
l2 miss              143647561      # 38.74% l2 miss

Process overview shows the time is in the test scaffold

356 processes
	 68 clinfo                  19.85     6.32
	 38 vulkaninfo               1.14     1.50
	  6 vkpeak                   0.43     0.36
	  3 vkpeak:disk$0            0.43     0.36
	  6 glxinfo:gdrv0            0.13     0.04
	  6 glxinfo:gl0              0.13     0.04
	  4 vulkani:disk$0           0.12     0.16
	  2 glxinfo                  0.08     0.02
	  2 glxinfo:cs0              0.08     0.02
	  2 glxinfo:disk$0           0.08     0.02
	  2 glxinfo:sh0              0.08     0.02
	  2 glxinfo:shlo0            0.08     0.02
	  6 php                      0.06     0.23
	  2 llvmpipe-0               0.06     0.08
	  2 llvmpipe-1               0.06     0.08
	  2 llvmpipe-10              0.06     0.08
	  2 llvmpipe-11              0.06     0.08
	  2 llvmpipe-12              0.06     0.08
	  2 llvmpipe-13              0.06     0.08
	  2 llvmpipe-14              0.06     0.08
	  2 llvmpipe-15              0.06     0.08
	  2 llvmpipe-2               0.06     0.08
	  2 llvmpipe-3               0.06     0.08
	  2 llvmpipe-4               0.06     0.08
	  2 llvmpipe-5               0.06     0.08
	  2 llvmpipe-6               0.06     0.08
	  2 llvmpipe-7               0.06     0.08
	  2 llvmpipe-8               0.06     0.08
	  2 llvmpipe-9               0.06     0.08
	  6 clang                    0.05     0.07
	  3 rocminfo                 0.03     0.00
	  1 lspci                    0.01     0.02
	  1 ps                       0.00     0.01
	 82 sh                       0.00     0.00
	 15 gsettings                0.00     0.00
	 12 gcc                      0.00     0.00
	  9 systemd-detect-          0.00     0.00
	  8 stat                     0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 cc                       0.00     0.00
	  1 date                     0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 gmain                    0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
47 maximum processes

Computation blocks

      2603431) vkpeak           cpu=3 start=6.66  finish=185.81
        2603432) vkpeak           cpu=15 start=6.67  finish=185.81
          2603433) vkpeak:disk$0    cpu=13 start=6.70  finish=185.81
      2603439) vkpeak           cpu=4 start=189.82 finish=369.02
        2603440) vkpeak           cpu=15 start=189.82 finish=369.01
          2603441) vkpeak:disk$0    cpu=9 start=189.85 finish=369.01
      2603443) vkpeak           cpu=8 start=373.02 finish=552.26
        2603444) vkpeak           cpu=11 start=373.03 finish=552.25
          2603445) vkpeak:disk$0    cpu=15 start=373.06 finish=552.25