A python library which uses tensorflow and measures various AI models. One test that reports both training and inference scores. The number of running processes seems to bounce to the number of cores and single threaded and then in between. Looks like it also varies with sub workloads.

Topdown metrics are similar to the “tensorflow” test in being heavily backend bound. That test has more regular patterns while this one goes to different sub tests.

AMD metrics show not much floating point code and a high backend stalls and low retiring and frontend stalls. This report is one of the first with opcache, tlb and icache statistics.

elapsed              1051.310
on_cpu               0.735          # 11.76 / 16 cores
utime                12130.094
stime                232.507
nvcsw                15457321       # 97.25%
nivcsw               436597         # 2.75%
inblock              0              # 0.00/sec
onblock              12584          # 11.97/sec
cpu-clock            12368006983271 # 12368.007 seconds
task-clock           12373062836640 # 12373.063 seconds
page faults          26680062       # 2156.302/sec
context switches     15898993       # 1284.968/sec
cpu migrations       3022862        # 244.310/sec
major page faults    7              # 0.001/sec
minor page faults    26680055       # 2156.302/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             957490729882   # 31.389 branches per 1000 inst
branch misses        13287236219    # 1.39% branch miss
conditional          801168210108   # 26.264 conditional branches per 1000 inst
indirect             23166096714    # 0.759 indirect branches per 1000 inst
cpu-cycles           49761243136903 # 2.95 GHz
instructions         30637757626442 # 0.62 IPC low
slots                99453847169220 #
retiring             10179875202861 # 10.2% (12.1%) low
-- ucode             2580640907     #     0.0%
-- fastpath          10177294561954 #    10.2%
frontend             3462384584201  #  3.5% ( 4.1%) low
-- latency           2663176605084  #     2.7%
-- bandwidth         799207979117   #     0.8%
backend              70000221724395 # 70.4% (83.5%) high
-- cpu               36281460856251 #    36.5%
-- memory            33718760868144 #    33.9%
speculation          141920805302   #  0.1% ( 0.2%) low
-- branch mispredict 124024703083   #     0.1%
-- pipeline restart  17896102219    #     0.0%
smt-contention       15668164812391 # 15.8% ( 0.0%)
cpu-cycles           49849520997432 # 2.96 GHz
instructions         30646476066015 # 0.61 IPC low
instructions         10204482310837 # 131.096 l2 access per 1000 inst
l2 hit from l1       992671755236   # 12.84% l2 miss
l2 miss from l1      56974474740    #
l2 hit from l2 pf    230317005078   #
l3 hit from l2 pf    74512435238    #
l3 miss from l2 pf   40261901613    #
instructions         10210314170078 # 31.654 float per 1000 inst
float 512            68             # 0.000 AVX-512 per 1000 inst
float 256            15380490758    # 1.506 AVX-256 per 1000 inst
float 128            307817367452   # 30.148 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         6              # 0.000 scalar per 1000 inst
instructions         2668409        #
opcache              996007         # 373.259 opcache per 1000 inst
opcache miss         537203         # 53.9% opcache miss rate
l1 dTLB miss         4865           # 1.823 L1 dTLB per 1000 inst
l2 dTLB miss         990            # 0.371 L2 dTLB per 1000 inst
instructions         2715789        #
icache               1316573        # 484.785 icache per 1000 inst
icache miss          112561         #  8.5% icache miss rate
l1 iTLB miss         13             # 0.005 L1 iTLB per 1000 inst
l2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst
tlb flush            18             # 0.007 TLB flush per 1000 inst

Intel metrics

elapsed              1337.941
on_cpu               0.494          # 7.91 / 16 cores
utime                10366.642
stime                215.507
nvcsw                8160500        # 93.34%
nivcsw               582164         # 6.66%
inblock              148904         # 111.29/sec
onblock              1344           # 1.00/sec
cpu-clock            10557710888684 # 10557.711 seconds
task-clock           10563300378839 # 10563.300 seconds
page faults          22965931       # 2174.125/sec
context switches     8749217        # 828.265/sec
cpu migrations       2136601        # 202.266/sec
major page faults    724            # 0.069/sec
minor page faults    22965207       # 2174.056/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             1798252958796  # 41.581 branches per 1000 inst
branch misses        8481393140     # 0.47% branch miss
conditional          1798252972524  # 41.581 conditional branches per 1000 inst
indirect             548815196792   # 12.690 indirect branches per 1000 inst
slots                58533044377064 #
retiring             23888560940405 # 40.8% (40.8%)
-- ucode             1622686986715  #     2.8%
-- fastpath          22265873953690 #    38.0%
frontend             7616630872137  # 13.0% (13.0%)
-- latency           5497768921797  #     9.4%
-- bandwidth         2118861950340  #     3.6%
backend              25098776551734 # 42.9% (42.9%)
-- cpu               11916166002514 #    20.4%
-- memory            13182610549220 #    22.5%
speculation          2054328069108  #  3.5% ( 3.5%)
-- branch mispredict 1869991804975  #     3.2%
-- pipeline restart  184336264133   #     0.3%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           31128010635019 # 1.46 GHz
instructions         57746411381473 # 1.86 IPC
l2 access            1262468579018  # 48.640 l2 access per 1000 inst
l2 miss              297475919726   # 23.56% l2 miss

Process overview shows mostly invocations of python

410 processes
	 54 python3              600264.00  9015.82
	 68 clinfo                  17.19     5.00
	 38 vulkaninfo               1.15     0.96
	  4 vulkani:disk$0           0.13     0.11
	  6 php                      0.11     0.23
	  6 glxinfo:gdrv0            0.08     0.09
	  6 glxinfo:gl0              0.08     0.09
	  2 llvmpipe-0               0.07     0.06
	  2 llvmpipe-1               0.07     0.06
	  2 llvmpipe-10              0.07     0.06
	  2 llvmpipe-11              0.07     0.06
	  2 llvmpipe-12              0.07     0.06
	  2 llvmpipe-13              0.07     0.06
	  2 llvmpipe-14              0.07     0.06
	  2 llvmpipe-15              0.07     0.06
	  2 llvmpipe-2               0.07     0.06
	  2 llvmpipe-3               0.07     0.06
	  2 llvmpipe-4               0.07     0.06
	  2 llvmpipe-5               0.07     0.06
	  2 llvmpipe-6               0.07     0.06
	  2 llvmpipe-7               0.07     0.06
	  2 llvmpipe-8               0.07     0.06
	  2 llvmpipe-9               0.07     0.06
	  2 glxinfo                  0.05     0.04
	  6 clang                    0.04     0.08
	  2 glxinfo:cs0              0.04     0.04
	  2 glxinfo:disk$0           0.04     0.04
	  2 glxinfo:sh0              0.04     0.04
	  2 glxinfo:shlo0            0.04     0.04
	  3 rocminfo                 0.03     0.00
	  1 lspci                    0.00     0.02
	 81 sh                       0.00     0.00
	 12 gcc                      0.00     0.00
	 11 gsettings                0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  4 uname                    0.00     0.00
	  3 gmain                    0.00     0.00
	  3 lscpu                    0.00     0.00
	  2 dconf worker             0.00     0.00
	  2 dmesg                    0.00     0.00
	  2 file                     0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 ai-benchmark             0.00     0.00
	  1 cat                      0.00     0.00
	  1 cc                       0.00     0.00
	  1 date                     0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 ps                       0.00     0.00
	  1 python                   0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 sysctl                   0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
1 processes running
61 maximum processes

An example computation block

      206953) ai-benchmark     cpu=12 start=5.67  finish=1046.64
        206954) python3          cpu=2 start=5.68  finish=1046.37
          206955) python3          cpu=15 start=5.86  finish=1046.37
          206956) python3          cpu=0 start=5.86  finish=1046.37
          206957) python3          cpu=1 start=5.86  finish=1046.37
          206958) python3          cpu=10 start=5.86  finish=1046.37
          206959) python3          cpu=11 start=5.86  finish=1046.37
          206960) python3          cpu=13 start=5.86  finish=1046.37
          206961) python3          cpu=12 start=5.86  finish=1046.37
          206962) python3          cpu=14 start=5.86  finish=1046.37
          206963) python3          cpu=7 start=5.86  finish=1046.37
          206964) python3          cpu=8 start=5.86  finish=1046.37
          206965) python3          cpu=9 start=5.86  finish=1046.37
          206966) python3          cpu=6 start=5.86  finish=1046.37
          206967) python3          cpu=3 start=5.86  finish=1046.37
          206968) python3          cpu=5 start=5.86  finish=1046.37
          206969) python3          cpu=4 start=5.86  finish=1046.37
          206970) file             cpu=7 start=6.72  finish=6.73 
          206971) uname            cpu=1 start=6.73  finish=6.73 
          206972) python3          cpu=15 start=6.73  finish=7.80 
            206973) file             cpu=1 start=6.75  finish=6.76 
            206974) uname            cpu=10 start=6.76  finish=6.76 
            206975) cat              cpu=4 start=6.76  finish=6.76 
            206976) lscpu            cpu=1 start=6.76  finish=6.77 
            206977) sysctl           cpu=10 start=6.77  finish=6.77 
            206978) dmesg            cpu=4 start=6.77  finish=6.78 
            206979) python3          cpu=1 start=6.78  finish=7.79 
              206981) ?? cpu=0 start=7.79  finish=0.00 
          206982) python3          cpu=13 start=7.80  finish=1046.37
          206983) python3          cpu=0 start=7.80  finish=1046.37
          206984) python3          cpu=11 start=7.80  finish=1046.37
          206985) python3          cpu=9 start=7.80  finish=1046.37
          206986) python3          cpu=9 start=7.80  finish=1046.37
          206987) python3          cpu=8 start=7.80  finish=1046.37
          206988) python3          cpu=15 start=7.80  finish=1046.37
          206989) python3          cpu=6 start=7.80  finish=1046.37
          206990) python3          cpu=13 start=7.80  finish=1046.37
          206991) python3          cpu=5 start=7.80  finish=1046.37
          206992) python3          cpu=10 start=7.80  finish=1046.37
          206993) python3          cpu=14 start=7.80  finish=1046.37
          206994) python3          cpu=12 start=7.80  finish=1046.37
          206995) python3          cpu=3 start=7.80  finish=1046.37
          206996) python3          cpu=4 start=7.80  finish=1046.37
          206997) python3          cpu=12 start=7.81  finish=1046.37
          206998) python3          cpu=11 start=7.81  finish=7.81 
          207000) python3          cpu=7 start=14.34 finish=1046.37
          207001) python3          cpu=7 start=14.34 finish=1046.37
          207002) python3          cpu=5 start=14.34 finish=1046.37
          207003) python3          cpu=1 start=14.34 finish=1046.37
          207004) python3          cpu=1 start=14.34 finish=1046.37
          207005) python3          cpu=1 start=14.34 finish=1046.37
          207006) python3          cpu=2 start=14.34 finish=1046.37
          207007) python3          cpu=4 start=14.34 finish=1046.37
          207008) python3          cpu=6 start=14.34 finish=1046.37
          207009) python3          cpu=8 start=14.34 finish=1046.37
          207010) python3          cpu=3 start=14.34 finish=1046.37
          207011) python3          cpu=10 start=14.35 finish=1046.37
          207012) python3          cpu=14 start=14.35 finish=1046.37
          207013) python3          cpu=14 start=14.35 finish=1046.37
          207014) python3          cpu=13 start=14.35 finish=1046.37
          207015) python3          cpu=12 start=14.35 finish=1046.37
          207016) python3          cpu=2 start=14.35 finish=1046.37
          207017) python3          cpu=7 start=14.70 finish=14.70