Benchmark for the ffmpeg multimedia framework, working with various video and image workloads. Eight different workloads with slightly different characteristics but moderate retirement rate.

AMD metrics show we’re spending ~3.62 of cores, with a floating point app and some workloads with more branch prediction issues.

elapsed              3516.551
on_cpu               0.226          # 3.62 / 16 cores
utime                12390.943
stime                340.884
nvcsw                15578181       # 97.24%
nivcsw               442087         # 2.76%
inblock              544            # 0.15/sec
onblock              2240680        # 637.18/sec
cpu-clock            12698964696026 # 12698.965 seconds
task-clock           12707598404585 # 12707.598 seconds
page faults          79503821       # 6256.400/sec
context switches     16034984       # 1261.842/sec
cpu migrations       1151669        # 90.628/sec
major page faults    209            # 0.016/sec
minor page faults    79503612       # 6256.384/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             7203248030886  # 75.632 branches per 1000 inst
branch misses        240049896958   # 3.33% branch miss
conditional          4533988502930  # 47.606 conditional branches per 1000 inst
indirect             523019511730   # 5.492 indirect branches per 1000 inst
cpu-cycles           50141302219917 # 0.90 GHz
instructions         95031760041237 # 1.90 IPC
slots                101058172691034 #
retiring             33343174547595 # 33.0% (38.6%)
-- ucode             369097077701   #     0.4%
-- fastpath          32974077469894 #    32.6%
frontend             18316326101792 # 18.1% (21.2%)
-- latency           11796656848290 #    11.7%
-- bandwidth         6519669253502  #     6.5%
backend              28902503738660 # 28.6% (33.5%)
-- cpu               9238344511430  #     9.1%
-- memory            19664159227230 #    19.5%
speculation          5741462680659  #  5.7% ( 6.7%)
-- branch mispredict 5493902043181  #     5.4%
-- pipeline restart  247560637478   #     0.2%
smt-contention       14753369236746 # 14.6% ( 0.0%)
cpu-cycles           50150039568465 # 0.90 GHz
instructions         95013999407440 # 1.89 IPC
instructions         31732577834943 # 38.418 l2 access per 1000 inst
l2 hit from l1       1041329163186  # 10.60% l2 miss
l2 miss from l1      77034115956    #
l2 hit from l2 pf    125542444785   #
l3 hit from l2 pf    32137285724    #
l3 miss from l2 pf   20098536952    #
instructions         31712371266847 # 162.036 float per 1000 inst
float 512            1031           # 0.000 AVX-512 per 1000 inst
float 256            8708556624     # 0.275 AVX-256 per 1000 inst
float 128            5129851498771  # 161.762 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         431            # 0.000 scalar per 1000 inst

Intel metrics

elapsed              4537.973
on_cpu               0.224          # 3.59 / 16 cores
utime                15956.113
stime                313.231
nvcsw                17077668       # 93.39%
nivcsw               1209684        # 6.61%
inblock              872            # 0.19/sec
onblock              2240792        # 493.79/sec
cpu-clock            16173096823697 # 16173.097 seconds
task-clock           16184087266515 # 16184.087 seconds
page faults          79247994       # 4896.661/sec
context switches     18307277       # 1131.190/sec
cpu migrations       2784557        # 172.055/sec
major page faults    105            # 0.006/sec
minor page faults    79247889       # 4896.655/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             7230167358835  # 74.859 branches per 1000 inst
branch misses        237818000973   # 3.29% branch miss
conditional          7230168530419  # 74.859 conditional branches per 1000 inst
indirect             2079543575777  # 21.531 indirect branches per 1000 inst
slots                152043588937604 #
retiring             67100901539543 # 44.1% (44.1%)
-- ucode             4640783240152  #     3.1%
-- fastpath          62460118299391 #    41.1%
frontend             27983850905441 # 18.4% (18.4%)
-- latency           12653607703945 #     8.3%
-- bandwidth         15330243201496 #    10.1%
backend              33676837503150 # 22.1% (22.1%)
-- cpu               21959791561468 #    14.4%
-- memory            11717045941682 #     7.7%
speculation          24216123543815 # 15.9% (15.9%)
-- branch mispredict 23560593854272 #    15.5%
-- pipeline restart  655529689543   #     0.4%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           57065942417550 # 0.79 GHz
instructions         123132261252145 # 2.16 IPC
l2 access            1879230062109  # 29.384 l2 access per 1000 inst
l2 miss              324166645172   # 17.25% l2 miss

Process tree shows different av:: encoders being used for different tests. A lot of short-lived processes.

36594 processes
	10914 ffmpeg               185063.52  3556.32
	1080 av:h264:df15         12462.15   276.74
	1080 av:h264:df14         12462.13   276.67
	1080 av:h264:df13         12462.08   276.64
	1080 av:h264:df12         12462.03   276.58
	1080 av:h264:df11         12461.98   276.51
	1080 av:h264:df10         12461.94   276.41
	1080 av:h264:df9          12461.87   276.33
	1080 av:h264:df8          12461.82   276.29
	1080 av:h264:df7          12461.71   276.27
	1080 av:h264:df6          12461.62   276.22
	1080 av:h264:df5          12461.59   276.11
	1080 av:h264:df4          12461.48   276.06
	1080 av:h264:df3          12461.42   275.99
	1080 av:h264:df2          12461.36   275.95
	1080 av:h264:df1          12461.30   275.91
	1080 av:h264:df0          12461.21   275.80
	900 dec0:0:h264           9524.23   211.29
	900 dmx0:matroska,w       9309.61   207.19
	360 mux0:matroska         7341.88   105.99
	540 mux0:null             4586.64   125.17
	360 dmx1:matroska,w       1036.13    65.92
	180 av:hevc:df15           610.90    38.27
	180 av:hevc:df14           610.90    38.23
	180 av:hevc:df12           610.89    38.23
	180 av:hevc:df13           610.89    38.23
	180 av:hevc:df11           610.88    38.22
	180 av:hevc:df10           610.87    38.22
	180 av:hevc:df9            610.87    38.21
	180 av:hevc:df7            610.87    38.19
	180 av:hevc:df8            610.87    38.19
	180 av:hevc:df4            610.87    38.17
	180 av:hevc:df5            610.87    38.17
	180 av:hevc:df6            610.87    38.17
	180 av:hevc:df2            610.86    38.16
	180 av:hevc:df3            610.86    38.16
	180 av:hevc:df1            610.85    38.16
	180 av:hevc:df0            610.84    38.15
	180 dec1:0:hevc            607.72    36.37
	180 dec1:0:h264            522.94    34.05
	1350 ffprobe                427.14     4.24
	 64 clinfo                  10.88     3.52
	 25 python3                  1.38     0.75
	 38 vulkaninfo               0.95     0.77
	  6 php                      0.16     0.68
	  4 vulkani:disk$0           0.10     0.09
	  6 glxinfo:gdrv0            0.05     0.10
	  2 llvmpipe-0               0.05     0.05
	  2 llvmpipe-1               0.05     0.05
	  2 llvmpipe-10              0.05     0.05
	  2 llvmpipe-11              0.05     0.05
	  2 llvmpipe-12              0.05     0.05
	  2 llvmpipe-13              0.05     0.05
	  2 llvmpipe-14              0.05     0.05
	  2 llvmpipe-15              0.05     0.05
	  2 llvmpipe-2               0.05     0.05
	  2 llvmpipe-3               0.05     0.05
	  2 llvmpipe-4               0.05     0.05
	  2 llvmpipe-5               0.05     0.05
	  2 llvmpipe-6               0.05     0.05
	  2 llvmpipe-7               0.05     0.05
	  2 llvmpipe-8               0.05     0.05
	  2 llvmpipe-9               0.05     0.05
	  2 glxinfo                  0.04     0.04
	  2 glxinfo:cs0              0.04     0.04
	  2 glxinfo:disk$0           0.04     0.04
	  2 glxinfo:shlo0            0.04     0.04
	  2 glxinfo:sh0              0.03     0.04
	  6 clang                    0.02     0.04
	  1 lspci                    0.00     0.02
	464 sh                       0.00     0.00
	 13 gcc                      0.00     0.00
	  9 gsettings                0.00     0.00
	  9 stty                     0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  4 gmain                    0.00     0.00
	  3 dconf worker             0.00     0.00
	  2 cc                       0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 date                     0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 ps                       0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
65 maximum processes

An example test shows parallelism might be more limited by processes than by available cores.

     238602) ffmpeg           cpu=7 start=5.64  finish=41.62
        238603) python3          cpu=6 start=5.64  finish=41.62
          238604) ffprobe          cpu=1 start=5.65  finish=5.67 
          238605) ffprobe          cpu=2 start=5.67  finish=5.97 
          238606) ffprobe          cpu=6 start=5.97  finish=5.99 
          238607) ffmpeg           cpu=6 start=5.99  finish=7.02 
            238608) av:h264:df0      cpu=9 start=6.00  finish=7.02 
            238609) av:h264:df1      cpu=10 start=6.00  finish=7.02 
            238610) av:h264:df2      cpu=11 start=6.00  finish=7.02 
            238611) av:h264:df3      cpu=12 start=6.00  finish=7.02 
            238612) av:h264:df4      cpu=14 start=6.00  finish=7.02 
            238613) av:h264:df5      cpu=7 start=6.00  finish=7.02 
            238614) av:h264:df6      cpu=5 start=6.00  finish=7.02 
            238615) av:h264:df7      cpu=9 start=6.00  finish=7.02 
            238616) av:h264:df8      cpu=10 start=6.00  finish=7.02 
            238617) av:h264:df9      cpu=11 start=6.00  finish=7.02 
            238618) av:h264:df10     cpu=12 start=6.00  finish=7.02 
            238619) av:h264:df11     cpu=8 start=6.00  finish=7.02 
            238620) av:h264:df12     cpu=9 start=6.00  finish=7.02 
            238621) av:h264:df13     cpu=5 start=6.00  finish=7.02 
            238622) av:h264:df14     cpu=9 start=6.00  finish=7.02 
            238623) av:h264:df15     cpu=10 start=6.01  finish=7.02 
            238624) dec0:0:h264      cpu=7 start=6.01  finish=6.98 
            238625) dmx0:matroska,w  cpu=1 start=6.01  finish=6.97 
            238626) mux0:matroska    cpu=5 start=6.04  finish=7.02 
          238627) sh               cpu=9 start=7.03  finish=7.23 
            238628) ffmpeg           cpu=13 start=7.03  finish=7.22 
              238629) av:h264:df0      cpu=0 start=7.05  finish=7.22 
              238630) av:h264:df1      cpu=15 start=7.05  finish=7.22 
              238631) av:h264:df2      cpu=12 start=7.05  finish=7.22 
              238632) av:h264:df3      cpu=2 start=7.05  finish=7.22 
              238633) av:h264:df4      cpu=14 start=7.05  finish=7.22 
              238634) av:h264:df5      cpu=5 start=7.06  finish=7.22 
              238635) av:h264:df6      cpu=11 start=7.06  finish=7.22 
              238636) av:h264:df7      cpu=15 start=7.06  finish=7.22 
              238637) av:h264:df8      cpu=0 start=7.06  finish=7.22 
              238638) av:h264:df9      cpu=11 start=7.06  finish=7.22 
              238639) av:h264:df10     cpu=7 start=7.06  finish=7.22 
              238640) av:h264:df11     cpu=5 start=7.06  finish=7.22 
              238641) av:h264:df12     cpu=3 start=7.06  finish=7.22 
              238642) av:h264:df13     cpu=1 start=7.06  finish=7.22 
              238643) av:h264:df14     cpu=12 start=7.06  finish=7.22 
              238644) av:h264:df15     cpu=12 start=7.06  finish=7.22 
              238645) dec0:0:h264      cpu=6 start=7.06  finish=7.21 
              238646) av:h264:df0      cpu=3 start=7.06  finish=7.22 
              238647) av:h264:df1      cpu=10 start=7.06  finish=7.22 
              238648) av:h264:df2      cpu=4 start=7.06  finish=7.22 
              238649) av:h264:df3      cpu=9 start=7.06  finish=7.22 
              238650) av:h264:df4      cpu=3 start=7.06  finish=7.22 
              238651) av:h264:df5      cpu=7 start=7.06  finish=7.22 
              238652) av:h264:df6      cpu=2 start=7.06  finish=7.22 
              238653) av:h264:df7      cpu=12 start=7.06  finish=7.22 
              238654) av:h264:df8      cpu=10 start=7.06  finish=7.22 
              238655) av:h264:df9      cpu=0 start=7.06  finish=7.22 
              238656) av:h264:df10     cpu=14 start=7.06  finish=7.22 
              238657) av:h264:df11     cpu=2 start=7.06  finish=7.22 
              238658) av:h264:df12     cpu=10 start=7.06  finish=7.22 
              238659) av:h264:df13     cpu=7 start=7.06  finish=7.22 
              238660) av:h264:df14     cpu=11 start=7.06  finish=7.22 
              238661) av:h264:df15     cpu=14 start=7.06  finish=7.22 
              238662) dec1:0:h264      cpu=5 start=7.06  finish=7.21 
              238663) dmx0:matroska,w  cpu=6 start=7.07  finish=7.20 
              238664) dmx1:matroska,w  cpu=5 start=7.10  finish=7.20 
              238665) ffmpeg           cpu=6 start=7.13  finish=7.22 
              238666) ffmpeg           cpu=2 start=7.13  finish=7.22 
              238667) ffmpeg           cpu=8 start=7.13  finish=7.22 
              238668) ffmpeg           cpu=11 start=7.13  finish=7.22 
              238669) ffmpeg           cpu=9 start=7.13  finish=7.22 
              238670) ffmpeg           cpu=12 start=7.13  finish=7.22 
              238671) ffmpeg           cpu=5 start=7.13  finish=7.22 
              238672) ffmpeg           cpu=0 start=7.13  finish=7.22 
              238673) ffmpeg           cpu=1 start=7.13  finish=7.22 
              238674) ffmpeg           cpu=14 start=7.13  finish=7.22 
              238675) ffmpeg           cpu=10 start=7.13  finish=7.22 
              238676) ffmpeg           cpu=15 start=7.13  finish=7.22 
              238677) ffmpeg           cpu=4 start=7.13  finish=7.22 
              238678) ffmpeg           cpu=3 start=7.13  finish=7.22 
              238679) ffmpeg           cpu=7 start=7.13  finish=7.22 
              238680) mux0:null        cpu=11 start=7.13  finish=7.21 
          238681) ffprobe          cpu=12 start=7.23  finish=7.25 
          238682) ffprobe          cpu=11 start=7.25  finish=7.25 
          238683) ffprobe          cpu=12 start=7.25  finish=7.45 
          238684) ffprobe          cpu=6 start=7.45  finish=7.46 
          238685) ffmpeg           cpu=5 start=7.46  finish=7.98