{"id":525,"date":"2024-01-14T00:43:05","date_gmt":"2024-01-14T00:43:05","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=525"},"modified":"2024-01-14T15:22:57","modified_gmt":"2024-01-14T15:22:57","slug":"uvg266","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/uvg266\/","title":{"rendered":"uvg266"},"content":{"rendered":"\n<p>uvg266 is a video encoder for VVC\/H.266 based on Kvazaar. There are 10 workloads.  Overall slightly lower IPC but most metrics are similar.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-15.png\" alt=\"\" class=\"wp-image-550\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-15.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-15-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-15-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown overview<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-53.png\" alt=\"\" class=\"wp-image-552\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-53.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-53-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-53-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              1226.459\non_cpu               0.789          # 12.63 \/ 16 cores\nutime                15210.235\nstime                276.940\nnvcsw                20082024       # 61.17%\nnivcsw               12746891       # 38.83%\ninblock              0              # 0.00\/sec\nonblock              22496          # 18.34\/sec\ncpu-clock            15487261555793 # 15487.262 seconds\ntask-clock           15490033937969 # 15490.034 seconds\npage faults          14118358       # 911.448\/sec\ncontext switches     32834801       # 2119.737\/sec\ncpu migrations       777121         # 50.169\/sec\nmajor page faults    2              # 0.000\/sec\nminor page faults    14118356       # 911.448\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             9488374461340  # 84.697 branches per 1000 inst\nbranch misses        176580290813   # 1.86% branch miss\nconditional          7372985440727  # 65.814 conditional branches per 1000 inst\nindirect             476354889281   # 4.252 indirect branches per 1000 inst\ncpu-cycles           59222760227158 # 3.02 GHz\ninstructions         112113224173244 # 1.89 IPC\nslots                118341567323460 #\nretiring             39571594319069 # 33.4% (50.0%)\n-- ucode             99400535036    #     0.1%\n-- fastpath          39472193784033 #    33.4%\nfrontend             18606899854651 # 15.7% (23.5%)\n-- latency           11404823371602 #     9.6%\n-- bandwidth         7202076483049  #     6.1%\nbackend              18508891525338 # 15.6% (23.4%)\n-- cpu               6387284955523  #     5.4%\n-- memory            12121606569815 #    10.2%\nspeculation          2443403174228  #  2.1% ( 3.1%)\n-- branch mispredict 2351990600409  #     2.0%\n-- pipeline restart  91412573819    #     0.1%\nsmt-contention       39209393842876 # 33.1% ( 0.0%)\ncpu-cycles           59233020683832 # 3.02 GHz\ninstructions         112104812727986 # 1.89 IPC\ninstructions         37336699054165 # 39.285 l2 access per 1000 inst\nl2 hit from l1       1124260604785  # 5.14% l2 miss\nl2 miss from l1      41199565892    #\nl2 hit from l2 pf    308374602830   #\nl3 hit from l2 pf    25687086559    #\nl3 miss from l2 pf   8446252065     #\ninstructions         37327978214966 # 123.040 float per 1000 inst\nfloat 512            106            # 0.000 AVX-512 per 1000 inst\nfloat 256            356            # 0.000 AVX-256 per 1000 inst\nfloat 128            4592836790618  # 123.040 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              2006.258\non_cpu               0.773          # 12.36 \/ 16 cores\nutime                24527.891\nstime                278.416\nnvcsw                26498601       # 77.59%\nnivcsw               7653750        # 22.41%\ninblock              18235080       # 9089.10\/sec\nonblock              17968          # 8.96\/sec\ncpu-clock            24807513144133 # 24807.513 seconds\ntask-clock           24812450667548 # 24812.451 seconds\npage faults          19015950       # 766.387\/sec\ncontext switches     34162087       # 1376.812\/sec\ncpu migrations       1480974        # 59.687\/sec\nmajor page faults    73             # 0.003\/sec\nminor page faults    19015877       # 766.384\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             10988043510016 # 83.506 branches per 1000 inst\nbranch misses        202479273078   # 1.84% branch miss\nconditional          10988043555712 # 83.506 conditional branches per 1000 inst\nindirect             3409696603840  # 25.913 indirect branches per 1000 inst\nslots                119227881800384 #\nretiring             72354003596444 # 60.7% (60.7%)\n-- ucode             6177833054651  #     5.2%\n-- fastpath          66176170541793 #    55.5%\nfrontend             28456843609843 # 23.9% (23.9%)\n-- latency           12151341821410 #    10.2%\n-- bandwidth         16305501788433 #    13.7%\nbackend              8926358411337  #  7.5% ( 7.5%)\n-- cpu               5078008768688  #     4.3%\n-- memory            3848349642649  #     3.2%\nspeculation          9809590154881  #  8.2% ( 8.2%)\n-- branch mispredict 9387525599800  #     7.9%\n-- pipeline restart  422064555081   #     0.4%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           77735101284795 # 2.39 GHz\ninstructions         153645864074988 # 1.98 IPC\nl2 access            1975746574306  # 26.935 l2 access per 1000 inst\nl2 miss              257360554948   # 13.03% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process overview<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>930 processes\n\t570 uvg266               273549.88  4020.09\n\t 68 clinfo                  16.54     6.32\n\t 38 vulkaninfo               0.76     1.50\n\t  6 php                      0.16     0.23\n\t  6 glxinfo:gdrv0            0.16     0.04\n\t  4 vulkani:disk$0           0.08     0.16\n\t  2 glxinfo                  0.08     0.02\n\t  2 glxinfo:cs0              0.08     0.02\n\t  2 glxinfo:disk$0           0.08     0.02\n\t  2 glxinfo:sh0              0.08     0.02\n\t  2 glxinfo:shlo0            0.08     0.02\n\t  6 clang                    0.06     0.06\n\t  2 llvmpipe-0               0.04     0.08\n\t  2 llvmpipe-1               0.04     0.08\n\t  2 llvmpipe-10              0.04     0.08\n\t  2 llvmpipe-11              0.04     0.08\n\t  2 llvmpipe-12              0.04     0.08\n\t  2 llvmpipe-13              0.04     0.08\n\t  2 llvmpipe-14              0.04     0.08\n\t  2 llvmpipe-15              0.04     0.08\n\t  2 llvmpipe-2               0.04     0.08\n\t  2 llvmpipe-3               0.04     0.08\n\t  2 llvmpipe-4               0.04     0.08\n\t  2 llvmpipe-5               0.04     0.08\n\t  2 llvmpipe-6               0.04     0.08\n\t  2 llvmpipe-7               0.04     0.08\n\t  2 llvmpipe-8               0.04     0.08\n\t  2 llvmpipe-9               0.04     0.08\n\t  3 rocminfo                 0.03     0.00\n\t  1 lspci                    0.01     0.01\n\t  1 ps                       0.00     0.01\n\t100 sh                       0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t 12 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  3 gmain                    0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dconf worker             0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Core computation structure with one thread per thread.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      238433) uvg266           cpu=2 start=5.83  finish=111.07\n        238434) uvg266           cpu=8 start=5.83  finish=111.05\n          238435) uvg266           cpu=5 start=5.83  finish=111.05\n          238436) uvg266           cpu=2 start=5.83  finish=111.05\n          238437) uvg266           cpu=9 start=5.83  finish=111.05\n          238438) uvg266           cpu=1 start=5.83  finish=111.05\n          238439) uvg266           cpu=6 start=5.83  finish=111.05\n          238440) uvg266           cpu=14 start=5.83  finish=111.05\n          238441) uvg266           cpu=13 start=5.83  finish=111.05\n          238442) uvg266           cpu=10 start=5.83  finish=111.05\n          238443) uvg266           cpu=4 start=5.83  finish=111.05\n          238444) uvg266           cpu=8 start=5.83  finish=111.05\n          238445) uvg266           cpu=15 start=5.83  finish=111.05\n          238446) uvg266           cpu=3 start=5.83  finish=111.05\n          238447) uvg266           cpu=11 start=5.83  finish=111.05\n          238448) uvg266           cpu=7 start=5.83  finish=111.05\n          238449) uvg266           cpu=0 start=5.83  finish=111.05\n          238450) uvg266           cpu=12 start=5.83  finish=111.05\n          238451) uvg266           cpu=0 start=5.86  finish=108.68\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>uvg266 is a video encoder for VVC\/H.266 based on Kvazaar. There are 10 workloads. Overall slightly lower IPC but most metrics are similar. Topdown overview AMD metrics Intel metrics Process overview Core computation structure with one thread per thread.<\/p>\n <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/uvg266\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a>","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-525","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/525","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=525"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/525\/revisions"}],"predecessor-version":[{"id":553,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/525\/revisions\/553"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=525"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}