{"id":308,"date":"2024-01-07T01:58:18","date_gmt":"2024-01-07T01:58:18","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=308"},"modified":"2024-01-07T14:41:51","modified_gmt":"2024-01-07T14:41:51","slug":"compress-rar","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/compress-rar\/","title":{"rendered":"compress-rar"},"content":{"rendered":"\n<p>Test of compressing and decompressing the Linux kernel (through a different one than the compress-gzip benchmark). Interesting that none tests of different compression tools use the same metrics and workload so not easy to compare between tools.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-25.png\" alt=\"\" class=\"wp-image-324\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-25.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-25-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-25-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show a lower on-cpu (3.63) but not single threaded and also a lot of output. Not as much branch misprediction as some other codes.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              159.883\non_cpu               0.227          # 3.63 \/ 16 cores\nutime                373.561\nstime                206.226\nnvcsw                24109219       # 99.98%\nnivcsw               4882           # 0.02%\ninblock              344            # 2.15\/sec\nonblock              9913648        # 62005.75\/sec\ncpu-clock            573029063163   # 573.029 seconds\ntask-clock           579433116869   # 579.433 seconds\npage faults          471162         # 813.143\/sec\ncontext switches     24114697       # 41617.740\/sec\ncpu migrations       2346204        # 4049.137\/sec\nmajor page faults    4              # 0.007\/sec\nminor page faults    471158         # 813.136\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             381485532457   # 148.151 branches per 1000 inst\nbranch misses        21732499954    # 5.70% branch miss\nconditional          331455035535   # 128.721 conditional branches per 1000 inst\nindirect             2131195176     # 0.828 indirect branches per 1000 inst\ncpu-cycles           2247254350506  # 0.92 GHz\ninstructions         2657644056246  # 1.18 IPC\nslots                4337089786164  #\nretiring             856287818753   # 19.7% (22.6%)\n-- ucode             1063178575     #     0.0%\n-- fastpath          855224640178   #    19.7%\nfrontend             1416917998202  # 32.7% (37.4%)\n-- latency           990546038496   #    22.8%\n-- bandwidth         426371959706   #     9.8%\nbackend              1317194261591  # 30.4% (34.7%)\n-- cpu               380456352651   #     8.8%\n-- memory            936737908940   #    21.6%\nspeculation          201494052691   #  4.6% ( 5.3%)\n-- branch mispredict 200227001429   #     4.6%\n-- pipeline restart  1267051262     #     0.0%\nsmt-contention       543796763011   # 12.5% ( 0.0%)\ncpu-cycles           2245077480593  # 0.92 GHz\ninstructions         2667029914647  # 1.19 IPC\ninstructions         872960594992   # 41.360 l2 access per 1000 inst\nl2 hit from l1       27079123490    # 31.89% l2 miss\nl2 miss from l1      5990309331     #\nl2 hit from l2 pf    3502066051     #\nl3 hit from l2 pf    4885619672     #\nl3 miss from l2 pf   638696475      #\ninstructions         870165842389   # 20.284 float per 1000 inst\nfloat 512            77             # 0.000 AVX-512 per 1000 inst\nfloat 256            1157816        # 0.001 AVX-256 per 1000 inst\nfloat 128            17649607359    # 20.283 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              239.855\non_cpu               0.218          # 3.49 \/ 16 cores\nutime                628.377\nstime                209.002\nnvcsw                25437007       # 92.02%\nnivcsw               2205558        # 7.98%\ninblock              237344         # 989.53\/sec\nonblock              9913984        # 41333.30\/sec\ncpu-clock            833996774060   # 833.997 seconds\ntask-clock           840839881199   # 840.840 seconds\npage faults          454067         # 540.016\/sec\ncontext switches     27643630       # 32876.212\/sec\ncpu migrations       7769626        # 9240.316\/sec\nmajor page faults    16             # 0.019\/sec\nminor page faults    454051         # 539.997\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             378883981060   # 144.229 branches per 1000 inst\nbranch misses        14411618351    # 3.80% branch miss\nconditional          378883995012   # 144.229 conditional branches per 1000 inst\nindirect             107206296207   # 40.810 indirect branches per 1000 inst\nslots                4884567876338  #\nretiring             1388281256735  # 28.4% (28.4%)\n-- ucode             80372149297    #     1.6%\n-- fastpath          1307909107438  #    26.8%\nfrontend             864325651934   # 17.7% (17.7%)\n-- latency           475762760331   #     9.7%\n-- bandwidth         388562891603   #     8.0%\nbackend              1861932065696  # 38.1% (38.1%)\n-- cpu               477788296605   #     9.8%\n-- memory            1384143769091  #    28.3%\nspeculation          880364205059   # 18.0% (18.0%)\n-- branch mispredict 866195517000   #    17.7%\n-- pipeline restart  14168688059    #     0.3%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           2069940879352  # 0.58 GHz\ninstructions         3033820737561  # 1.47 IPC\nl2 access            58335330439    # 37.463 l2 access per 1000 inst\nl2 miss              24403745862    # 41.83% l2 miss<\/code><\/pre>\n\n\n\n<p>Process overview is straightforward<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>420 processes\n\t 51 rar                   6839.31  1699.16\n\t 64 clinfo                  12.16     2.56\n\t  1 xz                       4.45     0.27\n\t 38 vulkaninfo               0.93     0.95\n\t  6 php                      0.51     4.54\n\t  2 cp                       0.32     5.24\n\t  1 tar                      0.12     2.32\n\t  6 glxinfo:gdrv0            0.11     0.06\n\t  4 vulkani:disk$0           0.09     0.10\n\t  7 rm                       0.07     3.77\n\t  2 glxinfo                  0.06     0.02\n\t  2 glxinfo:cs0              0.06     0.02\n\t  2 glxinfo:disk$0           0.06     0.02\n\t  2 glxinfo:sh0              0.06     0.02\n\t  2 glxinfo:shlo0            0.06     0.02\n\t  2 llvmpipe-0               0.05     0.05\n\t  2 llvmpipe-1               0.05     0.05\n\t  2 llvmpipe-10              0.05     0.05\n\t  2 llvmpipe-11              0.05     0.05\n\t  2 llvmpipe-12              0.05     0.05\n\t  2 llvmpipe-13              0.05     0.05\n\t  2 llvmpipe-14              0.05     0.05\n\t  2 llvmpipe-15              0.05     0.05\n\t  2 llvmpipe-2               0.05     0.05\n\t  2 llvmpipe-3               0.05     0.05\n\t  2 llvmpipe-4               0.05     0.05\n\t  2 llvmpipe-5               0.05     0.05\n\t  2 llvmpipe-6               0.05     0.05\n\t  2 llvmpipe-7               0.05     0.05\n\t  2 llvmpipe-8               0.05     0.05\n\t  2 llvmpipe-9               0.05     0.05\n\t  6 clang                    0.03     0.04\n\t  1 lspci                    0.00     0.03\n\t 92 sh                       0.00     0.00\n\t 12 gcc                      0.00     0.00\n\t  9 stty                     0.00     0.00\n\t  8 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 gmain                    0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  4 bash                     0.00     0.00\n\t  3 compress-rar             0.00     0.00\n\t  3 dconf worker             0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 cc                       0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 ps                       0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>The core computation blocks show we are starting one thread per core, so relatively lower on_cpu likely indicates waiting e.g. for disk.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      34369) compress-rar start=17.02 finish=58.27\n        34370) rar start=17.02 finish=58.27\n          34371) rar start=18.90 finish=58.19\n          34372) rar start=18.90 finish=58.19\n          34373) rar start=18.90 finish=58.19\n          34374) rar start=18.90 finish=58.19\n          34375) rar start=18.90 finish=58.19\n          34376) rar start=18.90 finish=58.19\n          34377) rar start=18.90 finish=58.19\n          34378) rar start=18.90 finish=58.19\n          34379) rar start=18.90 finish=58.19\n          34380) rar start=18.90 finish=58.19\n          34381) rar start=18.90 finish=58.19\n          34382) rar start=18.90 finish=58.19\n          34383) rar start=18.90 finish=58.19\n          34384) rar start=18.90 finish=58.19\n          34385) rar start=18.90 finish=58.19\n          34386) rar start=18.90 finish=58.19<\/code><\/pre>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Test of compressing and decompressing the Linux kernel (through a different one than the compress-gzip benchmark). Interesting that none tests of different compression tools use the same metrics and workload so not easy to compare between tools. AMD metrics show <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/compress-rar\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-308","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/308","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=308"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/308\/revisions"}],"predecessor-version":[{"id":337,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/308\/revisions\/337"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=308"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}