{"id":1021,"date":"2024-01-28T21:24:10","date_gmt":"2024-01-28T21:24:10","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1021"},"modified":"2024-01-29T10:29:18","modified_gmt":"2024-01-29T10:29:18","slug":"gnupg","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/gnupg\/","title":{"rendered":"gnupg"},"content":{"rendered":"\n<p>Test to encrypt a 2.7GB file with GnuPG. Looks like a single-threaded program that runs in about a minute.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-79.png\" alt=\"\" class=\"wp-image-1054\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-79.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-79-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-79-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows speculation stalls as particularly high and backend stalls as low.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-116.png\" alt=\"\" class=\"wp-image-1056\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-116.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-116-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-116-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics confirm a single-threaded program. Little floating point and little L2 access. Not many indirect branches but still very high branch mis-prediction. Frontend stalls are more latency than bandwidth<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              169.011\non_cpu               0.057          # 0.92 \/ 16 cores\nutime                151.248\nstime                3.799\nnvcsw                2031           # 72.85%\nnivcsw               757            # 27.15%\ninblock              0              # 0.00\/sec\nonblock              4207744        # 24896.30\/sec\ncpu-clock            155083495851   # 155.083 seconds\ntask-clock           155087202998   # 155.087 seconds\npage faults          148931         # 960.305\/sec\ncontext switches     3455           # 22.278\/sec\ncpu migrations       256            # 1.651\/sec\nmajor page faults    2              # 0.013\/sec\nminor page faults    148929         # 960.292\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             204801389499   # 146.104 branches per 1000 inst\nbranch misses        14376986247    # 7.02% branch miss\nconditional          191213671302   # 136.411 conditional branches per 1000 inst\nindirect             142014949      # 0.101 indirect branches per 1000 inst\ncpu-cycles           715665699465   # 0.27 GHz\ninstructions         1400096299132  # 1.96 IPC\nslots                1434267036432  #\nretiring             445645996834   # 31.1% (31.1%)\n-- ucode             47579507       #     0.0%\n-- fastpath          445598417327   #    31.1%\nfrontend             473772218359   # 33.0% (33.0%)\n-- latency           248750339964   #    17.3%\n-- bandwidth         225021878395   #    15.7%\nbackend              214028036606   # 14.9% (14.9%) low\n-- cpu               63033651425    #     4.4%\n-- memory            150994385181   #    10.5%\nspeculation          300713825870   # 21.0% (21.0%) high\n-- branch mispredict 300208618021   #    20.9%\n-- pipeline restart  505207849      #     0.0%\nsmt-contention       106654498      #  0.0% ( 0.0%)\ncpu-cycles           718195890990   # 0.27 GHz\ninstructions         1400152976411  # 1.95 IPC\ninstructions         467215406957   # 17.124 l2 access per 1000 inst\nl2 hit from l1       4745977338     # 1.02% l2 miss\nl2 miss from l1      31050704       #\nl2 hit from l2 pf    3203773986     #\nl3 hit from l2 pf    17341357       #\nl3 miss from l2 pf   33443160       #\ninstructions         467198274463   # 23.010 float per 1000 inst\nfloat 512            43             # 0.000 AVX-512 per 1000 inst\nfloat 256            654            # 0.000 AVX-256 per 1000 inst\nfloat 128            10750115468    # 23.010 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              205.884\non_cpu               0.058          # 0.93 \/ 16 cores\nutime                189.745\nstime                2.491\nnvcsw                1952           # 70.17%\nnivcsw               830            # 29.83%\ninblock              0              # 0.00\/sec\nonblock              4196488        # 20382.77\/sec\ncpu-clock            192257964300   # 192.258 seconds\ntask-clock           192261256286   # 192.261 seconds\npage faults          138112         # 718.356\/sec\ncontext switches     3632           # 18.891\/sec\ncpu migrations       281            # 1.462\/sec\nmajor page faults    0              # 0.000\/sec\nminor page faults    138112         # 718.356\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             203656765790   # 145.850 branches per 1000 inst\nbranch misses        14459480379    # 7.10% branch miss\nconditional          203656778046   # 145.850 conditional branches per 1000 inst\nindirect             146325163      # 0.105 indirect branches per 1000 inst\nslots                4363710725378  #\nretiring             1314816821795  # 30.1% (30.1%)\n-- ucode             86447683819    #     2.0%\n-- fastpath          1228369137976  #    28.1%\nfrontend             819185548728   # 18.8% (18.8%)\n-- latency           377630877144   #     8.7%\n-- bandwidth         441554671584   #    10.1%\nbackend              510111406692   # 11.7% (11.7%) low\n-- cpu               416279869681   #     9.5%\n-- memory            93831537011    #     2.2%\nspeculation          1725289697089  # 39.5% (39.5%) high\n-- branch mispredict 1713750484505  #    39.3%\n-- pipeline restart  11539212584    #     0.3%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           727399626683   # 0.22 GHz\ninstructions         1396330348949  # 1.92 IPC\nl2 access            11452117161    # 8.202 l2 access per 1000 inst\nl2 miss              432509657      # 3.78% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process overview shows the gpg process and rest is test system overhead.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>363 processes\n\t  3 gpg                    149.76     1.32\n\t 68 clinfo                  17.20     5.72\n\t 38 vulkaninfo               0.95     1.33\n\t  6 glxinfo:gdrv0            0.15     0.04\n\t  6 glxinfo:gl0              0.15     0.04\n\t  4 vulkani:disk$0           0.10     0.14\n\t  6 clang                    0.08     0.04\n\t  2 glxinfo                  0.07     0.02\n\t  2 glxinfo:cs0              0.07     0.02\n\t  2 glxinfo:disk$0           0.07     0.02\n\t  2 glxinfo:sh0              0.07     0.02\n\t  2 glxinfo:shlo0            0.07     0.02\n\t  6 php                      0.05     0.08\n\t  2 llvmpipe-0               0.05     0.07\n\t  2 llvmpipe-1               0.05     0.07\n\t  2 llvmpipe-10              0.05     0.07\n\t  2 llvmpipe-11              0.05     0.07\n\t  2 llvmpipe-12              0.05     0.07\n\t  2 llvmpipe-13              0.05     0.07\n\t  2 llvmpipe-14              0.05     0.07\n\t  2 llvmpipe-15              0.05     0.07\n\t  2 llvmpipe-2               0.05     0.07\n\t  2 llvmpipe-3               0.05     0.07\n\t  2 llvmpipe-4               0.05     0.07\n\t  2 llvmpipe-5               0.05     0.07\n\t  2 llvmpipe-6               0.05     0.07\n\t  2 llvmpipe-7               0.05     0.07\n\t  2 llvmpipe-8               0.05     0.07\n\t  2 llvmpipe-9               0.05     0.07\n\t  3 rocminfo                 0.03     0.00\n\t  1 dd                       0.00     1.72\n\t  1 rm                       0.00     0.25\n\t  1 lspci                    0.00     0.02\n\t 84 sh                       0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t  9 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 gnupg                    0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  4 gmain                    0.00     0.00\n\t  3 dconf worker             0.00     0.00\n\t  2 bash                     0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 ps                       0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>The core computation sections<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      1173365) gnupg            cpu=4 start=7.23  finish=57.56\n        1173366) gnupg            cpu=5 start=7.24  finish=7.24 \n        1173367) gpg              cpu=14 start=7.24  finish=57.56\n      1173372) gnupg            cpu=13 start=61.57 finish=111.85\n        1173373) gnupg            cpu=14 start=61.57 finish=61.57\n        1173374) gpg              cpu=8 start=61.57 finish=111.85\n      1173376) gnupg            cpu=4 start=115.86 finish=166.37\n        1173377) gnupg            cpu=5 start=115.86 finish=115.86\n        1173378) gpg              cpu=14 start=115.86 finish=166.37\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Test to encrypt a 2.7GB file with GnuPG. Looks like a single-threaded program that runs in about a minute. Topdown profile shows speculation stalls as particularly high and backend stalls as low. AMD metrics confirm a single-threaded program. Little floating <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/gnupg\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1021","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1021","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1021"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1021\/revisions"}],"predecessor-version":[{"id":1057,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1021\/revisions\/1057"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1021"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}