{"id":690,"date":"2024-01-19T10:55:47","date_gmt":"2024-01-19T10:55:47","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=690"},"modified":"2024-01-19T10:55:48","modified_gmt":"2024-01-19T10:55:48","slug":"openssl","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/openssl\/","title":{"rendered":"openssl"},"content":{"rendered":"\n<p>Parallel testing of the openssl library with seven different encryption methods. These are parallel tests that fully use the processor.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-36.png\" alt=\"\" class=\"wp-image-691\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-36.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-36-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-36-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown shows a variation depending on the test with some having high retirement rates. Not much in way of backend stalls.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-74.png\" alt=\"\" class=\"wp-image-693\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-74.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-74-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-74-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics emphasize running on all cores. Some floating point code. Not much L2 access at all and little speculation penalty.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              3546.361\non_cpu               0.964          # 15.43 \/ 16 cores\nutime                54705.828\nstime                6.286\nnvcsw                2786           # 0.61%\nnivcsw               454724         # 99.39%\ninblock              0              # 0.00\/sec\nonblock              38344          # 10.81\/sec\ncpu-clock            54712772033085 # 54712.772 seconds\ntask-clock           54712944501664 # 54712.945 seconds\npage faults          237163         # 4.335\/sec\ncontext switches     474670         # 8.676\/sec\ncpu migrations       288            # 0.005\/sec\nmajor page faults    6              # 0.000\/sec\nminor page faults    237157         # 4.335\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             28141573963092 # 84.214 branches per 1000 inst\nbranch misses        4040824108     # 0.01% branch miss\nconditional          20015608782402 # 59.897 conditional branches per 1000 inst\nindirect             1374390913138  # 4.113 indirect branches per 1000 inst\ncpu-cycles           224836154819168 # 3.96 GHz\ninstructions         333918782253330 # 1.49 IPC\nslots                449645952363360 #\nretiring             120885862595350 # 26.9% (42.2%)\n-- ucode             2836478130363  #     0.6%\n-- fastpath          118049384464987 #    26.3%\nfrontend             17843197259553 #  4.0% ( 6.2%)\n-- latency           3325064484588  #     0.7%\n-- bandwidth         14518132774965 #     3.2%\nbackend              147725599817768 # 32.9% (51.6%)\n-- cpu               130249734748986 #    29.0%\n-- memory            17475865068782 #     3.9%\nspeculation          15053448577    #  0.0% ( 0.0%)\n-- branch mispredict 14694571098    #     0.0%\n-- pipeline restart  358877479      #     0.0%\nsmt-contention       163175458826959 # 36.3% ( 0.0%)\ncpu-cycles           224755130568245 # 3.96 GHz\ninstructions         333767051670546 # 1.49 IPC\ninstructions         111255059194864 # 1.475 l2 access per 1000 inst\nl2 hit from l1       144816833386   # 0.08% l2 miss\nl2 miss from l1      93791436       #\nl2 hit from l2 pf    19193166382    #\nl3 hit from l2 pf    37068899       #\nl3 miss from l2 pf   6992874        #\ninstructions         111228238560745 # 148.491 float per 1000 inst\nfloat 512            85             # 0.000 AVX-512 per 1000 inst\nfloat 256            229775066001   # 2.066 AVX-256 per 1000 inst\nfloat 128            16286635504331 # 146.425 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         5              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              3546.473\non_cpu               0.964          # 15.42 \/ 16 cores\nutime                54698.894\nstime                1.537\nnvcsw                2754           # 0.71%\nnivcsw               386803         # 99.29%\ninblock              1624           # 0.46\/sec\nonblock              26832          # 7.57\/sec\ncpu-clock            54700573721291 # 54700.574 seconds\ntask-clock           54700650886402 # 54700.651 seconds\npage faults          221464         # 4.049\/sec\ncontext switches     406713         # 7.435\/sec\ncpu migrations       370            # 0.007\/sec\nmajor page faults    11             # 0.000\/sec\nminor page faults    221453         # 4.048\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             27345414812497 # 75.259 branches per 1000 inst\nbranch misses        9738137632     # 0.04% branch miss\nconditional          27345414836753 # 75.259 conditional branches per 1000 inst\nindirect             9376516889196  # 25.806 indirect branches per 1000 inst\nslots                252047617073012 #\nretiring             193522371846759 # 76.8% (76.8%)\n-- ucode             10580939452566 #     4.2%\n-- fastpath          182941432394193 #    72.6%\nfrontend             44035380227094 # 17.5% (17.5%)\n-- latency           25740142883266 #    10.2%\n-- bandwidth         18295237343828 #     7.3%\nbackend              12144723001961 #  4.8% ( 4.8%)\n-- cpu               11181100551912 #     4.4%\n-- memory            963622450049   #     0.4%\nspeculation          141736916921   #  0.1% ( 0.1%)\n-- branch mispredict 122323887918   #     0.0%\n-- pipeline restart  19413029003    #     0.0%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           85364189856430 # 1.50 GHz\ninstructions         175744735277161 # 2.06 IPC\nl2 access            9473367665     # 0.055 l2 access per 1000 inst\nl2 miss              251739645      # 2.66% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process overview shows time all spent in openssl using the internal benchmark mechanism<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>732 processes\n\t378 openssl              54697.53     1.02\n\t 68 clinfo                  15.88     6.98\n\t 38 vulkaninfo               0.95     1.33\n\t  6 php                      0.14     0.24\n\t  6 glxinfo:gdrv0            0.12     0.09\n\t  4 vulkani:disk$0           0.10     0.14\n\t  2 glxinfo                  0.06     0.04\n\t  2 glxinfo:cs0              0.06     0.04\n\t  2 glxinfo:disk$0           0.06     0.04\n\t  2 glxinfo:sh0              0.06     0.04\n\t  2 glxinfo:shlo0            0.06     0.04\n\t  2 llvmpipe-0               0.05     0.07\n\t  2 llvmpipe-1               0.05     0.07\n\t  2 llvmpipe-10              0.05     0.07\n\t  2 llvmpipe-11              0.05     0.07\n\t  2 llvmpipe-12              0.05     0.07\n\t  2 llvmpipe-13              0.05     0.07\n\t  2 llvmpipe-14              0.05     0.07\n\t  2 llvmpipe-15              0.05     0.07\n\t  2 llvmpipe-2               0.05     0.07\n\t  2 llvmpipe-3               0.05     0.07\n\t  2 llvmpipe-4               0.05     0.07\n\t  2 llvmpipe-5               0.05     0.07\n\t  2 llvmpipe-6               0.05     0.07\n\t  2 llvmpipe-7               0.05     0.07\n\t  2 llvmpipe-8               0.05     0.07\n\t  2 llvmpipe-9               0.05     0.07\n\t  6 clang                    0.04     0.08\n\t  1 lspci                    0.00     0.02\n\t 94 sh                       0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t 11 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  4 gmain                    0.00     0.00\n\t  3 rocminfo                 0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dconf worker             0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 ps                       0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Straightforward computation structures<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      2555597) openssl          cpu=7 start=5.94  finish=185.95\n        2555598) openssl          cpu=10 start=5.94  finish=185.95\n          2555599) openssl          cpu=10 start=5.95  finish=185.95\n          2555600) openssl          cpu=4 start=5.95  finish=185.95\n          2555601) openssl          cpu=6 start=5.95  finish=185.95\n          2555602) openssl          cpu=3 start=5.95  finish=185.95\n          2555603) openssl          cpu=1 start=5.95  finish=185.95\n          2555604) openssl          cpu=0 start=5.95  finish=185.95\n          2555605) openssl          cpu=7 start=5.95  finish=185.95\n          2555606) openssl          cpu=13 start=5.95  finish=185.95\n          2555607) openssl          cpu=9 start=5.95  finish=185.95\n          2555608) openssl          cpu=12 start=5.95  finish=185.95\n          2555609) openssl          cpu=14 start=5.95  finish=185.95\n          2555610) openssl          cpu=11 start=5.95  finish=185.95\n          2555611) openssl          cpu=8 start=5.95  finish=185.95\n          2555612) openssl          cpu=2 start=5.95  finish=185.95\n          2555613) openssl          cpu=5 start=5.95  finish=185.95\n          2555614) openssl          cpu=15 start=5.95  finish=185.95\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Parallel testing of the openssl library with seven different encryption methods. These are parallel tests that fully use the processor. Topdown shows a variation depending on the test with some having high retirement rates. Not much in way of backend <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/openssl\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-690","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/690","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=690"}],"version-history":[{"count":1,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/690\/revisions"}],"predecessor-version":[{"id":694,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/690\/revisions\/694"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=690"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}