{"id":1887,"date":"2024-03-01T12:37:54","date_gmt":"2024-03-01T12:37:54","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1887"},"modified":"2024-03-02T01:09:37","modified_gmt":"2024-03-02T01:09:37","slug":"gcrypt","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/gcrypt\/","title":{"rendered":"gcrypt"},"content":{"rendered":"\n<p>Testing libgcrypt with the integrated benchmark. Looks to be single-threaded.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-6.png\" alt=\"\" class=\"wp-image-1894\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-6.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-6-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-6-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows some blurring, probably from different crypt subtests.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-6.png\" alt=\"\" class=\"wp-image-1896\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-6.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-6-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-6-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show little floaitng point, a very low amount of frontend stalls and very little L2 access.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              531.902\non_cpu               0.061          # 0.97 \/ 16 cores\nutime                516.828\nstime                0.740\nnvcsw                2043           # 43.96%\nnivcsw               2604           # 56.04%\ninblock              64             # 0.12\/sec\nonblock              13832          # 26.00\/sec\ncpu-clock            517639332237   # 517.639 seconds\ntask-clock           517645091350   # 517.645 seconds\npage faults          147616         # 285.168\/sec\ncontext switches     7137           # 13.787\/sec\ncpu migrations       331            # 0.639\/sec\nmajor page faults    3              # 0.006\/sec\nminor page faults    147613         # 285.163\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             380131956406   # 60.983 branches per 1000 inst\nbranch misses        431943792      # 0.11% branch miss\nconditional          262048446662   # 42.039 conditional branches per 1000 inst\nindirect             26285539920    # 4.217 indirect branches per 1000 inst\ncpu-cycles           2417266512530  # 0.28 GHz\ninstructions         6219334247412  # 2.57 IPC\nslots                4842841465530  #\nretiring             2252950350168  # 46.5% (46.5%)\n-- ucode             3132389809     #     0.1%\n-- fastpath          2249817960359  #    46.5%\nfrontend             135360904140   #  2.8% ( 2.8%) low\n-- latency           41369303016    #     0.9%\n-- bandwidth         93991601124    #     1.9%\nbackend              2433230177150  # 50.2% (50.2%)\n-- cpu               472885962398   #     9.8%\n-- memory            1960344214752  #    40.5%\nspeculation          20862241606    #  0.4% ( 0.4%) low\n-- branch mispredict 16213481345    #     0.3%\n-- pipeline restart  4648760261     #     0.1%\nsmt-contention       437317253      #  0.0% ( 0.0%)\ncpu-cycles           2413505785886  # 0.28 GHz\ninstructions         6212281735247  # 2.57 IPC\ninstructions         2071438044827  # 0.094 l2 access per 1000 inst\nl2 hit from l1       178758560      # 12.16% l2 miss\nl2 miss from l1      14677271       #\nl2 hit from l2 pf    7254392        #\nl3 hit from l2 pf    4271893        #\nl3 miss from l2 pf   4767842        #\ninstructions         2070106154657  # 22.429 float per 1000 inst\nfloat 512            60             # 0.000 AVX-512 per 1000 inst\nfloat 256            620            # 0.000 AVX-256 per 1000 inst\nfloat 128            46429470920    # 22.429 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\ninstructions         6226129956548  #\nopcache              601881526493   # 96.670 opcache per 1000 inst\nopcache miss         7859374446     #  1.3% opcache miss rate\nl1 dTLB miss         28467427       # 0.005 L1 dTLB per 1000 inst\nl2 dTLB miss         4893619        # 0.001 L2 dTLB per 1000 inst\ninstructions         6234336458913  #\nicache               13440378870    # 2.156 icache per 1000 inst\nicache miss          654545777      #  4.9% icache miss rate\nl1 iTLB miss         7928289        # 0.001 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            16577          # 0.000 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics show memory accesses are all L1. Interesting to see relative amounts of memory-bound vs cpu-bound flipped between AMD and Intel.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              635.257\non_cpu               0.061          # 0.98 \/ 16 cores\nutime                619.955\nstime                0.494\nnvcsw                2947           # 49.68%\nnivcsw               2985           # 50.32%\ninblock              225048         # 354.26\/sec\nonblock              2688           # 4.23\/sec\ncpu-clock            620494490675   # 620.494 seconds\ntask-clock           620499746686   # 620.500 seconds\npage faults          138394         # 223.036\/sec\ncontext switches     8940           # 14.408\/sec\ncpu migrations       472            # 0.761\/sec\nmajor page faults    1148           # 1.850\/sec\nminor page faults    137246         # 221.186\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             370758952846   # 60.936 branches per 1000 inst\nbranch misses        576569854      # 0.16% branch miss\nconditional          370758964686   # 60.936 conditional branches per 1000 inst\nindirect             26280180628    # 4.319 indirect branches per 1000 inst\nslots                14166997571918 #\nretiring             7649074520400  # 54.0% (54.0%)\n-- ucode             500008685427   #     3.5%\n-- fastpath          7149065834973  #    50.5%\nfrontend             1409343082210  #  9.9% ( 9.9%)\n-- latency           280678924673   #     2.0%\n-- bandwidth         1128664157537  #     8.0%\nbackend              5938083799852  # 41.9% (41.9%)\n-- cpu               5064130744520  #    35.7%\n-- memory            873953055332   #     6.2%\nspeculation          113614731650   #  0.8% ( 0.8%) low\n-- branch mispredict 113465068431   #     0.8%\n-- pipeline restart  149663219      #     0.0%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           2352185195712  # 0.23 GHz\ninstructions         6088304938733  # 2.59 IPC\nl2 access            490630406      # 0.081 l2 access per 1000 inst\nl2 miss              98092730       # 19.99% l2 miss\ncpu-cycles           2352817541230  #  9.6% memory latency\nload stalls          225825885761   #  9.6% l1 bound\nl1 miss              938789868      #  0.0% l2 bound\nl2 miss              410168589      #  0.0% l3 bound\nl3 miss              261950789      #  0.0% dram bound\nstore_stalls         101113868      #  0.0% store bound\n<\/code><\/pre>\n\n\n\n<p>Process time shows most all the time spent in a benchmark application.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>354 processes\n\t  3 benchmark              514.36     0.00\n\t 68 clinfo                  16.22     6.31\n\t 38 vulkaninfo               1.52     0.94\n\t  4 vulkani:disk$0           0.16     0.10\n\t  6 glxinfo:gdrv0            0.11     0.05\n\t  6 glxinfo:gl0              0.11     0.05\n\t  6 php                      0.08     0.09\n\t  2 llvmpipe-0               0.08     0.05\n\t  2 llvmpipe-1               0.08     0.05\n\t  2 llvmpipe-10              0.08     0.05\n\t  2 llvmpipe-11              0.08     0.05\n\t  2 llvmpipe-12              0.08     0.05\n\t  2 llvmpipe-13              0.08     0.05\n\t  2 llvmpipe-14              0.08     0.05\n\t  2 llvmpipe-15              0.08     0.05\n\t  2 llvmpipe-2               0.08     0.05\n\t  2 llvmpipe-3               0.08     0.05\n\t  2 llvmpipe-4               0.08     0.05\n\t  2 llvmpipe-5               0.08     0.05\n\t  2 llvmpipe-6               0.08     0.05\n\t  2 llvmpipe-7               0.08     0.05\n\t  2 llvmpipe-8               0.08     0.05\n\t  2 llvmpipe-9               0.08     0.05\n\t  2 glxinfo                  0.06     0.02\n\t  2 glxinfo:cs0              0.06     0.02\n\t  2 glxinfo:disk$0           0.06     0.02\n\t  2 glxinfo:sh0              0.06     0.02\n\t  2 glxinfo:shlo0            0.06     0.02\n\t  6 clang                    0.04     0.08\n\t  3 rocminfo                 0.00     0.03\n\t  1 lspci                    0.00     0.02\n\t 82 sh                       0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t 11 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  3 gcrypt                   0.00     0.00\n\t  3 gmain                    0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 dconf worker             0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 ps                       0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Core computation pieces<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      823837) gcrypt           cpu=1 start=5.64  finish=177.28\n        823838) benchmark        cpu=11 start=5.64  finish=177.28\n      823842) gcrypt           cpu=1 start=181.28 finish=352.58\n        823843) benchmark        cpu=10 start=181.28 finish=352.58\n      823967) gcrypt           cpu=1 start=356.58 finish=528.03\n        823968) benchmark        cpu=10 start=356.59 finish=528.03\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Testing libgcrypt with the integrated benchmark. Looks to be single-threaded. Topdown profile shows some blurring, probably from different crypt subtests. AMD metrics show little floaitng point, a very low amount of frontend stalls and very little L2 access. Intel metrics <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/gcrypt\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1887","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1887","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1887"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1887\/revisions"}],"predecessor-version":[{"id":1897,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1887\/revisions\/1897"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1887"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}