Parallel testing of the openssl library with seven different encryption methods. These are parallel tests that fully use the processor.

Topdown shows a variation depending on the test with some having high retirement rates. Not much in way of backend stalls.

AMD metrics emphasize running on all cores. Some floating point code. Not much L2 access at all and little speculation penalty.

elapsed              3546.361
on_cpu               0.964          # 15.43 / 16 cores
utime                54705.828
stime                6.286
nvcsw                2786           # 0.61%
nivcsw               454724         # 99.39%
inblock              0              # 0.00/sec
onblock              38344          # 10.81/sec
cpu-clock            54712772033085 # 54712.772 seconds
task-clock           54712944501664 # 54712.945 seconds
page faults          237163         # 4.335/sec
context switches     474670         # 8.676/sec
cpu migrations       288            # 0.005/sec
major page faults    6              # 0.000/sec
minor page faults    237157         # 4.335/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             28141573963092 # 84.214 branches per 1000 inst
branch misses        4040824108     # 0.01% branch miss
conditional          20015608782402 # 59.897 conditional branches per 1000 inst
indirect             1374390913138  # 4.113 indirect branches per 1000 inst
cpu-cycles           224836154819168 # 3.96 GHz
instructions         333918782253330 # 1.49 IPC
slots                449645952363360 #
retiring             120885862595350 # 26.9% (42.2%)
-- ucode             2836478130363  #     0.6%
-- fastpath          118049384464987 #    26.3%
frontend             17843197259553 #  4.0% ( 6.2%)
-- latency           3325064484588  #     0.7%
-- bandwidth         14518132774965 #     3.2%
backend              147725599817768 # 32.9% (51.6%)
-- cpu               130249734748986 #    29.0%
-- memory            17475865068782 #     3.9%
speculation          15053448577    #  0.0% ( 0.0%)
-- branch mispredict 14694571098    #     0.0%
-- pipeline restart  358877479      #     0.0%
smt-contention       163175458826959 # 36.3% ( 0.0%)
cpu-cycles           224755130568245 # 3.96 GHz
instructions         333767051670546 # 1.49 IPC
instructions         111255059194864 # 1.475 l2 access per 1000 inst
l2 hit from l1       144816833386   # 0.08% l2 miss
l2 miss from l1      93791436       #
l2 hit from l2 pf    19193166382    #
l3 hit from l2 pf    37068899       #
l3 miss from l2 pf   6992874        #
instructions         111228238560745 # 148.491 float per 1000 inst
float 512            85             # 0.000 AVX-512 per 1000 inst
float 256            229775066001   # 2.066 AVX-256 per 1000 inst
float 128            16286635504331 # 146.425 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         5              # 0.000 scalar per 1000 inst

Intel metrics

elapsed              3546.473
on_cpu               0.964          # 15.42 / 16 cores
utime                54698.894
stime                1.537
nvcsw                2754           # 0.71%
nivcsw               386803         # 99.29%
inblock              1624           # 0.46/sec
onblock              26832          # 7.57/sec
cpu-clock            54700573721291 # 54700.574 seconds
task-clock           54700650886402 # 54700.651 seconds
page faults          221464         # 4.049/sec
context switches     406713         # 7.435/sec
cpu migrations       370            # 0.007/sec
major page faults    11             # 0.000/sec
minor page faults    221453         # 4.048/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             27345414812497 # 75.259 branches per 1000 inst
branch misses        9738137632     # 0.04% branch miss
conditional          27345414836753 # 75.259 conditional branches per 1000 inst
indirect             9376516889196  # 25.806 indirect branches per 1000 inst
slots                252047617073012 #
retiring             193522371846759 # 76.8% (76.8%)
-- ucode             10580939452566 #     4.2%
-- fastpath          182941432394193 #    72.6%
frontend             44035380227094 # 17.5% (17.5%)
-- latency           25740142883266 #    10.2%
-- bandwidth         18295237343828 #     7.3%
backend              12144723001961 #  4.8% ( 4.8%)
-- cpu               11181100551912 #     4.4%
-- memory            963622450049   #     0.4%
speculation          141736916921   #  0.1% ( 0.1%)
-- branch mispredict 122323887918   #     0.0%
-- pipeline restart  19413029003    #     0.0%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           85364189856430 # 1.50 GHz
instructions         175744735277161 # 2.06 IPC
l2 access            9473367665     # 0.055 l2 access per 1000 inst
l2 miss              251739645      # 2.66% l2 miss

Process overview shows time all spent in openssl using the internal benchmark mechanism

732 processes
	378 openssl              54697.53     1.02
	 68 clinfo                  15.88     6.98
	 38 vulkaninfo               0.95     1.33
	  6 php                      0.14     0.24
	  6 glxinfo:gdrv0            0.12     0.09
	  4 vulkani:disk$0           0.10     0.14
	  2 glxinfo                  0.06     0.04
	  2 glxinfo:cs0              0.06     0.04
	  2 glxinfo:disk$0           0.06     0.04
	  2 glxinfo:sh0              0.06     0.04
	  2 glxinfo:shlo0            0.06     0.04
	  2 llvmpipe-0               0.05     0.07
	  2 llvmpipe-1               0.05     0.07
	  2 llvmpipe-10              0.05     0.07
	  2 llvmpipe-11              0.05     0.07
	  2 llvmpipe-12              0.05     0.07
	  2 llvmpipe-13              0.05     0.07
	  2 llvmpipe-14              0.05     0.07
	  2 llvmpipe-15              0.05     0.07
	  2 llvmpipe-2               0.05     0.07
	  2 llvmpipe-3               0.05     0.07
	  2 llvmpipe-4               0.05     0.07
	  2 llvmpipe-5               0.05     0.07
	  2 llvmpipe-6               0.05     0.07
	  2 llvmpipe-7               0.05     0.07
	  2 llvmpipe-8               0.05     0.07
	  2 llvmpipe-9               0.05     0.07
	  6 clang                    0.04     0.08
	  1 lspci                    0.00     0.02
	 94 sh                       0.00     0.00
	 13 gcc                      0.00     0.00
	 11 gsettings                0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  4 gmain                    0.00     0.00
	  3 rocminfo                 0.00     0.00
	  2 cc                       0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 date                     0.00     0.00
	  1 dconf worker             0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 ps                       0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
47 maximum processes

Straightforward computation structures

      2555597) openssl          cpu=7 start=5.94  finish=185.95
        2555598) openssl          cpu=10 start=5.94  finish=185.95
          2555599) openssl          cpu=10 start=5.95  finish=185.95
          2555600) openssl          cpu=4 start=5.95  finish=185.95
          2555601) openssl          cpu=6 start=5.95  finish=185.95
          2555602) openssl          cpu=3 start=5.95  finish=185.95
          2555603) openssl          cpu=1 start=5.95  finish=185.95
          2555604) openssl          cpu=0 start=5.95  finish=185.95
          2555605) openssl          cpu=7 start=5.95  finish=185.95
          2555606) openssl          cpu=13 start=5.95  finish=185.95
          2555607) openssl          cpu=9 start=5.95  finish=185.95
          2555608) openssl          cpu=12 start=5.95  finish=185.95
          2555609) openssl          cpu=14 start=5.95  finish=185.95
          2555610) openssl          cpu=11 start=5.95  finish=185.95
          2555611) openssl          cpu=8 start=5.95  finish=185.95
          2555612) openssl          cpu=2 start=5.95  finish=185.95
          2555613) openssl          cpu=5 start=5.95  finish=185.95
          2555614) openssl          cpu=15 start=5.95  finish=185.95