Wireguard is a test of network stack and hence not the best for CPU-based metrics. This creates separate network devices and sends traffic through them. I’ve added things to –system to also check for network traffic. However, because these network devices are created after things start, I don’t record the traffic. The profile below shows short bursts of CPU activity amidst bursts of IRQ processing activity and a variable number of processes started.

Note this is the one test where my intel box seems to run faster than AMD box (151 seconds vs 193 seconds)

Overall graph of topdown metrics show the short bursts of CPU activity are mostly frontend latency related.

AMD metrics show on average only a little over one core worth of CPU, despite the first graph showing the run queue can be as high as 25 processes. There are a lot of branches in this short amount of code.

elapsed              648.250
on_cpu               0.070          # 1.12 / 16 cores
utime                49.347
stime                675.785
nvcsw                9918151        # 27.06%
nivcsw               26735996       # 72.94%
inblock              0              # 0.00/sec
onblock              13480          # 20.79/sec
cpu-clock            751000202974   # 751.000 seconds
task-clock           759804247090   # 759.804 seconds
page faults          251649         # 331.202/sec
context switches     36657366       # 48245.803/sec
cpu migrations       3114963        # 4099.691/sec
major page faults    146            # 0.192/sec
minor page faults    251503         # 331.010/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             447260141555   # 204.464 branches per 1000 inst
branch misses        50398393703    # 11.27% branch miss
conditional          226128351861   # 103.374 conditional branches per 1000 inst
indirect             5434863427     # 2.485 indirect branches per 1000 inst
cpu-cycles           2937207696633  # 0.29 GHz
instructions         2243131537504  # 0.76 IPC
slots                5718144840738  #
retiring             877099306526   # 15.3% (15.8%)
-- ucode             5812595978     #     0.1%
-- fastpath          871286710548   #    15.2%
frontend             3942628292756  # 68.9% (71.0%)
-- latency           3333746448912  #    58.3%
-- bandwidth         608881843844   #    10.6%
backend              704100712113   # 12.3% (12.7%)
-- cpu               150382607062   #     2.6%
-- memory            553718105051   #     9.7%
speculation          28504097599    #  0.5% ( 0.5%)
-- branch mispredict 28392540859    #     0.5%
-- pipeline restart  111556740      #     0.0%
smt-contention       164806460573   #  2.9% ( 0.0%)
cpu-cycles           2965792327103  # 0.29 GHz
instructions         2257623287988  # 0.76 IPC
instructions         731520942056   # 108.288 l2 access per 1000 inst
l2 hit from l1       68301882340    # 12.49% l2 miss
l2 miss from l1      5649904572     #
l2 hit from l2 pf    6671947470     #
l3 hit from l2 pf    4075915184     #
l3 miss from l2 pf   165354559      #
instructions         734388269635   # 16.385 float per 1000 inst
float 512            263            # 0.000 AVX-512 per 1000 inst
float 256            148351         # 0.000 AVX-256 per 1000 inst
float 128            12032632291    # 16.385 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         0              # 0.000 scalar per 1000 inst

Intel metrics

elapsed              475.534
on_cpu               0.065          # 1.05 / 16 cores
utime                37.293
stime                460.655
nvcsw                8453989        # 26.08%
nivcsw               23964755       # 73.92%
inblock              424            # 0.89/sec
onblock              2008           # 4.22/sec
cpu-clock            511848591517   # 511.849 seconds
task-clock           516693238705   # 516.693 seconds
page faults          235690         # 456.151/sec
context switches     32420944       # 62746.987/sec
cpu migrations       6844367        # 13246.481/sec
major page faults    150            # 0.290/sec
minor page faults    235540         # 455.860/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             303243666167   # 170.904 branches per 1000 inst
branch misses        969416905      # 0.32% branch miss
conditional          303243689143   # 170.904 conditional branches per 1000 inst
indirect             37894263124    # 21.357 indirect branches per 1000 inst
slots                18241517781626 #
retiring             5729375951369  # 31.4% (31.4%)
-- ucode             1100533124254  #     6.0%
-- fastpath          4628842827115  #    25.4%
frontend             5027917599006  # 27.6% (27.6%)
-- latency           2850238760757  #    15.6%
-- bandwidth         2177678838249  #    11.9%
backend              6724543857759  # 36.9% (36.9%)
-- cpu               4076434284858  #    22.3%
-- memory            2648109572901  #    14.5%
speculation          921400842474   #  5.1% ( 5.1%)
-- branch mispredict 841198396395   #     4.6%
-- pipeline restart  80202446079    #     0.4%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           1756463286214  # 0.20 GHz
instructions         2212636517840  # 1.26 IPC
l2 access            153827325812   # 91.161 l2 access per 1000 inst
l2 miss              32255204923    # 20.97% l2 miss

Process summary

693 processes
	 96 iperf3                  50.88   626.23
	 68 clinfo                  18.84     6.66
	 38 vulkaninfo               0.95     1.31
	  6 glxinfo:gdrv0            0.19     0.03
	  4 vulkani:disk$0           0.10     0.14
	  2 glxinfo                  0.10     0.01
	  2 glxinfo:cs0              0.10     0.01
	  2 glxinfo:disk$0           0.10     0.01
	  2 glxinfo:sh0              0.10     0.01
	  2 glxinfo:shlo0            0.10     0.01
	  6 php                      0.07     0.06
	  6 clang                    0.07     0.05
	  2 llvmpipe-0               0.05     0.07
	  2 llvmpipe-1               0.05     0.07
	  2 llvmpipe-10              0.05     0.07
	  2 llvmpipe-11              0.05     0.07
	  2 llvmpipe-12              0.05     0.07
	  2 llvmpipe-13              0.05     0.07
	  2 llvmpipe-14              0.05     0.07
	  2 llvmpipe-15              0.05     0.07
	  2 llvmpipe-2               0.05     0.07
	  2 llvmpipe-3               0.05     0.07
	  2 llvmpipe-4               0.05     0.07
	  2 llvmpipe-5               0.05     0.07
	  2 llvmpipe-6               0.05     0.07
	  2 llvmpipe-7               0.05     0.07
	  2 llvmpipe-8               0.05     0.07
	  2 llvmpipe-9               0.05     0.07
	 24 bash                     0.03     0.12
	  3 rocminfo                 0.03     0.00
	 48 ss                       0.00     0.48
	  1 lspci                    0.00     0.03
	  1 ps                       0.00     0.01
	100 ip                       0.00     0.00
	 81 sh                       0.00     0.00
	 30 wg                       0.00     0.00
	 24 ping                     0.00     0.00
	 24 ping6                    0.00     0.00
	 12 gcc                      0.00     0.00
	 11 gsettings                0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  4 gmain                    0.00     0.00
	  4 readlink                 0.00     0.00
	  3 mount                    0.00     0.00
	  3 wireguard                0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 cc                       0.00     0.00
	  1 date                     0.00     0.00
	  1 dconf worker             0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
47 maximum processes

The process structure looks as follows:

      36321) wireguard        cpu=6 start=5.80  finish=204.72
        36322) bash             cpu=6 start=5.80  finish=204.72
          36323) readlink         cpu=2 start=5.80  finish=5.80 
          36324) mount            cpu=3 start=5.81  finish=5.81 
          36325) ip               cpu=2 start=5.81  finish=5.81 
          36326) ip               cpu=13 start=5.81  finish=5.82 
          36327) ip               cpu=15 start=5.82  finish=5.82 
          36328) ip               cpu=2 start=5.82  finish=5.82 
          36329) ip               cpu=13 start=5.82  finish=5.83 
          36330) ip               cpu=15 start=5.83  finish=5.83 
          36331) ip               cpu=3 start=5.83  finish=5.83 
          36332) ip               cpu=2 start=5.83  finish=5.83 
          36334) ip               cpu=15 start=5.83  finish=5.89 
          36335) ip               cpu=3 start=5.89  finish=5.90 
          36337) ip               cpu=4 start=5.90  finish=5.96 
          36338) wg               cpu=3 start=5.96  finish=5.96 
          36339) wg               cpu=3 start=5.96  finish=5.96 
          36340) bash             cpu=15 start=5.96  finish=5.96 
            36341) wg               cpu=4 start=5.96  finish=5.96 
          36342) bash             cpu=3 start=5.96  finish=5.97 
            36343) wg               cpu=8 start=5.97  finish=5.97 
          36344) ip               cpu=4 start=5.97  finish=5.97 
          36345) ip               cpu=3 start=5.97  finish=5.97 
          36346) ip               cpu=4 start=5.97  finish=5.98 
          36347) ip               cpu=8 start=5.98  finish=5.98 
          36348) bash             cpu=15 start=5.98  finish=5.98 
          36349) wg               cpu=11 start=5.98  finish=5.98 
          36350) bash             cpu=15 start=5.99  finish=5.99 
          36351) wg               cpu=8 start=5.99  finish=5.99 
          36352) ip               cpu=15 start=5.99  finish=5.99 
          36353) ip               cpu=11 start=5.99  finish=5.99 
          36354) ip               cpu=15 start=6.00  finish=6.00 
          36355) wg               cpu=4 start=6.00  finish=6.00 
          36356) wg               cpu=15 start=6.00  finish=6.00 
          36357) ping             cpu=4 start=6.00  finish=6.01 
          36358) ping             cpu=9 start=6.01  finish=6.02 
          36359) ping6            cpu=12 start=6.02  finish=6.02 
          36360) ping6            cpu=11 start=6.02  finish=6.03 
          36361) iperf3           cpu=1 start=6.03  finish=25.94
          36362) ss               cpu=4 start=6.03  finish=6.05 
          36363) iperf3           cpu=2 start=6.05  finish=25.94
          36365) iperf3           cpu=15 start=25.95 finish=46.30
          36366) ss               cpu=12 start=25.95 finish=25.96
          36367) iperf3           cpu=12 start=25.96 finish=46.30
          36369) iperf3           cpu=5 start=46.30 finish=68.88
          36370) ss               cpu=12 start=46.30 finish=46.31
          36371) iperf3           cpu=4 start=46.31 finish=68.88
          36372) iperf3           cpu=2 start=68.89 finish=93.43
          36373) ss               cpu=10 start=68.89 finish=68.90
          36374) iperf3           cpu=0 start=68.90 finish=93.43
          36375) ip               cpu=11 start=93.43 finish=93.44
          36376) ip               cpu=10 start=93.44 finish=93.44
          36377) ping             cpu=7 start=93.44 finish=93.44
          36378) ping             cpu=9 start=93.44 finish=93.45
          36379) ping6            cpu=6 start=93.45 finish=93.45
          36380) ping6            cpu=6 start=93.45 finish=93.45
          36381) iperf3           cpu=10 start=93.45 finish=96.05
          36382) ss               cpu=11 start=93.45 finish=93.47
          36383) iperf3           cpu=8 start=93.47 finish=96.05
          36386) iperf3           cpu=7 start=96.05 finish=98.67
          36387) ss               cpu=15 start=96.05 finish=96.07
          36388) iperf3           cpu=8 start=96.07 finish=98.67
          36389) iperf3           cpu=9 start=98.67 finish=102.12
          36390) ss               cpu=7 start=98.67 finish=98.69
          36391) iperf3           cpu=4 start=98.69 finish=102.12
          36392) iperf3           cpu=15 start=102.12 finish=105.81
          36393) ss               cpu=3 start=102.12 finish=102.14
          36394) iperf3           cpu=12 start=102.14 finish=105.81
          36395) ip               cpu=3 start=105.81 finish=105.81
          36396) ip               cpu=3 start=105.82 finish=105.82
          36397) wg               cpu=3 start=105.82 finish=105.82
          36398) wg               cpu=3 start=105.82 finish=105.82
          36399) ping             cpu=3 start=105.82 finish=105.83
          36400) ping             cpu=12 start=105.83 finish=105.83
          36401) ping6            cpu=6 start=105.83 finish=105.83
          36402) ping6            cpu=12 start=105.83 finish=105.84
          36403) iperf3           cpu=15 start=105.84 finish=125.84
          36404) ss               cpu=10 start=105.84 finish=105.85
          36405) iperf3           cpu=14 start=105.86 finish=125.84
          36406) iperf3           cpu=5 start=125.84 finish=146.22
          36407) ss               cpu=2 start=125.84 finish=125.86
          36408) iperf3           cpu=0 start=125.86 finish=146.22
          36410) iperf3           cpu=4 start=146.22 finish=168.64
          36411) ss               cpu=1 start=146.23 finish=146.24
          36412) iperf3           cpu=11 start=146.24 finish=168.64
          36413) iperf3           cpu=8 start=168.64 finish=192.19
          36414) ss               cpu=2 start=168.64 finish=168.66
          36415) iperf3           cpu=2 start=168.66 finish=192.19
          36416) ip               cpu=1 start=192.19 finish=192.19
          36417) ip               cpu=1 start=192.19 finish=192.19
          36418) ping             cpu=1 start=192.19 finish=192.20
          36419) ping             cpu=7 start=192.20 finish=192.20
          36420) ping6            cpu=1 start=192.20 finish=192.20
          36421) ping6            cpu=6 start=192.21 finish=192.21
          36422) iperf3           cpu=4 start=192.21 finish=194.74
          36423) ss               cpu=1 start=192.21 finish=192.22
          36424) iperf3           cpu=2 start=192.22 finish=194.74
          36425) iperf3           cpu=3 start=194.74 finish=197.24
          36426) ss               cpu=7 start=194.74 finish=194.75
          36427) iperf3           cpu=12 start=194.75 finish=197.24
          36428) iperf3           cpu=6 start=197.24 finish=200.68
          36429) ss               cpu=10 start=197.24 finish=197.25
          36430) iperf3           cpu=11 start=197.25 finish=200.68
          36431) iperf3           cpu=3 start=200.69 finish=204.30
          36432) ss               cpu=8 start=200.69 finish=200.70
          36433) iperf3           cpu=12 start=200.70 finish=204.30
          36434) ip               cpu=9 start=204.30 finish=204.30
          36435) ip               cpu=6 start=204.30 finish=204.47
          36436) ip               cpu=2 start=204.47 finish=204.70
          36437) bash             cpu=7 start=204.70 finish=204.70
            36438) ip               cpu=1 start=204.70 finish=204.70
          36439) bash             cpu=3 start=204.70 finish=204.71
            36440) ip               cpu=12 start=204.70 finish=204.70
          36441) bash             cpu=13 start=204.71 finish=204.71
            36442) ip               cpu=7 start=204.71 finish=204.71
          36443) ip               cpu=1 start=204.71 finish=204.71
          36444) ip               cpu=12 start=204.71 finish=204.71
          36445) ip               cpu=1 start=204.71 finish=204.72