A benchmark of the Apache Cassandra NoSQL database. There are three workloads. First writes, then read/write in ratio 1:1 and then a ratio of 3:1. Looks like a variable set of runnable processes.

Topdown profile is weighted towards frontend stalls. After that some backend memory stalls with overall retirement rate lower than average.

AMD metrics show not much floating point, a high amount of L2 access including misses and frontend latency almost half the total.Also interesting that only about 1/4 of the cores busy.
elapsed 4332.928
on_cpu 0.258 # 4.13 / 16 cores
utime 8905.936
stime 8995.891
nvcsw 601776779 # 94.13%
nivcsw 37510851 # 5.87%
inblock 9152 # 2.11/sec
onblock 294672 # 68.01/sec
cpu-clock 46605354321724 # 46605.354 seconds
task-clock 47003106395359 # 47003.106 seconds
page faults 35135335 # 747.511/sec
context switches 1432379568 # 30474.147/sec
cpu migrations 836923710 # 17805.711/sec
major page faults 5111418 # 108.746/sec
minor page faults 29976133 # 637.748/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 19677696605382 # 169.462 branches per 1000 inst
branch misses 1174720045057 # 5.97% branch miss
conditional 14157392473496 # 121.922 conditional branches per 1000 inst
indirect 398569475290 # 3.432 indirect branches per 1000 inst
cpu-cycles 110019767419271 # 2.46 GHz
instructions 78514350926593 # 0.71 IPC
slots 215057315796582 #
retiring 26727298203613 # 12.4% (14.0%)
-- ucode 107310119389 # 0.0%
-- fastpath 26619988084224 # 12.4%
frontend 121877473591996 # 56.7% (64.0%)
-- latency 105576764308332 # 49.1%
-- bandwidth 16300709283664 # 7.6%
backend 37980810293573 # 17.7% (19.9%)
-- cpu 3183460600102 # 1.5%
-- memory 34797349693471 # 16.2%
speculation 3728650165352 # 1.7% ( 2.0%)
-- branch mispredict 3700122721449 # 1.7%
-- pipeline restart 28527443903 # 0.0%
smt-contention 24641343322125 # 11.5% ( 0.0%)
cpu-cycles 118359556086527 # 2.48 GHz
instructions 83709827177511 # 0.71 IPC
instructions 27330974978530 # 104.378 l2 access per 1000 inst
l2 hit from l1 2521875844280 # 40.19% l2 miss
l2 miss from l1 913841730923 #
l2 hit from l2 pf 98215015831 #
l3 hit from l2 pf 193261277895 #
l3 miss from l2 pf 39386943790 #
instructions 27344544465946 # 8.199 float per 1000 inst
float 512 308 # 0.000 AVX-512 per 1000 inst
float 256 962 # 0.000 AVX-256 per 1000 inst
float 128 224187156471 # 8.199 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 33788 # 0.000 scalar per 1000 inst
Intel metrics
elapsed 4097.853
on_cpu 0.265 # 4.24 / 16 cores
utime 10573.212
stime 6785.551
nvcsw 549845910 # 94.17%
nivcsw 34034739 # 5.83%
inblock 1175312 # 286.81/sec
onblock 255152 # 62.26/sec
cpu-clock 46773863724106 # 46773.864 seconds
task-clock 47010972033464 # 47010.972 seconds
page faults 28928282 # 615.352/sec
context switches 1341725858 # 28540.696/sec
cpu migrations 871346665 # 18534.964/sec
major page faults 3548986 # 75.493/sec
minor page faults 25312394 # 538.436/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 17578871409480 # 162.943 branches per 1000 inst
branch misses 285816420521 # 1.63% branch miss
conditional 17578871753096 # 162.943 conditional branches per 1000 inst
indirect 3475864234757 # 32.219 indirect branches per 1000 inst
slots 260335397251850 #
retiring 80256736492957 # 30.8% (30.8%)
-- ucode 6584405418875 # 2.5%
-- fastpath 73672331074082 # 28.3%
frontend 90294915158826 # 34.7% (34.7%)
-- latency 59016419380680 # 22.7%
-- bandwidth 31278495778146 # 12.0%
backend 61537137199156 # 23.6% (23.6%)
-- cpu 23158639207958 # 8.9%
-- memory 38378497991198 # 14.7%
speculation 30134034973779 # 11.6% (11.6%)
-- branch mispredict 28831919130365 # 11.1%
-- pipeline restart 1302115843414 # 0.5%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 166264750189585 # 2.23 GHz
instructions 157451768214939 # 0.95 IPC
l2 access 7724180485399 # 103.319 l2 access per 1000 inst
l2 miss 3266744587191 # 42.29% l2 miss
Interesting set of process names being used. This is a JDK program and many existing processes seem to run throughout the benchmark. We lost some events at the end.
9103 processes
85 JMX server conn 1369594.41 0.00
25 RMI TCP Connect 489389.80 0.00
64 epollEventLoopG 313049.92 140784.96
1184 cluster1-nio-wo 260370.83 215330.45
38 Native-Transpor 188366.70 85160.72
4 RMI TCP Accept- 133587.60 0.00
23 MemtableFlushWr 93993.37 36670.56
23 PerDiskMemtable 93988.87 36668.69
87 VM Periodic Tas 66793.80 0.00
2 CMS Main Thread 66793.80 0.00
8 MutationStage-2 39131.24 17598.12
174 java 35845.62 22259.47
5 MutationStage-1 23210.37 10214.00
7 CompactionExecu 22095.39 9798.23
96 globalEventExec 21541.15 17823.12
4 LocalPool-Clean 19565.62 8799.06
87 Finalizer 17922.81 11129.74
42 Common-Cleaner 17921.81 11129.00
2 AsyncAppender-W 9782.81 4399.53
2 BatchlogTasks:1 9782.81 4399.53
2 COMMIT-LOG-ALLO 9782.81 4399.53
2 Callback-Map-Re 9782.81 4399.53
2 ForkJoinPool.co 9782.81 4399.53
2 GossipTasks:1 9782.81 4399.53
2 HintsWriteExecu 9782.81 4399.53
2 IndexSummaryMan 9782.81 4399.53
2 MemtablePostFlu 9782.81 4399.53
2 MemtableReclaim 9782.81 4399.53
2 Messaging-Accep 9782.81 4399.53
2 MigrationStage: 9782.81 4399.53
2 NonPeriodicTask 9782.81 4399.53
2 OptionalTasks:1 9782.81 4399.53
2 PERIODIC-COMMIT 9782.81 4399.53
2 PendingRangeCal 9782.81 4399.53
2 Reference-Reape 9782.81 4399.53
2 ScheduledFastTa 9782.81 4399.53
2 ScheduledTasks: 9782.81 4399.53
2 SecondaryIndexM 9782.81 4399.53
2 SlabPoolCleaner 9782.81 4399.53
2 SnapshotCleanup 9782.81 4399.53
2 logback-1 9782.81 4399.53
2 logback-2 9782.81 4399.53
2 logback-3 9782.81 4399.53
2 logback-4 9782.81 4399.53
2 logback-5 9782.81 4399.53
2 logback-6 9782.81 4399.53
2 logback-7 9782.81 4399.53
2 logback-8 9782.81 4399.53
2 read-hotness-tr 9782.81 4399.53
89 StressMetrics 8901.83 6859.15
37 ObjectCleanerTh 8136.82 6729.24
37 JmxCollector:1 8136.80 6729.23
37 JmxCollector:10 8136.80 6729.23
37 JmxCollector:11 8136.80 6729.23
37 JmxCollector:12 8136.80 6729.23
37 JmxCollector:13 8136.80 6729.23
37 JmxCollector:14 8136.80 6729.23
37 JmxCollector:15 8136.80 6729.23
37 JmxCollector:16 8136.80 6729.23
37 JmxCollector:2 8136.80 6729.23
37 JmxCollector:3 8136.80 6729.23
37 JmxCollector:4 8136.80 6729.23
37 JmxCollector:5 8136.80 6729.23
37 JmxCollector:6 8136.80 6729.23
37 JmxCollector:7 8136.80 6729.23
37 JmxCollector:8 8136.80 6729.23
37 JmxCollector:9 8136.80 6729.23
37 Logging-Cleaner 8136.70 6729.17
37 Thread-0 8136.67 6729.17
37 Shutdown-checke 8136.66 6729.14
37 cluster1-connec 8136.66 6729.14
37 cluster1-timeou 8136.65 6729.14
37 cluster1-schedu 8132.43 6728.24
1 MutationStage-4 6138.06 2984.59
70 cluster1-worker 6065.67 4522.42
38 Thread-21 5089.49 4092.63
38 Thread-22 5089.49 4092.62
38 Thread-25 5089.47 4092.63
38 Thread-28 5089.47 4092.63
38 Thread-32 5089.47 4092.63
38 Thread-34 5089.47 4092.63
38 Thread-20 5089.47 4092.62
38 Thread-35 5089.46 4092.63
38 Thread-23 5089.46 4092.62
38 Thread-24 5089.46 4092.62
38 Thread-26 5089.45 4092.62
38 Thread-27 5089.45 4092.62
38 Thread-29 5089.45 4092.62
38 Thread-31 5089.45 4092.62
38 Thread-33 5089.45 4092.62
38 Thread-30 5089.44 4092.62
1 ReadStage-10 3644.75 1414.94
1 ReadStage-18 3644.75 1414.94
1 ReadStage-19 3644.75 1414.94
1 ReadStage-20 3644.75 1414.94
1 ReadStage-25 3644.75 1414.94
1 ReadStage-31 3644.75 1414.94
1 ReadStage-4 3644.75 1414.94
13 Thread-37 3287.03 2684.24
13 Thread-39 3287.03 2684.24
13 Thread-40 3287.03 2684.24
13 Thread-41 3287.03 2684.24
13 Thread-42 3287.03 2684.24
13 Thread-44 3287.03 2684.24
13 Thread-46 3287.03 2684.24
13 Thread-47 3287.03 2684.24
13 Thread-48 3287.03 2684.24
13 Thread-51 3287.03 2684.24
13 Thread-52 3287.03 2684.24
13 Thread-43 3287.03 2684.23
13 Thread-45 3287.03 2684.23
13 Thread-49 3287.03 2684.23
13 Thread-38 3287.02 2684.24
13 Thread-50 3287.02 2684.24
38 Thread-4 523.50 82.11
38 Thread-3 523.49 82.12
38 Thread-6 523.49 82.11
38 Thread-10 523.48 82.12
38 Thread-5 523.48 82.11
38 Thread-7 523.47 82.12
38 Thread-9 523.46 82.11
38 Thread-12 523.44 82.12
38 Thread-8 523.44 82.11
38 Thread-11 523.43 82.11
38 Thread-14 523.43 82.10
38 Thread-16 523.43 82.10
38 Thread-17 523.43 82.10
38 Thread-13 523.42 82.11
38 Thread-18 523.42 82.10
38 Thread-15 523.41 82.11
9 loadSavedCache: 51.25 25.02
31 clinfo 8.41 3.63
19 vulkaninfo 0.73 0.76
3 find 0.36 0.48
3 glxinfo:gdrv0 0.08 0.03
3 glxinfo:gl0 0.08 0.03
6 ldconfig.real 0.07 0.15
2 vulkani:disk$0 0.07 0.08
6 clang 0.06 0.06
1 llvmpipe-0 0.04 0.04
1 llvmpipe-1 0.04 0.04
1 llvmpipe-10 0.04 0.04
1 llvmpipe-11 0.04 0.04
1 llvmpipe-12 0.04 0.04
1 llvmpipe-13 0.04 0.04
1 llvmpipe-14 0.04 0.04
1 llvmpipe-15 0.04 0.04
1 llvmpipe-2 0.04 0.04
1 llvmpipe-3 0.04 0.04
1 llvmpipe-4 0.04 0.04
1 llvmpipe-5 0.04 0.04
1 llvmpipe-6 0.04 0.04
1 llvmpipe-7 0.04 0.04
1 llvmpipe-8 0.04 0.04
1 llvmpipe-9 0.04 0.04
1 glxinfo 0.04 0.01
1 glxinfo:cs0 0.04 0.01
1 glxinfo:disk$0 0.04 0.01
1 glxinfo:sh0 0.04 0.01
1 glxinfo:shlo0 0.04 0.01
518 C1 CompilerThre 0.00 307397.68
890 C2 CompilerThre 0.00 218273.37
87 GC Thread#0 0.00 17922.81
87 Reference Handl 0.00 17922.81
87 Service Thread 0.00 17922.81
87 Signal Dispatch 0.00 17922.81
87 VM Thread 0.00 17922.81
87 Sweeper thread 0.00 17922.80
42 GC Thread#1 0.00 17921.81
42 GC Thread#2 0.00 17921.81
42 GC Thread#3 0.00 17921.81
42 GC Thread#4 0.00 17921.81
42 GC Thread#5 0.00 17921.81
42 GC Thread#6 0.00 17921.81
42 GC Thread#7 0.00 17921.81
42 GC Thread#8 0.00 17921.81
42 GC Thread#9 0.00 17921.81
39 RMI Scheduler(0 0.00 17919.62
39 GC Thread#10 0.00 17919.61
39 GC Thread#11 0.00 17919.61
39 GC Thread#12 0.00 17919.61
2 CMS Thread#0 0.00 9782.81
2 CMS Thread#1 0.00 9782.81
2 CMS Thread#2 0.00 9782.81
2 CMS Thread#3 0.00 9782.81
85 G1 Conc#0 0.00 8140.00
85 G1 Refine#0 0.00 8139.88
37 G1 Conc#1 0.00 8136.82
37 G1 Conc#2 0.00 8136.82
37 RMI RenewClean- 0.00 8136.80
33 G1 Refine#1 0.00 7264.24
27 G1 Refine#2 0.00 5765.48
3 process reaper 0.00 373.36
1 G1 Refine#3 0.00 317.12
5 rm 0.00 1.25
113 cassandra 0.00 0.02
327 cassandra-stres 0.00 0.00
228 grep 0.00 0.00
135 awk 0.00 0.00
89 dirname 0.00 0.00
87 JMX client hear 0.00 0.00
85 G1 Main Marker 0.00 0.00
85 G1 Young RemSet 0.00 0.00
67 sh 0.00 0.00
44 cut 0.00 0.00
37 RMI GC Daemon 0.00 0.00
29 cat 0.00 0.00
19 sleep 0.00 0.00
12 expr 0.00 0.00
12 gcc 0.00 0.00
9 gsettings 0.00 0.00
8 systemd-detect- 0.00 0.00
8 which 0.00 0.00
7 stat 0.00 0.00
6 llvm-link 0.00 0.00
6 ls 0.00 0.00
6 tr 0.00 0.00
6 uname 0.00 0.00
5 bash 0.00 0.00
4 gmain 0.00 0.00
4 phoronix-test-s 0.00 0.00
4 sed 0.00 0.00
4 sort 0.00 0.00
3 free 0.00 0.00
3 getopt 0.00 0.00
3 head 0.00 0.00
3 mkdir 0.00 0.00
2 dconf worker 0.00 0.00
1 date 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lscpu 0.00 0.00
1 mktemp 0.00 0.00
1 ps 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
1 xset 0.00 0.00
238 processes running
248 maximum processes
