A test of building the node.js javascript engine. This code is longer than several of the “build-*” workloads, though not quite as fast as build-gcc or build-llvm. Similar to other build workloads there is a high number of processes, number of frontend stalls. In contrast to some others, there is somewhat higher backend stalls. This code has some “cleanup phases before each run that also look reflected in the compilation. Also looks like mostly parallel compilation with one serializing (link?) half way through and greater serialization towards end of the workload.

Topdown profile shows those general periods of higher frontend stalls. Also looks like the “cleanup” before each workload has different profile.

AMD metrics show ~1/5 instructions is a branch with little floating point code.
elapsed 2144.566
on_cpu 0.941 # 15.05 / 16 cores
utime 30188.340
stime 2091.837
nvcsw 807175 # 44.88%
nivcsw 991151 # 55.12%
inblock 0 # 0.00/sec
onblock 12258648 # 5716.14/sec
cpu-clock 32279798809815 # 32279.799 seconds
task-clock 32280195435292 # 32280.195 seconds
page faults 642525533 # 19904.636/sec
context switches 1568781 # 48.599/sec
cpu migrations 73009 # 2.262/sec
major page faults 1676 # 0.052/sec
minor page faults 642523857 # 19904.584/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 21649570910307 # 211.978 branches per 1000 inst
branch misses 542079184612 # 2.50% branch miss
conditional 16952172427646 # 165.984 conditional branches per 1000 inst
indirect 430406042887 # 4.214 indirect branches per 1000 inst
cpu-cycles 133548357863790 # 3.89 GHz
instructions 101428319812801 # 0.76 IPC
slots 268650189068820 #
retiring 32676831786963 # 12.2% (16.3%)
-- ucode 21498857071 # 0.0%
-- fastpath 32655332929892 # 12.2%
frontend 69755359369146 # 26.0% (34.8%)
-- latency 52488811107576 # 19.5%
-- bandwidth 17266548261570 # 6.4%
backend 92552250002478 # 34.5% (46.1%)
-- cpu 6556257702259 # 2.4%
-- memory 85995992300219 # 32.0%
speculation 5716339764160 # 2.1% ( 2.8%)
-- branch mispredict 5651531115706 # 2.1%
-- pipeline restart 64808648454 # 0.0%
smt-contention 67949097952459 # 25.3% ( 0.0%)
cpu-cycles 133519263543635 # 3.88 GHz
instructions 101430105523798 # 0.76 IPC
instructions 33965954644499 # 69.306 l2 access per 1000 inst
l2 hit from l1 1830068894706 # 22.58% l2 miss
l2 miss from l1 323908834364 #
l2 hit from l2 pf 316444081333 #
l3 hit from l2 pf 101163396524 #
l3 miss from l2 pf 106365283407 #
instructions 33946845258090 # 15.153 float per 1000 inst
float 512 81194 # 0.000 AVX-512 per 1000 inst
float 256 2345789 # 0.000 AVX-256 per 1000 inst
float 128 514401099938 # 15.153 AVX-128 per 1000 inst
float MMX 0 # 0.000 MMX per 1000 inst
float scalar 8 # 0.000 scalar per 1000 inst
Intel metrics
elapsed 2512.218
on_cpu 0.945 # 15.12 / 16 cores
utime 36117.910
stime 1854.539
nvcsw 878304 # 45.90%
nivcsw 1035202 # 54.10%
inblock 102096 # 40.64/sec
onblock 12247440 # 4875.15/sec
cpu-clock 37972118857909 # 37972.119 seconds
task-clock 37972545348380 # 37972.545 seconds
page faults 642330574 # 16915.658/sec
context switches 1691066 # 44.534/sec
cpu migrations 84049 # 2.213/sec
major page faults 1076 # 0.028/sec
minor page faults 642329498 # 16915.629/sec
alignment faults 0 # 0.000/sec
emulation faults 0 # 0.000/sec
branches 21432898419363 # 210.396 branches per 1000 inst
branch misses 411236524227 # 1.92% branch miss
conditional 21432906380163 # 210.396 conditional branches per 1000 inst
indirect 3716689790845 # 36.485 indirect branches per 1000 inst
slots 182247267068792 #
retiring 54162825882165 # 29.7% (29.7%)
-- ucode 4366997381232 # 2.4%
-- fastpath 49795828500933 # 27.3%
frontend 55027280352127 # 30.2% (30.2%)
-- latency 33598971511576 # 18.4%
-- bandwidth 21428308840551 # 11.8%
backend 56316877227342 # 30.9% (30.9%)
-- cpu 12654035624989 # 6.9%
-- memory 43662841602353 # 24.0%
speculation 17013502249974 # 9.3% ( 9.3%)
-- branch mispredict 16365820969817 # 9.0%
-- pipeline restart 647681280157 # 0.4%
smt-contention 0 # 0.0% ( 0.0%)
cpu-cycles 81430819711470 # 2.03 GHz
instructions 79248849727315 # 0.97 IPC
l2 access 3923871673155 # 68.114 l2 access per 1000 inst
l2 miss 1327323041163 # 33.83% l2 miss
Process overview suggests ~2/3 C++ compilation and ~1/3 C compilation in terms of files, but much more time spent in the C++ compilation. Overall a quarter million processes
248492 processes
7231 cc1plus 29340.31 1605.19
3882 cc1 464.31 56.24
36 ld 23.74 9.48
11158 as 23.09 1.92
65 clinfo 18.13 6.67
6 make 11.97 18.56
3 torque 10.78 0.23
3 xz 4.32 0.43
37 python 4.17 0.46
36 python3.10 3.14 0.24
18 node_mksnapshot 3.12 0.54
3 mksnapshot 2.86 0.23
117 ar 1.70 1.71
38 vulkaninfo 1.13 1.52
3 genccode 0.49 0.29
6 php 0.21 0.53
6 glxinfo:gdrv0 0.19 0.07
3 tar 0.12 2.75
4 vulkani:disk$0 0.12 0.16
2 glxinfo 0.09 0.03
2 glxinfo:cs0 0.09 0.03
2 glxinfo:disk$0 0.09 0.03
2 glxinfo:sh0 0.09 0.03
2 glxinfo:shlo0 0.09 0.03
2 llvmpipe-0 0.06 0.08
2 llvmpipe-1 0.06 0.08
2 llvmpipe-10 0.06 0.08
2 llvmpipe-11 0.06 0.08
2 llvmpipe-12 0.06 0.08
2 llvmpipe-13 0.06 0.08
2 llvmpipe-14 0.06 0.08
2 llvmpipe-15 0.06 0.08
2 llvmpipe-2 0.06 0.08
2 llvmpipe-3 0.06 0.08
2 llvmpipe-4 0.06 0.08
2 llvmpipe-5 0.06 0.08
2 llvmpipe-6 0.06 0.08
2 llvmpipe-7 0.06 0.08
2 llvmpipe-8 0.06 0.08
2 llvmpipe-9 0.06 0.08
11325 rm 0.05 2.43
3 icupkg 0.05 0.19
6 clang 0.05 0.07
45 V8 DefaultWorke 0.00 42.90
3 cp 0.00 0.11
3942 cc 0.00 0.03
3 find 0.00 0.03
3 rocminfo 0.00 0.03
1 lspci 0.00 0.03
1 ps 0.00 0.01
123565 sh 0.00 0.00
33622 sed 0.00 0.00
11553 mkdir 0.00 0.00
11376 printf 0.00 0.00
11296 touch 0.00 0.00
11208 grep 0.00 0.00
7263 g++ 0.00 0.00
117 dirname 0.00 0.00
36 collect2 0.00 0.00
22 gcc 0.00 0.00
17 uname 0.00 0.00
15 tr 0.00 0.00
10 gsettings 0.00 0.00
8 stat 0.00 0.00
8 systemd-detect- 0.00 0.00
6 llvm-link 0.00 0.00
5 phoronix-test-s 0.00 0.00
4 bash 0.00 0.00
3 build-nodejs 0.00 0.00
3 bytecode_builti 0.00 0.00
3 dconf worker 0.00 0.00
3 gen-regexp-spec 0.00 0.00
3 gmain 0.00 0.00
3 ld.gold 0.00 0.00
3 ln 0.00 0.00
2 lscpu 0.00 0.00
2 which 0.00 0.00
2 xset 0.00 0.00
1 date 0.00 0.00
1 dmesg 0.00 0.00
1 dmidecode 0.00 0.00
1 ifconfig 0.00 0.00
1 ip 0.00 0.00
1 lsmod 0.00 0.00
1 mktemp 0.00 0.00
1 python3 0.00 0.00
1 qdbus 0.00 0.00
1 readlink 0.00 0.00
1 realpath 0.00 0.00
1 sort 0.00 0.00
1 stty 0.00 0.00
1 systemctl 0.00 0.00
1 template.sh 0.00 0.00
1 wc 0.00 0.00
1 xrandr 0.00 0.00
189 processes running
236 maximum processes
