While the binding is in Java, the underlying routines are still in C and the profile is not much different from Scimark. There are six workloads and these are single-threaded.

Topdown profile shows differences among the tests. The CPU version has one test with high branch misprediction, but this seems tamped down in the Java version.

AMD metrics confirm single threaded nature

elapsed              101.740
on_cpu               0.048          # 0.77 / 16 cores
utime                77.442
stime                1.123
nvcsw                6061           # 91.45%
nivcsw               567            # 8.55%
inblock              0              # 0.00/sec
onblock              12952          # 127.31/sec
cpu-clock            78566629032    # 78.567 seconds
task-clock           78574938696    # 78.575 seconds
page faults          243844         # 3103.330/sec
context switches     6966           # 88.654/sec
cpu migrations       323            # 4.111/sec
major page faults    2              # 0.025/sec
minor page faults    243835         # 3103.216/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             143175014666   # 150.673 branches per 1000 inst
branch misses        369747747      # 0.26% branch miss
conditional          137701502280   # 144.912 conditional branches per 1000 inst
indirect             290449316      # 0.306 indirect branches per 1000 inst
cpu-cycles           357048247322   # 0.22 GHz
instructions         947412030051   # 2.65 IPC
slots                716557155222   #
retiring             294626599862   # 41.1% (41.1%)
-- ucode             22043579       #     0.0%
-- fastpath          294604556283   #    41.1%
frontend             21243854500    #  3.0% ( 3.0%) low
-- latency           12986602212    #     1.8%
-- bandwidth         8257252288     #     1.2%
backend              391269290765   # 54.6% (54.6%)
-- cpu               208947620712   #    29.2%
-- memory            182321670053   #    25.4%
speculation          9364462889     #  1.3% ( 1.3%)
-- branch mispredict 8821180342     #     1.2%
-- pipeline restart  543282547      #     0.1%
smt-contention       52652075       #  0.0% ( 0.0%)
cpu-cycles           355730677372   # 0.22 GHz
instructions         948230791406   # 2.67 IPC
instructions         316779639994   # 59.416 l2 access per 1000 inst
l2 hit from l1       10828793760    # 38.61% l2 miss
l2 miss from l1      2792307207     #
l2 hit from l2 pf    3517655294     #
l3 hit from l2 pf    4354421742     #
l3 miss from l2 pf   120933830      #
instructions         316604203686   # 99.215 float per 1000 inst
float 512            50             # 0.000 AVX-512 per 1000 inst
float 256            648            # 0.000 AVX-256 per 1000 inst
float 128            31412029930    # 99.215 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         28             # 0.000 scalar per 1000 inst

Intel metrics

elapsed              402.843
on_cpu               0.054          # 0.86 / 16 cores
utime                344.403
stime                1.434
nvcsw                18472          # 91.84%
nivcsw               1642           # 8.16%
inblock              60192          # 149.42/sec
onblock              4368           # 10.84/sec
cpu-clock            345661731035   # 345.662 seconds
task-clock           345681034487   # 345.681 seconds
page faults          515263         # 1490.574/sec
context switches     21940          # 63.469/sec
cpu migrations       869            # 2.514/sec
major page faults    403            # 1.166/sec
minor page faults    514835         # 1489.335/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             455720412495   # 148.354 branches per 1000 inst
branch misses        707526994      # 0.16% branch miss
conditional          455720433455   # 148.354 conditional branches per 1000 inst
indirect             916935931      # 0.298 indirect branches per 1000 inst
slots                1887837681326  #
retiring             733344951086   # 38.8% (38.8%)
-- ucode             44537182261    #     2.4%
-- fastpath          688807768825   #    36.5%
frontend             57289147523    #  3.0% ( 3.0%) low
-- latency           20966437018    #     1.1%
-- bandwidth         36322710505    #     1.9%
backend              1067027486646  # 56.5% (56.5%)
-- cpu               622814566378   #    33.0%
-- memory            444212920268   #    23.5%
speculation          18286255717    #  1.0% ( 1.0%) low
-- branch mispredict 18047855058    #     1.0%
-- pipeline restart  238400659      #     0.0%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           313647299514   # 0.17 GHz
instructions         767158407693   # 2.45 IPC
l2 access            37411980220    # 48.805 l2 access per 1000 inst
l2 miss              22747730612    # 60.80% l2 miss

Process profile gives us some of the JVM processes such as garbage collection

434 processes
	  8 java                   153.24     0.36
	  4 Finalizer               76.62     0.18
	  3 Common-Cleaner          76.59     0.17
	 68 clinfo                  20.52     5.98
	 38 vulkaninfo               1.14     1.52
	  4 vulkani:disk$0           0.12     0.16
	  6 php                      0.08     0.06
	  2 llvmpipe-0               0.06     0.08
	  2 llvmpipe-1               0.06     0.08
	  2 llvmpipe-10              0.06     0.08
	  2 llvmpipe-11              0.06     0.08
	  2 llvmpipe-12              0.06     0.08
	  2 llvmpipe-13              0.06     0.08
	  2 llvmpipe-14              0.06     0.08
	  2 llvmpipe-15              0.06     0.08
	  2 llvmpipe-2               0.06     0.08
	  2 llvmpipe-3               0.06     0.08
	  2 llvmpipe-4               0.06     0.08
	  2 llvmpipe-5               0.06     0.08
	  2 llvmpipe-6               0.06     0.08
	  2 llvmpipe-7               0.06     0.08
	  2 llvmpipe-8               0.06     0.08
	  2 llvmpipe-9               0.06     0.08
	  6 clang                    0.05     0.07
	  3 rocminfo                 0.03     0.00
	 13 C1 CompilerThre          0.00   186.19
	  4 C2 CompilerThre          0.00    76.62
	  4 G1 Conc#0                0.00    76.62
	  4 GC Thread#0              0.00    76.62
	  4 Reference Handl          0.00    76.62
	  4 Service Thread           0.00    76.62
	  4 Signal Dispatch          0.00    76.62
	  4 Sweeper thread           0.00    76.62
	  4 G1 Refine#0              0.00    76.61
	  4 VM Thread                0.00    76.61
	  3 GC Thread#1              0.00    76.59
	  3 GC Thread#2              0.00    76.59
	  3 GC Thread#3              0.00    76.59
	  3 GC Thread#4              0.00    76.59
	  3 GC Thread#5              0.00    76.59
	  3 GC Thread#6              0.00    76.59
	  3 GC Thread#7              0.00    76.59
	  3 GC Thread#8              0.00    76.59
	  3 GC Thread#9              0.00    76.59
	  1 lspci                    0.00     0.03
	  1 ps                       0.00     0.01
	 83 sh                       0.00     0.00
	 13 gsettings                0.00     0.00
	 12 gcc                      0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  4 G1 Main Marker           0.00     0.00
	  4 G1 Young RemSet          0.00     0.00
	  4 VM Periodic Tas          0.00     0.00
	  4 glxinfo                  0.00     0.00
	  3 java-scimark2            0.00     0.00
	  2 gmain                    0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 setterm                  0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  1 cc                       0.00     0.00
	  1 date                     0.00     0.00
	  1 dconf worker             0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
0 processes running
47 maximum processes

The computation block

      1804315) java-scimark2    cpu=10 start=34.81 finish=60.23
        1804316) java             cpu=4 start=34.81 finish=60.22
          1804317) java             cpu=5 start=34.81 finish=60.22
            1804318) GC Thread#0      cpu=-1 start=34.81 finish=60.22
            1804319) G1 Main Marker   cpu=0 start=34.82 finish=60.20
            1804320) G1 Conc#0        cpu=-1 start=34.82 finish=60.22
            1804321) G1 Refine#0      cpu=-1 start=34.82 finish=60.20
            1804322) G1 Young RemSet  cpu=0 start=34.82 finish=60.20
            1804323) VM Thread        cpu=-1 start=34.83 finish=60.22
              1804335) GC Thread#1      cpu=-1 start=54.74 finish=60.22
              1804336) GC Thread#2      cpu=-1 start=54.74 finish=60.22
              1804337) GC Thread#3      cpu=-1 start=54.74 finish=60.22
              1804338) GC Thread#4      cpu=-1 start=54.74 finish=60.22
              1804339) GC Thread#5      cpu=-1 start=54.74 finish=60.22
              1804340) GC Thread#6      cpu=-1 start=54.74 finish=60.22
              1804341) GC Thread#7      cpu=-1 start=54.74 finish=60.22
              1804342) GC Thread#8      cpu=-1 start=54.74 finish=60.22
              1804343) GC Thread#9      cpu=-1 start=54.74 finish=60.22
            1804324) Reference Handl  cpu=-1 start=34.83 finish=60.22
            1804325) Finalizer        cpu=15 start=34.83 finish=60.22
            1804326) Signal Dispatch  cpu=-1 start=34.83 finish=60.22
            1804327) Service Thread   cpu=-1 start=34.83 finish=60.22
            1804328) C2 CompilerThre  cpu=-1 start=34.83 finish=60.22
              1804331) C1 CompilerThre  cpu=-1 start=34.84 finish=45.13
                1804334) C1 CompilerThre  cpu=-1 start=34.86 finish=35.79
            1804329) C1 CompilerThre  cpu=-1 start=34.83 finish=60.22
              1804344) C1 CompilerThre  cpu=-1 start=60.20 finish=60.22
            1804330) Sweeper thread   cpu=-1 start=34.83 finish=60.22
            1804332) VM Periodic Tas  cpu=0 start=34.84 finish=60.20
            1804333) Common-Cleaner   cpu=13 start=34.84 finish=60.22