3D CAD modeler program. There are five different workloads. From the profile, these look to be single-threaded with something extra running during the first workload. The relative times also stand out with first workload taking almost as much time as the other four combined.

Topdown profile shows a high retirement rate and mix of backend and frontend stalls.

AMD metrics reflect a low on-core percentage, ~1/5 instructions are branches and very little floating point.

elapsed              422.749
on_cpu               0.050          # 0.80 / 16 cores
utime                332.058
stime                4.627
nvcsw                23009          # 92.58%
nivcsw               1844           # 7.42%
inblock              264            # 0.62/sec
onblock              14608          # 34.55/sec
cpu-clock            336740720119   # 336.741 seconds
task-clock           336754926269   # 336.755 seconds
page faults          1910499        # 5673.262/sec
context switches     26752          # 79.441/sec
cpu migrations       638            # 1.895/sec
major page faults    4              # 0.012/sec
minor page faults    1910495        # 5673.250/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             1031901503702  # 195.758 branches per 1000 inst
branch misses        4885209147     # 0.47% branch miss
conditional          681508801840   # 129.287 conditional branches per 1000 inst
indirect             68784371657    # 13.049 indirect branches per 1000 inst
cpu-cycles           1547029539906  # 0.23 GHz
instructions         5265953090888  # 3.40 IPC
slots                3098409161730  #
retiring             1761428833226  # 56.8% (56.9%)
-- ucode             3746874236     #     0.1%
-- fastpath          1757681958990  #    56.7%
frontend             676634274863   # 21.8% (21.8%)
-- latency           350628159138   #    11.3%
-- bandwidth         326006115725   #    10.5%
backend              535081034943   # 17.3% (17.3%)
-- cpu               77646362712    #     2.5%
-- memory            457434672231   #    14.8%
speculation          124848917485   #  4.0% ( 4.0%)
-- branch mispredict 119125690073   #     3.8%
-- pipeline restart  5723227412     #     0.2%
smt-contention       415746112      #  0.0% ( 0.0%)
cpu-cycles           1837831312838  # 0.21 GHz
instructions         6327727125019  # 3.44 IPC
instructions         2111272455778  # 5.506 l2 access per 1000 inst
l2 hit from l1       10450947730    # 12.78% l2 miss
l2 miss from l1      878626140      #
l2 hit from l2 pf    565877018      #
l3 hit from l2 pf    424006556      #
l3 miss from l2 pf   183262736      #
instructions         2111120557854  # 4.827 float per 1000 inst
float 512            89             # 0.000 AVX-512 per 1000 inst
float 256            590            # 0.000 AVX-256 per 1000 inst
float 128            10190850857    # 4.827 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         1              # 0.000 scalar per 1000 inst

Intel metrics

elapsed              484.067
on_cpu               0.051          # 0.82 / 16 cores
utime                393.998
stime                3.122
nvcsw                8520           # 78.47%
nivcsw               2337           # 21.53%
inblock              66600          # 137.58/sec
onblock              3424           # 7.07/sec
cpu-clock            397153907639   # 397.154 seconds
task-clock           397165603053   # 397.166 seconds
page faults          1847986        # 4652.936/sec
context switches     13066          # 32.898/sec
cpu migrations       526            # 1.324/sec
major page faults    453            # 1.141/sec
minor page faults    1847533        # 4651.795/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             1031007465039  # 195.753 branches per 1000 inst
branch misses        4322959209     # 0.42% branch miss
conditional          1031007485583  # 195.753 conditional branches per 1000 inst
indirect             68768741049    # 13.057 indirect branches per 1000 inst
slots                8979723509168  #
retiring             5073544138778  # 56.5% (56.5%)
-- ucode             293115212521   #     3.3%
-- fastpath          4780428926257  #    53.2%
frontend             2583017982853  # 28.8% (28.8%)
-- latency           1148193498713  #    12.8%
-- bandwidth         1434824484140  #    16.0%
backend              738150744894   #  8.2% ( 8.2%)
-- cpu               365619503836   #     4.1%
-- memory            372531241058   #     4.1%
speculation          747395099127   #  8.3% ( 8.3%)
-- branch mispredict 669939473186   #     7.5%
-- pipeline restart  77455625941    #     0.9%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           1496122737534  # 0.19 GHz
instructions         5266572245280  # 3.52 IPC
l2 access            31350249974    # 5.953 l2 access per 1000 inst
l2 miss              7792586637     # 24.86% l2 miss

Process overview gives processes logical names like disk.

490 processes
	 45 openscad               330.67     3.16
	 15 openscad:cs0           330.63     3.16
	 15 opensca:disk$0         330.62     3.16
	 15 openscad:sh0           330.62     3.16
	 15 openscad:gdrv0         330.61     3.16
	 15 openscad:gl0           330.61     3.16
	 15 openscad:shlo0         330.61     3.16
	 68 clinfo                  16.56     6.40
	  2 openscad:sh1            16.36     0.24
	 38 vulkaninfo               1.33     0.92
	  6 glxinfo:gdrv0            0.17     0.02
	  6 glxinfo:gl0              0.17     0.02
	  4 vulkani:disk$0           0.14     0.09
	  6 php                      0.09     0.17
	  2 llvmpipe-0               0.07     0.05
	  2 llvmpipe-1               0.07     0.05
	  2 llvmpipe-10              0.07     0.05
	  2 llvmpipe-11              0.07     0.05
	  2 llvmpipe-12              0.07     0.05
	  2 llvmpipe-13              0.07     0.05
	  2 llvmpipe-14              0.07     0.05
	  2 llvmpipe-15              0.07     0.05
	  2 llvmpipe-2               0.07     0.05
	  2 llvmpipe-3               0.07     0.05
	  2 llvmpipe-4               0.07     0.05
	  2 llvmpipe-5               0.07     0.05
	  2 llvmpipe-6               0.07     0.05
	  2 llvmpipe-7               0.07     0.05
	  2 llvmpipe-8               0.07     0.05
	  2 llvmpipe-9               0.07     0.05
	  2 glxinfo                  0.07     0.02
	  2 glxinfo:cs0              0.07     0.02
	  2 glxinfo:disk$0           0.07     0.02
	  2 glxinfo:sh0              0.07     0.02
	  2 glxinfo:shlo0            0.07     0.02
	  6 clang                    0.06     0.06
	  3 rocminfo                 0.03     0.00
	  1 lspci                    0.00     0.02
	  1 ps                       0.00     0.01
	 89 sh                       0.00     0.00
	 12 gcc                      0.00     0.00
	  8 gsettings                0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 gmain                    0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  3 dconf worker             0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 cc                       0.00     0.00
	  1 date                     0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 grep                     0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sed                      0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
47 maximum processes


The compute structure looks like a parent process opens various functional processes on different cores.

      632441) openscad         cpu=7 start=5.45  finish=69.37
        632442) openscad         cpu=11 start=5.46  finish=69.28
          632444) openscad:cs0     cpu=13 start=68.59 finish=69.28
          632445) opensca:disk$0   cpu=7 start=68.59 finish=69.28
          632446) openscad:sh0     cpu=15 start=68.59 finish=69.28
          632447) openscad:shlo0   cpu=9 start=68.59 finish=69.28
          632448) openscad:gdrv0   cpu=10 start=68.60 finish=69.27
          632449) openscad:gl0     cpu=2 start=68.60 finish=69.27
        632453) openscad         cpu=13 start=69.36 finish=69.37