{"id":1139,"date":"2024-01-31T01:13:07","date_gmt":"2024-01-31T01:13:07","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1139"},"modified":"2024-01-31T14:34:36","modified_gmt":"2024-01-31T14:34:36","slug":"incompact3d","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/incompact3d\/","title":{"rendered":"incompact3d"},"content":{"rendered":"\n<p>Fortran MPI finite different code for solving incompressible Navier Stokes equation. There are three workloads that seem to use one thread per core (w\/o hyperthreading)<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-94.png\" alt=\"\" class=\"wp-image-1193\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-94.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-94-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-94-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows second and third workloads in particular are memory bound with a low retirement rate.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-131.png\" alt=\"\" class=\"wp-image-1195\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-131.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-131-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-131-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics shows floating point code with a lot of page faults.  Backend memory stalls account for more than 3\/4 of the time.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              477.356\non_cpu               0.407          # 6.51 \/ 16 cores\nutime                2840.287\nstime                267.242\nnvcsw                85439          # 89.95%\nnivcsw               9548           # 10.05%\ninblock              2163240        # 4531.71\/sec\nonblock              163304         # 342.10\/sec\ncpu-clock            3274890463329  # 3274.890 seconds\ntask-clock           3274990300351  # 3274.990 seconds\npage faults          43197238       # 13190.035\/sec\ncontext switches     266807         # 81.468\/sec\ncpu migrations       6953           # 2.123\/sec\nmajor page faults    151164         # 46.157\/sec\nminor page faults    43046020       # 13143.862\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             1161817492247  # 101.468 branches per 1000 inst\nbranch misses        11533210495    # 0.99% branch miss\nconditional          956582064412   # 83.543 conditional branches per 1000 inst\nindirect             36198213091    # 3.161 indirect branches per 1000 inst\ncpu-cycles           14691239845636 # 1.92 GHz\ninstructions         11244984618546 # 0.77 IPC\nslots                29381588652498 #\nretiring             3613955511700  # 12.3% (12.3%) low\n-- ucode             2929574137     #     0.0%\n-- fastpath          3611025937563  #    12.3%\nfrontend             2597882187281  #  8.8% ( 8.8%)\n-- latency           1508040790554  #     5.1%\n-- bandwidth         1089841396727  #     3.7%\nbackend              23113832773537 # 78.7% (78.7%) high\n-- cpu               3269849548655  #    11.1%\n-- memory            19843983224882 #    67.5%\nspeculation          35881622447    #  0.1% ( 0.1%) low\n-- branch mispredict 35056712417    #     0.1%\n-- pipeline restart  824910030      #     0.0%\nsmt-contention       20024814768    #  0.1% ( 0.0%)\ncpu-cycles           14706761333307 # 1.92 GHz\ninstructions         11352044498526 # 0.77 IPC\ninstructions         3796655539228  # 46.371 l2 access per 1000 inst\nl2 hit from l1       107891244433   # 33.98% l2 miss\nl2 miss from l1      16333761843    #\nl2 hit from l2 pf    24667546155    #\nl3 hit from l2 pf    3086153260     #\nl3 miss from l2 pf   40408030721    #\ninstructions         3787985458233  # 232.547 float per 1000 inst\nfloat 512            116            # 0.000 AVX-512 per 1000 inst\nfloat 256            756            # 0.000 AVX-256 per 1000 inst\nfloat 128            880886358251   # 232.547 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel fails with out of memory (OOM) as show in dmesg below. Perhaps one thing to try is using taskset to limit number of copies to reduce memory usage,<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>316432.378217] lowmem_reserve&#91;]: 0 0 0 0 0\n&#91;316432.378219] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 13312kB\n&#91;316432.378227] Node 0 DMA32: 94*4kB (UME) 66*8kB (UME) 51*16kB (UME) 82*32kB (UME) 73*64kB (UME) 45*128kB (UME) 34*256kB (UE) 17*512kB (UME) 12*1024kB (UME) 5*2048kB (UM) 2*4096kB (U) = 62904kB\n&#91;316432.378237] Node 0 Normal: 1009*4kB (UME) 439*8kB (UME) 233*16kB (UME) 47*32kB (UME) 86*64kB (UME) 18*128kB (UME) 8*256kB (E) 4*512kB (ME) 2*1024kB (E) 0*2048kB 9*4096kB (UME) = 63596kB\n&#91;316432.378246] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB\n&#91;316432.378248] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB\n&#91;316432.378248] 90945 total pagecache pages\n&#91;316432.378249] 6384 pages in swap cache\n&#91;316432.378250] Free swap  = 0kB\n&#91;316432.378250] Total swap = 2097148kB\n&#91;316432.378251] 4135862 pages RAM\n&#91;316432.378251] 0 pages HighMem\/MovableOnly\n&#91;316432.378252] 103762 pages reserved\n&#91;316432.378252] 0 pages hwpoisoned\n&#91;316432.378253] Tasks state (memory values in pages):\n&#91;316432.378254] &#91;  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name\n&#91;316432.378269] &#91;    412]     0   412    38735      352   335872      352          -250 systemd-journal\n&#91;316432.378271] &#91;    459]     0   459     6774      224    77824      576         -1000 systemd-udevd\n&#91;316432.378275] &#91;    704]   108   704     3741      128    69632      160          -900 systemd-oomd\n&#91;316432.378277] &#91;    705]   101   705     6417      248    90112      960             0 systemd-resolve\n&#91;316432.378279] &#91;    707]   103   707    22346      192    73728      192             0 systemd-timesyn\n&#91;316432.378281] &#91;    871]     0   871    60139      149   106496      288             0 accounts-daemon\n&#91;316432.378282] &#91;    872]     0   872      704      224    45056       32             0 acpid\n&#91;316432.378284] &#91;    875]   114   875     1928      224    53248       32             0 avahi-daemon\n&#91;316432.378285] &#91;    876]     0   876     2654      160    61440      160             0 bluetoothd\n&#91;316432.378286] &#91;    877]     0   877     2374      256    57344       32             0 cron\n&#91;316432.378288] &#91;    879]   102   879     2758      576    57344      192          -900 dbus-daemon\n&#91;316432.378289] &#91;    881]     0   881   120634      656   167936      384             0 NetworkManager\n&#91;316432.378291] &#91;    888]     0   888    20712      288    57344       32             0 irqbalance\n&#91;316432.378292] &#91;    891]     0   891    10263      192   122880     2336             0 networkd-dispat\n&#91;316432.378294] &#91;    893]     0   893    60778      935   110592      128             0 polkitd\n&#91;316432.378295] &#91;    896]     0   896    60024      271   102400      192             0 power-profiles-\n&#91;316432.378297] &#91;    899]   104   899    55601      256    73728      480             0 rsyslogd\n&#91;316432.378298] &#91;    907]     0   907   607263     2412   385024     1231          -900 snapd\n&#91;316432.378300] &#91;    911]     0   911    59094      256    90112      128             0 switcheroo-cont\n&#91;316432.378301] &#91;    912]     0   912    12088      233    81920      256             0 systemd-logind\n&#91;316432.378303] &#91;    915]     0   915    69022      192   126976      224             0 thermald\n&#91;316432.378304] &#91;    916]     0   916   116737      693   139264      448             0 udisksd\n&#91;316432.378306] &#91;    917]     0   917     4462      192    77824      448             0 wpa_supplicant\n&#91;316432.378307] &#91;    930]   114   930     1861      162    53248       64             0 avahi-daemon\n&#91;316432.378309] &#91;   1042]     0  1042    79491      142   118784      416             0 ModemManager\n&#91;316432.378310] &#91;   1049]     0  1049    60061      242   102400      224             0 boltd\n&#91;316432.378311] &#91;   1062]     0  1062      722        0    45056        0             0 run-cups-browse\n&#91;316432.378313] &#91;   1064]     0  1064      722        0    45056       32             0 run-cupsd\n&#91;316432.378315] &#91;   1077]     0  1077    29539      192   131072     2176             0 unattended-upgr\n&#91;316432.378316] &#91;   1079]     0  1079   542926     2162   405504     1696          -999 containerd\n&#91;316432.378318] &#91;   1130]     0  1130     3858      288    69632      352         -1000 sshd\n&#91;316432.378319] &#91;   1211]     0  1211    60301      288   102400      288             0 gdm3\n&#91;316432.378321] &#91;   1382]     0  1382    42614      288    98304      352             0 gdm-session-wor\n&#91;316432.378322] &#91;   1483]  1000  1483     4506      256    81920      544             0 systemd\n&#91;316432.378323] &#91;   1484]  1000  1484    42481      304    98304      928             0 (sd-pam)\n&#91;316432.378325] &#91;   1524]  1000  1524     9892      192    69632      256             0 pipewire\n&#91;316432.378326] &#91;   1525]  1000  1525     5864      160    77824      288             0 pipewire-media-\n&#91;316432.378328] &#91;   1527]  1000  1527   421301      552   225280     1120             0 pulseaudio\n&#91;316432.378329] &#91;   1529]  1000  1529    19063        0   118784      416             0 snapd-desktop-i\n&#91;316432.378330] &#91;   1533]  1000  1533   288366      576   159744      288             0 ubuntu-report\n&#91;316432.378332] &#91;   1568]  1000  1568    60191      326    90112      160             0 gnome-keyring-d\n&#91;316432.378334] &#91;   1579]   116  1579    38501      192    65536       64             0 rtkit-daemon\n&#91;316432.378335] &#91;   1584]  1000  1584    40597      160    81920      128             0 gdm-wayland-ses\n&#91;316432.378336] &#91;   1602]  1000  1602     2465      352    57344      224             0 dbus-daemon\n&#91;316432.378338] &#91;   1631]  1000  1631    55761      224   139264      448             0 gnome-session-b\n&#91;316432.378340] &#91;   1636]  1000  1636    60161      288    98304      128             0 gvfsd\n&#91;316432.378341] &#91;   1657]  1000  1657    95223      160   106496      192             0 gvfsd-fuse\n&#91;316432.378342] &#91;   1847]  1000  1847   161954     1191   368640     1888             0 tracker-miner-f\n&#91;316432.378344] &#91;   1909]     0  1909    60554      192   106496      256             0 upowerd\n&#91;316432.378345] &#91;   1921]  1000  1921    22977      224    81920       96             0 gnome-session-c\n&#91;316432.378347] &#91;   1939]  1000  1939   129823      384   180224      544             0 gnome-session-b\n&#91;316432.378348] &#91;   2043]  1000  2043    77405      160    98304      160             0 at-spi-bus-laun\n&#91;316432.378350] &#91;   2045]  1000  2045  1604253    10540  2379776    27648             0 gnome-shell\n&#91;316432.378351] &#91;   2061]  1000  2061     2141      224    53248       96             0 dbus-daemon\n&#91;316432.378353] &#91;   2082]  1000  2082    97500      512   126976      128             0 gvfs-udisks2-vo\n&#91;316432.378355] &#91;   2103]  1000  2103    78802      160   114688      224             0 gvfs-afc-volume\n&#91;316432.378356] &#91;   2108]  1000  2108    59114      320    94208       64             0 gvfs-mtp-volume\n&#91;316432.378358] &#91;   2112]  1000  2112    59158      256    98304       96             0 gvfs-goa-volume\n&#91;316432.378359] &#91;   2116]  1000  2116   144585      320   274432     1728             0 goa-daemon\n&#91;316432.378360] &#91;   2124]  1000  2124    84600      181   147456     1376             0 goa-identity-se\n&#91;316432.378362] &#91;   2126]  1000  2126    59386      288    98304      128             0 gvfs-gphoto2-vo\n&#91;316432.378363] &#91;   2187]  1000  2187   134226      160   139264      192             0 xdg-document-po\n&#91;316432.378365] &#91;   2197]  1000  2197    59037      224    98304      128             0 xdg-permission-\n&#91;316432.378366] &#91;   2205]  1000  2205      699      256    40960        0             0 fusermount3\n&#91;316432.378368] &#91;   2322]  1000  2322   145742      192   204800      768             0 gnome-shell-cal\n&#91;316432.378370] &#91;   2330]  1000  2330   268042      448   262144      736             0 evolution-sourc\n&#91;316432.378371] &#91;   2334]  1000  2334    39237      224    73728       64             0 dconf-service\n&#91;316432.378373] &#91;   2341]     0  2341    74558      362   172032      576             0 packagekitd\n&#91;316432.378374] &#91;   2350]  1000  2350   279749      448   294912      896             0 evolution-calen\n&#91;316432.378375] &#91;   2365]  1000  2365   168059      288   241664      800             0 evolution-addre\n&#91;316432.378377] &#91;   2366]  1000  2366    97149      256   118784      224             0 gvfsd-trash\n&#91;316432.378379] &#91;   2376]  1000  2376   716437      571   258048      896             0 gjs\n&#91;316432.378380] &#91;   2378]  1000  2378    40671      192    77824      160             0 at-spi2-registr\n&#91;316432.378382] &#91;   2395]  1000  2395      723      192    45056        0             0 sh\n&#91;316432.378383] &#91;   2396]  1000  2396    77605      160   102400      160             0 gsd-a11y-settin\n&#91;316432.378384] &#91;   2398]  1000  2398   115885      673   208896     1216             0 gsd-color\n&#91;316432.378386] &#91;   2399]  1000  2399    78782      480   114688      992             0 ibus-daemon\n&#91;316432.378387] &#91;   2401]  1000  2401    93866      224   176128      512             0 gsd-datetime\n&#91;316432.378389] &#91;   2403]  1000  2403    78013      319   106496       96             0 gsd-housekeepin\n&#91;316432.378391] &#91;   2404]  1000  2404    85364      961   167936      768             0 gsd-keyboard\n&#91;316432.378392] &#91;   2407]  1000  2407   216252      352   221184     1472             0 gsd-media-keys\n&#91;316432.378394] &#91;   2409]  1000  2409   112854      832   188416      864             0 gsd-power\n&#91;316432.378395] &#91;   2415]  1000  2415    62463      224   118784      320             0 gsd-print-notif\n&#91;316432.378396] &#91;   2416]  1000  2416   114464      288   131072      128             0 gsd-rfkill\n&#91;316432.378398] &#91;   2417]  1000  2417    59072      192    90112      160             0 gsd-screensaver\n&#91;316432.378399] &#91;   2420]  1000  2420   116484      384   135168      224             0 gsd-sharing\n&#91;316432.378400] &#91;   2422]  1000  2422    78075      192   110592      256             0 gsd-smartcard\n&#91;316432.378402] &#91;   2424]  1000  2424    79835      192   126976      320             0 gsd-sound\n&#91;316432.378403] &#91;   2435]  1000  2435    85507      640   167936     1024             0 gsd-wacom\n&#91;316432.378405] &#91;   2464]  1000  2464    58067      448    77824       64             0 gsd-disk-utilit\n&#91;316432.378406] &#91;   2472]  1000  2472   196938      924   376832     1696             0 wallch\n&#91;316432.378407] &#91;   2484]  1000  2484   204748     1308   385024     2816             0 evolution-alarm\n&#91;316432.378409] &#91;   2495]  1000  2495    40870      256    81920      160             0 ibus-memconf\n&#91;316432.378410] &#91;   2496]  1000  2496    87172      678   180224     2432             0 ibus-extension-\n&#91;316432.378411] &#91;   2501]  1000  2501    59315      288    98304       96             0 ibus-portal\n&#91;316432.378413] &#91;   2603]  1000  2603    40870      256    90112       96             0 ibus-engine-sim\n&#91;316432.378415] &#91;   2608]  1000  2608   155951      448   163840      352             0 xdg-desktop-por\n&#91;316432.378416] &#91;   2627]  1000  2627   130965     1078   204800     1408             0 xdg-desktop-por\n&#91;316432.378417] &#91;   2656]     0  2656    16655        0   110592      384             0 cupsd\n&#91;316432.378419] &#91;   2669]  1000  2669    85589      224   155648      480             0 gsd-printer\n&#91;316432.378420] &#91;   2699]   123  2699    61350      167   114688      896             0 colord\n&#91;316432.378422] &#91;   2717]  1000  2717    95178      607   180224     1504             0 snapd-desktop-i\n&#91;316432.378424] &#91;   2727]  1000  2727   734880      813   266240      864             0 gjs\n&#91;316432.378425] &#91;   2750]  1000  2750   104588     1441   180224      736             0 xdg-desktop-por\n&#91;316432.378426] &#91;   2785]  1000  2785   871640      456   462848     4928             0 gjs\n&#91;316432.378428] &#91;   2805]     0  2805      722       96    45056        0             0 run-cups-browse\n&#91;316432.378429] &#91;   2809]  1000  2809    40735      320    90112       64             0 gvfsd-metadata\n&#91;316432.378431] &#91;   2964]  1000  2964   345294      732   634880    10720             0 Xwayland\n&#91;316432.378432] &#91;   2997]  1000  2997   390931      332   569344     4928             0 gsd-xsettings\n&#91;316432.378434] &#91;   3054]  1000  3054    67202      763   159744     1472             0 ibus-x11\n&#91;316432.378436] &#91;   3067]     0  3067   551671     3129   487424     3406          -500 dockerd\n&#91;316432.378437] &#91;   3077]   113  3077     3272      262    61440       64             0 kerneloops\n&#91;316432.378439] &#91;   3079]   113  3079     3272      229    65536       96             0 kerneloops\n&#91;316432.378440] &#91;   4014]  1000  4014   123407     1558   188416      608             0 update-notifier\n&#91;316432.378441] &#91;   4083]  1000  4083   140115     2754   286720     2880             0 gnome-terminal-\n&#91;316432.378443] &#91;   4101]  1000  4101     2884      256    61440      320             0 bash\n&#91;316432.378444] &#91;   4413]  1000  4413   531284     1066   311296      768             0 snap\n&#91;316432.378446] &#91; 183930]     0 183930   115718     4089   344064    14016             0 fwupd\n&#91;316432.378448] &#91; 238055]  1000 238055    20997      224    69632        0             0 gpg-agent\n&#91;316432.378450] &#91;1728196]     0 1728196    18327      576   122880      224             0 cupsd\n&#91;316432.378451] &#91;1728197]     0 1728197    43157      416   102400      224             0 cups-browsed\n&#91;316432.378453] &#91;3529332]  1000 3529332     2526      192    61440       96             0 run_all.sh\n&#91;316432.378455] &#91;3651325]     0 3651325      802      128    40960        0             0 sleep\n&#91;316432.378457] &#91;3687752]  1000 3687752     2526      256    69632       64             0 run_test.sh\n&#91;316432.378459] &#91;3688662]  1000 3688662      711      192    45056        0             0 wspy\n&#91;316432.378460] &#91;3688663]  1000 3688663      723      192    45056       32             0 phoronix-test-s\n&#91;316432.378462] &#91;3688676]  1000 3688676      723      160    45056        0             0 sh\n&#91;316432.378463] &#91;3688677]  1000 3688677    20493      320   139264     2976             0 php\n&#91;316432.378464] &#91;3688708]  1000 3688708      723      160    49152        0             0 sh\n&#91;316432.378466] &#91;3688709]  1000 3688709    51149      352   139264      992             0 php\n&#91;316432.378467] &#91;3688710]  1000 3688710    51149      285    98304      992             0 php\n&#91;316432.378468] &#91;3688711]  1000 3688711    51149      253    98304      960             0 php\n&#91;316432.378470] &#91;3688712]  1000 3688712    51149      221    98304      992             0 php\n&#91;316432.378471] &#91;3688713]  1000 3688713    51149      221    98304      992             0 php\n&#91;316432.378472] &#91;3688906]  1000 3688906      724      160    45056        0             0 incompact3d\n&#91;316432.378473] &#91;3688907]  1000 3688907   154711     2336   659456     6208             0 mpirun\n&#91;316432.378475] &#91;3688911]  1000 3688911  2959299   295379  3092480      832             0 xcompact3d\n&#91;316432.378476] &#91;3688912]  1000 3688912  2959268   300777  3129344      768             0 xcompact3d\n&#91;316432.378477] &#91;3688913]  1000 3688913  2959268   265555  2846720      832             0 xcompact3d\n&#91;316432.378478] &#91;3688915]  1000 3688915  3010174   381583  4104192    44928             0 xcompact3d\n&#91;316432.378480] &#91;3688917]  1000 3688917  3017348   228656  2904064    44832             0 xcompact3d\n&#91;316432.378481] &#91;3688920]  1000 3688920  3017348   218800  2809856    44832             0 xcompact3d\n&#91;316432.378482] &#91;3688925]  1000 3688925  3017348   237700  2981888    44928             0 xcompact3d\n&#91;316432.378483] &#91;3688927]  1000 3688927  3035014   267565  2871296      800             0 xcompact3d\n&#91;316432.378484] &#91;3688930]  1000 3688930  3034756   398092  4243456    44896             0 xcompact3d\n&#91;316432.378485] &#91;3688933]  1000 3688933  3035015   409069  4214784    32320             0 xcompact3d\n&#91;316432.378486] &#91;3688936]  1000 3688936  3035014   414706  4288512    32416             0 xcompact3d\n&#91;316432.378488] &#91;3688938]  1000 3688938  3035014   245193  2789376    16832             0 xcompact3d\n&#91;316432.378489] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=user.slice,mems_allowed=0,global_oom,task_memcg=\/user.slice\/user-1000.slice\/user@1000.service\/app.slice\/app-org.gnome.Te\\\nrminal.slice\/vte-spawn-a3fbe82c-41c8-4dfc-a423-e062697180e2.scope,task=xcompact3d,pid=3688936,uid=1000\n&#91;316432.378506] Out of memory: Killed process 3688936 (xcompact3d) total-vm:12140056kB, anon-rss:1657344kB, file-rss:1096kB, shmem-rss:384kB, UID:1000 pgtables:4188kB oom_score_adj:0\n&#91;316432.405967] Purging GPU memory, 512 pages freed, 16574 pages still pinned, 0 pages left available.\n&#91;316432.828656] containerd invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=-999\n&#91;316432.828666] CPU: 12 PID: 1162 Comm: containerd Tainted: G        W          6.5.0-15-generic #15~22.04.1-Ubuntu\n&#91;316432.828669] Hardware name: GEEKOM Mini IT13\/Default string, BIOS 1.09 11\/10\/2023\n&#91;316432.828670] Call Trace:\n&#91;316432.828672]  &lt;TASK&gt;\n&#91;316432.828675]  dump_stack_lvl+0x48\/0x70\n&#91;316432.828681]  dump_stack+0x10\/0x20\n&#91;316432.828683]  dump_header+0x50\/0x290\n&#91;316432.828688]  oom_kill_process+0x10d\/0x1c0\n&#91;316432.828692]  out_of_memory+0x103\/0x350\n&#91;316432.828693]  __alloc_pages_may_oom+0x112\/0x1e0\n&#91;316432.828698]  __alloc_pages_slowpath.constprop.0+0x46f\/0x9a0\n&#91;316432.828702]  __alloc_pages+0x31d\/0x350\n&#91;316432.828705]  alloc_pages+0x90\/0x1a0\n&#91;316432.828708]  folio_alloc+0x1d\/0x60\n&#91;316432.828710]  filemap_alloc_folio+0x31\/0x40\n&#91;316432.828714]  __filemap_get_folio+0xd8\/0x230\n&#91;316432.828716]  filemap_fault+0x454\/0x750\n&#91;316432.828718]  __do_fault+0x36\/0x150\n&#91;316432.828722]  do_read_fault+0x11d\/0x170\n&#91;316432.828725]  do_fault+0xf3\/0x170\n&#91;316432.828727]  handle_pte_fault+0x74\/0x170\n&#91;316432.828730]  __handle_mm_fault+0x65c\/0x720\n&#91;316432.828733]  handle_mm_fault+0x164\/0x360\n&#91;316432.828735]  do_user_addr_fault+0x212\/0x6b0\n&#91;316432.828738]  exc_page_fault+0x83\/0x1b0\n&#91;316432.828742]  asm_exc_page_fault+0x27\/0x30\n&#91;316432.828746] RIP: 0033:0x563715388315\n&#91;316432.828785] Code: Unable to access opcode bytes at 0x5637153882eb.\n&#91;316432.828786] RSP: 002b:00007f37e26a9d38 EFLAGS: 00010216\n&#91;316432.828788] RAX: 00011fcb4474a494 RBX: 000000c000094400 RCX: 0000000000000000\n&#91;316432.828790] RDX: 0000000033800494 RSI: 0000000000000000 RDI: 0000000000000000\n&#91;316432.828791] RBP: 00007f37e26a9d98 R08: 0000000000000000 R09: 0000000000000000\n&#91;316432.828792] R10: 0000000000000000 R11: 0000000000000000 R12: 00007f37e26a9d18\n&#91;316432.828793] R13: 0000000000000016 R14: 000000c0000069c0 R15: 00007ffe701a5340\n&#91;316432.828796]  &lt;\/TASK&gt;\n&#91;316432.828797] Mem-Info:\n&#91;316432.828799] active_anon:859541 inactive_anon:2903046 isolated_anon:0\n                 active_file:278 inactive_file:532 isolated_file:0\n                 unevictable:35353 dirty:132 writeback:0\n                 slab_reclaimable:83650 slab_unreclaimable:54527\n                 mapped:1549 shmem:64940 pagetables:14656\n                 sec_pagetables:0 bounce:0\n                 kernel_misc_reclaimable:0\n                 free:34898 free_pcp:164 free_cma:0\n<\/code><\/pre>\n\n\n\n<p>The program is too large to run on my 16 GB Intel system and I see messages in dmesg from the out of memory (OOM) killer:<\/p>\n\n\n\n<p>Process overview<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>647 processes\n\t216 xcompact3d            8699.06  1127.94\n\t 68 clinfo                  17.10     7.32\n\t 54 mpirun                   3.02    17.26\n\t 38 vulkaninfo               0.74     1.90\n\t  6 glxinfo:gdrv0            0.15     0.08\n\t  6 glxinfo:gl0              0.15     0.08\n\t  6 clang                    0.08     0.09\n\t  4 vulkani:disk$0           0.07     0.20\n\t  2 glxinfo                  0.07     0.03\n\t  2 glxinfo:cs0              0.07     0.03\n\t  2 glxinfo:disk$0           0.07     0.03\n\t  2 glxinfo:sh0              0.07     0.03\n\t  2 glxinfo:shlo0            0.07     0.03\n\t  6 php                      0.06     8.78\n\t  2 llvmpipe-0               0.04     0.10\n\t  2 llvmpipe-1               0.04     0.10\n\t  2 llvmpipe-10              0.04     0.10\n\t  2 llvmpipe-11              0.04     0.10\n\t  2 llvmpipe-12              0.04     0.10\n\t  2 llvmpipe-13              0.04     0.10\n\t  2 llvmpipe-14              0.04     0.10\n\t  2 llvmpipe-15              0.04     0.10\n\t  2 llvmpipe-2               0.04     0.10\n\t  2 llvmpipe-3               0.04     0.10\n\t  2 llvmpipe-4               0.04     0.10\n\t  2 llvmpipe-5               0.04     0.10\n\t  2 llvmpipe-6               0.04     0.10\n\t  2 llvmpipe-7               0.04     0.10\n\t  2 llvmpipe-8               0.04     0.10\n\t  2 llvmpipe-9               0.04     0.10\n\t  3 rocminfo                 0.00     0.03\n\t  1 lspci                    0.00     0.02\n\t  1 ps                       0.00     0.01\n\t 93 sh                       0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t  9 gsettings                0.00     0.00\n\t  9 incompact3d              0.00     0.00\n\t  9 mkdir                    0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 gmain                    0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 dconf worker             0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Computation blocks<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      2577007) incompact3d      cpu=0 start=124.44 finish=153.05\n        2577008) mpirun           cpu=8 start=124.45 finish=153.03\n          2577011) mpirun           cpu=0 start=125.06 finish=153.03\n          2577012) mpirun           cpu=1 start=125.06 finish=125.06\n          2577013) mpirun           cpu=15 start=125.08 finish=153.02\n          2577014) mpirun           cpu=5 start=125.58 finish=153.02\n          2577015) mpirun           cpu=15 start=125.58 finish=153.02\n          2577016) xcompact3d       cpu=12 start=125.62 finish=153.00\n            2577018) xcompact3d       cpu=13 start=125.63 finish=153.00\n            2577021) xcompact3d       cpu=0 start=125.63 finish=153.00\n            2577040) sh               cpu=14 start=125.88 finish=125.88\n              2577041) mkdir            cpu=11 start=125.88 finish=125.88\n          2577017) xcompact3d       cpu=10 start=125.62 finish=153.00\n            2577020) xcompact3d       cpu=3 start=125.63 finish=153.00\n            2577024) xcompact3d       cpu=15 start=125.64 finish=153.00\n          2577019) xcompact3d       cpu=7 start=125.63 finish=153.00\n            2577023) xcompact3d       cpu=5 start=125.64 finish=153.00\n            2577026) xcompact3d       cpu=3 start=125.64 finish=153.00\n          2577022) xcompact3d       cpu=14 start=125.64 finish=153.00\n            2577027) xcompact3d       cpu=6 start=125.64 finish=153.00\n            2577030) xcompact3d       cpu=13 start=125.65 finish=153.00\n          2577025) xcompact3d       cpu=5 start=125.64 finish=153.00\n            2577029) xcompact3d       cpu=9 start=125.65 finish=153.00\n            2577033) xcompact3d       cpu=8 start=125.65 finish=153.00\n          2577028) xcompact3d       cpu=11 start=125.64 finish=153.00\n            2577032) xcompact3d       cpu=6 start=125.65 finish=153.00\n            2577035) xcompact3d       cpu=14 start=125.66 finish=153.00\n          2577031) xcompact3d       cpu=9 start=125.65 finish=153.00\n            2577036) xcompact3d       cpu=2 start=125.66 finish=153.00\n            2577038) xcompact3d       cpu=4 start=125.66 finish=153.00\n          2577034) xcompact3d       cpu=0 start=125.66 finish=153.00\n            2577037) xcompact3d       cpu=1 start=125.66 finish=153.00\n            2577039) xcompact3d       cpu=6 start=125.67 finish=153.00\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Fortran MPI finite different code for solving incompressible Navier Stokes equation. There are three workloads that seem to use one thread per core (w\/o hyperthreading) Topdown profile shows second and third workloads in particular are memory bound with a low <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/incompact3d\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1139","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1139","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1139"}],"version-history":[{"count":3,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1139\/revisions"}],"predecessor-version":[{"id":1196,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1139\/revisions\/1196"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1139"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}