{"id":1780,"date":"2024-02-23T01:42:42","date_gmt":"2024-02-23T01:42:42","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1780"},"modified":"2024-02-27T00:46:21","modified_gmt":"2024-02-27T00:46:21","slug":"spark","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/spark\/","title":{"rendered":"spark"},"content":{"rendered":"\n<p>A benchmark of the Apache Spark using the PySpark interface.  Apache Spark is an open-source unified analytics engine. There are four tests each with different sub-scenarios.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-61.png\" alt=\"\" class=\"wp-image-1804\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-61.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-61-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-61-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows a high retirement rate with some backend stalls.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-63.png\" alt=\"\" class=\"wp-image-1806\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-63.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-63-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-63-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show overall 1.6 cores active, little floating point and high retirement.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              2951.391\non_cpu               0.101          # 1.61 \/ 16 cores\nutime                4425.534\nstime                328.609\nnvcsw                4856288        # 86.78%\nnivcsw               739739         # 13.22%\ninblock              8              # 0.00\/sec\nonblock              13977128       # 4735.78\/sec\ncpu-clock            42663814969283 # 42663.815 seconds\ntask-clock           42665967867809 # 42665.968 seconds\npage faults          44053031       # 1032.510\/sec\ncontext switches     6717843        # 157.452\/sec\ncpu migrations       931799         # 21.839\/sec\nmajor page faults    1186           # 0.028\/sec\nminor page faults    44006031       # 1031.408\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             64285744518706 # 177.115 branches per 1000 inst\nbranch misses        116204596704   # 0.18% branch miss\nconditional          50330874718377 # 138.668 conditional branches per 1000 inst\nindirect             5391625116869  # 14.855 indirect branches per 1000 inst\ncpu-cycles           219995811005385 # 3.55 GHz\ninstructions         474805279426321 # 2.16 IPC\nslots                441582327330810 #\nretiring             157423348696856 # 35.6% (58.5%) high\n-- ucode             567875726893   #     0.1%\n-- fastpath          156855472969963 #    35.5%\nfrontend             19549579490013 #  4.4% ( 7.3%)\n-- latency           13520739667860 #     3.1%\n-- bandwidth         6028839822153  #     1.4%\nbackend              88382195394937 # 20.0% (32.8%)\n-- cpu               11254725399387 #     2.5%\n-- memory            77127469995550 #    17.5%\nspeculation          3802926667430  #  0.9% ( 1.4%)\n-- branch mispredict 2933296078179  #     0.7%\n-- pipeline restart  869630589251   #     0.2%\nsmt-contention       172423207228753 # 39.0% ( 0.0%)\ncpu-cycles           218890360802921 # 3.54 GHz\ninstructions         474914641892602 # 2.17 IPC\ninstructions         158472945209461 # 13.127 l2 access per 1000 inst\nl2 hit from l1       2033359144055  # 3.50% l2 miss\nl2 miss from l1      46451580342    #\nl2 hit from l2 pf    20499319858    #\nl3 hit from l2 pf    12301929165    #\nl3 miss from l2 pf   14068223396    #\ninstructions         158458706917338 # 19.971 float per 1000 inst\nfloat 512            3950           # 0.000 AVX-512 per 1000 inst\nfloat 256            503524         # 0.000 AVX-256 per 1000 inst\nfloat 128            3164589063781  # 19.971 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         37917          # 0.000 scalar per 1000 inst\ninstructions         2709961        #\nopcache              1001966        # 369.734 opcache per 1000 inst\nopcache miss         534211         # 53.3% opcache miss rate\nl1 dTLB miss         7266           # 2.681 L1 dTLB per 1000 inst\nl2 dTLB miss         1171           # 0.432 L2 dTLB per 1000 inst\ninstructions         2703521        #\nicache               1299232        # 480.570 icache per 1000 inst\nicache miss          111019         #  8.5% icache miss rate\nl1 iTLB miss         13             # 0.005 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            19             # 0.007 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              16920.126\non_cpu               0.261          # 4.17 \/ 16 cores\nutime                68536.052\nstime                2033.421\nnvcsw                20512849       # 60.38%\nnivcsw               13458961       # 39.62%\ninblock              348193440      # 20578.66\/sec\nonblock              533099000      # 31506.80\/sec\ncpu-clock            242347906374008 # 242347.906 seconds\ntask-clock           242358478400143 # 242358.478 seconds\npage faults          351178559      # 1449.005\/sec\ncontext switches     47074935       # 194.237\/sec\ncpu migrations       5096975        # 21.031\/sec\nmajor page faults    29159          # 0.120\/sec\nminor page faults    350541678      # 1446.377\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             278766653515579 # 177.353 branches per 1000 inst\nbranch misses        798092266132   # 0.29% branch miss\nconditional          278766666338011 # 177.353 conditional branches per 1000 ins\nt\nindirect             62205170112713 # 39.575 indirect branches per 1000 inst\nslots                1007749798569482 #\nretiring             554744105785745 # 55.0% (55.0%) high\n-- ucode             31026419537826 #     3.1%\n-- fastpath          523717686247919 #    52.0%\nfrontend             291111717445577 # 28.9% (28.9%)\n-- latency           100773439754211 #    10.0%\n-- bandwidth         190338277691366 #    18.9%\nbackend              123976744622939 # 12.3% (12.3%) low\n-- cpu               39235397453279 #     3.9%\n-- memory            84741347169660 #     8.4%\nspeculation          38960763394989 #  3.9% ( 3.9%)\n-- branch mispredict 35041534228150 #     3.5%\n-- pipeline restart  3919229166839  #     0.4%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           481594355727426 # 1.78 GHz\ninstructions         915628982456618 # 1.90 IPC\nl2 access            6943696813220  # 10.398 l2 access per 1000 inst\nl2 miss              1685572410210  # 24.27% l2 miss\ncpu-cycles           364846275080470 # 25.2% memory latency\nload stalls          90138769389868 # 13.4% l1 bound\nl1 miss              41084570173389 #  3.7% l2 bound\nl2 miss              27487531859349 #  1.6% l3 bound\nl3 miss              21589200763974 #  5.9% dram bound\nstore_stalls         1939852578337  #  0.5% store bound\n<\/code><\/pre>\n\n\n\n<p>Process overview includes many processes and a Java set of threads with names by thread, only first part shown here.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>114616 processes\n\t640 dispatcher-even      76964.32  3834.19\n\t741 block-manager-s      66938.04  6289.06\n\t494 block-manager-a      44631.10  4193.59\n\t1994 python3              42767.48  1275.63\n\t320 shuffle-client-      38489.52  1920.40\n\t320 shuffle-server-      38489.52  1920.40\n\t320 map-output-disp      38464.08  1912.89\n\t160 task-result-get      19229.00   955.73\n\t162 QueryStageCreat      15328.79  1250.05\n\t120 spark-listener-      14421.84   716.89\n\t162 java                  9638.24   483.08\n\t 80 dispatcher-Bloc       9619.42   479.07\n\t 81 Finalizer             4819.24   241.59\n\t 80 Common-Cleaner        4819.23   241.56\n\t 40 org.apache.hado       4811.23   240.09\n\t 40 rpc-boss-3-1          4811.23   240.09\n\t 40 shuffle-boss-6-       4811.23   240.09\n\t 40 rpc-server-4-1        4811.19   240.05\n\t 40 rpc-server-4-2        4811.18   240.05\n\t 40 rpc-server-4-3        4811.18   240.05\n\t 40 rpc-server-4-4        4811.18   240.05\n\t 40 rpc-server-4-5        4811.18   240.05\n\t 40 rpc-server-4-6        4811.18   240.05\n\t 40 rpc-server-4-7        4811.18   240.05\n\t 40 rpc-server-4-8        4811.18   240.05\n\t 40 rpc-client-1-1        4811.17   240.04\n\t 40 rpc-client-1-2        4811.17   240.04\n\t 40 rpc-client-1-3        4811.16   240.04\n\t 40 rpc-client-1-4        4811.16   240.04\n\t 40 rpc-client-1-5        4811.16   240.04\n\t 40 rpc-client-1-6        4811.16   240.04\n\t 40 rpc-client-1-7        4811.16   240.04\n\t 40 rpc-client-1-8        4811.16   240.04\n\t 40 Thread-0              4810.92   239.96\n\t 40 shutdown-hook-0       4810.73   239.88\n\t 40 element-trackin       4810.44   239.76\n\t 40 netty-rpc-env-t       4810.13   239.58\n\t 40 RemoteBlock-tem       4808.39   239.21\n\t 40 heartbeat-recei       4807.42   238.98\n\t 40 driver-heartbea       4807.34   238.97\n\t 40 task-starvation       4807.25   238.92\n\t 40 task-abort-time       4807.20   238.93\n\t 40 dag-scheduler-e       4807.03   238.90\n\t 40 context-cleaner       4807.01   238.90\n\t 40 SparkUI-57            4806.92   238.86\n\t 40 SparkUI-58            4806.91   238.86\n\t 40 SparkUI-59            4806.90   238.86\n\t 40 SparkUI-60            4806.89   238.85\n\t 40 SparkUI-61            4806.89   238.85\n\t 40 SparkUI-62            4806.88   238.85\n\t 40 SparkUI-63            4806.86   238.85\n\t 40 SparkUI-64            4806.85   238.85\n\t 40 SparkUI-65            4806.85   238.85\n\t 40 SparkUI-66            4806.85   238.85\n\t 40 executor-kill-m       4806.31   238.73\n\t 40 executor-heartb       4806.04   238.64\n\t 40 Logging-Cleaner       4805.67   238.61\n\t 40 Thread-2              4805.59   238.60\n\t 40 Thread-4              4804.58   238.53\n\t 36 Thread-20             4638.91   215.45\n\t 18 serve-DataFrame       3116.02    68.72\n\t 36 broadcast-excha       3052.28   295.16\n\t 72 checkPathsExist        391.91    32.46\n\t  4 Thread-19              164.64    22.95\n\t144 process reaper         138.68 12318.76\n\t  3 Thread-1526            104.07    11.51\n\t  3 Thread-1525            104.03    11.49\n\t  3 Thread-1523            103.75    11.43\n\t  3 Thread-1524            103.59    11.42\n\t  3 Thread-1519            103.58    11.41\n\t  3 Thread-1522            103.53    11.41\n\t  3 Thread-1521            103.51    11.41\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>A benchmark of the Apache Spark using the PySpark interface. Apache Spark is an open-source unified analytics engine. There are four tests each with different sub-scenarios. Topdown profile shows a high retirement rate with some backend stalls. AMD metrics show <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/spark\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1780","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1780","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1780"}],"version-history":[{"count":3,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1780\/revisions"}],"predecessor-version":[{"id":1833,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1780\/revisions\/1833"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1780"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}