{"id":1263,"date":"2024-02-02T02:45:40","date_gmt":"2024-02-02T02:45:40","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1263"},"modified":"2024-02-02T03:03:11","modified_gmt":"2024-02-02T03:03:11","slug":"ngspice","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/ngspice\/","title":{"rendered":"ngspice"},"content":{"rendered":"\n<p>SPICE circuit simulator with two different test cases. The program looks single-threaded with occasional three threads.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-6.png\" alt=\"\" class=\"wp-image-1270\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-6.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-6-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-6-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows a large amount of topdown stalls and a low retirement rate<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-6.png\" alt=\"\" class=\"wp-image-1272\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-6.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-6-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-6-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics shows floating point code with a moderate L2 access and miss rate. There are not many branches. The on-cpu is barely more than one core.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              654.274\non_cpu               0.070          # 1.12 \/ 16 cores\nutime                731.620\nstime                1.458\nnvcsw                13729          # 81.39%\nnivcsw               3140           # 18.61%\ninblock              0              # 0.00\/sec\nonblock              56064          # 85.69\/sec\ncpu-clock            733047101244   # 733.047 seconds\ntask-clock           733083455448   # 733.083 seconds\npage faults          375005         # 511.545\/sec\ncontext switches     19961          # 27.229\/sec\ncpu migrations       315            # 0.430\/sec\nmajor page faults    2              # 0.003\/sec\nminor page faults    375003         # 511.542\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             227636710334   # 88.502 branches per 1000 inst\nbranch misses        1487507273     # 0.65% branch miss\nconditional          165221763672   # 64.236 conditional branches per 1000 inst\nindirect             10647435856    # 4.140 indirect branches per 1000 inst\ncpu-cycles           3379846461857  # 0.32 GHz\ninstructions         2573410274886  # 0.76 IPC\nslots                6764786299656  #\nretiring             884507274916   # 13.1% (13.1%) low\n-- ucode             1115721355     #     0.0%\n-- fastpath          883391553561   #    13.1%\nfrontend             239339399574   #  3.5% ( 3.5%) low\n-- latency           118070306598   #     1.7%\n-- bandwidth         121269092976   #     1.8%\nbackend              5535942344017  # 81.8% (81.9%) high\n-- cpu               761174160581   #    11.3%\n-- memory            4774768183436  #    70.6%\nspeculation          103593987406   #  1.5% ( 1.5%)\n-- branch mispredict 74106925238    #     1.1%\n-- pipeline restart  29487062168    #     0.4%\nsmt-contention       1402399494     #  0.0% ( 0.0%)\ncpu-cycles           3389350813164  # 0.32 GHz\ninstructions         2570015738058  # 0.76 IPC\ninstructions         858514957618   # 75.488 l2 access per 1000 inst\nl2 hit from l1       48217698572    # 39.31% l2 miss\nl2 miss from l1      14973906198    #\nl2 hit from l2 pf    6090783709     #\nl3 hit from l2 pf    7164595567     #\nl3 miss from l2 pf   3334366653     #\ninstructions         856700215185   # 222.357 float per 1000 inst\nfloat 512            46             # 0.000 AVX-512 per 1000 inst\nfloat 256            910            # 0.000 AVX-256 per 1000 inst\nfloat 128            190493618209   # 222.357 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\n\n\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              650.632\non_cpu               0.083          # 1.32 \/ 16 cores\nutime                857.839\nstime                1.072\nnvcsw                13716          # 82.68%\nnivcsw               2874           # 17.32%\ninblock              1840           # 2.83\/sec\nonblock              44824          # 68.89\/sec\ncpu-clock            858768871242   # 858.769 seconds\ntask-clock           858793135960   # 858.793 seconds\npage faults          364203         # 424.087\/sec\ncontext switches     19661          # 22.894\/sec\ncpu migrations       607            # 0.707\/sec\nmajor page faults    15             # 0.017\/sec\nminor page faults    364188         # 424.070\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             226966574551   # 88.287 branches per 1000 inst\nbranch misses        1541480071     # 0.68% branch miss\nconditional          226966586935   # 88.287 conditional branches per 1000 inst\nindirect             10685420053    # 4.156 indirect branches per 1000 inst\nslots                17759394818246 #\nretiring             2473017644334  # 13.9% (13.9%) low\n-- ucode             322404419382   #     1.8%\n-- fastpath          2150613224952  #    12.1%\nfrontend             821427631570   #  4.6% ( 4.6%) low\n-- latency           358912915741   #     2.0%\n-- bandwidth         462514715829   #     2.6%\nbackend              14581736945460 # 82.1% (82.1%) high\n-- cpu               6516868648695  #    36.7%\n-- memory            8064868296765  #    45.4%\nspeculation          273647358689   #  1.5% ( 1.5%)\n-- branch mispredict 216235726103   #     1.2%\n-- pipeline restart  57411632586    #     0.3%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           2955502961512  # 0.28 GHz\ninstructions         2570750857075  # 0.87 IPC\nl2 access            306554724965   # 119.256 l2 access per 1000 inst\nl2 miss              158681184532   # 51.76% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process overview is simple invocations of ngspice<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>368 processes\n\t 18 ngspice               1452.74     1.26\n\t 68 clinfo                  15.88     6.30\n\t 38 vulkaninfo               1.49     0.76\n\t  4 vulkani:disk$0           0.15     0.08\n\t  6 glxinfo:gdrv0            0.12     0.06\n\t  6 glxinfo:gl0              0.12     0.06\n\t  2 llvmpipe-0               0.08     0.04\n\t  2 llvmpipe-1               0.08     0.04\n\t  2 llvmpipe-10              0.08     0.04\n\t  2 llvmpipe-11              0.08     0.04\n\t  2 llvmpipe-12              0.08     0.04\n\t  2 llvmpipe-13              0.08     0.04\n\t  2 llvmpipe-14              0.08     0.04\n\t  2 llvmpipe-15              0.08     0.04\n\t  2 llvmpipe-2               0.08     0.04\n\t  2 llvmpipe-3               0.08     0.04\n\t  2 llvmpipe-4               0.08     0.04\n\t  2 llvmpipe-5               0.08     0.04\n\t  2 llvmpipe-6               0.08     0.04\n\t  2 llvmpipe-7               0.08     0.04\n\t  2 llvmpipe-8               0.08     0.04\n\t  2 llvmpipe-9               0.08     0.04\n\t  6 clang                    0.07     0.04\n\t  6 php                      0.06     0.27\n\t  2 glxinfo                  0.06     0.02\n\t  2 glxinfo:cs0              0.06     0.02\n\t  2 glxinfo:disk$0           0.06     0.02\n\t  2 glxinfo:sh0              0.06     0.02\n\t  2 glxinfo:shlo0            0.06     0.02\n\t  3 rocminfo                 0.03     0.00\n\t  1 lspci                    0.00     0.03\n\t 84 sh                       0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t 12 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  3 gmain                    0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dconf worker             0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 ps                       0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Computation blocks<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      158451) ngspice          cpu=7 start=6.70  finish=118.97\n        158452) ngspice          cpu=5 start=6.70  finish=118.96\n          158453) ngspice          cpu=6 start=8.54  finish=118.96\n      158458) ngspice          cpu=7 start=122.98 finish=232.45\n        158459) ngspice          cpu=0 start=122.99 finish=232.45\n          158460) ngspice          cpu=1 start=124.84 finish=232.45\n      158467) ngspice          cpu=7 start=236.47 finish=347.93\n        158468) ngspice          cpu=0 start=236.47 finish=347.93\n          158469) ngspice          cpu=2 start=238.31 finish=347.93\n      158473) sh               cpu=9 start=347.94 finish=347.94\n        158474) sh               cpu=2 start=347.94 finish=347.94\n      158475) ngspice          cpu=15 start=358.43 finish=452.29\n        158476) ngspice          cpu=8 start=358.44 finish=452.27\n          158477) ngspice          cpu=10 start=371.73 finish=452.27\n      158478) ngspice          cpu=15 start=456.29 finish=551.97\n        158479) ngspice          cpu=0 start=456.29 finish=551.95\n          158480) ngspice          cpu=9 start=469.75 finish=551.95\n      158513) ngspice          cpu=15 start=555.97 finish=649.15\n        158514) ngspice          cpu=0 start=555.97 finish=649.14\n          158515) ngspice          cpu=1 start=569.14 finish=649.14\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>SPICE circuit simulator with two different test cases. The program looks single-threaded with occasional three threads. Topdown profile shows a large amount of topdown stalls and a low retirement rate AMD metrics shows floating point code with a moderate L2 <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/ngspice\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1263","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1263","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1263"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1263\/revisions"}],"predecessor-version":[{"id":1273,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1263\/revisions\/1273"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1263"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}