{"id":1229,"date":"2024-02-01T11:28:50","date_gmt":"2024-02-01T11:28:50","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1229"},"modified":"2024-02-01T11:28:51","modified_gmt":"2024-02-01T11:28:51","slug":"sqlite","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/sqlite\/","title":{"rendered":"sqlite"},"content":{"rendered":"\n<p>Benchmarking the sqlite database with five workloads that vary the number of threads operating from 1 to 16 in powers of 2. The number of runable processes only gets to five below.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-3.png\" alt=\"\" class=\"wp-image-1230\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-3.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-3-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/systemtime-3-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows a workload dominated by frontend stalls and with a low retirement rate.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-3.png\" alt=\"\" class=\"wp-image-1231\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-3.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-3-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/02\/amdtopdown-3-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show less than one core of on-cpu on average. There is a moderately high L2 access and L2 miss rate but a low set of memory stalls. There is little floating point code.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              374.638\non_cpu               0.040          # 0.64 \/ 16 cores\nutime                27.590\nstime                212.440\nnvcsw                6203372        # 83.77%\nnivcsw               1202158        # 16.23%\ninblock              0              # 0.00\/sec\nonblock              33573736       # 89616.45\/sec\ncpu-clock            234773842345   # 234.774 seconds\ntask-clock           237874612156   # 237.875 seconds\npage faults          322667         # 1356.458\/sec\ncontext switches     7406386        # 31135.672\/sec\ncpu migrations       373657         # 1570.815\/sec\nmajor page faults    15             # 0.063\/sec\nminor page faults    322652         # 1356.395\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             73382349376    # 204.985 branches per 1000 inst\nbranch misses        9183182583     # 12.51% branch miss\nconditional          39856833656    # 111.335 conditional branches per 1000 inst\nindirect             735923949      # 2.056 indirect branches per 1000 inst\ncpu-cycles           503897140569   # 0.08 GHz\ninstructions         362772973215   # 0.72 IPC\nslots                985859494440   #\nretiring             132024456157   # 13.4% (13.6%) low\n-- ucode             675731578      #     0.1%\n-- fastpath          131348724579   #    13.3%\nfrontend             743024582218   # 75.4% (76.3%) high\n-- latency           632192389680   #    64.1%\n-- bandwidth         110832192538   #    11.2%\nbackend              85070860456    #  8.6% ( 8.7%) low\n-- cpu               24113210052    #     2.4%\n-- memory            60957650404    #     6.2%\nspeculation          13412065688    #  1.4% ( 1.4%)\n-- branch mispredict 13392597513    #     1.4%\n-- pipeline restart  19468175       #     0.0%\nsmt-contention       12224270726    #  1.2% ( 0.0%)\ncpu-cycles           503154312781   # 0.08 GHz\ninstructions         362440359916   # 0.72 IPC\ninstructions         117775660458   # 112.538 l2 access per 1000 inst\nl2 hit from l1       12244170605    # 32.61% l2 miss\nl2 miss from l1      3748255784     #\nl2 hit from l2 pf    436472980      #\nl3 hit from l2 pf    549331690      #\nl3 miss from l2 pf   24246945       #\ninstructions         117838963317   # 11.287 float per 1000 inst\nfloat 512            339            # 0.000 AVX-512 per 1000 inst\nfloat 256            572            # 0.000 AVX-256 per 1000 inst\nfloat 128            1330103949     # 11.287 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         5              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              948.364\non_cpu               0.038          # 0.61 \/ 16 cores\nutime                84.072\nstime                496.014\nnvcsw                6863881        # 89.83%\nnivcsw               776946         # 10.17%\ninblock              0              # 0.00\/sec\nonblock              33562496       # 35389.88\/sec\ncpu-clock            565310791203   # 565.311 seconds\ntask-clock           572699669326   # 572.700 seconds\npage faults          312550         # 545.749\/sec\ncontext switches     7644694        # 13348.522\/sec\ncpu migrations       1472792        # 2571.666\/sec\nmajor page faults    14             # 0.024\/sec\nminor page faults    312536         # 545.724\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             61483148417    # 177.932 branches per 1000 inst\nbranch misses        1319964232     # 2.15% branch miss\nconditional          61483187617    # 177.933 conditional branches per 1000 inst\nindirect             11339675159    # 32.817 indirect branches per 1000 inst\nslots                1072578969920  #\nretiring             226365385224   # 21.1% (21.1%)\n-- ucode             37992131912    #     3.5%\n-- fastpath          188373253312   #    17.6%\nfrontend             505370446250   # 47.1% (47.1%) high\n-- latency           364473530709   #    34.0%\n-- bandwidth         140896915541   #    13.1%\nbackend              267722977027   # 25.0% (25.0%)\n-- cpu               129424001415   #    12.1%\n-- memory            138298975612   #    12.9%\nspeculation          101927658915   #  9.5% ( 9.5%)\n-- branch mispredict 96582518582    #     9.0%\n-- pipeline restart  5345140333     #     0.5%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           364913872444   # 0.02 GHz\ninstructions         390403751528   # 1.07 IPC\nl2 access            20396866754    # 92.720 l2 access per 1000 inst\nl2 miss              5816501509     # 28.52% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process overview shows the test overhead is almost as much user time as the workload, though there is a much higher amount of system time. Interesting to drill deeper to see where that system time goes.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>1205 processes\n\t372 sqlite3                 25.15   187.90\n\t 68 clinfo                  17.18     8.65\n\t 38 vulkaninfo               1.71     0.95\n\t  4 vulkani:disk$0           0.18     0.10\n\t  6 glxinfo:gdrv0            0.16     0.03\n\t  6 glxinfo:gl0              0.16     0.03\n\t  6 php                      0.13     0.12\n\t  2 llvmpipe-0               0.09     0.05\n\t  2 llvmpipe-1               0.09     0.05\n\t  2 llvmpipe-10              0.09     0.05\n\t  2 llvmpipe-11              0.09     0.05\n\t  2 llvmpipe-12              0.09     0.05\n\t  2 llvmpipe-13              0.09     0.05\n\t  2 llvmpipe-14              0.09     0.05\n\t  2 llvmpipe-15              0.09     0.05\n\t  2 llvmpipe-2               0.09     0.05\n\t  2 llvmpipe-3               0.09     0.05\n\t  2 llvmpipe-4               0.09     0.05\n\t  2 llvmpipe-5               0.09     0.05\n\t  2 llvmpipe-6               0.09     0.05\n\t  2 llvmpipe-7               0.09     0.05\n\t  2 llvmpipe-8               0.09     0.05\n\t  2 llvmpipe-9               0.09     0.05\n\t  2 glxinfo                  0.09     0.01\n\t  2 glxinfo:cs0              0.08     0.01\n\t  2 glxinfo:disk$0           0.08     0.01\n\t  2 glxinfo:sh0              0.08     0.01\n\t  2 glxinfo:shlo0            0.08     0.01\n\t  6 clang                    0.07     0.05\n\t  3 rocminfo                 0.03     0.00\n\t  1 lspci                    0.01     0.02\n\t  1 ps                       0.00     0.01\n\t292 cat                      0.00     0.00\n\t111 sh                       0.00     0.00\n\t108 sqlite-benchmar          0.00     0.00\n\t 20 bash                     0.00     0.00\n\t 20 rm                       0.00     0.00\n\t 15 seq                      0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t  9 gsettings                0.00     0.00\n\t  9 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  4 gmain                    0.00     0.00\n\t  3 dconf worker             0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n58 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Computation blocks are as follows<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      20404) sqlite-benchmar  cpu=8 start=5.57  finish=13.19\n        20405) cat              cpu=1 start=5.58  finish=5.58 \n        20406) seq              cpu=2 start=5.58  finish=5.58 \n        20407) sqlite-benchmar  cpu=4 start=5.58  finish=13.19\n          20408) sqlite3          cpu=13 start=5.58  finish=5.59 \n          20409) cat              cpu=6 start=5.59  finish=7.45 \n          20410) sqlite3          cpu=10 start=5.59  finish=8.07 \n          20411) cat              cpu=13 start=8.07  finish=10.01\n          20412) sqlite3          cpu=12 start=8.07  finish=10.63\n          20413) cat              cpu=13 start=10.63 finish=12.57\n          20414) sqlite3          cpu=14 start=10.63 finish=13.19\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Benchmarking the sqlite database with five workloads that vary the number of threads operating from 1 to 16 in powers of 2. The number of runable processes only gets to five below. Topdown profile shows a workload dominated by frontend <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/sqlite\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1229","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1229","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1229"}],"version-history":[{"count":1,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1229\/revisions"}],"predecessor-version":[{"id":1232,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1229\/revisions\/1232"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1229"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}