{"id":562,"date":"2024-01-14T20:09:57","date_gmt":"2024-01-14T20:09:57","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=562"},"modified":"2024-01-14T20:09:58","modified_gmt":"2024-01-14T20:09:58","slug":"liquid-dsp","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/liquid-dsp\/","title":{"rendered":"liquid-dsp"},"content":{"rendered":"\n<p>This is a software-defined radio signal processing library. There are 15 workloads trying variations including how many threads are started, apparent in the progression below.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-17.png\" alt=\"\" class=\"wp-image-563\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-17.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-17-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/systemtime-17-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown metrics show a low overall amount of frontend stalls, though it also looks like they are peppered in periodically at higher level. Otherwise backend stalls tend to be the largest contributor.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-55.png\" alt=\"\" class=\"wp-image-564\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-55.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-55-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/01\/amdtopdown-55-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show a moderate number of indirect branches as percentage of overall branches and lower L2 access.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              1697.119\non_cpu               0.326          # 5.22 \/ 16 cores\nutime                8856.685\nstime                1.784\nnvcsw                16842          # 25.40%\nnivcsw               49461          # 74.60%\ninblock              8              # 0.00\/sec\nonblock              17368          # 10.23\/sec\ncpu-clock            8859043109277  # 8859.043 seconds\ntask-clock           8859086569027  # 8859.087 seconds\npage faults          189820         # 21.427\/sec\ncontext switches     74498          # 8.409\/sec\ncpu migrations       2162           # 0.244\/sec\nmajor page faults    2              # 0.000\/sec\nminor page faults    189818         # 21.426\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             8939620743102  # 146.505 branches per 1000 inst\nbranch misses        4916139852     # 0.05% branch miss\nconditional          4087186678285  # 66.982 conditional branches per 1000 inst\nindirect             1816700409580  # 29.773 indirect branches per 1000 inst\ncpu-cycles           43906190865863 # 1.12 GHz\ninstructions         77386138698388 # 1.76 IPC\nslots                87820253971404 #\nretiring             26966840935029 # 30.7% (34.3%)\n-- ucode             60406107       #     0.0%\n-- fastpath          26966780528922 #    30.7%\nfrontend             655509839567   #  0.7% ( 0.8%)\n-- latency           442054892988   #     0.5%\n-- bandwidth         213454946579   #     0.2%\nbackend              50285143305180 # 57.3% (64.0%)\n-- cpu               24928897941717 #    28.4%\n-- memory            25356245363463 #    28.9%\nspeculation          720398682627   #  0.8% ( 0.9%)\n-- branch mispredict 395658212360   #     0.5%\n-- pipeline restart  324740470267   #     0.4%\nsmt-contention       9192280268536  # 10.5% ( 0.0%)\ncpu-cycles           35561747977725 # 1.37 GHz\ninstructions         56362778341513 # 1.58 IPC\ninstructions         18791619171629 # 2.312 l2 access per 1000 inst\nl2 hit from l1       32872153754    # 6.47% l2 miss\nl2 miss from l1      2333406782     #\nl2 hit from l2 pf    10093658356    #\nl3 hit from l2 pf    471559932      #\nl3 miss from l2 pf   5473831        #\ninstructions         18779521772018 # 176.946 float per 1000 inst\nfloat 512            95             # 0.000 AVX-512 per 1000 inst\nfloat 256            184530991347   # 9.826 AVX-256 per 1000 inst\nfloat 128            3138425566159  # 167.120 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              2101.806\non_cpu               0.449          # 7.18 \/ 16 cores\nutime                15099.025\nstime                1.385\nnvcsw                20978          # 15.89%\nnivcsw               111064         # 84.11%\ninblock              19432          # 9.25\/sec\nonblock              6256           # 2.98\/sec\ncpu-clock            15101217180305 # 15101.217 seconds\ntask-clock           15101261875302 # 15101.262 seconds\npage faults          180657         # 11.963\/sec\ncontext switches     142233         # 9.419\/sec\ncpu migrations       3874           # 0.257\/sec\nmajor page faults    116            # 0.008\/sec\nminor page faults    180541         # 11.955\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             14706391828425 # 142.843 branches per 1000 inst\nbranch misses        14208269354    # 0.10% branch miss\nconditional          14706391860937 # 142.843 conditional branches per 1000 inst\nindirect             6299830739447  # 61.190 indirect branches per 1000 inst\nslots                115697040400052 #\nretiring             69201998292968 # 59.8% (59.8%)\n-- ucode             8257132014881  #     7.1%\n-- fastpath          60944866278087 #    52.7%\nfrontend             18277382966992 # 15.8% (15.8%)\n-- latency           9902669823664  #     8.6%\n-- bandwidth         8374713143328  #     7.2%\nbackend              27497778361275 # 23.8% (23.8%)\n-- cpu               19048599236733 #    16.5%\n-- memory            8449179124542  #     7.3%\nspeculation          749205159374   #  0.6% ( 0.6%)\n-- branch mispredict 571390197369   #     0.5%\n-- pipeline restart  177814962005   #     0.2%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           27865924741031 # 0.84 GHz\ninstructions         65644794602005 # 2.36 IPC\nl2 access            4235697429     # 0.065 l2 access per 1000 inst\nl2 miss              3330682985     # 78.63% l2 miss\n<\/code><\/pre>\n\n\n\n<p>Process overview shows benchmark_threa is the primary resource user.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>796 processes\n\t366 benchmark_threa      102837.37     3.75\n\t 68 clinfo                  16.20     6.66\n\t 38 vulkaninfo               1.14     1.14\n\t  6 php                      0.15     0.48\n\t  6 glxinfo:gdrv0            0.15     0.07\n\t  4 vulkani:disk$0           0.12     0.12\n\t  2 glxinfo                  0.07     0.03\n\t  2 glxinfo:cs0              0.07     0.03\n\t  2 glxinfo:disk$0           0.07     0.03\n\t  2 glxinfo:sh0              0.07     0.03\n\t  2 glxinfo:shlo0            0.07     0.03\n\t  2 llvmpipe-0               0.06     0.06\n\t  2 llvmpipe-1               0.06     0.06\n\t  2 llvmpipe-10              0.06     0.06\n\t  2 llvmpipe-11              0.06     0.06\n\t  2 llvmpipe-12              0.06     0.06\n\t  2 llvmpipe-13              0.06     0.06\n\t  2 llvmpipe-14              0.06     0.06\n\t  2 llvmpipe-15              0.06     0.06\n\t  2 llvmpipe-2               0.06     0.06\n\t  2 llvmpipe-3               0.06     0.06\n\t  2 llvmpipe-4               0.06     0.06\n\t  2 llvmpipe-5               0.06     0.06\n\t  2 llvmpipe-6               0.06     0.06\n\t  2 llvmpipe-7               0.06     0.06\n\t  2 llvmpipe-8               0.06     0.06\n\t  2 llvmpipe-9               0.06     0.06\n\t  6 clang                    0.03     0.09\n\t  3 rocminfo                 0.00     0.03\n\t  1 lspci                    0.00     0.02\n\t110 sh                       0.00     0.00\n\t 60 liquid-dsp               0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t  8 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 gmain                    0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  3 dconf worker             0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 ps                       0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Computation blocks<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      255011) liquid-dsp       cpu=5 start=5.85  finish=35.89\n        255012) benchmark_threa  cpu=5 start=5.86  finish=35.89\n          255013) benchmark_threa  cpu=6 start=5.86  finish=35.89\n      255018) liquid-dsp       cpu=12 start=39.89 finish=69.93\n        255019) benchmark_threa  cpu=5 start=39.90 finish=69.93\n          255020) benchmark_threa  cpu=14 start=39.90 finish=69.93\n      255022) liquid-dsp       cpu=4 start=73.94 finish=103.97\n        255023) benchmark_threa  cpu=5 start=73.94 finish=103.97\n          255024) benchmark_threa  cpu=14 start=73.94 finish=103.97\n      255025) sh               cpu=11 start=103.97 finish=103.97\n        255026) sh               cpu=5 start=103.97 finish=103.97\n      255027) liquid-dsp       cpu=0 start=114.16 finish=144.20\n        255028) benchmark_threa  cpu=1 start=114.16 finish=144.20\n          255029) benchmark_threa  cpu=11 start=114.17 finish=144.20\n      255030) liquid-dsp       cpu=8 start=148.20 finish=178.24\n        255031) benchmark_threa  cpu=8 start=148.20 finish=178.24\n          255032) benchmark_threa  cpu=3 start=148.21 finish=178.24\n      255033) liquid-dsp       cpu=8 start=182.24 finish=212.28\n        255034) benchmark_threa  cpu=1 start=182.25 finish=212.28\n          255035) benchmark_threa  cpu=3 start=182.25 finish=212.28\n      255036) liquid-dsp       cpu=8 start=216.28 finish=246.32\n        255037) benchmark_threa  cpu=1 start=216.28 finish=246.32\n          255038) benchmark_threa  cpu=3 start=216.29 finish=246.32\n      255042) liquid-dsp       cpu=8 start=250.32 finish=280.36\n        255043) benchmark_threa  cpu=1 start=250.32 finish=280.36\n          255044) benchmark_threa  cpu=11 start=250.33 finish=280.36\n      255075) liquid-dsp       cpu=8 start=284.36 finish=314.40\n        255076) benchmark_threa  cpu=1 start=284.37 finish=314.40\n          255077) benchmark_threa  cpu=3 start=284.37 finish=314.40\n      255078) sh               cpu=5 start=314.40 finish=314.40\n        255079) sh               cpu=14 start=314.40 finish=314.40\n      255080) liquid-dsp       cpu=5 start=324.68 finish=354.79\n        255081) benchmark_threa  cpu=14 start=324.68 finish=354.79\n          255082) benchmark_threa  cpu=0 start=324.69 finish=354.79\n          255083) benchmark_threa  cpu=1 start=324.69 finish=354.79\n      255084) liquid-dsp       cpu=11 start=358.79 finish=388.86\n        255085) benchmark_threa  cpu=5 start=358.80 finish=388.86\n          255086) benchmark_threa  cpu=14 start=358.80 finish=388.86\n          255087) benchmark_threa  cpu=15 start=358.80 finish=388.86\n      255088) liquid-dsp       cpu=0 start=392.86 finish=422.93\n        255089) benchmark_threa  cpu=9 start=392.86 finish=422.93\n          255090) benchmark_threa  cpu=11 start=392.87 finish=422.93\n          255091) benchmark_threa  cpu=4 start=392.87 finish=422.93\n      255092) sh               cpu=0 start=422.93 finish=422.93\n        255093) sh               cpu=9 start=422.93 finish=422.93\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>This is a software-defined radio signal processing library. There are 15 workloads trying variations including how many threads are started, apparent in the progression below. Topdown metrics show a low overall amount of frontend stalls, though it also looks like <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/liquid-dsp\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-562","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/562","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=562"}],"version-history":[{"count":1,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/562\/revisions"}],"predecessor-version":[{"id":565,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/562\/revisions\/565"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=562"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}