{"id":1853,"date":"2024-02-29T19:00:22","date_gmt":"2024-02-29T19:00:22","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=1853"},"modified":"2024-03-01T02:13:23","modified_gmt":"2024-03-01T02:13:23","slug":"gnuradio","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/gnuradio\/","title":{"rendered":"gnuradio"},"content":{"rendered":"\n<p>Signal processing blocks for software-defined radio. There is one result but it reports on six subtests. The later ones look single-threaded with earier ones have more multi-threaded aspects.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-3.png\" alt=\"\" class=\"wp-image-1872\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-3.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-3-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-3-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows variation amount the subtests with a few backend bound and a few with high retirement rates.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-3.png\" alt=\"\" class=\"wp-image-1874\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-3.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-3-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-3-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics are a composite. This is floating point code with moderate retirement rate.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              1097.637\non_cpu               0.121          # 1.93 \/ 16 cores\nutime                1562.452\nstime                558.792\nnvcsw                89602927       # 99.99%\nnivcsw               4545           # 0.01%\ninblock              0              # 0.00\/sec\nonblock              13920          # 12.68\/sec\ncpu-clock            2095272795070  # 2095.273 seconds\ntask-clock           2114354833498  # 2114.355 seconds\npage faults          170110         # 80.455\/sec\ncontext switches     89612779       # 42383.037\/sec\ncpu migrations       2779           # 1.314\/sec\nmajor page faults    21             # 0.010\/sec\nminor page faults    170089         # 80.445\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             1845636929664  # 82.601 branches per 1000 inst\nbranch misses        21723197141    # 1.18% branch miss\nconditional          1537264132256  # 68.800 conditional branches per 1000 inst\nindirect             57598335588    # 2.578 indirect branches per 1000 inst\ncpu-cycles           8302100851453  # 0.47 GHz\ninstructions         23094745269927 # 2.78 IPC\nslots                16383275123232 #\nretiring             7492118350882  # 45.7% (45.9%)\n-- ucode             15744683433    #     0.1%\n-- fastpath          7476373667449  #    45.6%\nfrontend             3012954823010  # 18.4% (18.5%)\n-- latency           1575453553086  #     9.6%\n-- bandwidth         1437501269924  #     8.8%\nbackend              5747613614802  # 35.1% (35.2%)\n-- cpu               3038521545725  #    18.5%\n-- memory            2709092069077  #    16.5%\nspeculation          74801905916    #  0.5% ( 0.5%) low\n-- branch mispredict 72868715607    #     0.4%\n-- pipeline restart  1933190309     #     0.0%\nsmt-contention       55386719908    #  0.3% ( 0.0%)\ncpu-cycles           8319796035258  # 0.47 GHz\ninstructions         23115058974537 # 2.78 IPC\ninstructions         7616531900660  # 31.420 l2 access per 1000 inst\nl2 hit from l1       177670026785   # 7.21% l2 miss\nl2 miss from l1      11066886113    #\nl2 hit from l2 pf    55458519661    #\nl3 hit from l2 pf    6175744674     #\nl3 miss from l2 pf   6063535        #\ninstructions         7592977884179  # 211.400 float per 1000 inst\nfloat 512            67             # 0.000 AVX-512 per 1000 inst\nfloat 256            388            # 0.000 AVX-256 per 1000 inst\nfloat 128            1605156885414  # 211.400 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         37             # 0.000 scalar per 1000 inst\ninstructions         22861285368694 #\nopcache              2139250028244  # 93.575 opcache per 1000 inst\nopcache miss         624186820575   # 29.2% opcache miss rate\nl1 dTLB miss         7418045926     # 0.324 L1 dTLB per 1000 inst\nl2 dTLB miss         241724207      # 0.011 L2 dTLB per 1000 inst\ninstructions         22559155812921 #\nicache               878817198627   # 38.956 icache per 1000 inst\nicache miss          131715842092   # 15.0% icache miss rate\nl1 iTLB miss         1830144713     # 0.081 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            28298          # 0.000 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics show this fits in L3.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              1096.987\non_cpu               0.101          # 1.62 \/ 16 cores\nutime                1506.299\nstime                270.634\nnvcsw                84433775       # 99.95%\nnivcsw               41809          # 0.05%\ninblock              36576          # 33.34\/sec\nonblock              2664           # 2.43\/sec\ncpu-clock            1739532062505  # 1739.532 seconds\ntask-clock           1748408579524  # 1748.409 seconds\npage faults          165661         # 94.750\/sec\ncontext switches     84480901       # 48318.741\/sec\ncpu migrations       98036          # 56.072\/sec\nmajor page faults    228            # 0.130\/sec\nminor page faults    165433         # 94.619\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             1578650529620  # 83.347 branches per 1000 inst\nbranch misses        1664132624     # 0.11% branch miss\nconditional          1578650549844  # 83.347 conditional branches per 1000 inst\nindirect             112747402928   # 5.953 indirect branches per 1000 inst\nslots                40609276691846 #\nretiring             22874172887876 # 56.3% (56.3%) high\n-- ucode             1613706177400  #     4.0%\n-- fastpath          21260466710476 #    52.4%\nfrontend             4677518049216  # 11.5% (11.5%)\n-- latency           2149216887370  #     5.3%\n-- bandwidth         2528301161846  #     6.2%\nbackend              12344914421614 # 30.4% (30.4%)\n-- cpu               7984307851491  #    19.7%\n-- memory            4360606570123  #    10.7%\nspeculation          698457345576   #  1.7% ( 1.7%)\n-- branch mispredict 617169797578   #     1.5%\n-- pipeline restart  81287547998    #     0.2%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           5871574534741  # 0.33 GHz\ninstructions         20031105148449 # 3.41 IPC high\nl2 access            179866949012   # 10.334 l2 access per 1000 inst\nl2 miss              20298180768    # 11.29% l2 miss\ncpu-cycles           5168426575314  # 17.0% memory latency\nload stalls          842343722814   #  7.7% l1 bound\nl1 miss              444045056065   #  4.7% l2 bound\nl2 miss              199180657463   #  3.8% l3 bound\nl3 miss              291145344      #  0.0% dram bound\nstore_stalls         37279648653    #  0.7% store bound\n<\/code><\/pre>\n\n\n\n<p>Process overview shows python code with subproceses for different tests.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>634 processes\n\t 78 python3              45334.92  8184.34\n\t 21 fft_filter_ccf1       4164.37   662.14\n\t  6 iir_filter_ffd6       2949.47   557.66\n\t  6 iir_filter_ccf5       2688.09   518.63\n\t 12 fft_filter_ccf2       2602.84   411.60\n\t  6 fft_filter_fff3       2165.59   449.15\n\t  3 hilbert_fc82          1766.19   318.90\n\t  3 null_source81         1766.19   318.90\n\t  3 probe_rate83          1766.19   318.90\n\t  3 hilbert_fc78          1702.13   310.71\n\t  3 null_source77         1702.13   310.71\n\t  3 probe_rate79          1702.13   310.71\n\t  3 hilbert_fc74          1637.99   302.50\n\t  3 null_source73         1637.99   302.50\n\t  3 probe_rate75          1637.99   302.50\n\t  3 iir_filter_ffd7       1573.31   294.79\n\t  3 null_source68         1573.31   294.79\n\t  3 probe_rate71          1573.31   294.79\n\t  3 null_source63         1507.53   284.26\n\t  3 probe_rate66          1507.53   284.26\n\t  3 null_source58         1441.94   273.40\n\t  3 probe_rate61          1441.94   273.40\n\t  3 null_source54         1376.21   262.97\n\t  3 probe_rate56          1376.21   262.97\n\t  3 null_source50         1311.88   255.66\n\t  3 probe_rate52          1311.88   255.66\n\t  3 iir_filter_ccf4       1247.54   248.43\n\t  3 null_source46         1247.54   248.43\n\t  3 probe_rate48          1247.54   248.43\n\t  3 fft_filter_fff4       1183.22   241.11\n\t  3 null_source42         1183.22   241.11\n\t  3 probe_rate44          1183.22   241.11\n\t  3 null_source38         1116.25   230.21\n\t  3 probe_rate40          1116.25   230.21\n\t  3 null_source34         1049.34   218.94\n\t  3 probe_rate36          1049.34   218.94\n\t  3 probe_rate32           982.48   207.45\n\t  3 sig_source31           982.48   207.45\n\t  3 probe_rate29           915.38   179.32\n\t  3 sig_source28           915.38   179.32\n\t  3 probe_rate26           848.06   150.99\n\t  3 sig_source25           848.06   150.99\n\t  3 null_source17          780.91   122.77\n\t  3 probe_rate23           780.91   122.77\n\t  3 null_source9           520.51    83.32\n\t  3 probe_rate15           520.51    83.32\n\t  3 fft_filter_ccf3        260.11    43.29\n\t  3 fft_filter_ccf4        260.11    43.29\n\t  3 fft_filter_ccf5        260.11    43.29\n\t  3 fft_filter_ccf6        260.11    43.29\n\t  3 null_source1           260.11    43.29\n\t  3 probe_rate7            260.11    43.29\n\t 68 clinfo                  16.86     5.33\n\t 38 vulkaninfo               0.96     1.33\n\t  4 vulkani:disk$0           0.10     0.14\n\t  6 clang                    0.09     0.02\n\t  6 php                      0.06     0.14\n\t  2 llvmpipe-0               0.05     0.07\n\t  2 llvmpipe-1               0.05     0.07\n\t  2 llvmpipe-10              0.05     0.07\n\t  2 llvmpipe-11              0.05     0.07\n\t  2 llvmpipe-12              0.05     0.07\n\t  2 llvmpipe-13              0.05     0.07\n\t  2 llvmpipe-14              0.05     0.07\n\t  2 llvmpipe-15              0.05     0.07\n\t  2 llvmpipe-2               0.05     0.07\n\t  2 llvmpipe-3               0.05     0.07\n\t  2 llvmpipe-4               0.05     0.07\n\t  2 llvmpipe-5               0.05     0.07\n\t  2 llvmpipe-6               0.05     0.07\n\t  2 llvmpipe-7               0.05     0.07\n\t  2 llvmpipe-8               0.05     0.07\n\t  2 llvmpipe-9               0.05     0.07\n\t  1 lspci                    0.00     0.01\n\t 84 sh                       0.00     0.00\n\t 15 gnuradio-config          0.00     0.00\n\t 12 gcc                      0.00     0.00\n\t  9 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  4 glxinfo                  0.00     0.00\n\t  4 gmain                    0.00     0.00\n\t  3 dconf worker             0.00     0.00\n\t  3 gnuradio                 0.00     0.00\n\t  3 rocminfo                 0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 setterm                  0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  1 cc                       0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 ps                       0.00     0.00\n\t  1 python                   0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sed                      0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n19 processes running\n66 maximum processes\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Signal processing blocks for software-defined radio. There is one result but it reports on six subtests. The later ones look single-threaded with earier ones have more multi-threaded aspects. Topdown profile shows variation amount the subtests with a few backend bound <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/gnuradio\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1853","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1853","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=1853"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1853\/revisions"}],"predecessor-version":[{"id":1875,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/1853\/revisions\/1875"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=1853"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}