{"id":2192,"date":"2024-03-24T20:23:08","date_gmt":"2024-03-24T20:23:08","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=2192"},"modified":"2024-03-27T12:35:50","modified_gmt":"2024-03-27T12:35:50","slug":"pjsip","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/pjsip\/","title":{"rendered":"pjsip"},"content":{"rendered":"\n<p>A multi-media communication library. There are three tests. Overall not all cores are used.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-46.png\" alt=\"\" class=\"wp-image-2224\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-46.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-46-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/systemtime-46-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows patterns for the first two workloads with increasing backend stalls and stead more front-end dominated third test.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-48.png\" alt=\"\" class=\"wp-image-2222\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-48.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-48-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/03\/amdtopdown-48-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics confirm ~3 cores used. This has a low amount of floating point and some L2 access\/miss. Retirement rate is low.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              872.680\non_cpu               0.181          # 2.90 \/ 16 cores\nutime                1471.931\nstime                1054.997\nnvcsw                104739970      # 99.99%\nnivcsw               13391          # 0.01%\ninblock              0              # 0.00\/sec\nonblock              19360          # 22.18\/sec\ncpu-clock            2447341449035  # 2447.341 seconds\ntask-clock           2480102291249  # 2480.102 seconds\npage faults          14806187       # 5969.990\/sec\ncontext switches     104757512      # 42239.190\/sec\ncpu migrations       205635         # 82.914\/sec\nmajor page faults    2              # 0.001\/sec\nminor page faults    14806185       # 5969.990\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             1128582198076  # 213.850 branches per 1000 inst\nbranch misses        47870033453    # 4.24% branch miss\nconditional          693597720831   # 131.427 conditional branches per 1000 inst\nindirect             48446812498    # 9.180 indirect branches per 1000 inst\ncpu-cycles           4979308073975  # 0.60 GHz\ninstructions         3332596284211  # 0.67 IPC low\nslots                9590503738338  #\nretiring             1157720464051  # 12.1% (13.4%) low\n-- ucode             5521946715     #     0.1%\n-- fastpath          1152198517336  #    12.0%\nfrontend             3682466302037  # 38.4% (42.5%)\n-- latency           2964712224432  #    30.9%\n-- bandwidth         717754077605   #     7.5%\nbackend              3732946857573  # 38.9% (43.1%)\n-- cpu               287450185120   #     3.0%\n-- memory            3445496672453  #    35.9%\nspeculation          90555733372    #  0.9% ( 1.0%)\n-- branch mispredict 89531846934    #     0.9%\n-- pipeline restart  1023886438     #     0.0%\nsmt-contention       920983823302   #  9.6% ( 0.0%)\ncpu-cycles           4974037708645  # 0.60 GHz\ninstructions         3337759127969  # 0.67 IPC low\ninstructions         1067429416158  # 77.346 l2 access per 1000 inst\nl2 hit from l1       73991456271    # 23.13% l2 miss\nl2 miss from l1      13819645625    #\nl2 hit from l2 pf    3289759922     #\nl3 hit from l2 pf    3690946118     #\nl3 miss from l2 pf   1589510836     #\ninstructions         1062162717684  # 37.920 float per 1000 inst\nfloat 512            70             # 0.000 AVX-512 per 1000 inst\nfloat 256            376            # 0.000 AVX-256 per 1000 inst\nfloat 128            40277396596    # 37.920 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         0              # 0.000 scalar per 1000 inst\ninstructions         3271752497547  #\nopcache              807851793032   # 246.917 opcache per 1000 inst\nopcache miss         364522750050   # 45.1% opcache miss rate\nl1 dTLB miss         18076511385    # 5.525 L1 dTLB per 1000 inst\nl2 dTLB miss         6912031728     # 2.113 L2 dTLB per 1000 inst\ninstructions         3241047487847  #\nicache               651603170877   # 201.047 icache per 1000 inst\nicache miss          151656213181   # 23.3% icache miss rate\nl1 iTLB miss         4500504772     # 1.389 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            27633          # 0.000 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Intel metrics show L2 contributing most to memory stalls.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              628.100\non_cpu               0.271          # 4.33 \/ 16 cores\nutime                1091.855\nstime                1629.278\nnvcsw                151030991      # 99.93%\nnivcsw               109554         # 0.07%\ninblock              11128          # 17.72\/sec\nonblock              6552           # 10.43\/sec\ncpu-clock            2583472462864  # 2583.472 seconds\ntask-clock           2613444209838  # 2613.444 seconds\npage faults          5468293        # 2092.370\/sec\ncontext switches     151143522      # 57833.078\/sec\ncpu migrations       1129211        # 432.078\/sec\nmajor page faults    75             # 0.029\/sec\nminor page faults    5468218        # 2092.342\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             648571518382   # 193.231 branches per 1000 inst\nbranch misses        6479755096     # 1.00% branch miss\nconditional          648571535662   # 193.231 conditional branches per 1000 inst\nindirect             184234923106   # 54.890 indirect branches per 1000 inst\nslots                9992331641624  #\nretiring             2316526686439  # 23.2% (23.2%)\n-- ucode             271324146085   #     2.7%\n-- fastpath          2045202540354  #    20.5%\nfrontend             3415663041163  # 34.2% (34.2%)\n-- latency           2120574401746  #    21.2%\n-- bandwidth         1295088639417  #    13.0%\nbackend              3818905787350  # 38.2% (38.2%)\n-- cpu               1985792510508  #    19.9%\n-- memory            1833113276842  #    18.3%\nspeculation          680472023318   #  6.8% ( 6.8%)\n-- branch mispredict 627083874564   #     6.3%\n-- pipeline restart  53388148754    #     0.5%\nsmt-contention       0              #  0.0% ( 0.0%)\ncpu-cycles           6930631634594  # 0.68 GHz\ninstructions         3663941525969  # 0.53 IPC low\nl2 access            120844116452   # 67.688 l2 access per 1000 inst\nl2 miss              32589491360    # 26.97% l2 miss\ncpu-cycles           3286115076954  # 40.3% memory latency\nload stalls          1313147926015  #  7.7% l1 bound\nl1 miss              1060648082367  #  6.0% l2 bound\nl2 miss              862603162359   # 23.5% l3 bound\nl3 miss              91443384427    #  2.8% dram bound\nstore_stalls         12488939402    #  0.4% store bound\n<\/code><\/pre>\n\n\n\n<p>Process overview shows pjsip-perf as the primary test process<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>768 processes\n\t304 pjsip-perf           31540.46 15983.96\n\t 68 clinfo                  19.84     5.99\n\t 38 vulkaninfo               1.71     1.33\n\t 80 threaded-ml              0.38     0.76\n\t  4 vulkani:disk$0           0.18     0.14\n\t  6 glxinfo:gdrv0            0.18     0.04\n\t  6 glxinfo:gl0              0.18     0.04\n\t  6 php                      0.09     0.17\n\t  2 llvmpipe-0               0.09     0.07\n\t  2 llvmpipe-1               0.09     0.07\n\t  2 llvmpipe-10              0.09     0.07\n\t  2 llvmpipe-11              0.09     0.07\n\t  2 llvmpipe-12              0.09     0.07\n\t  2 llvmpipe-13              0.09     0.07\n\t  2 llvmpipe-14              0.09     0.07\n\t  2 llvmpipe-15              0.09     0.07\n\t  2 llvmpipe-2               0.09     0.07\n\t  2 llvmpipe-3               0.09     0.07\n\t  2 llvmpipe-4               0.09     0.07\n\t  2 llvmpipe-5               0.09     0.07\n\t  2 llvmpipe-6               0.09     0.07\n\t  2 llvmpipe-7               0.09     0.07\n\t  2 llvmpipe-8               0.09     0.07\n\t  2 llvmpipe-9               0.09     0.07\n\t  2 glxinfo                  0.08     0.02\n\t  2 glxinfo:cs0              0.08     0.02\n\t  2 glxinfo:disk$0           0.08     0.02\n\t  2 glxinfo:sh0              0.08     0.02\n\t  2 glxinfo:shlo0            0.08     0.02\n\t  6 clang                    0.06     0.04\n\t  3 rocminfo                 0.06     0.00\n\t  1 lspci                    0.01     0.02\n\t 86 sh                       0.00     0.00\n\t 17 sed                      0.00     0.00\n\t 16 pjsip                    0.00     0.00\n\t 13 gcc                      0.00     0.00\n\t  9 gsettings                0.00     0.00\n\t  8 stat                     0.00     0.00\n\t  8 systemd-detect-          0.00     0.00\n\t  6 llvm-link                0.00     0.00\n\t  5 phoronix-test-s          0.00     0.00\n\t  4 gmain                    0.00     0.00\n\t  3 dconf worker             0.00     0.00\n\t  2 cc                       0.00     0.00\n\t  2 lscpu                    0.00     0.00\n\t  2 uname                    0.00     0.00\n\t  2 which                    0.00     0.00\n\t  2 xset                     0.00     0.00\n\t  1 date                     0.00     0.00\n\t  1 dirname                  0.00     0.00\n\t  1 dmesg                    0.00     0.00\n\t  1 dmidecode                0.00     0.00\n\t  1 grep                     0.00     0.00\n\t  1 ifconfig                 0.00     0.00\n\t  1 ip                       0.00     0.00\n\t  1 lsmod                    0.00     0.00\n\t  1 mktemp                   0.00     0.00\n\t  1 ps                       0.00     0.00\n\t  1 qdbus                    0.00     0.00\n\t  1 readlink                 0.00     0.00\n\t  1 realpath                 0.00     0.00\n\t  1 sort                     0.00     0.00\n\t  1 stty                     0.00     0.00\n\t  1 systemctl                0.00     0.00\n\t  1 template.sh              0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 xrandr                   0.00     0.00\n0 processes running\n47 maximum processes\n<\/code><\/pre>\n\n\n\n<p>Computation blocks<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      15666) pjsip            cpu=15 start=5.72  finish=71.82\n        15667) pjsip-perf       cpu=15 start=5.72  finish=71.76\n          15668) threaded-ml      cpu=9 start=5.74  finish=5.74 \n          15669) threaded-ml      cpu=13 start=5.74  finish=5.74 \n          15670) pjsip-perf       cpu=9 start=5.74  finish=5.74 \n          15671) pjsip-perf       cpu=12 start=5.74  finish=5.74 \n          15672) threaded-ml      cpu=2 start=5.74  finish=5.74 \n          15673) threaded-ml      cpu=1 start=5.74  finish=5.74 \n          15674) threaded-ml      cpu=9 start=6.05  finish=6.05 \n          15675) pjsip-perf       cpu=9 start=6.05  finish=69.52\n          15676) pjsip-perf       cpu=8 start=6.05  finish=68.50\n          15677) pjsip-perf       cpu=5 start=6.05  finish=68.56\n          15678) pjsip-perf       cpu=3 start=6.05  finish=68.91\n          15679) pjsip-perf       cpu=4 start=6.05  finish=68.48\n          15680) pjsip-perf       cpu=12 start=6.05  finish=68.34\n          15681) pjsip-perf       cpu=14 start=6.05  finish=68.76\n          15682) pjsip-perf       cpu=15 start=6.05  finish=68.78\n          15683) pjsip-perf       cpu=0 start=6.05  finish=68.48\n          15684) pjsip-perf       cpu=6 start=6.05  finish=68.19\n          15685) pjsip-perf       cpu=8 start=6.05  finish=68.72\n          15686) pjsip-perf       cpu=3 start=6.05  finish=68.47\n          15687) pjsip-perf       cpu=10 start=6.05  finish=68.52\n          15688) pjsip-perf       cpu=1 start=6.05  finish=68.40\n          15689) pjsip-perf       cpu=11 start=6.05  finish=68.87\n          15690) pjsip-perf       cpu=7 start=6.05  finish=68.68\n        15692) sed              cpu=8 start=71.82 finish=71.82\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>A multi-media communication library. There are three tests. Overall not all cores are used. Topdown profile shows patterns for the first two workloads with increasing backend stalls and stead more front-end dominated third test. AMD metrics confirm ~3 cores used. <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/phoronix\/pjsip\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":58,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-2192","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2192","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=2192"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2192\/revisions"}],"predecessor-version":[{"id":2225,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2192\/revisions\/2225"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/58"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=2192"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}