Testing the GNU multi-precision library. A single threaded program that reports a GMPbench score

Topdown profile suggests multiple sub-tests. Overall a high retirement rate low frontend stalls with backend stalls variable with the test case.

AMD metrics show little floating point and little l2 access. The CPU stalls are core-bound and not memory-bound.

elapsed              427.784
on_cpu               0.062          # 0.99 / 16 cores
utime                417.815
stime                4.074
nvcsw                3074           # 64.89%
nivcsw               1663           # 35.11%
inblock              0              # 0.00/sec
onblock              14296          # 33.42/sec
cpu-clock            421916906517   # 421.917 seconds
task-clock           421922454058   # 421.922 seconds
page faults          2756183        # 6532.440/sec
context switches     6298           # 14.927/sec
cpu migrations       444            # 1.052/sec
major page faults    3              # 0.007/sec
minor page faults    2756180        # 6532.433/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             501523528799   # 82.260 branches per 1000 inst
branch misses        3061059185     # 0.61% branch miss
conditional          382644488334   # 62.761 conditional branches per 1000 inst
indirect             14876152172    # 2.440 indirect branches per 1000 inst
cpu-cycles           1965040411588  # 0.29 GHz
instructions         6072094172754  # 3.09 IPC high
slots                3937816545756  #
retiring             2247636929437  # 57.1% (57.1%) high
-- ucode             776956035      #     0.0%
-- fastpath          2246859973402  #    57.1%
frontend             116276658154   #  3.0% ( 3.0%) low
-- latency           80906720016    #     2.1%
-- bandwidth         35369938138    #     0.9%
backend              1509881987155  # 38.3% (38.3%)
-- cpu               1262260815603  #    32.1%
-- memory            247621171552   #     6.3%
speculation          63519286739    #  1.6% ( 1.6%)
-- branch mispredict 60232902373    #     1.5%
-- pipeline restart  3286384366     #     0.1%
smt-contention       501252396      #  0.0% ( 0.0%)
cpu-cycles           1965474699019  # 0.29 GHz
instructions         6059932908034  # 3.08 IPC high
instructions         2023359162871  # 2.900 l2 access per 1000 inst
l2 hit from l1       3554993713     # 10.74% l2 miss
l2 miss from l1      138068096      #
l2 hit from l2 pf    1820413091     #
l3 hit from l2 pf    483630493      #
l3 miss from l2 pf   8709715        #
instructions         2021651448545  # 1.140 float per 1000 inst
float 512            213            # 0.000 AVX-512 per 1000 inst
float 256            638            # 0.000 AVX-256 per 1000 inst
float 128            2303804812     # 1.140 AVX-128 per 1000 inst
float MMX            0              # 0.000 MMX per 1000 inst
float scalar         0              # 0.000 scalar per 1000 inst

Intel metrics

elapsed              429.153
on_cpu               0.062          # 0.99 / 16 cores
utime                421.186
stime                2.284
nvcsw                2764           # 62.44%
nivcsw               1663           # 37.56%
inblock              248            # 0.58/sec
onblock              2720           # 6.34/sec
cpu-clock            423457398211   # 423.457 seconds
task-clock           423463418970   # 423.463 seconds
page faults          2434289        # 5748.523/sec
context switches     5991           # 14.148/sec
cpu migrations       466            # 1.100/sec
major page faults    1              # 0.002/sec
minor page faults    2434288        # 5748.520/sec
alignment faults     0              # 0.000/sec
emulation faults     0              # 0.000/sec
branches             444883853274   # 81.002 branches per 1000 inst
branch misses        2834417364     # 0.64% branch miss
conditional          444883878586   # 81.002 conditional branches per 1000 inst
indirect             14177570230    # 2.581 indirect branches per 1000 inst
slots                9621835222010  #
retiring             6303423323147  # 65.5% (65.5%) high
-- ucode             987932791689   #    10.3%
-- fastpath          5315490531458  #    55.2%
frontend             487924713905   #  5.1% ( 5.1%)
-- latency           108479845700   #     1.1%
-- bandwidth         379444868205   #     3.9%
backend              2464878156935  # 25.6% (25.6%)
-- cpu               2307296971385  #    24.0%
-- memory            157581185550   #     1.6%
speculation          366841291386   #  3.8% ( 3.8%)
-- branch mispredict 355601817820   #     3.7%
-- pipeline restart  11239473566    #     0.1%
smt-contention       0              #  0.0% ( 0.0%)
cpu-cycles           1602571643660  # 0.23 GHz
instructions         5493598362825  # 3.43 IPC high
l2 access            9495709247     # 1.729 l2 access per 1000 inst
l2 miss              1735896974     # 18.28% l2 miss

Process overview suggests different operations being tested in separate processes.

772 processes
	 15 multiply               157.50     1.91
	  8 divide                  86.34     0.79
	  5 gcd                     53.60     0.01
	  5 gcdext                  53.39     0.00
	  3 rsa                     33.69     0.00
	  3 pi                      31.87     0.25
	 67 clinfo                  15.59     7.19
	 38 vulkaninfo               0.58     1.71
	  6 glxinfo:gdrv0            0.14     0.03
	  6 glxinfo:gl0              0.14     0.03
	  4 vulkani:disk$0           0.07     0.18
	  6 php                      0.07     0.10
	  6 clang                    0.07     0.05
	  2 glxinfo                  0.07     0.02
	  2 glxinfo:cs0              0.06     0.01
	  2 glxinfo:disk$0           0.06     0.01
	  2 glxinfo:sh0              0.06     0.01
	  2 glxinfo:shlo0            0.06     0.01
	  2 llvmpipe-0               0.04     0.09
	  2 llvmpipe-1               0.04     0.09
	  2 llvmpipe-10              0.04     0.09
	  2 llvmpipe-11              0.04     0.09
	  2 llvmpipe-12              0.04     0.09
	  2 llvmpipe-13              0.04     0.09
	  2 llvmpipe-14              0.04     0.09
	  2 llvmpipe-15              0.04     0.09
	  2 llvmpipe-2               0.04     0.09
	  2 llvmpipe-3               0.04     0.09
	  2 llvmpipe-4               0.04     0.09
	  2 llvmpipe-5               0.04     0.09
	  2 llvmpipe-6               0.04     0.09
	  2 llvmpipe-7               0.04     0.09
	  2 llvmpipe-8               0.04     0.09
	  2 llvmpipe-9               0.04     0.09
	  3 rocminfo                 0.03     0.00
	118 runbench                 0.01     0.04
	  1 lspci                    0.00     0.03
	  1 ps                       0.00     0.01
	150 gexpr                    0.00     0.00
	 80 sh                       0.00     0.00
	 79 sed                      0.00     0.00
	 40 grep                     0.00     0.00
	 13 gcc                      0.00     0.00
	  8 stat                     0.00     0.00
	  8 systemd-detect-          0.00     0.00
	  7 gsettings                0.00     0.00
	  6 llvm-link                0.00     0.00
	  5 gmain                    0.00     0.00
	  5 phoronix-test-s          0.00     0.00
	  4 dconf worker             0.00     0.00
	  2 cc                       0.00     0.00
	  2 lscpu                    0.00     0.00
	  2 uname                    0.00     0.00
	  2 which                    0.00     0.00
	  2 xset                     0.00     0.00
	  1 cat                      0.00     0.00
	  1 date                     0.00     0.00
	  1 dirname                  0.00     0.00
	  1 dmesg                    0.00     0.00
	  1 dmidecode                0.00     0.00
	  1 gmpbench                 0.00     0.00
	  1 ifconfig                 0.00     0.00
	  1 ip                       0.00     0.00
	  1 lsmod                    0.00     0.00
	  1 mktemp                   0.00     0.00
	  1 qdbus                    0.00     0.00
	  1 readlink                 0.00     0.00
	  1 realpath                 0.00     0.00
	  1 sort                     0.00     0.00
	  1 stty                     0.00     0.00
	  1 systemctl                0.00     0.00
	  1 template.sh              0.00     0.00
	  1 wc                       0.00     0.00
	  1 xrandr                   0.00     0.00
0 processes running
47 maximum processes

An example from the computation section

      1170498) gmpbench         cpu=0 start=5.47  finish=425.53
        1170499) runbench         cpu=0 start=5.47  finish=425.53
          1170500) cat              cpu=15 start=5.47  finish=5.47 
          1170501) runbench         cpu=14 start=5.48  finish=5.48 
            1170502) runbench         cpu=3 start=5.48  finish=5.48 
            1170503) sed              cpu=2 start=5.48  finish=5.48 
          1170504) multiply         cpu=1 start=5.48  finish=13.29
          1170507) runbench         cpu=14 start=13.29 finish=13.29
            1170508) grep             cpu=15 start=13.29 finish=13.29
            1170509) sed              cpu=3 start=13.29 finish=13.29
          1170510) gexpr            cpu=15 start=13.29 finish=13.29
          1170511) gexpr            cpu=3 start=13.29 finish=13.29
          1170512) gexpr            cpu=4 start=13.29 finish=13.29
          1170513) runbench         cpu=1 start=13.29 finish=13.29
            1170514) runbench         cpu=4 start=13.29 finish=13.29
            1170515) sed              cpu=15 start=13.29 finish=13.29
          1170516) multiply         cpu=14 start=13.29 finish=24.10
          1170518) runbench         cpu=7 start=24.10 finish=24.10
            1170519) grep             cpu=1 start=24.10 finish=24.10
            1170520) sed              cpu=3 start=24.10 finish=24.10
          1170521) gexpr            cpu=4 start=24.10 finish=24.10
          1170522) gexpr            cpu=1 start=24.10 finish=24.10
          1170523) gexpr            cpu=3 start=24.10 finish=24.10
          1170524) runbench         cpu=14 start=24.10 finish=24.11
            1170525) runbench         cpu=7 start=24.10 finish=24.10
            1170526) sed              cpu=2 start=24.10 finish=24.11
          1170527) multiply         cpu=4 start=24.11 finish=34.75
          1170528) runbench         cpu=6 start=34.75 finish=34.75
            1170529) grep             cpu=7 start=34.75 finish=34.75
            1170530) sed              cpu=1 start=34.75 finish=34.75
          1170531) gexpr            cpu=10 start=34.76 finish=34.76
          1170532) gexpr            cpu=6 start=34.76 finish=34.76
          1170533) gexpr            cpu=7 start=34.76 finish=34.76
          1170534) runbench         cpu=11 start=34.76 finish=34.76
            1170535) runbench         cpu=12 start=34.76 finish=34.76
            1170536) sed              cpu=10 start=34.76 finish=34.76
          1170537) multiply         cpu=6 start=34.76 finish=45.30
          1170538) runbench         cpu=6 start=45.30 finish=45.30
            1170539) grep             cpu=15 start=45.30 finish=45.30
            1170540) sed              cpu=1 start=45.30 finish=45.30
          1170541) gexpr            cpu=10 start=45.30 finish=45.30
          1170542) gexpr            cpu=15 start=45.30 finish=45.30
          1170543) gexpr            cpu=1 start=45.30 finish=45.30
          1170544) runbench         cpu=3 start=45.30 finish=45.30
            1170545) runbench         cpu=12 start=45.30 finish=45.30
            1170546) sed              cpu=6 start=45.30 finish=45.30
          1170547) multiply         cpu=1 start=45.31 finish=57.07
          1170548) runbench         cpu=6 start=57.07 finish=57.07
            1170549) grep             cpu=7 start=57.07 finish=57.07
            1170550) sed              cpu=10 start=57.07 finish=57.07
          1170551) gexpr            cpu=3 start=57.07 finish=57.08
          1170552) gexpr            cpu=4 start=57.08 finish=57.08
          1170553) gexpr            cpu=6 start=57.08 finish=57.08
          1170554) runbench         cpu=1 start=57.08 finish=57.08
            1170555) runbench         cpu=10 start=57.08 finish=57.08
            1170556) sed              cpu=3 start=57.08 finish=57.08
          1170557) multiply         cpu=6 start=57.08 finish=67.69
          1170558) runbench         cpu=7 start=67.69 finish=67.69
            1170559) grep             cpu=9 start=67.69 finish=67.69
            1170560) sed              cpu=10 start=67.69 finish=67.69
          1170561) gexpr            cpu=3 start=67.69 finish=67.69
          1170562) gexpr            cpu=6 start=67.69 finish=67.69
          1170563) gexpr            cpu=7 start=67.69 finish=67.69
          1170564) runbench         cpu=9 start=67.70 finish=67.70
            1170565) runbench         cpu=10 start=67.70 finish=67.70
            1170566) sed              cpu=11 start=67.70 finish=67.70
          1170567) multiply         cpu=4 start=67.70 finish=78.30
          1170568) runbench         cpu=14 start=78.30 finish=78.31
            1170569) grep             cpu=7 start=78.30 finish=78.31
            1170570) sed              cpu=9 start=78.30 finish=78.31
          1170571) gexpr            cpu=10 start=78.31 finish=78.31
          1170572) gexpr            cpu=3 start=78.31 finish=78.31
          1170573) gexpr            cpu=14 start=78.31 finish=78.31
          1170574) runbench         cpu=7 start=78.31 finish=78.31
            1170575) runbench         cpu=9 start=78.31 finish=78.31
            1170576) sed              cpu=10 start=78.31 finish=78.31
          1170577) multiply         cpu=1 start=78.31 finish=88.83
          1170578) runbench         cpu=14 start=88.83 finish=88.83
            1170579) grep             cpu=15 start=88.83 finish=88.83
            1170580) sed              cpu=10 start=88.83 finish=88.83
          1170581) gexpr            cpu=1 start=88.83 finish=88.83
          1170582) gexpr            cpu=3 start=88.83 finish=88.84
          1170583) gexpr            cpu=15 start=88.84 finish=88.84