{"id":2343,"date":"2024-06-03T10:29:03","date_gmt":"2024-06-03T10:29:03","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=2343"},"modified":"2024-06-05T00:31:34","modified_gmt":"2024-06-05T00:31:34","slug":"538-imagick_r","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/cpu2017\/538-imagick_r\/","title":{"rendered":"538.imagick_r"},"content":{"rendered":"\n<p>imagick is a SPEC CPU(R) benchmark described <a href=\"https:\/\/spec.org\/cpu2017\/Docs\/benchmarks\/538.imagick_r.html\">here<\/a> and written in C. The workload runs on all logical cores.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-14.png\" alt=\"\" class=\"wp-image-2414\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-14.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-14-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/systemtime-14-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>Topdown profile shows a high retirement rate with short periods of backend bound. It also shows higher than normal branch misprediction.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"960\" src=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-15.png\" alt=\"\" class=\"wp-image-2416\" srcset=\"https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-15.png 1280w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-15-1024x768.png 1024w, https:\/\/mvermeulen.org\/perf\/wp-content\/uploads\/sites\/7\/2024\/06\/amdtopdown-15-768x576.png 768w\" sizes=\"auto, (max-width: 1280px) 100vw, 1280px\" \/><\/figure>\n\n\n\n<p>AMD metrics show a large number of branches, not much L2 access.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>elapsed              371.223\non_cpu               0.896          # 14.33 \/ 16 cores\nutime                5301.865\nstime                18.001\nnvcsw                10370          # 17.91%\nnivcsw               47532          # 82.09%\ninblock              0              # 0.00\/sec\nonblock              493496         # 1329.38\/sec\ncpu-clock            5320047906701  # 5320.048 seconds\ntask-clock           5320074769301  # 5320.075 seconds\npage faults          5459466        # 1026.201\/sec\ncontext switches     57229          # 10.757\/sec\ncpu migrations       185            # 0.035\/sec\nmajor page faults    1439           # 0.270\/sec\nminor page faults    5458027        # 1025.931\/sec\nalignment faults     0              # 0.000\/sec\nemulation faults     0              # 0.000\/sec\nbranches             8454570179881  # 182.276 branches per 1000 inst\nbranch misses        75411585767    # 0.89% branch miss\nconditional          8124558626962  # 175.161 conditional branches per 1000 inst\nindirect             8694767325     # 0.187 indirect branches per 1000 inst\ncpu-cycles           21192834540990 # 3.63 GHz\ninstructions         46388803894448 # 2.19 IPC\nslots                42385231849734 #\nretiring             14329148354545 # 33.8% (56.0%) high\n-- ucode             251053747      #     0.0%\n-- fastpath          14328897300798 #    33.8%\nfrontend             2841163137830  #  6.7% (11.1%)\n-- latency           1593122069814  #     3.8%\n-- bandwidth         1248041068016  #     2.9%\nbackend              6452022137302  # 15.2% (25.2%)\n-- cpu               3524539608501  #     8.3%\n-- memory            2927482528801  #     6.9%\nspeculation          1961676278585  #  4.6% ( 7.7%)\n-- branch mispredict 1946164368946  #     4.6%\n-- pipeline restart  15511909639    #     0.0%\nsmt-contention       16801162764112 # 39.6% ( 0.0%)\ncpu-cycles           20996638297565 # 3.85 GHz\ninstructions         46389036286138 # 2.21 IPC\ninstructions         15458610985360 # 11.978 l2 access per 1000 inst\nl2 hit from l1       83102477730    # 6.13% l2 miss\nl2 miss from l1      2867022134     #\nl2 hit from l2 pf    93572754615    #\nl3 hit from l2 pf    4342178823     #\nl3 miss from l2 pf   4142145101     #\ninstructions         15459233449809 # 149.855 float per 1000 inst\nfloat 512            234            # 0.000 AVX-512 per 1000 inst\nfloat 256            16966          # 0.000 AVX-256 per 1000 inst\nfloat 128            2316635779894  # 149.855 AVX-128 per 1000 inst\nfloat MMX            0              # 0.000 MMX per 1000 inst\nfloat scalar         1              # 0.000 scalar per 1000 inst\ninstructions         46381069755520 #\nopcache              6422257727641  # 138.467 opcache per 1000 inst\nopcache miss         9631990405     #  0.1% opcache miss rate\nl1 dTLB miss         91943963321    # 1.982 L1 dTLB per 1000 inst\nl2 dTLB miss         2682643698     # 0.058 L2 dTLB per 1000 inst\ninstructions         46381020206518 #\nicache               17841044731    # 0.385 icache per 1000 inst\nicache miss          2571776549     # 14.4% icache miss rate\nl1 iTLB miss         161510630      # 0.003 L1 iTLB per 1000 inst\nl2 iTLB miss         0              # 0.000 L2 iTLB per 1000 inst\ntlb flush            62101          # 0.000 TLB flush per 1000 inst\n<\/code><\/pre>\n\n\n\n<p>Process overview shows time predominantly spent in imagick_r_base.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>689 processes\n\t 48 imagick_r_base.       5259.95    13.00\n\t 71 specperl                 8.46     1.40\n\t 48 imagevalidate_5          2.04     0.46\n\t  2 clang                    0.00     0.02\n\t 11 ps                       0.00     0.01\n\t  1 lsb_release              0.00     0.01\n\t224 sh                       0.00     0.00\n\t 54 specrxp                  0.00     0.00\n\t 48 bash                     0.00     0.00\n\t 41 specinvoke               0.00     0.00\n\t 22 cat                      0.00     0.00\n\t 21 grep                     0.00     0.00\n\t 12 uniq                     0.00     0.00\n\t 11 sort                     0.00     0.00\n\t 10 expand                   0.00     0.00\n\t  7 specmake                 0.00     0.00\n\t  6 pwd                      0.00     0.00\n\t  5 basename                 0.00     0.00\n\t  5 systemctl                0.00     0.00\n\t  4 rm                       0.00     0.00\n\t  4 specpp                   0.00     0.00\n\t  4 uname                    0.00     0.00\n\t  3 dirname                  0.00     0.00\n\t  3 dmidecode                0.00     0.00\n\t  3 lscpu                    0.00     0.00\n\t  2 df                       0.00     0.00\n\t  2 dpkg                     0.00     0.00\n\t  2 runcpu                   0.00     0.00\n\t  2 specsha512sum            0.00     0.00\n\t  2 specxz                   0.00     0.00\n\t  2 who                      0.00     0.00\n\t  1 cpupower                 0.00     0.00\n\t  1 head                     0.00     0.00\n\t  1 logname                  0.00     0.00\n\t  1 ls                       0.00     0.00\n\t  1 numactl                  0.00     0.00\n\t  1 sysctl                   0.00     0.00\n\t  1 w                        0.00     0.00\n\t  1 wc                       0.00     0.00\n\t  1 which                    0.00     0.00\n0 processes running\n53 maximum processes\n<\/code><\/pre>\n\n\n\n<p>specinvoke is used to launch separate copies on each logical core<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>    446774) specinvoke       cpu=13 start=3.67  finish=114.84\n      446776) sh               cpu=11 start=3.67  finish=113.30\n        446783) bash             cpu=0 start=3.67  finish=113.30\n          446806) imagick_r_base.  cpu=0 start=3.67  finish=113.27\n      446777) sh               cpu=11 start=3.67  finish=113.58\n        446784) bash             cpu=1 start=3.67  finish=113.58\n          446808) imagick_r_base.  cpu=1 start=3.67  finish=113.54\n      446778) sh               cpu=10 start=3.67  finish=113.03\n        446786) bash             cpu=2 start=3.67  finish=113.03\n          446807) imagick_r_base.  cpu=2 start=3.67  finish=113.00\n      446779) sh               cpu=0 start=3.67  finish=113.98\n        446787) bash             cpu=3 start=3.67  finish=113.98\n          446812) imagick_r_base.  cpu=3 start=3.67  finish=113.95\n      446780) sh               cpu=4 start=3.67  finish=111.56\n        446793) bash             cpu=4 start=3.67  finish=111.56\n          446813) imagick_r_base.  cpu=4 start=3.67  finish=111.52\n      446781) sh               cpu=5 start=3.67  finish=113.57\n        446791) bash             cpu=5 start=3.67  finish=113.57\n          446815) imagick_r_base.  cpu=5 start=3.67  finish=113.54\n      446782) sh               cpu=5 start=3.67  finish=114.78\n        446789) bash             cpu=6 start=3.67  finish=114.78\n          446811) imagick_r_base.  cpu=6 start=3.67  finish=114.74\n      446785) sh               cpu=12 start=3.67  finish=113.72\n        446794) bash             cpu=7 start=3.67  finish=113.72\n          446814) imagick_r_base.  cpu=7 start=3.67  finish=113.69\n      446788) sh               cpu=8 start=3.67  finish=113.11\n        446797) bash             cpu=8 start=3.67  finish=113.11\n          446817) imagick_r_base.  cpu=8 start=3.67  finish=113.06\n      446790) sh               cpu=8 start=3.67  finish=114.36\n        446799) bash             cpu=9 start=3.67  finish=114.36\n          446816) imagick_r_base.  cpu=9 start=3.67  finish=114.34\n      446792) sh               cpu=12 start=3.67  finish=113.03\n        446802) bash             cpu=10 start=3.67  finish=113.03\n          446818) imagick_r_base.  cpu=10 start=3.67  finish=112.99\n      446795) sh               cpu=2 start=3.67  finish=113.16\n        446803) bash             cpu=11 start=3.67  finish=113.16\n          446823) imagick_r_base.  cpu=11 start=3.67  finish=113.11\n      446796) sh               cpu=4 start=3.67  finish=112.97\n        446804) bash             cpu=12 start=3.67  finish=112.97\n          446819) imagick_r_base.  cpu=12 start=3.67  finish=112.94\n      446798) sh               cpu=12 start=3.67  finish=113.33\n        446805) bash             cpu=13 start=3.67  finish=113.33\n          446822) imagick_r_base.  cpu=13 start=3.67  finish=113.30\n      446800) sh               cpu=15 start=3.67  finish=114.84\n        446809) bash             cpu=14 start=3.67  finish=114.84\n          446821) imagick_r_base.  cpu=14 start=3.67  finish=114.81\n      446801) sh               cpu=15 start=3.67  finish=113.91\n        446810) bash             cpu=15 start=3.67  finish=113.91\n          446820) imagick_r_base.  cpu=15 start=3.67  finish=113.89\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>imagick is a SPEC CPU(R) benchmark described here and written in C. The workload runs on all logical cores. Topdown profile shows a high retirement rate with short periods of backend bound. It also shows higher than normal branch misprediction. <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/cpu2017\/538-imagick_r\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":2297,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-2343","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2343","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=2343"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2343\/revisions"}],"predecessor-version":[{"id":2417,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2343\/revisions\/2417"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/2297"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=2343"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}