{"id":44,"date":"2023-02-23T02:40:37","date_gmt":"2023-02-23T02:40:37","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?p=44"},"modified":"2023-02-23T02:55:48","modified_gmt":"2023-02-23T02:55:48","slug":"performance-counters-required-to-compute-topdown-metrics","status":"publish","type":"post","link":"https:\/\/mvermeulen.org\/perf\/2023\/02\/23\/performance-counters-required-to-compute-topdown-metrics\/","title":{"rendered":"Performance Counters required to compute topdown metrics"},"content":{"rendered":"<p>From past work, we know the five counters required to compute the first level topdown metrics on Intel processors:<\/p>\n<pre>CLK_UNHALTED_CORE      = 0x00\nIDQ_UOPS_NOT_DELIVERED_CORE = 0x9C, umask=1\nUOPS_RETIRED_RETIRE_SLOTS   = 0xC2, umask=2\nUOPS_ISSUED_ANY             = 0x0E, umask=1\nINT_MISC_RECOVERY_CYCLES    = 0x0d, umask=3, cmask=1\n<\/pre>\n<p>These can be composed as follows<\/p>\n<pre>width = 4                                       ; issue width = 4 for many processors\ntotal-slots = CLK_UNHALTED_CORE * width\nfrontend = IDQ_UOPS_NOT_DELIVERED_CORE \/ total-slots\nretiring = UOPS_RETIRED_RETIRE_SLOTS \/ total-slots\nspeculation = (UOPS_ISSUED_ANY - UOPS_RETIRED_RETIRE_SLOTS + (width*INT_MISC_RECOVERY_CYCLES)\/width\nbackend = 1 - (frontend + retiring + speculation)\n<\/pre>\n<p>Now working with the definitions from &#8220;perf-list&#8221; we can also find the set of performance counters used to compute topdown metrics on AMD.  First here are the definitions that perf gives us:<\/p>\n<pre>PipelineL1:\n  backend_bound\n       [d_ratio(de_no_dispatch_per_slot.backend_stalls, total_dispatch_slots)]\n  bad_speculation\n       [d_ratio(de_src_op_disp.all - ex_ret_ops, total_dispatch_slots)]\n  frontend_bound\n       [d_ratio(de_no_dispatch_per_slot.no_ops_from_frontend,\n        total_dispatch_slots)]\n  retiring\n       [d_ratio(ex_ret_ops, total_dispatch_slots)]\n  smt_contention\n       [d_ratio(de_no_dispatch_per_slot.smt_contention, total_dispatch_slots)]\n<\/pre>\n<p>This simplifies further<\/p>\n<pre>de_no_dispatch_per_slot.backend_stalls        = 0x1A, umask=0x1E\ntotal_dispatch_slots                          = 6 * ls_not_halted_cyc\nls_not_halted_cyc                             = 0x76\nde_src_op_disp.all                            = 0xAA, umask=0x7\nex_ret_ops                                    = 0xC1\nde_no_dispatch_per_slots.no_ops_from_frontend = 0x1A0, umask=1\nde_no_dispatch_per_slot.smt_contention        = 0x1A0, umask=0x60\n<\/pre>\n<p>So I believe we can compute top-down metrics using six counters listed above and then also have an additional category to account for pipeline slots not dispatched because of SMT contention.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>From past work, we know the five counters required to compute the first level topdown metrics on Intel processors: CLK_UNHALTED_CORE = 0x00 IDQ_UOPS_NOT_DELIVERED_CORE = 0x9C, umask=1 UOPS_RETIRED_RETIRE_SLOTS = 0xC2, umask=2 UOPS_ISSUED_ANY = 0x0E, umask=1 INT_MISC_RECOVERY_CYCLES = 0x0d, umask=3, cmask=1 These <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/2023\/02\/23\/performance-counters-required-to-compute-topdown-metrics\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[6,7,8],"class_list":["post-44","post","type-post","status-publish","format-standard","hentry","category-hardware","tag-perf","tag-performance-counters","tag-topdown"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/44","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=44"}],"version-history":[{"count":1,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/44\/revisions"}],"predecessor-version":[{"id":45,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/44\/revisions\/45"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=44"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/categories?post=44"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/tags?post=44"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}