{"id":128,"date":"2023-12-29T20:53:46","date_gmt":"2023-12-29T20:53:46","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?p=128"},"modified":"2023-12-29T22:23:59","modified_gmt":"2023-12-29T22:23:59","slug":"turning-on-counters-for-l3-and-data-fabric-measurements-on-amd","status":"publish","type":"post","link":"https:\/\/mvermeulen.org\/perf\/2023\/12\/29\/turning-on-counters-for-l3-and-data-fabric-measurements-on-amd\/","title":{"rendered":"Turning on counters for l3 and data fabric measurements on AMD"},"content":{"rendered":"\n<p>By default, counters were not available to measure l3 and df counters on AMD.  With some help from <a href=\"https:\/\/github.com\/RRZE-HPC\/likwid\/wiki\/TutorialLikwidPerf\">likwid documentation<\/a> I figured out what is going on and how to get it enabled.<\/p>\n\n\n\n<p>The first thing to do is see if the perf subsystem knows about l3 and df areas.  This can be done by doing<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>prompt% ls \/sys\/devices\/*\/format\n\/sys\/devices\/amd_iommu_0\/format:\ncsource  devid  devid_mask  domid  domid_mask  pasid  pasid_mask\n\n\/sys\/devices\/cpu\/format:\ncmask  edge  event  inv  umask\n\n\/sys\/devices\/ibs_fetch\/format:\nl3missonly  rand_en\n\n\/sys\/devices\/ibs_op\/format:\ncnt_ctl  l3missonly\n\n\/sys\/devices\/kprobe\/format:\nretprobe\n\n\/sys\/devices\/msr\/format:\nevent\n\n\/sys\/devices\/power\/format:\nevent\n\n\/sys\/devices\/uprobe\/format:\nref_ctr_offset  retprobe<\/code><\/pre>\n\n\n\n<p>Only devices that are available will show up here. My example is missing, so next one needs to see what is compiled into the running kernel.  This can be done by doing:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>prompt% \/home\/mev\/source\/wspy# grep -i perf_events \/boot\/config-$(uname -r)\nCONFIG_HAVE_PERF_EVENTS=y\nCONFIG_GUEST_PERF_EVENTS=y\nCONFIG_PERF_EVENTS=y\nCONFIG_PERF_EVENTS_INTEL_UNCORE=y\nCONFIG_PERF_EVENTS_INTEL_RAPL=m\nCONFIG_PERF_EVENTS_INTEL_CSTATE=m\n# CONFIG_PERF_EVENTS_AMD_POWER is not set\nCONFIG_PERF_EVENTS_AMD_UNCORE=m\nCONFIG_PERF_EVENTS_AMD_BRS=y\nCONFIG_HAVE_PERF_EVENTS_NMI=y\nCONFIG_SECURITY_PERF_EVENTS_RESTRICT=y<\/code><\/pre>\n\n\n\n<p>The l3 and df counters are uncore counters and can be loaded as a module.  So we load this module using the following command<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>prompt% \/home\/mev\/source\/wspy# insmod \/lib\/modules\/$(uname -r)\/kernel\/arch\/x86\/events\/amd\/amd-uncore.ko<\/code><\/pre>\n\n\n\n<p>This loads the module and the command above shows \/sys\/devices\/amd_l3\/format and \/sys\/devices\/amd_df\/format.  Once this is enabled, the perf list command can give relevant counters.  The command and useful parts of the output are included below:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>prompt% perf list -v --detail\nl3_cache:\n  l3_cache_accesses\n       &#91;l3_lookup_state.all_coherent_accesses_to_l3]\n  l3_misses\n       &#91;l3_lookup_state.l3_miss]\n  l3_read_miss_latency\n       &#91;l3_xi_sampled_latency.all * 10 \/ l3_xi_sampled_latency_requests.all]<\/code><\/pre>\n\n\n\n<p>Now using &#8220;perf stat&#8221; we can try the l3 counters and make sure they work.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>prompt% perf stat -e l3_lookup_state.all_coherent_accesses_to_l3,l3_lookup_state.l3_hit \/bin\/ls\ncpumask  format  perf_event_mux_interval_ms  power  subsystem  type  uevent\n\n Performance counter stats for 'system wide':\n\n            80,264      l3_lookup_state.all_coherent_accesses_to_l3                                      \n            70,798      l3_lookup_state.l3_hit                                                \n\n       0.001688959 seconds time elapsed<\/code><\/pre>\n\n\n\n<p>What remains is figuring out the right &#8220;config&#8221; flags to make the equivalent call to perf_event_open.  We can look these up with strace.  This tells me the type field for the struct perf_event_attr is 0xe.  This also happens to be shown in \/sys\/devices\/amd_l3\/type file.  I can figure this out for l3 access but not quite sure which event to use for the data fabric to measure memory.<\/p>\n\n\n\n<p>Success!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>By default, counters were not available to measure l3 and df counters on AMD. With some help from likwid documentation I figured out what is going on and how to get it enabled. The first thing to do is see <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/2023\/12\/29\/turning-on-counters-for-l3-and-data-fabric-measurements-on-amd\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11],"tags":[17,18,16,6,7],"class_list":["post-128","post","type-post","status-publish","format-standard","hentry","category-experiment","tag-data-fabric","tag-kernel","tag-l3","tag-perf","tag-performance-counters"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/128","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=128"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/128\/revisions"}],"predecessor-version":[{"id":130,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/128\/revisions\/130"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=128"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/categories?post=128"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/tags?post=128"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}