{"id":62,"date":"2023-12-15T21:40:16","date_gmt":"2023-12-15T21:40:16","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?page_id=62"},"modified":"2023-12-16T22:13:01","modified_gmt":"2023-12-16T22:13:01","slug":"lmbench","status":"publish","type":"page","link":"https:\/\/mvermeulen.org\/perf\/workloads\/lmbench\/","title":{"rendered":"lmbench"},"content":{"rendered":"\n<p>The lmbench benchmark can be installed using:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>prompt% apt install lmbench<\/code><\/pre>\n\n\n\n<p>This includes the lat_mem_rd program that I have used to measure cache latencies.  An example run and output for a memory size of 20000mb (20GB) and stride of 1024 is shown below.  I have also annotated known architectural information on the cache hierarchy<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>L1 = 32KB and 4 clocks<\/li>\n\n\n\n<li>L2 = 1MB and 14 clocks<\/li>\n\n\n\n<li>L3 = 96MB and 47 clocks<\/li>\n<\/ul>\n\n\n\n<p>Where the cycle counts come from Agnor Fog&#8217;s <a href=\"https:\/\/www.agner.org\/optimize\/microarchitecture.pdf\">micro-architecture manuals<\/a><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>mev@montpelier:~$ \/usr\/lib\/lmbench\/bin\/x86_64-linux-gnu\/lat_mem_rd 20000 1024\n\"stride=1024\n0.00098 0.819\n0.00195 0.796\n0.00293 0.821\n0.00391 0.814\n0.00586 0.819\n0.00781 0.809\n0.01172 0.820\n0.01562 0.792\n0.02344 0.821\n0.03125 0.792 # L1 = 32KB\n0.04688 1.552\n0.06250 2.886\n0.09375 1.303\n0.12500 2.877\n0.18750 2.882\n0.25000 2.888\n0.37500 3.195\n0.50000 3.198\n0.75000 3.251\n1.00000 3.311 # L2 = 1MB\n1.50000 4.004\n2.00000 4.693\n3.00000 6.296\n4.00000 6.743\n6.00000 7.496\n8.00000 7.411\n12.00000 7.674\n16.00000 8.166\n24.00000 8.054\n32.00000 7.817\n48.00000 8.194\n64.00000 7.956\n96.00000 8.212 # L3 = 96MB\n128.00000 8.857\n192.00000 10.671\n256.00000 15.057\n384.00000 21.861\n512.00000 22.165\n768.00000 22.636\n1024.00000 22.624\n1536.00000 23.139\n2048.00000 23.651\n3072.00000 24.744\n4096.00000 25.005\n6144.00000 25.829\n8192.00000 25.676\n12288.00000 26.301\n16384.00000 26.037\n<\/code><\/pre>\n\n\n\n<p>The numbers are in nanoseconds and suggest to me a few things:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>CPUs have become more dynamic with base and boost frequencies. Compared to a decade ago it may not be as simple to convert from nanoseconds to cycles but we&#8217;re in the almost 5GHz range or 0.2ns per cycle and the numbers above correspond.<\/li>\n\n\n\n<li>A pointer ring value of 1024 was chosen, as one varies this number the values also change, most likely prefetching coming into play.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>The lmbench benchmark can be installed using: This includes the lat_mem_rd program that I have used to measure cache latencies. An example run and output for a memory size of 20000mb (20GB) and stride of 1024 is shown below. I <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/workloads\/lmbench\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":48,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-62","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/62","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=62"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/62\/revisions"}],"predecessor-version":[{"id":89,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/62\/revisions\/89"}],"up":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/pages\/48"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=62"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}