{"id":2518,"date":"2024-07-15T02:00:17","date_gmt":"2024-07-15T02:00:17","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?p=2518"},"modified":"2024-07-15T12:04:50","modified_gmt":"2024-07-15T12:04:50","slug":"cachyos-optimized-packages","status":"publish","type":"post","link":"https:\/\/mvermeulen.org\/perf\/2024\/07\/15\/cachyos-optimized-packages\/","title":{"rendered":"CachyOS optimized packages"},"content":{"rendered":"\n<p>I have a Zen4 7940HS system where I have installed <a href=\"https:\/\/cachyos.org\/\">CachyOS<\/a>.  This is an Arch-based Linux OS with a focus on performance.  In particular, the <a href=\"https:\/\/wiki.cachyos.org\/cachyos_basic\/why_cachyos\/\">Why CachyOS? <\/a>page cites an optimized scheduler as well as packages compiled for the particular architecture.  For example, rather than packages compiled to run on the lowest-common-denominator architecture, CachyOS has optimized packages for different &#8220;levels&#8221;  The &#8220;-v3&#8221; level enables architectures newer than Intel Haswell or AMD Excavator and the &#8220;-v4&#8221; level enables use of AVX-512.<\/p>\n\n\n\n<p>The <a href=\"https:\/\/cachyos.org\/blog\/2407-july-release\/\">July 2024 release notes<\/a> highlight the addition of a Zen4 optimized repository<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>This is our 8th release this year, and we are very proud to announce a new optimized repository. Starting with this release, we are providing a&nbsp;<strong>Zen4<\/strong>&nbsp;optimized repository. This repository will be automatically used at new installation for Zen4 and Zen5 CPUs, to provide the best performance.<\/p>\n\n\n\n<p>The znver4 target provides a bunch of&nbsp;<strong>extra avx512<\/strong>&nbsp;extensions and also other instructions. Here you can find a list of the additional used instructions by the compiler compared to the x86-64-v4 target:&nbsp;<code>abm, adx, aes, avx512bf16, avx512bitalg, avx512ifma, avx512vbmi, avx512vbmi2, avx512vnni, avx512vpopctndq, clflushopt, clwb, clzero, fsgsbase, gfni, mwaitx, pclmul, pku. prfchw, rpdid, rdrnd, rdseed, sha, sse4a, vaes, vockmulqdq, wbnoinvd, savec, xsaveopt, xsaves<\/code><\/p>\n<\/blockquote>\n\n\n\n<p>This seemed intriguing so I decided to try a somewhat random collection of Phoronix tests using this repository.  I compare the performance running CachyOS with Zen4 vs Ubuntu 22.04.  A summary table follows:<\/p>\n\n\n\n<table id=\"tablepress-11\" class=\"tablepress tablepress-id-11\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\">Metric<\/th><th class=\"column-2\">Direction<\/th><th class=\"column-3\">CachyOS Zen4<\/th><th class=\"column-4\">Ubuntu 22.04<\/th><th class=\"column-5\">Ratio<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">coremark<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">410390 iterations\/sec<\/td><td class=\"column-4\">438579 iterations\/sec<\/td><td class=\"column-5\">0.936<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">build-linux-kernel<\/td><td class=\"column-2\">lower<\/td><td class=\"column-3\">114.035 seconds<\/td><td class=\"column-4\">116.25 seconds<\/td><td class=\"column-5\">1.019<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">openssl: SHA256<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">13254334897 \/ second<\/td><td class=\"column-4\">13420827833 \/ second<\/td><td class=\"column-5\">0.988<\/td>\n<\/tr>\n<tr class=\"row-5\">\n\t<td class=\"column-1\">openssl: SHA512<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">4539223377 \/ second<\/td><td class=\"column-4\">4413779170 \/ second<\/td><td class=\"column-5\">1.028<\/td>\n<\/tr>\n<tr class=\"row-6\">\n\t<td class=\"column-1\">openssl: RSA4096<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">5949.0 sign\/s<\/td><td class=\"column-4\">5713.1 sign\/s<\/td><td class=\"column-5\">1.041<\/td>\n<\/tr>\n<tr class=\"row-7\">\n\t<td class=\"column-1\">openssl: ChaCha20<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">56376095723 byte\/second<\/td><td class=\"column-4\">55209170900 byte\/second<\/td><td class=\"column-5\">1.021<\/td>\n<\/tr>\n<tr class=\"row-8\">\n\t<td class=\"column-1\">openssl: AES-128-GCM<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">108245069000 byte\/second<\/td><td class=\"column-4\">106316322953 byte\/second<\/td><td class=\"column-5\">1.018<\/td>\n<\/tr>\n<tr class=\"row-9\">\n\t<td class=\"column-1\">openssl: AES-256-GCM<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">93611141897 byte\/second<\/td><td class=\"column-4\">91688082637 byte\/second<\/td><td class=\"column-5\">1.021<\/td>\n<\/tr>\n<tr class=\"row-10\">\n\t<td class=\"column-1\">openssl: ChaCha20-Poly1035<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">40096535620 byte\/second<\/td><td class=\"column-4\">39278150087 byte\/second<\/td><td class=\"column-5\">1.021<\/td>\n<\/tr>\n<tr class=\"row-11\">\n\t<td class=\"column-1\">phpbench<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">2243593 score<\/td><td class=\"column-4\">1055625 score<\/td><td class=\"column-5\">2.125<\/td>\n<\/tr>\n<tr class=\"row-12\">\n\t<td class=\"column-1\">ospray: particle_volume\/ao<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">3.73062 \/ second<\/td><td class=\"column-4\">3.64172 \/ second<\/td><td class=\"column-5\">1.024<\/td>\n<\/tr>\n<tr class=\"row-13\">\n\t<td class=\"column-1\">ospray: particle_volume\/scivis<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">3.685 \/ second<\/td><td class=\"column-4\">3.62516 \/ second<\/td><td class=\"column-5\">1.017<\/td>\n<\/tr>\n<tr class=\"row-14\">\n\t<td class=\"column-1\">ospray: particle_volume\/pathtracer<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">120.966 \/ second<\/td><td class=\"column-4\">121.287 \/ second<\/td><td class=\"column-5\">0.997<\/td>\n<\/tr>\n<tr class=\"row-15\">\n\t<td class=\"column-1\">ospray: gravity\/ao<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">2.62208 \/ second<\/td><td class=\"column-4\">2.79344 \/ second<\/td><td class=\"column-5\">0.939<\/td>\n<\/tr>\n<tr class=\"row-16\">\n\t<td class=\"column-1\">ospray: gravity\/scvis<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">2.66757 \/ second<\/td><td class=\"column-4\">2.76412 \/ second<\/td><td class=\"column-5\">0.965<\/td>\n<\/tr>\n<tr class=\"row-17\">\n\t<td class=\"column-1\">ospray: gravity\/pathtracer<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">3.58901 \/ second<\/td><td class=\"column-4\">3.45087 \/ second<\/td><td class=\"column-5\">1.040<\/td>\n<\/tr>\n<tr class=\"row-18\">\n\t<td class=\"column-1\">rawtherapee<\/td><td class=\"column-2\">lower<\/td><td class=\"column-3\">51.331 seconds<\/td><td class=\"column-4\">51.189 seconds<\/td><td class=\"column-5\">0.997<\/td>\n<\/tr>\n<tr class=\"row-19\">\n\t<td class=\"column-1\">namd: ATPase<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">1.30961 \/ day<\/td><td class=\"column-4\">1.26995 \/ day<\/td><td class=\"column-5\">1.031<\/td>\n<\/tr>\n<tr class=\"row-20\">\n\t<td class=\"column-1\">namd: STMV<\/td><td class=\"column-2\">higher<\/td><td class=\"column-3\">0.39164<\/td><td class=\"column-4\">0.37621 \/ day<\/td><td class=\"column-5\">1.048<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-11 from cache -->\n\n\n<p>The benchmarks selected were a subset of those from <a href=\"https:\/\/www.phoronix.com\/review\/cachyos-x86-64-v3-v4\">this Phoronix article<\/a>.  That article compares the various CachyOS repositories against each other while I am doing a comparison vs. Ubuntu.  Overall there was a smaller increase than I expected\/hoped.  Some particular items I will note from the able above:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>One additional difference is that CachyOS has a new gcc version 14.1 while Ubuntu 22.04 has gcc version 11.4<\/li>\n\n\n\n<li>The coremark benchmark compiles the coremark code with gcc -O2.  Not sure why that became slower but related to compiler?<\/li>\n\n\n\n<li>The build-linux-kernel particularly measures the time of the kernel itself.  I was pleasantly surprised to see this faster as my guess would have been that compile speed could have slowed<\/li>\n\n\n\n<li>The various OpenSSL benchmarks might be specific to the underlying instructions and again nice to see them slightly faster<\/li>\n\n\n\n<li>Phpbench is the particular outlier with a 2x performance improvement.<\/li>\n\n\n\n<li>Ospray is mixed with a few benchmarks faster and a few slower.<\/li>\n\n\n\n<li>rawtherapee is just a slight bit slower<\/li>\n\n\n\n<li>namd also shows a small improvement<\/li>\n<\/ul>\n\n\n\n<p>Overall, it is nice to have one system running CachyOS as a dynamic updated system. Occasionally it is slightly more difficult to get benchmarks to run than Ubuntu. Presumably this is because that tends to be the default choice. So I don&#8217;t expect to shift everything over to CachyOS. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>I have a Zen4 7940HS system where I have installed CachyOS. This is an Arch-based Linux OS with a focus on performance. In particular, the Why CachyOS? page cites an optimized scheduler as well as packages compiled for the particular <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/2024\/07\/15\/cachyos-optimized-packages\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11],"tags":[23,30,22],"class_list":["post-2518","post","type-post","status-publish","format-standard","hentry","category-experiment","tag-benchmarks","tag-cachyos","tag-phoronix"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/2518","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=2518"}],"version-history":[{"count":2,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/2518\/revisions"}],"predecessor-version":[{"id":2523,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/2518\/revisions\/2523"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=2518"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/categories?post=2518"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/tags?post=2518"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}