{"id":111,"date":"2023-12-19T02:14:35","date_gmt":"2023-12-19T02:14:35","guid":{"rendered":"https:\/\/mvermeulen.org\/perf\/?p=111"},"modified":"2023-12-19T02:18:10","modified_gmt":"2023-12-19T02:18:10","slug":"new-i5-13500h-machine","status":"publish","type":"post","link":"https:\/\/mvermeulen.org\/perf\/2023\/12\/19\/new-i5-13500h-machine\/","title":{"rendered":"New i5-13500H machine"},"content":{"rendered":"\n<p>I have set up a new Intel performance machine for experiments. The processor is a <a href=\"https:\/\/www.intel.com\/content\/www\/us\/en\/products\/sku\/232147\/intel-core-i513500h-processor-18m-cache-up-to-4-70-ghz\/specifications.html\">i5-13500H <\/a>in a <a href=\"https:\/\/www.geekompc.com\/geekom-mini-it13-mini-pc\/\">Geekom MiniIT13<\/a> mini-PC.<\/p>\n\n\n\n<p>Following are some of the major parameters. This comparison is with Ryzen 7840 which will be my AMD comparison microprocessor<\/p>\n\n\n\n\n<table id=\"tablepress-1\" class=\"tablepress tablepress-id-1\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\">Item<\/th><th class=\"column-2\">Ryzen 7840HS<\/th><th class=\"column-3\">i5-13500H<\/th><th class=\"column-4\">Notes<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">Architecture<\/td><td class=\"column-2\">Zen4<\/td><td class=\"column-3\">Raptor Lake<\/td><td class=\"column-4\"><\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">Cores<\/td><td class=\"column-2\">8<\/td><td class=\"column-3\">12<br \/>\n4 performance (raptor cove)<br \/>\n8 efficiency (gracemont)<\/td><td class=\"column-4\"><\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">Threads<\/td><td class=\"column-2\">16<\/td><td class=\"column-3\">16<\/td><td class=\"column-4\"><\/td>\n<\/tr>\n<tr class=\"row-5\">\n\t<td class=\"column-1\">Base Clock<\/td><td class=\"column-2\">3.8 GHz<\/td><td class=\"column-3\">2.6 GHz, 1.9 GHz<\/td><td class=\"column-4\"><\/td>\n<\/tr>\n<tr class=\"row-6\">\n\t<td class=\"column-1\">Boost Clock<\/td><td class=\"column-2\">5.1 GHz<\/td><td class=\"column-3\">4.7 GHz, 3.5 GHz<\/td><td class=\"column-4\"><\/td>\n<\/tr>\n<tr class=\"row-7\">\n\t<td class=\"column-1\">TDP<\/td><td class=\"column-2\">35-45W<\/td><td class=\"column-3\">45W-95W<\/td><td class=\"column-4\">Set by vendor<\/td>\n<\/tr>\n<tr class=\"row-8\">\n\t<td class=\"column-1\">Memory<\/td><td class=\"column-2\">32 GB (2 x 16 GiB)<br \/>\n<br \/>\nDDR5 - 5600<br \/>\n<br \/>\n2 Memory Channels<\/td><td class=\"column-3\">16 GB<br \/>\n<br \/>\nDDR4 - 3200<br \/>\n<br \/>\n2 Memory Channels<\/td><td class=\"column-4\">Check BIOS for actual speed<\/td>\n<\/tr>\n<tr class=\"row-9\">\n\t<td class=\"column-1\">Stream<\/td><td class=\"column-2\">Copy: 71400 MB\/s<br \/>\nScale: 70300 MB\/s<br \/>\nAdd: 73600 MB\/s<br \/>\nTriad: 73000 MB\/s<\/td><td class=\"column-3\">Copy: 39200 MB\/s<br \/>\nScale: 39100 MB\/s<br \/>\nAdd: 40100 MB\/s<br \/>\nTriad: 40000 MB\/s<\/td><td class=\"column-4\">Measured<\/td>\n<\/tr>\n<tr class=\"row-10\">\n\t<td class=\"column-1\">Cache<\/td><td class=\"column-2\">L1 - 32kB, 8 way, 4 clocks<br \/>\n<br \/>\nL2 - 1 MB, 8-way, 14 clocks<br \/>\n<br \/>\nL3 - 16MB, 24 way, 47 clocks<\/td><td class=\"column-3\">L1 - 48 kB, 12-way\/8-way, 3\/5 clocks<br \/>\n<br \/>\nL2 - 1 MB, 10-way\/16-way, 15-20 clocks<br \/>\n<br \/>\nL3 - 18 MB, 10-way, 65-20 clocks<\/td><td class=\"column-4\">Agner Fog architecture document and likwid-topology<\/td>\n<\/tr>\n<tr class=\"row-11\">\n\t<td class=\"column-1\">lmbench<\/td><td class=\"column-2\">L1 - 0.8 ns<br \/>\nL2 - 3 ns<br \/>\nL3 - 8 ns<\/td><td class=\"column-3\">L1 - 1.3 ns, 1.0 ns<br \/>\nL2 - 4.4 ns, 8ns<br \/>\nL3 - 12 ns, 19ns<\/td><td class=\"column-4\">Measured in Nanoseconds<\/td>\n<\/tr>\n<tr class=\"row-12\">\n\t<td class=\"column-1\">Graphics<\/td><td class=\"column-2\">Radeon 780M<br \/>\n<br \/>\n12 cores<br \/>\n<br \/>\n2700 MHz<\/td><td class=\"column-3\">Intel Iris Xe<\/td><td class=\"column-4\"><\/td>\n<\/tr>\n<tr class=\"row-13\">\n\t<td class=\"column-1\">Phoronix stream<\/td><td class=\"column-2\">Average: 40604 MB\/s<\/td><td class=\"column-3\">Average: 35422 MB\/s<\/td><td class=\"column-4\">1.15x ratio smaller than optimized compiler results above<\/td>\n<\/tr>\n<tr class=\"row-14\">\n\t<td class=\"column-1\">Phoronix coremark<\/td><td class=\"column-2\">Average 464076 Iterations\/second<\/td><td class=\"column-3\">Average 388569 Iterations\/second<\/td><td class=\"column-4\">1.19 ratio<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-1 from cache -->\n\n\n\n<p>Following is the topology shown by likwid-topology.  From the thread topology and description of the hardware we have:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Four <a href=\"https:\/\/wccftech.com\/intel-raptor-cove-cores-for-13th-gen-raptor-lake-cpus-feature-the-same-architecture-as-alder-lakes-golden-cove-cores\/\">raptor-cove <\/a>performance cores (cores 0,1,2,3 and threads 0-7)<\/li>\n\n\n\n<li>Eight <a href=\"https:\/\/en.wikipedia.org\/wiki\/Gracemont_(microarchitecture)\">gracemont<\/a> efficiency cores (cores 4-11 and threads 8-15)<\/li>\n<\/ul>\n\n\n\n<p>Depending on the thread binding we use, we can experiment with either types of cores or take pot luck to see what happens.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>--------------------------------------------------------------------------------\nCPU name:\t13th Gen Intel(R) Core(TM) i5-13500H\nCPU type:\tUnknown Intel Processor\nCPU stepping:\t2\n********************************************************************************\nHardware Thread Topology\n********************************************************************************\nSockets:\t\t1\nCores per socket:\t12\nThreads per core:\t2\n--------------------------------------------------------------------------------\nHWThread\tThread\t\tCore\t\tSocket\t\tAvailable\n0\t\t0\t\t0\t\t0\t\t*\n1\t\t1\t\t0\t\t0\t\t*\n2\t\t0\t\t1\t\t0\t\t*\n3\t\t1\t\t1\t\t0\t\t*\n4\t\t0\t\t2\t\t0\t\t*\n5\t\t1\t\t2\t\t0\t\t*\n6\t\t0\t\t3\t\t0\t\t*\n7\t\t1\t\t3\t\t0\t\t*\n8\t\t0\t\t4\t\t0\t\t*\n9\t\t0\t\t5\t\t0\t\t*\n10\t\t0\t\t6\t\t0\t\t*\n11\t\t0\t\t7\t\t0\t\t*\n12\t\t0\t\t8\t\t0\t\t*\n13\t\t0\t\t9\t\t0\t\t*\n14\t\t0\t\t10\t\t0\t\t*\n15\t\t0\t\t11\t\t0\t\t*\n--------------------------------------------------------------------------------\nSocket 0:\t\t( 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 )\n--------------------------------------------------------------------------------\n********************************************************************************\nCache Topology\n********************************************************************************\nLevel:\t\t\t1\nSize:\t\t\t48 kB\nCache groups:\t\t( 0 1 ) ( 2 3 ) ( 4 5 ) ( 6 7 ) ( 8 9 ) ( 10 11 ) ( 12 13 ) ( 14 15 )\n--------------------------------------------------------------------------------\nLevel:\t\t\t2\nSize:\t\t\t1 MB\nCache groups:\t\t( 0 1 ) ( 2 3 ) ( 4 5 ) ( 6 7 ) ( 8 9 ) ( 10 11 ) ( 12 13 ) ( 14 15 )\n--------------------------------------------------------------------------------\nLevel:\t\t\t3\nSize:\t\t\t18 MB\nCache groups:\t\t( 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 )\n--------------------------------------------------------------------------------\n********************************************************************************\nNUMA Topology\n********************************************************************************\nNUMA domains:\t\t1\n--------------------------------------------------------------------------------\nDomain:\t\t\t0\nProcessors:\t\t( 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 )\nDistances:\t\t10\nFree memory:\t\t9261.24 MB\nTotal memory:\t\t15750.9 MB\n--------------------------------------------------------------------------------\n<\/code><\/pre>\n\n\n\n<p>Following are the outputs from stream using the Intel compiler with -qopt-streaming-stores and running on cores 0 and 2.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-------------------------------------------------------------\nSTREAM version $Revision: 5.10 $\n-------------------------------------------------------------\nThis system uses 8 bytes per array element.\n-------------------------------------------------------------\nArray size = 100000000 (elements), Offset = 0 (elements)\nMemory per array = 762.9 MiB (= 0.7 GiB).\nTotal memory required = 2288.8 MiB (= 2.2 GiB).\nEach kernel will be executed 100 times.\n The *best* time for each kernel (excluding the first iteration)\n will be used to compute the reported bandwidth.\n-------------------------------------------------------------\nNumber of Threads requested = 2\nNumber of Threads counted = 2\n-------------------------------------------------------------\nYour clock granularity\/precision appears to be 1 microseconds.\nEach test below will take on the order of 44926 microseconds.\n   (= 44926 clock ticks)\nIncrease the size of the arrays if this shows that\nyou are not getting at least 20 clock ticks per test.\n-------------------------------------------------------------\nWARNING -- The above is only a rough guideline.\nFor best results, please be sure you know the\nprecision of your system timer.\n-------------------------------------------------------------\nFunction    Best Rate MB\/s  Avg time     Min time     Max time\nCopy:           38956.0     0.041797     0.041072     0.042758\nScale:          35684.8     0.046104     0.044837     0.048052\nAdd:            37673.7     0.064326     0.063705     0.068297\nTriad:          37558.7     0.064531     0.063900     0.065920\n-------------------------------------------------------------\nSolution Validates: avg error less than 1.000000e-13 on all three arrays\n-------------------------------------------------------------<\/code><\/pre>\n\n\n\n<p>This seems to be slightly slower than running with aocc (AMD) compiler so perhaps not picking optimal Intel compiler settings?<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-------------------------------------------------------------\nSTREAM version $Revision: 5.10 $\n-------------------------------------------------------------\nThis system uses 8 bytes per array element.\n-------------------------------------------------------------\nArray size = 100000000 (elements), Offset = 0 (elements)\nMemory per array = 762.9 MiB (= 0.7 GiB).\nTotal memory required = 2288.8 MiB (= 2.2 GiB).\nEach kernel will be executed 100 times.\n The *best* time for each kernel (excluding the first iteration)\n will be used to compute the reported bandwidth.\n-------------------------------------------------------------\nNumber of Threads requested = 2\nNumber of Threads counted = 2\n-------------------------------------------------------------\nYour clock granularity\/precision appears to be 1 microseconds.\nEach test below will take on the order of 43205 microseconds.\n   (= 43205 clock ticks)\nIncrease the size of the arrays if this shows that\nyou are not getting at least 20 clock ticks per test.\n-------------------------------------------------------------\nWARNING -- The above is only a rough guideline.\nFor best results, please be sure you know the\nprecision of your system timer.\n-------------------------------------------------------------\nFunction    Best Rate MB\/s  Avg time     Min time     Max time\nCopy:           39250.2     0.041471     0.040764     0.041949\nScale:          39164.8     0.041535     0.040853     0.042097\nAdd:            40115.0     0.060216     0.059828     0.060848\nTriad:          40038.7     0.060927     0.059942     0.061662\n-------------------------------------------------------------\nSolution Validates: avg error less than 1.000000e-13 on all three arrays\n-------------------------------------------------------------\n<\/code><\/pre>\n\n\n\n<p>Following are outputs from lmbench using the performance cores<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\"stride=1024\n0.00098 1.319\n0.00195 1.319\n0.00293 1.319\n0.00391 1.319\n0.00586 1.319\n0.00781 1.319\n0.01172 1.319\n0.01562 1.319\n0.02344 1.319\n0.03125 1.319\n0.04688 1.319\n0.06250 3.003\n0.09375 3.957\n0.12500 3.957\n0.18750 3.957\n0.25000 3.957\n0.37500 3.957\n0.50000 4.418\n0.75000 4.419\n1.00000 5.134\n1.50000 7.942\n2.00000 9.243\n3.00000 9.315\n4.00000 9.855\n6.00000 9.361\n8.00000 10.209\n12.00000 12.617\n16.00000 17.291\n24.00000 27.485\n32.00000 32.916\n48.00000 39.200\n64.00000 40.679\n96.00000 43.083\n128.00000 43.493\n192.00000 44.343\n256.00000 45.184\n384.00000 44.791\n512.00000 45.646\n768.00000 44.740\n1024.00000 45.498\n1536.00000 46.423\n2048.00000 45.713\n3072.00000 46.404\n4096.00000 46.764\n6144.00000 45.805\n8192.00000 46.870\n<\/code><\/pre>\n\n\n\n<p>Following are outputs from lmbench using the efficiency cores.  This measurement has L1 access slightly faster and L2\/L3 access slower.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\"stride=1024\n0.00098 1.074\n0.00195 1.074\n0.00293 1.074\n0.00391 1.074\n0.00586 1.074\n0.00781 1.074\n0.01172 1.079\n0.01562 1.079\n0.02344 1.079\n0.03125 1.078\n0.04688 6.395\n0.06250 7.162\n0.09375 6.746\n0.12500 7.162\n0.18750 7.163\n0.25000 7.968\n0.37500 7.975\n0.50000 7.980\n0.75000 7.976\n1.00000 7.980\n1.50000 8.823\n2.00000 11.131\n3.00000 14.390\n4.00000 16.334\n6.00000 15.335\n8.00000 15.241\n12.00000 19.302\n16.00000 27.135\n24.00000 46.186\n32.00000 51.094\n48.00000 51.997\n64.00000 51.977\n96.00000 52.156\n128.00000 52.119\n192.00000 52.228\n256.00000 52.197\n384.00000 52.085\n512.00000 51.583\n768.00000 51.018\n1024.00000 51.002\n1536.00000 50.598\n2048.00000 50.765\n3072.00000 50.612\n4096.00000 50.774\n6144.00000 50.952\n8192.00000 49.313<\/code><\/pre>\n\n\n\n<p>Following are selected entries from output from lshw.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Motherboard: Default string\nBIOS: American Megatrends - 1.09 11\/10\/2023\nMemory: LD4S08G32C22ST\nNVMe - KINGSTON OM8SEP4512Q-A01 - 512 Mb\nSCSI - Samsung SSD 870 - 4 Gb<\/code><\/pre>\n\n\n\n<p>There are not specific benchmarks of this processor on <a href=\"https:\/\/www.phoronix.com\">phoronix.com<\/a>.  However, following are two benchmark articles for more powerful versions of the same processor:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The i5-13600H is a 14-core\/20-thread version &#8211; <a href=\"https:\/\/www.phoronix.com\/review\/intel-core-i5-13600k\">https:\/\/www.phoronix.com\/review\/intel-core-i5-13600k<\/a><\/li>\n\n\n\n<li>The i5-13900H is a 24 core\/32-thread version &#8211; <a href=\"https:\/\/www.phoronix.com\/review\/intel-core-i9-13900k\">https:\/\/www.phoronix.com\/review\/intel-core-i9-13900k<\/a><\/li>\n<\/ul>\n\n\n\n<p>These articles provide some addition areas for deeper analysis to understand how workloads exercise my processor.<\/p>\n\n\n\n<p>Overall, I now have a somewhat recent version of both Intel and AMD microprocessors to compare. Specifications for the AMD processor are more powerful, but both are close enough.  Both run 16-threads though the core configurations are different.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I have set up a new Intel performance machine for experiments. The processor is a i5-13500H in a Geekom MiniIT13 mini-PC. Following are some of the major parameters. This comparison is with Ryzen 7840 which will be my AMD comparison <span class=\"excerpt-dots\">&hellip;<\/span> <a class=\"more-link\" href=\"https:\/\/mvermeulen.org\/perf\/2023\/12\/19\/new-i5-13500h-machine\/\"><span class=\"more-msg\">Continue reading &rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[13],"class_list":["post-111","post","type-post","status-publish","format-standard","hentry","category-hardware","tag-i5-13500h"],"_links":{"self":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/111","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/comments?post=111"}],"version-history":[{"count":4,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/111\/revisions"}],"predecessor-version":[{"id":118,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/posts\/111\/revisions\/118"}],"wp:attachment":[{"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/media?parent=111"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/categories?post=111"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mvermeulen.org\/perf\/wp-json\/wp\/v2\/tags?post=111"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}