Adding summary statistics for all benchmarks
After adding general parsing of measurement statistics, I can now also create a statistical summary across all ~170 benchmarks as shown below. This lets me see for example the minimum IPC, maximum IPC, mean IPC and standard deviation. This will then provide some information whether a particular workload is “low” or “high” in a metric and how significantly.
The statistics below come from the workload statistics with AMD metrics first followed by Intel metrics. For example, based on this table we can see mean values for topdown metrics:
- Retirement go from 0.8% to 76.2% with a mean of 32.3% and standard deviation of 15.6%. A retirement rate over 64.5% would be two standard deviations above the mean.
- Frontend stalls go from 0.1% to 73% with a mean of 22.5% and a standard deviation of 17.5%
- Backend stalls go from 4.1% to 97.1% with a mean of 41.6% and a standard deviation of 21.3%
- Speculative stalls go from 0% to 21.2% with a mean of 3.56% and a standard deviation of 3.95%
These numbers are recalculated as the reports are re-generated but with 170 workloads mostly included are a good first overview of how the workloads operate on my AMD 7840.
Some next steps including flagging the outliers in the metrics and seeing how I can create histograms for different fields below.
| metric | count | min | max | median | mean | stddev |
|---|---|---|---|---|---|---|
| elapsed | 174 | 2.5 | 8.8e+03 | 554 | 1.25e+03 | 1.62e+03 |
| on_cpu | 174 | 0 | 16 | 7.26 | 7.31 | 5.53 |
| inblock | 172 | 0 | 5.47e+06 | 0 | 3.19e+04 | 4.16e+05 |
| onblock | 172 | 0.46 | 4e+05 | 133 | 9.97e+03 | 3.54e+04 |
| page-fault | 174 | 4.33 | 1.24e+05 | 2.36e+03 | 1.14e+04 | 2.04e+04 |
| context-switch | 174 | 1.51 | 5.12e+04 | 75 | 2.18e+03 | 7.73e+03 |
| IPC | 174 | 0.03 | 4.63 | 1.44 | 1.64 | 0.88 |
| GHz | 174 | 0 | 4.62 | 1.98 | 1.91 | 1.39 |
| retire-rate | 174 | 0.8 | 76.2 | 29.2 | 32.3 | 15.6 |
| frontend-stall | 174 | 0.1 | 73 | 19.1 | 22.5 | 17.5 |
| backend-stall | 174 | 4.1 | 97.1 | 36.8 | 41.6 | 21.3 |
| spec-stall | 174 | 0 | 21.2 | 2.7 | 3.56 | 3.95 |
| retire-ucode | 174 | 0 | 1 | 0 | 0.0736 | 0.131 |
| retire-fastpath | 174 | 0.7 | 76.2 | 24.7 | 27.8 | 14.5 |
| float-density | 174 | 0.013 | 676 | 67.5 | 133 | 154 |
| frontend-latency | 174 | 0.1 | 58.3 | 9.1 | 13.6 | 12.6 |
| frontend-bandwidth | 174 | 0 | 28.7 | 5.5 | 6.06 | 4.86 |
| opcache-miss | 87 | 52.4 | 54.7 | 53.8 | 53.8 | 0.401 |
| icache-miss | 87 | 8.2 | 9.6 | 8.5 | 8.57 | 0.294 |
| backend-cpu | 174 | 0.7 | 64 | 9.3 | 12.3 | 10.9 |
| backend-memory | 174 | 0.4 | 95.5 | 19.9 | 23.5 | 17.4 |
| amd-l2-miss | 174 | 0.08 | 59.8 | 16.3 | 17.3 | 11.7 |
| amd-l2-density | 174 | 0.036 | 470 | 38.2 | 49.7 | 58.9 |
| spec-branch | 174 | 0 | 21.1 | 2.1 | 3.07 | 3.8 |
| spec-pipeline | 174 | 0 | 1.4 | 0 | 0.11 | 0.205 |
| branch-miss | 174 | 0.01 | 14.8 | 1.96 | 2.79 | 2.99 |
| branch-density | 174 | 8.68 | 276 | 125 | 130 | 61.9 |
| branch-cond | 174 | 5.66 | 271 | 92.7 | 98.4 | 48.9 |
| branch-ind | 174 | 0.003 | 29.8 | 3.07 | 4.53 | 5.4 |
| smt-contention | 174 | 0 | 45.4 | 12.5 | 13.3 | 13.1 |
| elapsed | 169 | 1.49 | 1.44e+04 | 750 | 1.71e+03 | 2.45e+03 |
| on_cpu | 169 | 0 | 15.8 | 9.05 | 7.76 | 5.6 |
| inblock | 166 | 0 | 8.53e+04 | 53.9 | 2.33e+03 | 9.24e+03 |
| onblock | 166 | 0.37 | 4.25e+05 | 38.1 | 9.06e+03 | 3.59e+04 |
| page-fault | 169 | 4.05 | 1.17e+05 | 1.73e+03 | 1.02e+04 | 1.91e+04 |
| context-switch | 169 | 1.81 | 9.27e+04 | 72.6 | 2.44e+03 | 1e+04 |
| IPC | 169 | 0.05 | 5.54 | 1.82 | 2 | 0.975 |
| GHz | 169 | 0 | 3.06 | 1.46 | 1.3 | 0.885 |
| retire-rate | 169 | 3.5 | 87.6 | 42.8 | 44.3 | 14.7 |
| frontend-stall | 169 | 1 | 50.4 | 18.3 | 19.4 | 11.1 |
| backend-stall | 169 | 1.3 | 93.5 | 23.8 | 27.6 | 18.1 |
| spec-stall | 169 | 0 | 46.7 | 6.8 | 9.16 | 8.55 |
| retire-ucode | 169 | 0 | 16.7 | 3 | 3.37 | 2.4 |
| retire-fastpath | 169 | 2.3 | 83.4 | 39.5 | 40.9 | 14.1 |
| frontend-latency | 169 | 0.3 | 35.1 | 9.5 | 10.3 | 6.63 |
| frontend-bandwidth | 169 | 0.4 | 25.8 | 8.4 | 9.09 | 6 |
| backend-cpu | 169 | 0.6 | 67.7 | 10.4 | 13.8 | 11 |
| backend-memory | 169 | 0 | 90.4 | 10.1 | 13.8 | 14.1 |
| l1-stall | 77 | 0 | 24.4 | 5.2 | 5.45 | 4.88 |
| l2-stall | 77 | 0 | 57.1 | 8.6 | 9.01 | 9.09 |
| l3-stall | 77 | 0 | 35 | 2.6 | 4.07 | 5.51 |
| dram-stall | 77 | 0 | 49 | 6 | 8.86 | 10.6 |
| store-stall | 77 | 0 | 28.3 | 0.9 | 1.56 | 3.53 |
| intel-l2-miss | 169 | 0.59 | 92.9 | 26.8 | 28.2 | 17.1 |
| intel-l2-density | 169 | 0.029 | 370 | 26.9 | 35.7 | 45.2 |
| spec-branch | 169 | 0 | 46.7 | 6.2 | 8.71 | 8.58 |
| spec-pipeline | 169 | 0 | 6.2 | 0.3 | 0.456 | 0.701 |
| branch-miss | 169 | 0 | 20.4 | 1.03 | 1.81 | 2.58 |
| branch-density | 169 | 6.24 | 275 | 126 | 127 | 60.8 |
| branch-cond | 169 | 6.24 | 275 | 126 | 127 | 60.8 |
| branch-ind | 169 | 0.035 | 83 | 21.1 | 22.6 | 17 |

Comments
Adding summary statistics for all benchmarks — No Comments
HTML tags allowed in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>