Commit bf41694afb1b02585ece2f87942338199a6d959b
1 parent
42bf1102
edits
Showing
2 changed files
with
5 additions
and
5 deletions
optimal_performance.tex
| ... | ... | @@ -97,7 +97,7 @@ algorithm presented by CoScale~\cite{deng2012coscale} takes 5us to find optimal |
| 97 | 97 | frequency settings, time taken by PLLs to change voltage and frequency in commercial processors is in the |
| 98 | 98 | order of 10s of microseconds. |
| 99 | 99 | Reducing the frequency at which tuning algorithms need to re-tune is critical to |
| 100 | -reduce the cost of tuning overhead on application performance. | |
| 100 | +reduce the impact of tuning overhead on application performance. | |
| 101 | 101 | |
| 102 | 102 | %\item |
| 103 | 103 | \noindent \textit{Limited energy performance trade-off options.} Choosing the | ... | ... |
performance_clusters.tex
| ... | ... | @@ -26,7 +26,7 @@ the system. |
| 26 | 26 | We search for the performance clusters using an algorithm that is similar to the approach we used to find the optimal settings. We |
| 27 | 27 | first filter the settings that fall within a given inefficiency budget and |
| 28 | 28 | then search for the optimal settings in the first pass. In the second pass, we find all of the |
| 29 | -settings that have a speedup within the specified \textit{cluster threshold} of the optimal performance. | |
| 29 | +settings that have a speedup within the specified cluster threshold of the optimal performance. | |
| 30 | 30 | \begin{figure}[t] |
| 31 | 31 | \centering |
| 32 | 32 | \includegraphics[width=\columnwidth]{./figures/plots/496/stable_line_plots/lbm_stable_lineplot_annotated_5.pdf} |
| ... | ... | @@ -96,13 +96,13 @@ Figures~\ref{clusters-gobmk}(c),~\ref{clusters-gobmk}(d) plot the |
| 96 | 96 | performance clusters for \textit{gobmk} for inefficiency budget of 1.3 and |
| 97 | 97 | cluster thresholds of 1\% and 5\% respectively. As we observed in |
| 98 | 98 | Figure~\ref{gobmk-optimal}, the optimal settings for \textit{gobmk} change |
| 99 | -every sample (of length 10 million instructions) and follows | |
| 99 | +every sample (of length 10 million instructions) at inefficiency of 1.3 and follow | |
| 100 | 100 | application phases (CPI). Figure~\ref{clusters-gobmk}(c) shows that by |
| 101 | 101 | allowing just 1\% performance degradation, the number of settings |
| 102 | 102 | available to choose from increase. For example, for sample 11, the |
| 103 | -optimal settings were at 1000MHz CPU and 500MHz memory. With 1\% | |
| 103 | +optimal settings were at 920MHz CPU and 580MHz memory. With 1\% | |
| 104 | 104 | cluster threshold, the range of available frequencies increases to |
| 105 | -970MHz-1000MHz for CPU and 420MHz-580MHz for memory. With a 5\% | |
| 105 | +900MHz-9200MHz for CPU and 420MHz-580MHz for memory. With a 5\% | |
| 106 | 106 | cluster threshold, the range of available frequencies increases |
| 107 | 107 | further as shown in Figure~\ref{clusters-gobmk}(d). With an increase in number of available settings, the |
| 108 | 108 | probability of finding common settings in two consecutive samples | ... | ... |