Commit bf41694afb1b02585ece2f87942338199a6d959b
1 parent
42bf1102
edits
Showing
2 changed files
with
5 additions
and
5 deletions
optimal_performance.tex
| @@ -97,7 +97,7 @@ algorithm presented by CoScale~\cite{deng2012coscale} takes 5us to find optimal | @@ -97,7 +97,7 @@ algorithm presented by CoScale~\cite{deng2012coscale} takes 5us to find optimal | ||
| 97 | frequency settings, time taken by PLLs to change voltage and frequency in commercial processors is in the | 97 | frequency settings, time taken by PLLs to change voltage and frequency in commercial processors is in the |
| 98 | order of 10s of microseconds. | 98 | order of 10s of microseconds. |
| 99 | Reducing the frequency at which tuning algorithms need to re-tune is critical to | 99 | Reducing the frequency at which tuning algorithms need to re-tune is critical to |
| 100 | -reduce the cost of tuning overhead on application performance. | 100 | +reduce the impact of tuning overhead on application performance. |
| 101 | 101 | ||
| 102 | %\item | 102 | %\item |
| 103 | \noindent \textit{Limited energy performance trade-off options.} Choosing the | 103 | \noindent \textit{Limited energy performance trade-off options.} Choosing the |
performance_clusters.tex
| @@ -26,7 +26,7 @@ the system. | @@ -26,7 +26,7 @@ the system. | ||
| 26 | We search for the performance clusters using an algorithm that is similar to the approach we used to find the optimal settings. We | 26 | We search for the performance clusters using an algorithm that is similar to the approach we used to find the optimal settings. We |
| 27 | first filter the settings that fall within a given inefficiency budget and | 27 | first filter the settings that fall within a given inefficiency budget and |
| 28 | then search for the optimal settings in the first pass. In the second pass, we find all of the | 28 | then search for the optimal settings in the first pass. In the second pass, we find all of the |
| 29 | -settings that have a speedup within the specified \textit{cluster threshold} of the optimal performance. | 29 | +settings that have a speedup within the specified cluster threshold of the optimal performance. |
| 30 | \begin{figure}[t] | 30 | \begin{figure}[t] |
| 31 | \centering | 31 | \centering |
| 32 | \includegraphics[width=\columnwidth]{./figures/plots/496/stable_line_plots/lbm_stable_lineplot_annotated_5.pdf} | 32 | \includegraphics[width=\columnwidth]{./figures/plots/496/stable_line_plots/lbm_stable_lineplot_annotated_5.pdf} |
| @@ -96,13 +96,13 @@ Figures~\ref{clusters-gobmk}(c),~\ref{clusters-gobmk}(d) plot the | @@ -96,13 +96,13 @@ Figures~\ref{clusters-gobmk}(c),~\ref{clusters-gobmk}(d) plot the | ||
| 96 | performance clusters for \textit{gobmk} for inefficiency budget of 1.3 and | 96 | performance clusters for \textit{gobmk} for inefficiency budget of 1.3 and |
| 97 | cluster thresholds of 1\% and 5\% respectively. As we observed in | 97 | cluster thresholds of 1\% and 5\% respectively. As we observed in |
| 98 | Figure~\ref{gobmk-optimal}, the optimal settings for \textit{gobmk} change | 98 | Figure~\ref{gobmk-optimal}, the optimal settings for \textit{gobmk} change |
| 99 | -every sample (of length 10 million instructions) and follows | 99 | +every sample (of length 10 million instructions) at inefficiency of 1.3 and follow |
| 100 | application phases (CPI). Figure~\ref{clusters-gobmk}(c) shows that by | 100 | application phases (CPI). Figure~\ref{clusters-gobmk}(c) shows that by |
| 101 | allowing just 1\% performance degradation, the number of settings | 101 | allowing just 1\% performance degradation, the number of settings |
| 102 | available to choose from increase. For example, for sample 11, the | 102 | available to choose from increase. For example, for sample 11, the |
| 103 | -optimal settings were at 1000MHz CPU and 500MHz memory. With 1\% | 103 | +optimal settings were at 920MHz CPU and 580MHz memory. With 1\% |
| 104 | cluster threshold, the range of available frequencies increases to | 104 | cluster threshold, the range of available frequencies increases to |
| 105 | -970MHz-1000MHz for CPU and 420MHz-580MHz for memory. With a 5\% | 105 | +900MHz-9200MHz for CPU and 420MHz-580MHz for memory. With a 5\% |
| 106 | cluster threshold, the range of available frequencies increases | 106 | cluster threshold, the range of available frequencies increases |
| 107 | further as shown in Figure~\ref{clusters-gobmk}(d). With an increase in number of available settings, the | 107 | further as shown in Figure~\ref{clusters-gobmk}(d). With an increase in number of available settings, the |
| 108 | probability of finding common settings in two consecutive samples | 108 | probability of finding common settings in two consecutive samples |