Commit 175cc70418a55f0e91b66121d6cbfd76afc6f804

Authored by Rizwana Begum
1 parent 2758a2d6

final submission

algorithm_implications.tex
... ... @@ -29,12 +29,12 @@ detect stable regions of clusters containing both memory and CPU settings.
29 29  
30 30 \item \textit{Offline Analysis:} Another approach that can be taken to reduce
31 31 the number of tuning events is offline analysis of the applications. An
32   -application can be profiled once offline to identify regions in which the
33   -performance cluster is stable. The profiled information of the stable region
  32 +application can be profiled offline to identify regions in which the
  33 +performance cluster is stable. The profile information of the stable region
34 34 lengths, positions, and available settings can then be used at run time to enable
35 35 the system to predict how long it can go without tuning. Algorithms can also
36 36 extend the usage of the profiled information to new applications that may have
37   -phases that match with already profiled data. Previous work has already proposed
  37 +phases that match with existing profiled data. Previous work has already proposed
38 38 using offline analysis methods to detect application
39 39 phases~\cite{Lau:2006:CGO:phase}, which would be directly applicable here in our
40 40 system.
... ...
performance_clusters.tex
... ... @@ -213,7 +213,7 @@ available settings increase with inefficiency increasing the average length of
213 213 stable regions. At an inefficiency budget of 1.6, the average length of a stable region
214 214 increases drastically as shown in Figure~\ref{box-lengths}(b), which requires much less
215 215 transitions with 1\% cluster threshold and no transitions with higher cluster thresholds of 3\%
216   -and 5\%. Note that there is only one point on the box plot for 3\% and 5\%
  216 +and 5\%. Note that there is only one point on the box plot of bzip2 for 3\% and 5\%
217 217 cluster thresholds at inefficiency of 1.6, because the benchmark is covered entirely by only one region.
218 218 %and therefore no distribution is available.
219 219 However, \textit{gobmk} has rapidly changing phases and
... ... @@ -339,8 +339,9 @@ available to make better energy-performance trade-offs. Therefore average number
339 339 of samples for which one setting can be chosen decreases. For example, with 70
340 340 frequency settings sample 7 through sample 10 can always run at CPU frequency of 900MHz
341 341 and memory frequency of 300MHz. With 496 frequency settings, sample 7
342   -runs at one setting, sample 8-9 runs at another setting and sample 10 runs at a
343   -different setting due to the availability of more (and better) choices.
  342 +runs at 900MHz, sample 8-9 run at 950MHz and sample 10 runs at 980MHz of CPU
  343 +frequency. Fine frequency steps increase the availability of more (and better)
  344 +choices, resulting in smaller stable region lengths.
344 345 %\XXXnote{sounds wordy -Dave}.
345 346 In our system, we observed only a small improvement in performance (\textless
346 347 1\%) with an increased number of frequency steps when
... ... @@ -359,6 +360,7 @@ critical in deciding the correct size of the search space.
359 360 Figure~(a) plots performance clusters collected using 100MHz of frequency step for both CPU and
360 361 memory. Figure~(b) plots
361 362 performance clusters collected using frequency steps of 30MHz for CPU and 40MHz
  363 +for memory. We simulate frequency range of 100MHz-1000MHz for CPU and 200MHz-800MHz
362 364 for memory.}
363 365 \label{sensitivity}
364 366 \end{figure}
... ...