inefficiency_speedup.tex 4.83 KB
\section{Inefficiency vs. Speedup}
\label{sec-ineff-speedup}

Scaling individual components---CPU and memory---using DVFS has been studied in
the past 
%and
%researchers have used it 
to make power performance trade-offs. To the best of our knowledge, prior
work has not studied the system level energy-performance trade-offs of combined
CPU and memory DVFS.
%considering the interaction between CPU and memory
%frequency scaling. 
We take a first step and explore these trade-offs and show that incorrect
frequency settings may burn extra energy without improving performance. 
%We use this as a motivation to characterize optimal frequency
%settings for multiple benchmarks under various energy constraints.

We performed offline analysis of the data collected from our simulations to
study the inefficiency-performance trends for various benchmarks. With a brute
force search, we found $E_{min}$ and computed inefficiency at all %frequency
settings. We express performance in terms of $speedup$, the ratio of execution
time for a given configuration to the longest execution time.
% to the execution time at
%a given frequency setting.

Figure~\ref{heatmaps} plots the speedup and inefficiency for three workloads
operating with various CPU and memory frequencies. As the figure shows, the
ability of a workload to trade-off energy and performance using CPU and memory
frequency, depends on its mix of CPU and memory instructions. For CPU intensive
workloads like \textit{bzip2}, speedup varies with only CPU frequency, and
memory frequency has no impact on speedup. For workloads that have balanced CPU
and memory intensive phases like \textit{gobmk}, speedup varies with both CPU
and memory frequency. The \textit{milc} benchmark has some memory intensive
phases, however it is more CPU intensive and therefore its performance is more
dependent upon CPU frequency than memory frequency. 
%Besides\XXXnote{besides
%what?},
We make three major observations:

% MH: I like this formating better than an enumerated list
%\begin{enumerate}
%\item 
\noindent \textit{Running slower doesn't mean that system is running
efficiently.} At the lowest frequencies, 100MHz and 200MHz for CPU and
memory respectively, \textit{gobmk} takes the longest to execute. These settings slow down the application so much
that its overall energy consumption increases, thereby resulting in 
inefficiency of 1.55 for \textit{gobmk}.  Algorithms that choose these frequency settings spend
55\% more energy without any performance improvement.
%The converse is also true
%as noted by our second observation.

%\item 
\noindent \textit{Higher inefficiency doesn't always result in higher
performance:} \textit{gobmk} is fastest at 1000MHz for CPU and  800MHz for
memory frequency. It runs at inefficiency of 1.65 at these frequency settings.
Allowing \textit{gobmk} to run at higher inefficiency of say 2.2, doesn't
improve performance. In fact, any algorithms that force the application to
consume all of the given energy budget may degrade application performance. For
example, \textit{gobmk} runs 1.5x slower if it is forced to run at budget of
2.2 at 1000MHz and 200MHz of CPU and memory frequencies respectively.
%Our
%next observation is a natural extension of the first two observations.
%\item 

\noindent \textit{Smart algorithms should search for optimal points \textbf{under}
the inefficiency constraint and \textbf{not} just \textbf{at} the inefficiency
constraint.} Algorithms forcing the system to run exactly at given budget might end
up wasting energy or, even worse, degrading performance. A smart algorithm should
a) stay under given inefficiency budget b) should use only as much
inefficiency budget as needed c) and deliver the best performance.
%\end{enumerate}

Consequently, like other constraints used by algorithms such as performance, power and absolute energy, inefficiency
also allows energy management algorithms to waste system energy. We suggest
that, even though inefficiency doesn't completely eliminate the problem of
wasting energy, it mitigates the problem. For example, rate limiting approaches
waste energy as energy budget is specified for a given amount of time interval
and doesn't require a specific amount of work to be done within that budget.
However, inefficiency mandates the underlying algorithms to complete the given
amount of work under the constraint.
%within a specified energy budget. 
%For example an inefficiency of 1.5
%indicates that 50\% of additional energy can be burnt to \textit{complete} the
%given amount of work, thereby reducing the energy waste.
% 
% MH: The paragraph below has already been said
%Designing algorithms
%that don't waste energy and provide the best performance under given
%inefficiency budget is not the goal of this paper. Instead we focus on
%characterizing the optimal CPU and the memory frequency settings across
%inefficiency budgets, applications and phases within an application.