inefficiency.tex 8.99 KB
\section{Inefficiency}

\label{sec-inefficiency}

Extending battery life is critical for mobile systems, and therefore energy
management algorithms for mobile systems should optimize performance under
\textit{energy constraints}.
%
While several researchers have proposed algorithms that work under energy
constraints, these approaches require that the constraints are expressed in
terms of absolute energy~\cite{mobiheld09-cinder,ecosystem}.
%
For example, rate-limiting approaches take the maximum energy that can be
consumed in a given time period as an input~\cite{mobiheld09-cinder}.
%
Once the application consumes its limit, it is paused until the next time
period begins.

Unfortunately, in practice, it is difficult to choose absolute energy
constraints appropriately for a diverse group of applications without
understanding their inherent energy needs.
%
Energy consumption varies across applications, devices, and operating
conditions, making it impractical to choose an absolute energy budget.
%
Also, applying absolute energy constraints may slow down applications to the
point that total energy consumption \textit{increases} and
performance is degraded.

Other metrics that incorporate energy take the form of $Energy * Delay^n$.
%
We argue that while the energy-delay product can be used as a
\textit{measure} to gauge energy-performance trade-offs, it is not a suitable
\textit{constraint} to specify how much energy can be used to improve
performance.
%
A effective constraint should be (1) relative to the applications inherent
energy needs and (2) independent of applications and devices.
%
Because it uses absolute energy, the energy-delay product meets neither of
these criteria.

We propose a new metric called \textit{inefficiency}, which constrains how
much extra energy an application can use to improve performance.
%
Energy efficiency is defined as the work done per unit energy.
%
Therefore, the application is said to be most efficient when it consumes the
minimum energy possible on the given device.
%
The application becomes inefficient as it starts consuming more than the
minimum energy it requires.
%
We define the ratio of application's energy consumption ($E$) and the minimum
energy the application could have consumed ($E_{min}$) on the same device as
inefficiency: $I = \frac{E}{E_{min}}$. 
%
An \textit{inefficiency} of $1$ represents an application's most efficient
execution, while $1.5$ indicates that the application consumed $50\%$ more
energy that its most efficient execution.
%
Inefficiency is independent of workloads and devices and avoids the problems
inherent to absolute energy constraints.

Four questions arise when using inefficiency to establish energy
constraints for real systems:
%
\begin{enumerate}
    %
  \item What are the bounds of inefficiency?
    %
  \item How is the inefficiency budget set for a given application?
    %
  \item How is inefficiency computed?
    %
  \item How should systems stay within the inefficiency budget?
    %
\end{enumerate}

We continue by addressing these questions.
\subsection{Inefficiency Bounds and Inefficiency Budget}
% 27 Apr 2015 : GWA : I didn't understand this.

% We argue that absolute value of $I_{max}$ is irrelevant because, even when
% energy is unconstrained, algorithms should focus on delivering the best
% performance.
Devices will operate between an inefficiency of 1 and $I_{max}$ which
represents the unbounded energy constraint allowing the application to
consume as much energy as necessary to deliver the best performance.
%
$I_{max}$ depends upon applications and devices.
%
We argue that absolute value of $I_{max}$ is irrelevant
because, when energy is unconstrained, algorithms can burn unbounded energy and
only focus on
delivering the best performance.
%
The inefficiency budget matters the most when application has bounded energy
constraints and it can be set by the user or the applications. 
%For example, an inefficiency budget of 1.2 means that the user is willing to lose 20\% more energy to execute the application.
%
The OS can also set the inefficiency budget based on application's priority
allowing the higher priority applications to burn more energy than lower
priority applications. 
%
While higher inefficiency values represent looser energy constraints, this
does not guarantee higher performance.
%
It is the responsibility of the energy management algorithms to provide best
performance under a given inefficiency budget. 

\subsection{Computing Inefficiency}

Once the system specifies an inefficiency budget, the energy management
algorithms must tune the system to stay within the inefficiency
budget while delivering the best performance. 
%
To compute inefficiency, we need
both the energy ($E$) consumed by the application and the minimum energy
($E_{min}$) that application could have consumed.
%
Computing $E$ is straight
forward; Intel Sandy bridge architecture ~\cite{sandy-bridge-sw-manual} already
provides counters capable of measuring energy consumption at
runtime and the research community has 
tools and models to estimate the absolute energy of applications~\cite{brooks2000wattch,drampower-tool,li2009mcpat,micronpowercalc-lpddr3-url,wilton1996cacti}.

Computing $E_{min}$ is challenging due to inter-component
dependencies.
%
We propose two methods for computing $E_{min}$:
%
\begin{itemize}
    %
  \item \textbf{Brute force search:} $E_{min}$ can be estimated using the
    power models (or tools) for a given workload at all possible system
    settings.
    %
    The minimum of all these estimations is $E_{min}$.
    %
    While the overhead of this approach is high, it could be improved with a
    lookup table. 

  \item \textbf{Predicting and learning:} The overhead of the $E_{min}$ computation
    can be further reduced by predicting $E_{min}$ based on previous observations
    and learning continuously. 
    %
    A variety of learning based approaches~\cite{li2009machine} have
    been proposed in the past to estimate various metrics and application phases
    which can be applied to $E_{min}$ estimation as well. 

\end{itemize}

We are working towards designing efficient energy prediction models for CPU and 
memory.
%
Our models consider cross-component interactions on performance and energy
consumption.
%
%%%%%%%% MODEL %%%%%%%%%%
%We designed efficient models to predict performance and energy consumption of
%CPU and memory at various voltage and frequency settings for a given
%application. We plan on using these models to estimate $E_{min}$ of a given set
%of instructions. 
%%We envision a system capable of scaling voltage and frequency of CPU and only
%%frequency of DRAM. 
%Our models consider cross-component interactions on performance and energy.
%The performance model uses hardware performance counters to measure amount of time
%each component is $Busy$ completing the work, $Idle$ stalled on the other
%component and $Waiting$ for more work. We designed systematic methodology to
%scale these states to estimate execution time of a given workload at different
%voltage and frequency settings. In our model, the $Idle$ time of one component
%depends on the settings of the second component. The $Busy$ time of each
%component scales with it's own frequency. However, part of the $Busy$ time that
%overlaps with the other component is constrained by the slowest component.
%
%We combine predicted performance with the power models of CPU and memory
%described in Section~\ref{subsec-energy-models} to estimate energy consumption
%of CPU and memory. Our model has average prediction error of 4\% across SPEC
%CPU2006 benchmarks with highest error of 10\% except for $gobmk (18\%)$ and $lbm
%(24\%)$. 
%%%%% END OF MODEL %%%%%%
In this work we demonstrate how to use inefficiency, deferring predicting and 
optimizing $E_{min}$ to future work.

\subsection{Managing Inefficiency}
%
Future energy management algorithms need to tune system settings to keep the
system within specified inefficiency budget and deliver the best performance. 
%
Techniques that use predictors such as instructions-per-cycle (IPC) to decide
when to use DVFS or migrate threads can be extended to operate under given
inefficiency
budget~\cite{isca09-abhishek,ieeemicro05-isci,micro06-isci,isca09-thread-motion}.
%
Efforts that have tried to optimize memory energy consumption can be adapted
to use inefficiency as a constraint to their
system~\cite{david2011memory,deng2012multiscale,deng2011memscale,diniz2007limiting,lebeck2000power,malladi2012towards,sudan2010micro,zheng2009decoupled}.
%
While most of the existing multi-component energy management approaches work
under performance constraints, some have potential to be modified to work
under energy constraints and thus could operate under 
inefficiency budget~\cite{bitirgen2008coordinated,deng2012coscale,chen2011coordinating,fan2005synergy,felter2005performance,li2007cross,raghavendra2008no}.
%
We leave building some of these algorithms into a system as future work.
%
In this paper, we characterize the optimal performance point under different
inefficiency constraints and illustrate that the stability of these points
has implications for future algorithms.