inefficiency.tex
9.01 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
\section{Inefficiency}
\label{sec-inefficiency}
Extending battery life is critical for mobile systems, and therefore energy
management algorithms for mobile systems should optimize performance under
\textit{energy constraints}.
%
While several researchers have proposed algorithms that work under energy
constraints, these approaches require that the constraints are expressed in
terms of absolute energy~\cite{mobiheld09-cinder,ecosystem}.
%
For example, rate-limiting approaches take the maximum energy that can be
consumed in a given time period as an input~\cite{mobiheld09-cinder}.
%
Once the application consumes its limit, it is paused until the next time
period begins.
Unfortunately, in practice, it is difficult to choose absolute energy
constraints appropriately for a diverse group of applications without
understanding their inherent energy needs.
%
Energy consumption varies across applications, devices, and operating
conditions, making it impractical to choose an absolute energy budget.
%
Also, applying absolute energy constraints may slow down applications to the
point that total energy consumption \textit{increases} and
performance is degraded.
Other metrics that incorporate energy take the form of $Energy * Delay^n$.
%
We argue that while the energy-delay product can be used as a
\textit{measure} to gauge energy-performance trade-offs, it is not a suitable
\textit{constraint} to specify how much energy can be used to improve
performance.
%
A effective constraint should be (1) relative to the applications inherent
energy needs and (2) independent of applications and devices.
%
Because it uses absolute energy, the energy-delay product meets neither of
these criteria.
We propose a new metric called \textit{inefficiency}, which constrains how
much extra energy an application can use to improve performance.
%
Energy efficiency is defined as the work done per unit energy.
%
Therefore, the application is said to be most efficient when it consumes the
minimum energy possible on the given device.
%
The application becomes inefficient as it starts consuming more than the
minimum energy it requires.
%
We define the ratio of application's energy consumption ($E$) and the minimum
energy the application could have consumed ($E_{min}$) on the same device as
inefficiency: $I = \frac{E}{E_{min}}$.
%
An \textit{inefficiency} of $1$ represents an application's most efficient
execution, while $1.5$ indicates that the application consumed $50\%$ more
energy that its most efficient execution.
%
Inefficiency is independent of workloads and devices and avoids the problems
inherent to absolute energy constraints.
Four questions arise when using inefficiency to establish energy
constraints for real systems:
%
\begin{enumerate}
%
\item What are the bounds of inefficiency?
%
\item How is the inefficiency budget set for a given application?
%
\item How is inefficiency computed?
%
\item How should systems stay within the inefficiency budget?
%
\end{enumerate}
We continue by addressing these questions.
\subsection{Inefficiency Bounds and Inefficiency Budget}
% 27 Apr 2015 : GWA : I didn't understand this.
% We argue that absolute value of $I_{max}$ is irrelevant because, even when
% energy is unconstrained, algorithms should focus on delivering the best
% performance.
Devices will operate between an inefficiency of 1 and $I_{max}$ which
represents the unbounded energy constraint allowing the application to
consume as much energy as necessary to deliver the best performance.
%
$I_{max}$ depends upon applications and devices.
%
We argue that absolute value of $I_{max}$ is irrelevant
because, when energy is unconstrained, algorithms can burn unbounded energy and
only focus on
delivering the best performance.
%
The inefficiency budget matters the most when application has bounded energy
constraints and it can be set by the user or the applications.
%For example, an inefficiency budget of 1.2 means that the user is willing to lose 20\% more energy to execute the application.
%
The OS can also set the inefficiency budget based on application's priority
allowing the higher priority applications to burn more energy than lower
priority applications.
%
While higher inefficiency values represent looser energy constraints, this
does not guarantee higher performance.
%
It is the responsibility of the energy management algorithms to provide best
performance under a given inefficiency budget.
\subsection{Computing Inefficiency}
Once the system specifies an inefficiency budget, the energy management
algorithms must tune the system to stay within the inefficiency
budget while delivering the best performance.
%
To compute inefficiency, we need
both the energy ($E$) consumed by the application and the minimum energy
($E_{min}$) that application could have consumed.
%
Computing $E$ is straight
forward; Intel Sandy bridge architecture ~\cite{sandy-bridge-sw-manual} already
provides counters capable of measuring energy consumption at
runtime and the research community has
tools and models to estimate the absolute energy of applications~\cite{brooks2000wattch,drampower-tool,li2009mcpat,micronpowercalc-lpddr3-url,wilton1996cacti}.
Computing $E_{min}$ is challenging due to inter-component
dependencies.
%
We propose two methods for computing $E_{min}$:
%
\begin{itemize}
%
\item \textbf{Brute force search:} $E_{min}$ can be estimated using the
power models (or tools) for a given workload at all possible system
settings.
%
The minimum of all these estimations is $E_{min}$.
%
While the overhead of this approach is high, it could be improved with a
lookup table.
\item \textbf{Predicting and learning:} The overhead of the $E_{min}$ computation
can be further reduced by predicting $E_{min}$ based on previous observations
and learning continuously.
%
A variety of learning based approaches~\cite{li2009machine} have
been proposed in the past to estimate various metrics and application phases
which can be applied to $E_{min}$ estimation as well.
\end{itemize}
We are working towards designing efficient energy prediction models for CPU and
memory.
%
Our models consider cross-component interactions on performance and energy
consumption.
%
%%%%%%%% MODEL %%%%%%%%%%
%We designed efficient models to predict performance and energy consumption of
%CPU and memory at various voltage and frequency settings for a given
%application. We plan on using these models to estimate $E_{min}$ of a given set
%of instructions.
%%We envision a system capable of scaling voltage and frequency of CPU and only
%%frequency of DRAM.
%Our models consider cross-component interactions on performance and energy.
%The performance model uses hardware performance counters to measure amount of time
%each component is $Busy$ completing the work, $Idle$ stalled on the other
%component and $Waiting$ for more work. We designed systematic methodology to
%scale these states to estimate execution time of a given workload at different
%voltage and frequency settings. In our model, the $Idle$ time of one component
%depends on the settings of the second component. The $Busy$ time of each
%component scales with it's own frequency. However, part of the $Busy$ time that
%overlaps with the other component is constrained by the slowest component.
%
%We combine predicted performance with the power models of CPU and memory
%described in Section~\ref{subsec-energy-models} to estimate energy consumption
%of CPU and memory. Our model has average prediction error of 4\% across SPEC
%CPU2006 benchmarks with highest error of 10\% except for $gobmk (18\%)$ and $lbm
%(24\%)$.
%%%%% END OF MODEL %%%%%%
In this work we demonstrate how to use inefficiency and defer both predicting and
optimizing $E_{min}$ to future work.
\subsection{Managing Inefficiency}
%
Future energy management algorithms need to tune system settings to keep the
system within the specified inefficiency budget and deliver the best performance.
%
Techniques that use predictors such as instructions-per-cycle (IPC) to decide
when to use DVFS or migrate threads can be extended to operate under given
inefficiency
budget~\cite{isca09-abhishek,ieeemicro05-isci,micro06-isci,isca09-thread-motion}.
%
Efforts that have tried to optimize memory energy consumption can be adapted
to use inefficiency as a constraint to their
system~\cite{david2011memory,deng2012multiscale,deng2011memscale,diniz2007limiting,lebeck2000power,malladi2012towards,sudan2010micro,zheng2009decoupled}.
%
While most of the existing multi-component energy management approaches work
under performance constraints, some have the potential to be modified to work
under energy constraints and thus could operate under
inefficiency budget~\cite{bitirgen2008coordinated,deng2012coscale,chen2011coordinating,fan2005synergy,felter2005performance,li2007cross,raghavendra2008no}.
%
We leave incorporating some of these algorithms into a system as future work.
%
In this paper, we characterize the optimal performance point under different
inefficiency constraints and illustrate that the stability of these points
has implications for future algorithms.