Commit 5ae0e5d5998b8b27833e594d0a83615a3a99552f

Authored by Rizwana Begum
1 parent 0e5dc2a4

more edits

abstract.tex
@@ -2,13 +2,13 @@ @@ -2,13 +2,13 @@
2 2
3 Battery lifetime continues to be a top complaint about smartphones. Dynamic 3 Battery lifetime continues to be a top complaint about smartphones. Dynamic
4 voltage and frequency scaling (DVFS) has existed for mobile device CPUs for some 4 voltage and frequency scaling (DVFS) has existed for mobile device CPUs for some
5 -time, and provides a tradeoff between energy and performance. DVFS is beginning  
6 -to be applied to memory as well to make more energy-performance tradeoffs  
7 -possible. 5 +time, and provides a tradeoff between energy and performance. Dynamic frequency
  6 +scaling is beginning to be applied to memory as well to make more
  7 +energy-performance tradeoffs possible.
8 8
9 -We present the first characterization of the behavior and optimal frequency  
10 -settings of workloads running both under \textit{energy constraints} and on  
11 -systems with \textit{both} CPU and memory DVFS, an environment representative 9 +We present the first characterization of the behavior of the optimal frequency
  10 +settings of workloads running both, under \textit{energy constraints} and on
  11 +systems capable of CPU DVFS and memory DFS, an environment representative
12 of next-generation mobile devices. Our results show that continuously using 12 of next-generation mobile devices. Our results show that continuously using
13 the optimal frequency settings results in a large number of frequency 13 the optimal frequency settings results in a large number of frequency
14 transitions which end up hurting performance. However, by permitting a small 14 transitions which end up hurting performance. However, by permitting a small
inefficiency.tex
@@ -7,7 +7,7 @@ management algorithms for mobile systems should optimize performance under @@ -7,7 +7,7 @@ management algorithms for mobile systems should optimize performance under
7 \textit{energy constraints}. 7 \textit{energy constraints}.
8 % 8 %
9 While several researchers have proposed algorithms that work under energy 9 While several researchers have proposed algorithms that work under energy
10 -constraints, these approaches require that the constraints are expressed in 10 +constraints, these approaches require that the constraints be expressed in
11 terms of absolute energy~\cite{mobiheld09-cinder,ecosystem}. 11 terms of absolute energy~\cite{mobiheld09-cinder,ecosystem}.
12 % 12 %
13 For example, rate-limiting approaches take the maximum energy that can be 13 For example, rate-limiting approaches take the maximum energy that can be
@@ -24,7 +24,7 @@ Energy consumption varies across applications, devices, and operating @@ -24,7 +24,7 @@ Energy consumption varies across applications, devices, and operating
24 conditions, making it impractical to choose an absolute energy budget. 24 conditions, making it impractical to choose an absolute energy budget.
25 % 25 %
26 Also, applying absolute energy constraints may slow down applications to the 26 Also, applying absolute energy constraints may slow down applications to the
27 -point that total energy consumption \textit{increases} and 27 +point where total energy consumption \textit{increases} and
28 performance is degraded. 28 performance is degraded.
29 29
30 Other metrics that incorporate energy take the form of $Energy * Delay^n$. 30 Other metrics that incorporate energy take the form of $Energy * Delay^n$.
@@ -34,7 +34,7 @@ We argue that while the energy-delay product can be used as a @@ -34,7 +34,7 @@ We argue that while the energy-delay product can be used as a
34 \textit{constraint} to specify how much energy can be used to improve 34 \textit{constraint} to specify how much energy can be used to improve
35 performance. 35 performance.
36 % 36 %
37 -A effective constraint should be (1) relative to the applications inherent 37 +An effective constraint should be (1) relative to the applications inherent
38 energy needs and (2) independent of applications and devices. 38 energy needs and (2) independent of applications and devices.
39 % 39 %
40 Because it uses absolute energy, the energy-delay product meets neither of 40 Because it uses absolute energy, the energy-delay product meets neither of
@@ -57,7 +57,7 @@ inefficiency: $I = \frac{E}{E_{min}}$. @@ -57,7 +57,7 @@ inefficiency: $I = \frac{E}{E_{min}}$.
57 % 57 %
58 An \textit{inefficiency} of $1$ represents an application's most efficient 58 An \textit{inefficiency} of $1$ represents an application's most efficient
59 execution, while $1.5$ indicates that the application consumed $50\%$ more 59 execution, while $1.5$ indicates that the application consumed $50\%$ more
60 -energy that its most efficient execution. 60 +energy than its most efficient execution.
61 % 61 %
62 Inefficiency is independent of workloads and devices and avoids the problems 62 Inefficiency is independent of workloads and devices and avoids the problems
63 inherent to absolute energy constraints. 63 inherent to absolute energy constraints.
@@ -143,7 +143,7 @@ We propose two methods for computing $E_{min}$: @@ -143,7 +143,7 @@ We propose two methods for computing $E_{min}$:
143 143
144 \item \textbf{Predicting and learning:} The overhead of the $E_{min}$ computation 144 \item \textbf{Predicting and learning:} The overhead of the $E_{min}$ computation
145 can be further reduced by predicting $E_{min}$ based on previous observations 145 can be further reduced by predicting $E_{min}$ based on previous observations
146 - and learning continuously. 146 + and by continuous learning.
147 % 147 %
148 A variety of learning based approaches~\cite{li2009machine} have 148 A variety of learning based approaches~\cite{li2009machine} have
149 been proposed in the past to estimate various metrics and application phases 149 been proposed in the past to estimate various metrics and application phases
inefficiency_speedup.tex
@@ -7,7 +7,7 @@ the past @@ -7,7 +7,7 @@ the past
7 %researchers have used it 7 %researchers have used it
8 to make power performance trade-offs. To the best of our knowledge, prior 8 to make power performance trade-offs. To the best of our knowledge, prior
9 work has not studied the system level energy-performance trade-offs of combined 9 work has not studied the system level energy-performance trade-offs of combined
10 -CPU and memory DVFS. 10 +CPU and memory frequency scaling.
11 %considering the interaction between CPU and memory 11 %considering the interaction between CPU and memory
12 %frequency scaling. 12 %frequency scaling.
13 We take a first step and explore these trade-offs and show that incorrect 13 We take a first step and explore these trade-offs and show that incorrect
@@ -71,8 +71,8 @@ inefficiency budget as needed c) and deliver the best performance. @@ -71,8 +71,8 @@ inefficiency budget as needed c) and deliver the best performance.
71 %\end{enumerate} 71 %\end{enumerate}
72 72
73 Consequently, like other constraints used by algorithms such as performance, power and absolute energy, inefficiency 73 Consequently, like other constraints used by algorithms such as performance, power and absolute energy, inefficiency
74 -also allows energy management algorithms to waste system energy. We suggest  
75 -that, even though inefficiency doesn't completely eliminate the problem of 74 +also allows energy management algorithms to waste system energy. We argue
  75 +that, although inefficiency doesn't completely eliminate the problem of
76 wasting energy, it mitigates the problem. For example, rate limiting approaches 76 wasting energy, it mitigates the problem. For example, rate limiting approaches
77 waste energy as energy budget is specified for a given amount of time interval 77 waste energy as energy budget is specified for a given amount of time interval
78 and doesn't require a specific amount of work to be done within that budget. 78 and doesn't require a specific amount of work to be done within that budget.
introduction.tex
@@ -30,14 +30,14 @@ To better understand these systems, we characterize how the most performant @@ -30,14 +30,14 @@ To better understand these systems, we characterize how the most performant
30 CPU and memory frequency settings change for multiple workloads under various 30 CPU and memory frequency settings change for multiple workloads under various
31 energy constraints. 31 energy constraints.
32 32
33 -Our work represents two advances over previous efforts. 33 +Our work presents two advances over previous efforts.
34 % 34 %
35 First, while previous works have explored energy minimizations using DVFS 35 First, while previous works have explored energy minimizations using DVFS
36 under performance constraints focusing on reducing slack, we are the first to 36 under performance constraints focusing on reducing slack, we are the first to
37 study the potential DVFS settings under an energy constraint. 37 study the potential DVFS settings under an energy constraint.
38 % 38 %
39 Specifying performance constraints for servers is appropriate, since they are 39 Specifying performance constraints for servers is appropriate, since they are
40 -both wall-powered and have terms of service that must be met. 40 +both wall-powered and have quality of service constraints that must be met.
41 % 41 %
42 Therefore, they do not have to and cannot afford to sacrifice too much 42 Therefore, they do not have to and cannot afford to sacrifice too much
43 performance. 43 performance.
@@ -53,7 +53,7 @@ energy constraints and it is both application and device independent---unlike @@ -53,7 +53,7 @@ energy constraints and it is both application and device independent---unlike
53 existing metrics. 53 existing metrics.
54 54
55 Second, we are the first to characterize optimal frequency settings for 55 Second, we are the first to characterize optimal frequency settings for
56 -systems providing both CPU and memory DVFS. 56 +systems providing CPU DVFS and memory DFS.
57 % 57 %
58 We find that closely tracking the optimal settings during execution produces 58 We find that closely tracking the optimal settings during execution produces
59 many transitions and large frequency transition overhead. 59 many transitions and large frequency transition overhead.
@@ -65,7 +65,7 @@ We characterize the relationship between the amount of performance loss and @@ -65,7 +65,7 @@ We characterize the relationship between the amount of performance loss and
65 the rate of tuning for several benchmarks, and introduce the concepts of 65 the rate of tuning for several benchmarks, and introduce the concepts of
66 \textit{performance clusters} and \textit{stable regions} to aid the process. 66 \textit{performance clusters} and \textit{stable regions} to aid the process.
67 67
68 -We make following four contributions: 68 +We make the following contributions:
69 % 69 %
70 \begin{enumerate} 70 \begin{enumerate}
71 % 71 %
@@ -74,7 +74,7 @@ system to express the amount of extra energy that can be used to improve @@ -74,7 +74,7 @@ system to express the amount of extra energy that can be used to improve
74 performance. 74 performance.
75 % 75 %
76 \item We study the energy-performance trade-offs of systems that are capable 76 \item We study the energy-performance trade-offs of systems that are capable
77 -of both CPU and memory DVFS for multiple applications. We show that poor 77 +of CPU DVFS and memory DFS for multiple applications. We show that poor
78 frequency selection can hurt both performance and energy consumption. 78 frequency selection can hurt both performance and energy consumption.
79 % 79 %
80 \item We characterize the optimal frequency settings for multiple 80 \item We characterize the optimal frequency settings for multiple
@@ -87,7 +87,7 @@ management algorithms. @@ -87,7 +87,7 @@ management algorithms.
87 % 87 %
88 \end{enumerate} 88 \end{enumerate}
89 89
90 -We use the \texttt{gem5} simulator, the Android smartphone platform and Linux 90 +We use the \texttt{Gem5} simulator, the Android smartphone platform and Linux
91 kernel, and an empirical power model to (1) measure the inefficiency of 91 kernel, and an empirical power model to (1) measure the inefficiency of
92 several applications for a wide range of frequency settings, (2) compute 92 several applications for a wide range of frequency settings, (2) compute
93 performance clusters, and (3) study how performance clusters evolve. 93 performance clusters, and (3) study how performance clusters evolve.
@@ -112,4 +112,4 @@ studies their characteristics. @@ -112,4 +112,4 @@ studies their characteristics.
112 % 112 %
113 Section~\ref{sec-algo-implications} presents implications of 113 Section~\ref{sec-algo-implications} presents implications of
114 using performance clusters on energy-management algorithms, and 114 using performance clusters on energy-management algorithms, and
115 -Section~\ref{sec-conclusions} concludes. 115 +Section~\ref{sec-conclusions} summarizes and concludes the paper.
optimal_performance.tex
@@ -5,7 +5,7 @@ @@ -5,7 +5,7 @@
5 \centering 5 \centering
6 \includegraphics[width=\columnwidth]{figures/plots/496/2d_best_point_variation_mulineff/gobmk_2d_stable_point_mulineff_cpi_mpki.pdf} 6 \includegraphics[width=\columnwidth]{figures/plots/496/2d_best_point_variation_mulineff/gobmk_2d_stable_point_mulineff_cpi_mpki.pdf}
7 \vspace{-0.5em} 7 \vspace{-0.5em}
8 -\caption{\textbf{Optimal Performance Point for \text{Gobmk} Across Inefficiencies:} At 8 +\caption{\textbf{Optimal Performance Point for \textit{gobmk} Across Inefficiencies:} At
9 low inefficiency budgets, the optimal frequency settings follow CPI of the 9 low inefficiency budgets, the optimal frequency settings follow CPI of the
10 application, and select high memory frequencies for memory intensive phases. % with 10 application, and select high memory frequencies for memory intensive phases. % with
11 %high CPI. 11 %high CPI.
@@ -36,7 +36,7 @@ inefficiency budget is a function of workload.} @@ -36,7 +36,7 @@ inefficiency budget is a function of workload.}
36 \end{subfigure}% 36 \end{subfigure}%
37 \vspace{0.5em} 37 \vspace{0.5em}
38 \caption{\textbf{Performance Clusters of \textit{milc.}} 38 \caption{\textbf{Performance Clusters of \textit{milc.}}
39 -\textit{Milc} is CPU intensive to a large extent with some memory intensive 39 +\textit{milc} is CPU intensive to a large extent with some memory intensive
40 phases. At higher thresholds, while CPU frequency is tightly bound, performance 40 phases. At higher thresholds, while CPU frequency is tightly bound, performance
41 clusters cover a wide range of memory settings due to small performance 41 clusters cover a wide range of memory settings due to small performance
42 difference across these frequencies. } 42 difference across these frequencies. }
@@ -61,7 +61,7 @@ simulation noise, the algorithm selects the settings with highest CPU (first) @@ -61,7 +61,7 @@ simulation noise, the algorithm selects the settings with highest CPU (first)
61 and then memory frequency as this setting is bound to have highest performance among 61 and then memory frequency as this setting is bound to have highest performance among
62 the other possibilities. 62 the other possibilities.
63 63
64 -Figure~\ref{gobmk-optimal} plots the optimal settings for $gobmk$ for all 64 +Figure~\ref{gobmk-optimal} plots the optimal settings for \textit{gobmk} for all
65 benchmark samples (each of length 10~M instructions) across multiple 65 benchmark samples (each of length 10~M instructions) across multiple
66 inefficiency constraints. At low inefficiencies, the optimal settings follow 66 inefficiency constraints. At low inefficiencies, the optimal settings follow
67 the trends in CPI (cycles per instruction) and MPKI (misses per thousand 67 the trends in CPI (cycles per instruction) and MPKI (misses per thousand
performance_clusters.tex
@@ -4,7 +4,7 @@ @@ -4,7 +4,7 @@
4 \centering 4 \centering
5 \includegraphics[width=\columnwidth]{./figures/plots/496/stable_line_plots/lbm_stable_lineplot_annotated_5.pdf} 5 \includegraphics[width=\columnwidth]{./figures/plots/496/stable_line_plots/lbm_stable_lineplot_annotated_5.pdf}
6 \vspace{-0.5em} 6 \vspace{-0.5em}
7 -\caption{\textbf{Stable Regions and Transitions for \textit{Lbm} with 7 +\caption{\textbf{Stable Regions and Transitions for \textit{lbm} with
8 Threshold of 5\% and Inefficiency Budget of 1.3:} Solid lines represent the 8 Threshold of 5\% and Inefficiency Budget of 1.3:} Solid lines represent the
9 stable regions and vertical dashed lines mark the transitions made by 9 stable regions and vertical dashed lines mark the transitions made by
10 \textit{lbm}.} 10 \textit{lbm}.}
system_methodology.tex
@@ -25,7 +25,7 @@ performance for energy savings. @@ -25,7 +25,7 @@ performance for energy savings.
25 %voltage could result in data corruption since the memory array itself is 25 %voltage could result in data corruption since the memory array itself is
26 %asynchronous. 26 %asynchronous.
27 As no current hardware systems support memory frequency scaling, 27 As no current hardware systems support memory frequency scaling,
28 -we resort to Gem5~\cite{Binkert:gem5}, a cycle-accurate full system simulator 28 +we resort to \texttt{Gem5}~\cite{Binkert:gem5}, a cycle-accurate full system simulator
29 %as a platform 29 %as a platform
30 to perform our studies. 30 to perform our studies.
31 31
@@ -34,21 +34,21 @@ to perform our studies. @@ -34,21 +34,21 @@ to perform our studies.
34 \centering 34 \centering
35 \includegraphics[width=0.75\columnwidth]{./figures/plots/systemBlockDiagram.pdf} 35 \includegraphics[width=0.75\columnwidth]{./figures/plots/systemBlockDiagram.pdf}
36 \caption{\textbf{System Block Diagram}: Blocks that are newly added or 36 \caption{\textbf{System Block Diagram}: Blocks that are newly added or
37 - significantly modified from Gem5 origin implementation are shaded.} 37 + significantly modified from \texttt{Gem5} origin implementation are shaded.}
38 \label{fig-system-block-diag} 38 \label{fig-system-block-diag}
39 \end{figure} 39 \end{figure}
40 40
41 %We envision a system that consists of a CPU capable of tuning its voltage and 41 %We envision a system that consists of a CPU capable of tuning its voltage and
42 %frequency and memory that supports frequency scaling. 42 %frequency and memory that supports frequency scaling.
43 -Current Gem5 versions provide the infrastructure necessary to change CPU  
44 -frequency and voltage; we extended Gem5 DVFS to incorporate memory frequency  
45 -scaling. As shown in Figure~\ref{fig-system-block-diag}, Gem5 provides a DVFS 43 +Current \texttt{Gem5} versions provide the infrastructure necessary to change CPU
  44 +frequency and voltage; we extended \texttt{Gem5} DVFS to incorporate memory frequency
  45 +scaling. As shown in Figure~\ref{fig-system-block-diag}, \texttt{Gem5} provides a DVFS
46 controller device that provides an interface to control frequency by the OS at 46 controller device that provides an interface to control frequency by the OS at
47 runtime. We developed a memory frequency governor similar to existing Linux CPU 47 runtime. We developed a memory frequency governor similar to existing Linux CPU
48 frequency governors. Timing and current parameters of DRAM are scaled with its 48 frequency governors. Timing and current parameters of DRAM are scaled with its
49 frequency as described in the technical note from Micron~\cite{micronpower-TN-url}. 49 frequency as described in the technical note from Micron~\cite{micronpower-TN-url}.
50 %that are capable of tuning memory frequency at runtime. 50 %that are capable of tuning memory frequency at runtime.
51 -The blocks that we added or significantly modified from Gem5's original 51 +The blocks that we added or significantly modified from \texttt{Gem5}'s original
52 implementation are shaded in Figure~\ref{fig-system-block-diag}. 52 implementation are shaded in Figure~\ref{fig-system-block-diag}.
53 53
54 \begin{figure*}[t] 54 \begin{figure*}[t]
@@ -75,15 +75,15 @@ and degrade performance simultaneously.} @@ -75,15 +75,15 @@ and degrade performance simultaneously.}
75 75
76 \subsection{Energy Models} 76 \subsection{Energy Models}
77 \label{subsec-energy-models} 77 \label{subsec-energy-models}
78 -We developed energy models for the CPU and DRAM for our studies. Gem5 comes 78 +We developed energy models for the CPU and DRAM for our studies. \texttt{Gem5} comes
79 with the energy models for various DRAM chipsets. The 79 with the energy models for various DRAM chipsets. The
80 -DRAMPower~\cite{drampower-tool} model is integrated into Gem5 and computes the 80 +DRAMPower~\cite{drampower-tool} model is integrated into \texttt{Gem5} and computes the
81 memory energy consumption periodically during the benchmark execution. However, 81 memory energy consumption periodically during the benchmark execution. However,
82 -Gem5 lacks a model for CPU energy consumption. We developed a processor power 82 +\texttt{Gem5} lacks a model for CPU energy consumption. We developed a processor power
83 model based on empirical measurements of a PandaBoard~\cite{pandaboard-url} 83 model based on empirical measurements of a PandaBoard~\cite{pandaboard-url}
84 evaluation board. The board includes a OMAP4430~chipset with a Cortex~A9 84 evaluation board. The board includes a OMAP4430~chipset with a Cortex~A9
85 processor; this chipset is used in the mobile platform we want to emulate, the 85 processor; this chipset is used in the mobile platform we want to emulate, the
86 -Samsung Nexus S. We ran microbenchmarks designed to stress the PandaBoard to 86 +Galaxy Nexus S. We ran microbenchmarks designed to stress the PandaBoard to
87 its full utilization and measured power consumed using an Agilent~34411A 87 its full utilization and measured power consumed using an Agilent~34411A
88 multimeter. Because of the limitations of the platform, we could only measure 88 multimeter. Because of the limitations of the platform, we could only measure
89 peak dynamic power. Therefore, to model different voltage levels we scaled it 89 peak dynamic power. Therefore, to model different voltage levels we scaled it
@@ -97,7 +97,7 @@ processor is not computing, but unlike leakage power, background power scales @@ -97,7 +97,7 @@ processor is not computing, but unlike leakage power, background power scales
97 with clock frequency. We measure background power by calculating the 97 with clock frequency. We measure background power by calculating the
98 difference between the CPU power consumption in its power on idle state and 98 difference between the CPU power consumption in its power on idle state and
99 deep sleep mode (not clocked). Because background power is clocked, it is 99 deep sleep mode (not clocked). Because background power is clocked, it is
100 -scaled in a similar manner to dynamic power. Leakage power comprises up to 100 +scaled in a similar manner to dynamic power. Leakage power comprises up to
101 30\% of microprocessor peak power consumption~\cite{power7} and is linearly 101 30\% of microprocessor peak power consumption~\cite{power7} and is linearly
102 proportional to supply voltage~\cite{leakage-islped02}. 102 proportional to supply voltage~\cite{leakage-islped02}.
103 103
@@ -109,8 +109,8 @@ proportional to supply voltage~\cite{leakage-islped02}. @@ -109,8 +109,8 @@ proportional to supply voltage~\cite{leakage-islped02}.
109 109
110 \subsection{Experimental Methodology} 110 \subsection{Experimental Methodology}
111 Our simulation infrastructure is based on Android~4.1.1 ``Jelly Bean'' run on 111 Our simulation infrastructure is based on Android~4.1.1 ``Jelly Bean'' run on
112 -the Gem5 full system simulator. We use default core configuration provided by  
113 -Gem5 in revision 10585, that is designed to reflect ARM Cortex-A15 processor 112 +the \texttt{Gem5} full system simulator. We use default core configuration provided by
  113 +\texttt{Gem5} in revision 10585, that is designed to reflect ARM Cortex-A15 processor
114 with L1 cache size of 64~KB with access latency of 2 core cycles and a unified 114 with L1 cache size of 64~KB with access latency of 2 core cycles and a unified
115 L2 cache of size 2~MB with hit latency of 12 core cycles. The CPU and caches 115 L2 cache of size 2~MB with hit latency of 12 core cycles. The CPU and caches
116 operate under the same clock domain. For our purposes, we have configured the 116 operate under the same clock domain. For our purposes, we have configured the
@@ -147,12 +147,12 @@ benchmarks that have interesting and unique phases. @@ -147,12 +147,12 @@ benchmarks that have interesting and unique phases.
147 %hours. 147 %hours.
148 148
149 We collected samples of a fixed amount of work so that each sample would 149 We collected samples of a fixed amount of work so that each sample would
150 -represent the same work even across different frequencies. In Gem5, we collected 150 +represent the same work even across different frequencies. In \texttt{Gem5}, we collected
151 performance and energy consumption data every 10~million user mode 151 performance and energy consumption data every 10~million user mode
152 instructions. 152 instructions.
153 %this fixed sample of work makes . 153 %this fixed sample of work makes .
154 %By collecting data for a fixed amount of work (instructions) we are able to study frequency scaling for workloads; the alternative sampling in time . 154 %By collecting data for a fixed amount of work (instructions) we are able to study frequency scaling for workloads; the alternative sampling in time .
155 -Gem5 provides a mechanism to distinguish between user mode and 155 +\texttt{Gem5} provides a mechanism to distinguish between user mode and
156 kernel mode instructions. We used this feature to remove periodic OS traffic and enable a fair comparison 156 kernel mode instructions. We used this feature to remove periodic OS traffic and enable a fair comparison
157 across simulations of different CPU and memory frequencies. We used the collected 157 across simulations of different CPU and memory frequencies. We used the collected
158 performance and energy data to study the impact of workload dynamics on the 158 performance and energy data to study the impact of workload dynamics on the
@@ -162,7 +162,7 @@ a given inefficiency budget. Note that, all our studies are performed using @@ -162,7 +162,7 @@ a given inefficiency budget. Note that, all our studies are performed using
162 performance or energy. The interplay of performance and energy consumption of 162 performance or energy. The interplay of performance and energy consumption of
163 CPU and memory frequency scaling is complex as pointed by 163 CPU and memory frequency scaling is complex as pointed by
164 CoScale~\cite{deng2012coscale}. In the next Section, we measure and characterize 164 CoScale~\cite{deng2012coscale}. In the next Section, we measure and characterize
165 -the larger space of all system level performance and energy trade-offs 165 +the larger space of system level performance and energy trade-offs
166 of various CPU and memory frequency settings. 166 of various CPU and memory frequency settings.
167 167
168 %Although individual energy-performance trade-offs of DVFS for CPU and 168 %Although individual energy-performance trade-offs of DVFS for CPU and