Commit 787d6a3551c2fdc6e865c54ffce9c0b9368ebad6

Authored by Anudipa Maiti
1 parent 03df0645

made some small changes

conclusion.tex
... ... @@ -8,7 +8,7 @@ measure by describing the multiple ways in which it would aid in the
8 8 management of energy and other resources on battery-powered smartphones.
9 9 Using an energy consumption dataset collected on \PhoneLab{} we have explored
10 10 separately several potential inputs to a value measure and determined how
11   -they weight energy consumption. And finally, we have presented results from a
  11 +they weight energy consumption. Finally, we have presented results from a
12 12 failed effort to formulate an effective value measure. While this first
13 13 attempt was unsuccessful, we hope to engage the mobile systems community in
14 14 this effort so that more sophisticated and successful value measures can be
... ...
conclusion.tex~ 0 → 100644
  1 +\section{Conclusions}
  2 +\label{sec-conclusion}
  3 +
  4 +To conclude, we have argued that our inability to estimate app value is a
  5 +critical weakness that is threatening our successes at accurately estimating
  6 +and attributing energy consumption. We have motivated the need for a value
  7 +measure by describing the multiple ways in which it would aid in the
  8 +management of energy and other resources on battery-powered smartphones.
  9 +Using an energy consumption dataset collected on \PhoneLab{} we have explored
  10 +separately several potential inputs to a value measure and determined how
  11 +they weight energy consumption. And finally, we have presented results from a
  12 +failed effort to formulate an effective value measure. While this first
  13 +attempt was unsuccessful, we hope to engage the mobile systems community in
  14 +this effort so that more sophisticated and successful value measures can be
  15 +developed.
  16 +
  17 +\section*{Acknowledgments}
  18 +
  19 +Students and faculty working on estimating app value are supported by NSF
  20 +awards
  21 +\href{http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1205656}{1205656}
  22 +and
  23 +\href{http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1423215}{1423215}.
  24 +The authors thank the anonymous reviewers for their feedback.
... ...
introduction.tex
... ... @@ -3,7 +3,7 @@
3 3 Measuring app energy consumption\footnote{\small To avoid confusion between
4 4 app and energy usage, we use \textit{consumption} exclusively when referring
5 5 to energy usage and \textit{usage} exclusively when referring to user
6   -interaction with apps.} on mobile devices is nearly a solved problem, due to
  6 +interaction with apps.} on mobile devices is nearly a solved problem. This is due to
7 7 great strides made in both generating and validating energy models that
8 8 deliver accurate runtime energy consumption
9 9 estimates~\cite{mansdi,vedge-nsdi13,pathak2011,pathak2012,yoon} and in
... ... @@ -23,7 +23,7 @@ including:
23 23  
24 24 \item Will this change to an app make it more energy efficient?
25 25  
26   -\item Is a particular app an energy virus?
  26 +\item Is a particular app an \textit{energy virus}?
27 27  
28 28 \item How should the limited energy resources on a given app be prioritized?
29 29  
... ... @@ -41,8 +41,8 @@ of apps in order to evaluate two video conferencing tools, web browsers, or
41 41 email clients. Developers can determine whether a new feature delivers value
42 42 more or less efficiently than the rest of their app and better understand the
43 43 differences in energy consumption across different users. Measuring value
44   -allows a rigorous definition of an \textit{energy virus} as an app that
45   -delivers little or no value per joule, and for systems to reward efficient
  44 +allows a rigorous definition of an \textit{energy virus} as \textit{an app that
  45 +delivers little or no value per joule}, and for systems to reward efficient
46 46 apps by prioritizing limited resources based on app value or energy
47 47 efficiency. After all the progress we have made in computing the
48 48 denominator---energy consumption---we believe that the search for the missing
... ... @@ -56,7 +56,7 @@ different usage patterns. It must be efficient to compute, since it should
56 56 not compete for the same limited energy resources that it is intended to help
57 57 manage. Ideally it should require little to no user input, since this will
58 58 make it burdensome and error-prone. And to make matters worse, there is no
59   -obvious way to measure ground truth to compare against---even in the lab.
  59 +obvious way to measure ground truth to compare against---even in a lab.
60 60 Despite all these challenges, however, even a semi-accurate value measure
61 61 would greatly benefit energy management on battery-constrained smartphones.
62 62 With users continuing to report battery lifetime as their top concern with
... ...
results.tex
... ... @@ -28,7 +28,7 @@ component (the Fuel Gauge) lacks. Changes were distributed to \PhoneLab{}
28 28 participants in November, 2013, via an over-the-air (OTA) platform update.
29 29 The resulting 2~month dataset of 67~GB of compressed log files represents
30 30 \num{6806} user days during which \num{1328}~apps were started \num{277785}
31   -times and used for a total of \num{15224} hours of active use by
  31 +times, and used for a total of \num{15224} hours of active use by
32 32 107~\PhoneLab{} participants.
33 33  
34 34 Our analysis begins by investigating several components of a possible value
... ... @@ -37,7 +37,8 @@ consumed by each app. Next, we formulate a simple measure of content
37 37 delivery by measuring usage of the screen and audio output devices and test
38 38 it through a survey completed by 47~experiment participants. Unfortunately,
39 39 our results are inconclusive and open to several possible interpretations
40   -which we discuss.
  40 +which we discuss. We present our results in tabular format where for each measure we
  41 +rank 10 best performing and 10 worst performing apps in desending order.
41 42  
42 43 \newpage
43 44  
... ... @@ -152,6 +153,6 @@ interpreted as a sign that we need a more sophisticated value measure
152 153 incorporating more of the potential inputs we have previously discussed.
153 154 However, on one level the results are very encouraging: most users were
154 155 willing to consider removing one or more apps if that app would improve their
155   -battery lifetime. Clearly users are making this decision based on some idea
  156 +battery lifetime. Clearly, users are making this decision based on some idea
156 157 of each app's value---the challenge is to replicate their choices using the
157 158 information we have available to us.
... ...
results.tex~ 0 → 100644
  1 +\section{Results}
  2 +\label{sec-results}
  3 +
  4 +To examine the potential components of a value measure further, we utilize a
  5 +large dataset of energy consumption measurements collected by an IRB-approved
  6 +experiment run on the \PhoneLab{} testbed. \PhoneLab{} is a public smartphone
  7 +platform testbed located at the University at
  8 +Buffalo~\cite{phonelab-sensemine13}. 220~students, faculty, and staff carry
  9 +instrumented Android Nexus~5 smartphones and receive subsidized service in
  10 +return for willingness to participate in experiments. \PhoneLab{} provides
  11 +access to a representative group of participants balanced between genders and
  12 +across a wide variety of age brackets, making our results more
  13 +representative.
  14 +
  15 +Understanding fine-grained energy consumption dynamics required more
  16 +information than Android normally exposes to apps. In addition, to explore
  17 +components of our value measure we also wanted to capture information about
  18 +app usage---including foreground and background time and use of the display
  19 +and audio interface---that was not possible to measure on unmodified Android
  20 +devices. So to collect our dataset we took advantage of \PhoneLab{}'s ability
  21 +to modify the Android platform itself. We instrumented the
  22 +\texttt{SurfaceFlinger} and \texttt{AudioFlinger} components in the Android platform
  23 +to record usage of the screen and audio, and altered the ActivityManagerService
  24 +package to record energy consumption at each app transition. This allows energy
  25 +consumption by components such as the screen to be accurately attributed to
  26 +the foreground app, a feature that Android's internal battery monitoring
  27 +component (the Fuel Gauge) lacks. Changes were distributed to \PhoneLab{}
  28 +participants in November, 2013, via an over-the-air (OTA) platform update.
  29 +The resulting 2~month dataset of 67~GB of compressed log files represents
  30 +\num{6806} user days during which \num{1328}~apps were started \num{277785}
  31 +times, and used for a total of \num{15224} hours of active use by
  32 +107~\PhoneLab{} participants.
  33 +
  34 +Our analysis begins by investigating several components of a possible value
  35 +measure and shows the effect of using each to weight the overall energy
  36 +consumed by each app. Next, we formulate a simple measure of content
  37 +delivery by measuring usage of the screen and audio output devices and test
  38 +it through a survey completed by 47~experiment participants. Unfortunately,
  39 +our results are inconclusive and open to several possible interpretations
  40 +which we discuss.
  41 +
  42 +\newpage
  43 +
  44 +\subsection{Total Energy}
  45 +
  46 +\input{./figures/tables/tableALL.tex}
  47 +
  48 +Clearly, ranking apps by total energy consumption computed by adding all
  49 +foreground and background energy consumptions over the entire study says
  50 +much more about app popularity than it does about anything else.
  51 +Table~\ref{table-total} shows the top and bottom energy-consuming apps over
  52 +the entire study. As expected, popular apps such as the Android Browser,
  53 +Facebook, and the Android Phone component consume the most energy, while the
  54 +list of low consumers is dominated by apps with few installs. This table does
  55 +serve, however, to identify the popular apps in use by \PhoneLab{}
  56 +participants, and as a point of comparison for the remainder of our results.
  57 +
  58 +\subsection{Power}
  59 +
  60 +Computing each app's power consumption by scaling their total energy usage
  61 +against the total time they were running, either in the background or
  62 +foreground, reveals more information, as shown in Table~\ref{table-rate}. Our
  63 +results identify Facebook Messenger, Google+, and the Super-Bright LED
  64 +Flashlight as apps that rapidly-consume energy, while the Bank of America and
  65 +Weather Channel apps consume energy slowly. Differences between apps in
  66 +similar categories may begin to identify apps with problematic energy
  67 +consumption, such as contrasting the high energy usage of Facebook Messenger
  68 +with other messaging clients such as WhatsApp, Twitter, and Android
  69 +Messaging.
  70 +
  71 +\subsection{Foreground Energy Efficiency}
  72 +
  73 +Isolating the foreground component of execution time provides a better
  74 +measure of value, since it ignores the time that users spend ignoring apps.
  75 +Table~\ref{table-foreground} shows a measure of energy efficiency computed by
  76 +%utilizing foreground time alone as our value measure.
  77 +dividing total foreground energy consumption by total foreground time of an
  78 +app. Some surprising changes
  79 +from the power results can be seen. A number of apps have remained in their former
  80 +categories: Bank of America, which was identified as a low-power app, is also
  81 +a highly-efficient app when using foreground time as the value measure; and
  82 +Facebook Messenger, which was identified as a high-power app, is also marked
  83 +as inefficient. Other apps, however, have switched categories. ESPN
  84 +Sportscenter and Yahoo Mail do not consume much power, but also don't spend
  85 +much time in the foreground; interestingly, none of the high-power apps
  86 +looked better when their foreground usage was considered.
  87 +
  88 +\subsection{Content Energy Efficiency}
  89 +
  90 +Finally, we use the data we collected by instrumenting the
  91 +\texttt{SurfaceFlinger} and \texttt{AudioFlinger} components to compute a
  92 +simple measure of content delivery. We measure the audio and video frame
  93 +rates and combine them into a single measure by using bit-rates corresponding
  94 +to a 30~fps YouTube-encoded video and 128~kbps two-channel audio, with the
  95 +weights representing the fact that a single frame of video contains much more
  96 +content than a single sample of audio. We use this combined metric as the
  97 +value measure and again use it to weight the energy consumption of each app,
  98 +with the results shown in Table~\ref{table-content}.
  99 +
  100 +Comparing with the foreground energy efficiency again shows several
  101 +interesting changes. Yahoo Mail, which foreground energy efficiency marked as
  102 +inefficient, looks more efficient when content delivery is considered. While
  103 +it is possible that one \PhoneLab{} participant uses it to read email very
  104 +quickly, it may be more likely that it uses a ``spinner'' or other fancy UI
  105 +elements that generate artificially high frame rates without delivering much
  106 +information. The inability to distinguish between meaningless and meaningful
  107 +video frame content is a significant weakness of this simple approach.
  108 +YouTube and Candy Crush Saga both earn high marks, which is encouraging given
  109 +that they are very different apps but also might be a result of overweighting
  110 +screen refreshes. The Android Clock is also an unsurprising result, as it
  111 +requires almost no energy to generate a relatively-large number of screen
  112 +redraws in timer and stopwatch mode.
  113 +
  114 +\subsection{Survey Results and Discussion}
  115 +
  116 +\begin{figure*}[t]
  117 +\centering
  118 +\includegraphics[width=\textwidth]{./figures/survey.pdf}
  119 +
  120 +\caption{\small \textbf{Survey Results.} The height of each bar demonstrates how
  121 +many of the suggested apps the user is willing to remove for better battery
  122 +life, with suggestions based on overall usage or our new content-delivery
  123 +efficiency measure. Our new measure does not convincingly out-perform the
  124 +straw man.}
  125 +
  126 +\label{fig-survey}
  127 +
  128 +\end{figure*}
  129 +
  130 +To continue the evaluation of our simple content-based value measure, we
  131 +prepared a survey for the 107~\PhoneLab{} participants who contributed data
  132 +to our experiment. Our goal was to determine if users would be more willing
  133 +to remove inefficient apps, as defined using our content-based measure. As a
  134 +baseline, we also asked users about the apps that consumed the most energy.
  135 +We used each participants data to generate a custom survey containing
  136 +questions about 9 apps: the 3 least efficient apps as computed by our
  137 +content-based value measure, the 3 apps that used the most energy on their
  138 +smartphone during the experiment, and 3 apps chosen at random. For each we
  139 +asked them a simple question: ``If it would improve your battery life, would
  140 +you uninstall or stop using this app?'' To compute an aggregate score for
  141 +both the content-based and usage based measures, we give each measure 1~point
  142 +for a ``Yes'', 0.5~points for a ``Maybe'' and 0~points for a ``No''.
  143 +47~participants completed the survey, and the results are shown in
  144 +Figure~\ref{fig-survey}. For each user, if the score of one measure is higher
  145 +than the other, it is considered a ``win'' for the former.
  146 +
  147 +Overall the results are inconclusive, with the content-delivery measure not
  148 +clearly outperforming the straw-man usage measure at predicting which apps
  149 +each user would be willing to remove to save battery life. Given the crude
  150 +nature of our metric, this is not particularly surprising, and can be
  151 +interpreted as a sign that we need a more sophisticated value measure
  152 +incorporating more of the potential inputs we have previously discussed.
  153 +However, on one level the results are very encouraging: most users were
  154 +willing to consider removing one or more apps if that app would improve their
  155 +battery lifetime. Clearly, users are making this decision based on some idea
  156 +of each app's value---the challenge is to replicate their choices using the
  157 +information we have available to us.
... ...
usage.tex
... ... @@ -143,7 +143,7 @@ same approach can also be applied to determine how much of any limited system
143 143 resource to allocate to each app,
144 144 %with high-value apps gaining priority over
145 145 %the processor, memory allocation, networking bandwidth and limited storage.
146   -Together these resources allocation measures can be designed to ensure that
  146 +Together these resource allocation measures can be designed to ensure that
147 147 high-value apps run smoothly at the expense of lower-value apps.
148 148  
149 149 \subsection{Summary of Requirements}
... ...