From 787d6a3551c2fdc6e865c54ffce9c0b9368ebad6 Mon Sep 17 00:00:00 2001 From: anudipa Date: Sat, 27 Dec 2014 13:19:10 -0500 Subject: [PATCH] made some small changes --- conclusion.tex | 2 +- conclusion.tex~ | 24 ++++++++++++++++++++++++ introduction.tex | 10 +++++----- results.tex | 7 ++++--- results.tex~ | 157 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ usage.tex | 2 +- 6 files changed, 192 insertions(+), 10 deletions(-) create mode 100644 conclusion.tex~ create mode 100644 results.tex~ diff --git a/conclusion.tex b/conclusion.tex index 9437d8a..98b1359 100644 --- a/conclusion.tex +++ b/conclusion.tex @@ -8,7 +8,7 @@ measure by describing the multiple ways in which it would aid in the management of energy and other resources on battery-powered smartphones. Using an energy consumption dataset collected on \PhoneLab{} we have explored separately several potential inputs to a value measure and determined how -they weight energy consumption. And finally, we have presented results from a +they weight energy consumption. Finally, we have presented results from a failed effort to formulate an effective value measure. While this first attempt was unsuccessful, we hope to engage the mobile systems community in this effort so that more sophisticated and successful value measures can be diff --git a/conclusion.tex~ b/conclusion.tex~ new file mode 100644 index 0000000..9437d8a --- /dev/null +++ b/conclusion.tex~ @@ -0,0 +1,24 @@ +\section{Conclusions} +\label{sec-conclusion} + +To conclude, we have argued that our inability to estimate app value is a +critical weakness that is threatening our successes at accurately estimating +and attributing energy consumption. We have motivated the need for a value +measure by describing the multiple ways in which it would aid in the +management of energy and other resources on battery-powered smartphones. +Using an energy consumption dataset collected on \PhoneLab{} we have explored +separately several potential inputs to a value measure and determined how +they weight energy consumption. And finally, we have presented results from a +failed effort to formulate an effective value measure. While this first +attempt was unsuccessful, we hope to engage the mobile systems community in +this effort so that more sophisticated and successful value measures can be +developed. + +\section*{Acknowledgments} + +Students and faculty working on estimating app value are supported by NSF +awards +\href{http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1205656}{1205656} +and +\href{http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1423215}{1423215}. +The authors thank the anonymous reviewers for their feedback. diff --git a/introduction.tex b/introduction.tex index 30d9499..ef062e6 100644 --- a/introduction.tex +++ b/introduction.tex @@ -3,7 +3,7 @@ Measuring app energy consumption\footnote{\small To avoid confusion between app and energy usage, we use \textit{consumption} exclusively when referring to energy usage and \textit{usage} exclusively when referring to user -interaction with apps.} on mobile devices is nearly a solved problem, due to +interaction with apps.} on mobile devices is nearly a solved problem. This is due to great strides made in both generating and validating energy models that deliver accurate runtime energy consumption estimates~\cite{mansdi,vedge-nsdi13,pathak2011,pathak2012,yoon} and in @@ -23,7 +23,7 @@ including: \item Will this change to an app make it more energy efficient? -\item Is a particular app an energy virus? +\item Is a particular app an \textit{energy virus}? \item How should the limited energy resources on a given app be prioritized? @@ -41,8 +41,8 @@ of apps in order to evaluate two video conferencing tools, web browsers, or email clients. Developers can determine whether a new feature delivers value more or less efficiently than the rest of their app and better understand the differences in energy consumption across different users. Measuring value -allows a rigorous definition of an \textit{energy virus} as an app that -delivers little or no value per joule, and for systems to reward efficient +allows a rigorous definition of an \textit{energy virus} as \textit{an app that +delivers little or no value per joule}, and for systems to reward efficient apps by prioritizing limited resources based on app value or energy efficiency. After all the progress we have made in computing the denominator---energy consumption---we believe that the search for the missing @@ -56,7 +56,7 @@ different usage patterns. It must be efficient to compute, since it should not compete for the same limited energy resources that it is intended to help manage. Ideally it should require little to no user input, since this will make it burdensome and error-prone. And to make matters worse, there is no -obvious way to measure ground truth to compare against---even in the lab. +obvious way to measure ground truth to compare against---even in a lab. Despite all these challenges, however, even a semi-accurate value measure would greatly benefit energy management on battery-constrained smartphones. With users continuing to report battery lifetime as their top concern with diff --git a/results.tex b/results.tex index f47ac12..9159214 100644 --- a/results.tex +++ b/results.tex @@ -28,7 +28,7 @@ component (the Fuel Gauge) lacks. Changes were distributed to \PhoneLab{} participants in November, 2013, via an over-the-air (OTA) platform update. The resulting 2~month dataset of 67~GB of compressed log files represents \num{6806} user days during which \num{1328}~apps were started \num{277785} -times and used for a total of \num{15224} hours of active use by +times, and used for a total of \num{15224} hours of active use by 107~\PhoneLab{} participants. Our analysis begins by investigating several components of a possible value @@ -37,7 +37,8 @@ consumed by each app. Next, we formulate a simple measure of content delivery by measuring usage of the screen and audio output devices and test it through a survey completed by 47~experiment participants. Unfortunately, our results are inconclusive and open to several possible interpretations -which we discuss. +which we discuss. We present our results in tabular format where for each measure we +rank 10 best performing and 10 worst performing apps in desending order. \newpage @@ -152,6 +153,6 @@ interpreted as a sign that we need a more sophisticated value measure incorporating more of the potential inputs we have previously discussed. However, on one level the results are very encouraging: most users were willing to consider removing one or more apps if that app would improve their -battery lifetime. Clearly users are making this decision based on some idea +battery lifetime. Clearly, users are making this decision based on some idea of each app's value---the challenge is to replicate their choices using the information we have available to us. diff --git a/results.tex~ b/results.tex~ new file mode 100644 index 0000000..acb8815 --- /dev/null +++ b/results.tex~ @@ -0,0 +1,157 @@ +\section{Results} +\label{sec-results} + +To examine the potential components of a value measure further, we utilize a +large dataset of energy consumption measurements collected by an IRB-approved +experiment run on the \PhoneLab{} testbed. \PhoneLab{} is a public smartphone +platform testbed located at the University at +Buffalo~\cite{phonelab-sensemine13}. 220~students, faculty, and staff carry +instrumented Android Nexus~5 smartphones and receive subsidized service in +return for willingness to participate in experiments. \PhoneLab{} provides +access to a representative group of participants balanced between genders and +across a wide variety of age brackets, making our results more +representative. + +Understanding fine-grained energy consumption dynamics required more +information than Android normally exposes to apps. In addition, to explore +components of our value measure we also wanted to capture information about +app usage---including foreground and background time and use of the display +and audio interface---that was not possible to measure on unmodified Android +devices. So to collect our dataset we took advantage of \PhoneLab{}'s ability +to modify the Android platform itself. We instrumented the +\texttt{SurfaceFlinger} and \texttt{AudioFlinger} components in the Android platform +to record usage of the screen and audio, and altered the ActivityManagerService +package to record energy consumption at each app transition. This allows energy +consumption by components such as the screen to be accurately attributed to +the foreground app, a feature that Android's internal battery monitoring +component (the Fuel Gauge) lacks. Changes were distributed to \PhoneLab{} +participants in November, 2013, via an over-the-air (OTA) platform update. +The resulting 2~month dataset of 67~GB of compressed log files represents +\num{6806} user days during which \num{1328}~apps were started \num{277785} +times, and used for a total of \num{15224} hours of active use by +107~\PhoneLab{} participants. + +Our analysis begins by investigating several components of a possible value +measure and shows the effect of using each to weight the overall energy +consumed by each app. Next, we formulate a simple measure of content +delivery by measuring usage of the screen and audio output devices and test +it through a survey completed by 47~experiment participants. Unfortunately, +our results are inconclusive and open to several possible interpretations +which we discuss. + +\newpage + +\subsection{Total Energy} + +\input{./figures/tables/tableALL.tex} + +Clearly, ranking apps by total energy consumption computed by adding all +foreground and background energy consumptions over the entire study says +much more about app popularity than it does about anything else. +Table~\ref{table-total} shows the top and bottom energy-consuming apps over +the entire study. As expected, popular apps such as the Android Browser, +Facebook, and the Android Phone component consume the most energy, while the +list of low consumers is dominated by apps with few installs. This table does +serve, however, to identify the popular apps in use by \PhoneLab{} +participants, and as a point of comparison for the remainder of our results. + +\subsection{Power} + +Computing each app's power consumption by scaling their total energy usage +against the total time they were running, either in the background or +foreground, reveals more information, as shown in Table~\ref{table-rate}. Our +results identify Facebook Messenger, Google+, and the Super-Bright LED +Flashlight as apps that rapidly-consume energy, while the Bank of America and +Weather Channel apps consume energy slowly. Differences between apps in +similar categories may begin to identify apps with problematic energy +consumption, such as contrasting the high energy usage of Facebook Messenger +with other messaging clients such as WhatsApp, Twitter, and Android +Messaging. + +\subsection{Foreground Energy Efficiency} + +Isolating the foreground component of execution time provides a better +measure of value, since it ignores the time that users spend ignoring apps. +Table~\ref{table-foreground} shows a measure of energy efficiency computed by +%utilizing foreground time alone as our value measure. +dividing total foreground energy consumption by total foreground time of an +app. Some surprising changes +from the power results can be seen. A number of apps have remained in their former +categories: Bank of America, which was identified as a low-power app, is also +a highly-efficient app when using foreground time as the value measure; and +Facebook Messenger, which was identified as a high-power app, is also marked +as inefficient. Other apps, however, have switched categories. ESPN +Sportscenter and Yahoo Mail do not consume much power, but also don't spend +much time in the foreground; interestingly, none of the high-power apps +looked better when their foreground usage was considered. + +\subsection{Content Energy Efficiency} + +Finally, we use the data we collected by instrumenting the +\texttt{SurfaceFlinger} and \texttt{AudioFlinger} components to compute a +simple measure of content delivery. We measure the audio and video frame +rates and combine them into a single measure by using bit-rates corresponding +to a 30~fps YouTube-encoded video and 128~kbps two-channel audio, with the +weights representing the fact that a single frame of video contains much more +content than a single sample of audio. We use this combined metric as the +value measure and again use it to weight the energy consumption of each app, +with the results shown in Table~\ref{table-content}. + +Comparing with the foreground energy efficiency again shows several +interesting changes. Yahoo Mail, which foreground energy efficiency marked as +inefficient, looks more efficient when content delivery is considered. While +it is possible that one \PhoneLab{} participant uses it to read email very +quickly, it may be more likely that it uses a ``spinner'' or other fancy UI +elements that generate artificially high frame rates without delivering much +information. The inability to distinguish between meaningless and meaningful +video frame content is a significant weakness of this simple approach. +YouTube and Candy Crush Saga both earn high marks, which is encouraging given +that they are very different apps but also might be a result of overweighting +screen refreshes. The Android Clock is also an unsurprising result, as it +requires almost no energy to generate a relatively-large number of screen +redraws in timer and stopwatch mode. + +\subsection{Survey Results and Discussion} + +\begin{figure*}[t] +\centering +\includegraphics[width=\textwidth]{./figures/survey.pdf} + +\caption{\small \textbf{Survey Results.} The height of each bar demonstrates how +many of the suggested apps the user is willing to remove for better battery +life, with suggestions based on overall usage or our new content-delivery +efficiency measure. Our new measure does not convincingly out-perform the +straw man.} + +\label{fig-survey} + +\end{figure*} + +To continue the evaluation of our simple content-based value measure, we +prepared a survey for the 107~\PhoneLab{} participants who contributed data +to our experiment. Our goal was to determine if users would be more willing +to remove inefficient apps, as defined using our content-based measure. As a +baseline, we also asked users about the apps that consumed the most energy. +We used each participants data to generate a custom survey containing +questions about 9 apps: the 3 least efficient apps as computed by our +content-based value measure, the 3 apps that used the most energy on their +smartphone during the experiment, and 3 apps chosen at random. For each we +asked them a simple question: ``If it would improve your battery life, would +you uninstall or stop using this app?'' To compute an aggregate score for +both the content-based and usage based measures, we give each measure 1~point +for a ``Yes'', 0.5~points for a ``Maybe'' and 0~points for a ``No''. +47~participants completed the survey, and the results are shown in +Figure~\ref{fig-survey}. For each user, if the score of one measure is higher +than the other, it is considered a ``win'' for the former. + +Overall the results are inconclusive, with the content-delivery measure not +clearly outperforming the straw-man usage measure at predicting which apps +each user would be willing to remove to save battery life. Given the crude +nature of our metric, this is not particularly surprising, and can be +interpreted as a sign that we need a more sophisticated value measure +incorporating more of the potential inputs we have previously discussed. +However, on one level the results are very encouraging: most users were +willing to consider removing one or more apps if that app would improve their +battery lifetime. Clearly, users are making this decision based on some idea +of each app's value---the challenge is to replicate their choices using the +information we have available to us. diff --git a/usage.tex b/usage.tex index 85d9893..c993a4a 100644 --- a/usage.tex +++ b/usage.tex @@ -143,7 +143,7 @@ same approach can also be applied to determine how much of any limited system resource to allocate to each app, %with high-value apps gaining priority over %the processor, memory allocation, networking bandwidth and limited storage. -Together these resources allocation measures can be designed to ensure that +Together these resource allocation measures can be designed to ensure that high-value apps run smoothly at the expense of lower-value apps. \subsection{Summary of Requirements} -- libgit2 0.22.2