Commit fd01472c82940f0fcfb3d8ce615c4a000f5c4c53

Authored by Anudipa Maiti
1 parent bf9918b9

first change for camera ready

abstract.tex
@@ -5,7 +5,7 @@ measures alone are not sufficient to enable effective energy management on @@ -5,7 +5,7 @@ measures alone are not sufficient to enable effective energy management on
5 battery-constrained mobile devices. What is urgently needed is a way to put 5 battery-constrained mobile devices. What is urgently needed is a way to put
6 energy consumption into context by measuring the \textit{value} delivered by 6 energy consumption into context by measuring the \textit{value} delivered by
7 mobile apps. While difficult to compute, an accurate value measure would 7 mobile apps. While difficult to compute, an accurate value measure would
8 -enable cross-app comparison, app improvement, energy virus detection, and 8 +enable cross-app comparison, app improvement, energy inefficient app detection, and
9 effective runtime energy allocation and prioritization. Our paper motivates 9 effective runtime energy allocation and prioritization. Our paper motivates
10 the problem, describes requirements for a value measure, discusses and 10 the problem, describes requirements for a value measure, discusses and
11 evaluates several possible inputs to such a measure, and presents results 11 evaluates several possible inputs to such a measure, and presents results
introduction.tex
@@ -39,9 +39,9 @@ Armed with a measure of value we can return to the difficult questions posed @@ -39,9 +39,9 @@ Armed with a measure of value we can return to the difficult questions posed
39 above. By computing efficiency users can perform apples-to-apples comparisons 39 above. By computing efficiency users can perform apples-to-apples comparisons
40 of apps in order to evaluate two video conferencing tools, web browsers, or 40 of apps in order to evaluate two video conferencing tools, web browsers, or
41 email clients. Developers can determine whether a new feature delivers value 41 email clients. Developers can determine whether a new feature delivers value
42 -more or less efficiently than the rest of their app and better understand 42 +more or less efficiently than the rest of their app and understand better the
43 differences in energy consumption across different users. Measuring value 43 differences in energy consumption across different users. Measuring value
44 -allows a rigorous definition of energy virus as an app that delivers little 44 +allows a rigorous definition of an \textit{energy virus} as an app that delivers little
45 or no value per joule, and for systems to reward efficient apps by 45 or no value per joule, and for systems to reward efficient apps by
46 prioritizing limited resources based on app value or energy efficiency. After 46 prioritizing limited resources based on app value or energy efficiency. After
47 all the progress we have made in computing the denominator---energy 47 all the progress we have made in computing the denominator---energy
@@ -69,7 +69,7 @@ how useful such a measure would be while also formulating design requirements @@ -69,7 +69,7 @@ how useful such a measure would be while also formulating design requirements
69 for the value measure itself. Section~\ref{sec-measure} presents an overview 69 for the value measure itself. Section~\ref{sec-measure} presents an overview
70 of possible inputs into such a measure and discussion of how each could be 70 of possible inputs into such a measure and discussion of how each could be
71 measured and how useful it might be. In Section~\ref{sec-results} we present 71 measured and how useful it might be. In Section~\ref{sec-results} we present
72 -at formulating a value measure based on content delivered through the video 72 +our initial effort at formulating a value measure based on content delivered through the video
73 display and audio output---an attempt that we consider a failure based on the 73 display and audio output---an attempt that we consider a failure based on the
74 result of a user survey, but a failure that we hope sheds some light on this 74 result of a user survey, but a failure that we hope sheds some light on this
75 difficult challenge. 75 difficult challenge.
metric.tex
@@ -95,7 +95,7 @@ can compute accurate values while consuming small amounts of energy. @@ -95,7 +95,7 @@ can compute accurate values while consuming small amounts of energy.
95 However, while this framework is conceptually appealing, fitting each app 95 However, while this framework is conceptually appealing, fitting each app
96 into it requires app-specific features that we are trying to avoid: content 96 into it requires app-specific features that we are trying to avoid: content
97 is measured in messages for the chat client, frames for the video player, and 97 is measured in messages for the chat client, frames for the video player, and
98 -the accuracy of the step value for the pedometer. This raises the question of 98 +the step value accuracy for the pedometer. This raises the question of
99 whether a single measure of content delivery requiring no app-specific 99 whether a single measure of content delivery requiring no app-specific
100 knowledge can be utilized in all cases. We explore this question in more 100 knowledge can be utilized in all cases. We explore this question in more
101 detail, as well as differences between the other value measure inputs we have 101 detail, as well as differences between the other value measure inputs we have
paper.tex
@@ -33,7 +33,7 @@ Apps} @@ -33,7 +33,7 @@ Apps}
33 \href{http://ubicomp.org/ubicomp2014/}{\textit{HotMobile'15}}, February 33 \href{http://ubicomp.org/ubicomp2014/}{\textit{HotMobile'15}}, February
34 12--13, 2015, Santa Fe, NM, USA\\ 34 12--13, 2015, Santa Fe, NM, USA\\
35 ACM 978-1-4503-3391-7/15/02$\ldots$\$15.00.\\ 35 ACM 978-1-4503-3391-7/15/02$\ldots$\$15.00.\\
36 - \url{http://dx.doi.org/10.1145/2699343.2699361}} 36 + \url{http://dx.doi.org/10.1145/2699343.2699360}}
37 37
38 \begin{document} 38 \begin{document}
39 39
results.tex
@@ -19,9 +19,9 @@ app usage---including foreground and background time and use of the display @@ -19,9 +19,9 @@ app usage---including foreground and background time and use of the display
19 and audio interface---that was not possible to measure on unmodified Android 19 and audio interface---that was not possible to measure on unmodified Android
20 devices. So to collect our dataset we took advantage of \PhoneLab{}'s ability 20 devices. So to collect our dataset we took advantage of \PhoneLab{}'s ability
21 to modify the Android platform itself. We instrumented the 21 to modify the Android platform itself. We instrumented the
22 -\texttt{SurfaceFlinger} and \texttt{AudioFlinger} Android platform components  
23 -to record usage of the screen and audio, and altered the Activity Services  
24 -package to record energy consumption at each app transition, allowing energy 22 +\texttt{SurfaceFlinger} and \texttt{AudioFlinger} components in the Android platform
  23 +to record usage of the screen and audio, and altered the ActivityManagerService
  24 +package to record energy consumption at each app transition. This allows energy
25 consumption by components such as the screen to be accurately attributed to 25 consumption by components such as the screen to be accurately attributed to
26 the foreground app, a feature that Android's internal battery monitoring 26 the foreground app, a feature that Android's internal battery monitoring
27 component (the Fuel Gauge) lacks. Changes were distributed to \PhoneLab{} 27 component (the Fuel Gauge) lacks. Changes were distributed to \PhoneLab{}
@@ -37,13 +37,14 @@ consumed by each app. Next, we formulate a simple measure of content @@ -37,13 +37,14 @@ consumed by each app. Next, we formulate a simple measure of content
37 delivery by measuring usage of the screen and audio output devices and test 37 delivery by measuring usage of the screen and audio output devices and test
38 it through a survey completed by 47~experiment participants. Unfortunately, 38 it through a survey completed by 47~experiment participants. Unfortunately,
39 our results are inconclusive and open to several possible interpretations 39 our results are inconclusive and open to several possible interpretations
40 -which we conclude by discussing. 40 +which we discuss.
41 41
42 \subsection{Total Energy} 42 \subsection{Total Energy}
43 43
44 \input{./figures/tables/tableALL.tex} 44 \input{./figures/tables/tableALL.tex}
45 45
46 -Clearly, ranking apps by total energy consumption over the entire study says 46 +Clearly, ranking apps by total energy consumption computed by adding all
  47 +foreground and background energy consumptions over the entire study says
47 much more about app popularity than it does about anything else. 48 much more about app popularity than it does about anything else.
48 Table~\ref{table-total} shows the top and bottom energy-consuming apps over 49 Table~\ref{table-total} shows the top and bottom energy-consuming apps over
49 the entire study. As expected, popular apps such as the Android Browser, 50 the entire study. As expected, popular apps such as the Android Browser,
@@ -70,8 +71,10 @@ Messaging. @@ -70,8 +71,10 @@ Messaging.
70 Isolating the foreground component of execution time provides a better 71 Isolating the foreground component of execution time provides a better
71 measure of value, since it ignores the time that users spend ignoring apps. 72 measure of value, since it ignores the time that users spend ignoring apps.
72 Table~\ref{table-foreground} shows a measure of energy efficiency computed by 73 Table~\ref{table-foreground} shows a measure of energy efficiency computed by
73 -utilizing foreground time alone as our value measure. Some surprising changes  
74 -from the power results can be seen. Some apps have remaining in their former 74 +%utilizing foreground time alone as our value measure.
  75 +dividing total foreground energy consumption by total foreground time of an
  76 +app. Some surprising changes
  77 +from the power results can be seen. A number of apps have remained in their former
75 categories: Bank of America, which was identified as a low-power app, is also 78 categories: Bank of America, which was identified as a low-power app, is also
76 a highly-efficient app when using foreground time as the value measure; and 79 a highly-efficient app when using foreground time as the value measure; and
77 Facebook Messenger, which was identified as a high-power app, is also marked 80 Facebook Messenger, which was identified as a high-power app, is also marked
@@ -82,7 +85,7 @@ looked better when their foreground usage was considered. @@ -82,7 +85,7 @@ looked better when their foreground usage was considered.
82 85
83 \subsection{Content Energy Efficiency} 86 \subsection{Content Energy Efficiency}
84 87
85 -Finally, we the data we collected by instrumenting the 88 +Finally, we use the data we collected by instrumenting the
86 \texttt{SurfaceFlinger} and \texttt{AudioFlinger} components to compute a 89 \texttt{SurfaceFlinger} and \texttt{AudioFlinger} components to compute a
87 simple measure of content delivery. We measure the audio and video frame 90 simple measure of content delivery. We measure the audio and video frame
88 rates and combine them into a single measure by using bit-rates corresponding 91 rates and combine them into a single measure by using bit-rates corresponding
@@ -94,7 +97,7 @@ with the results shown in Table~\ref{table-content}. @@ -94,7 +97,7 @@ with the results shown in Table~\ref{table-content}.
94 97
95 Comparing with the foreground energy efficiency again shows several 98 Comparing with the foreground energy efficiency again shows several
96 interesting changes. Yahoo Mail, which foreground energy efficiency marked as 99 interesting changes. Yahoo Mail, which foreground energy efficiency marked as
97 -inefficiency, looks more efficient when content delivery is considered. While 100 +inefficient, looks more efficient when content delivery is considered. While
98 it is possible that one \PhoneLab{} participant uses it to read email very 101 it is possible that one \PhoneLab{} participant uses it to read email very
99 quickly, it may be more likely that it uses a ``spinner'' or other fancy UI 102 quickly, it may be more likely that it uses a ``spinner'' or other fancy UI
100 elements that generate artificially high frame rates without delivering much 103 elements that generate artificially high frame rates without delivering much
@@ -136,7 +139,8 @@ you uninstall or stop using this app?'' To compute an aggregate score for @@ -136,7 +139,8 @@ you uninstall or stop using this app?'' To compute an aggregate score for
136 both the content-based and usage based measures, we give each measure 1~point 139 both the content-based and usage based measures, we give each measure 1~point
137 for a ``Yes'', 0.5~points for a ``Maybe'' and 0~points for a ``No''. 140 for a ``Yes'', 0.5~points for a ``Maybe'' and 0~points for a ``No''.
138 47~participants completed the survey, and the results are shown in 141 47~participants completed the survey, and the results are shown in
139 -Figure~\ref{fig-survey}. 142 +Figure~\ref{fig-survey}. For each user, if the score of one measure is higher
  143 +than the other, it is considered a ``win'' for the former.
140 144
141 Overall the results are inconclusive, with the content-delivery measure not 145 Overall the results are inconclusive, with the content-delivery measure not
142 clearly outperforming the straw-man usage measure at predicting which apps 146 clearly outperforming the straw-man usage measure at predicting which apps
submitted/3.pdf 0 โ†’ 100644
No preview for this file type
submitted/4.pdf 0 โ†’ 100644
No preview for this file type
usage.tex
@@ -11,19 +11,19 @@ problem: what is the value of an app? @@ -11,19 +11,19 @@ problem: what is the value of an app?
11 11
12 All smartphone users intuitively realize that smartphone apps differ in 12 All smartphone users intuitively realize that smartphone apps differ in
13 value---an email client, for example, is probably more valuable than a app 13 value---an email client, for example, is probably more valuable than a app
14 -that makes farting sounds. But is it possible to quantify these subjective 14 +that makes random sounds. But is it possible to quantify these subjective
15 distinctions and produce a value measure? To argue that this is possible we 15 distinctions and produce a value measure? To argue that this is possible we
16 present two experiments that elucidate smartphone app value in the form of 16 present two experiments that elucidate smartphone app value in the form of
17 both ordinal and cardinal utilities: 17 both ordinal and cardinal utilities:
18 % 18 %
19 \begin{enumerate} 19 \begin{enumerate}
20 20
21 -\item An adversary will require you to remove some number of apps from your 21 +\item You will be required to remove some number of apps from your
22 smartphone. Order the apps you are currently using from least important to 22 smartphone. Order the apps you are currently using from least important to
23 most important. The N least important apps will be removed. 23 most important. The N least important apps will be removed.
24 24
25 -\item Your smartphone will require you to create an energy budget for the  
26 -apps you use. During any discharging cycle, once an app runs out of energy 25 +\item You will be required to create an energy budget for the
  26 +apps you use on your smartphone. During any discharging cycle, once an app runs out of energy
27 you will not be able to use it until you plug in your smartphone. Allocate 27 you will not be able to use it until you plug in your smartphone. Allocate
28 battery percentages to each app you use. 28 battery percentages to each app you use.
29 29
@@ -31,7 +31,9 @@ battery percentages to each app you use. @@ -31,7 +31,9 @@ battery percentages to each app you use.
31 % 31 %
32 We plan to engage smartphone users in studies to explore in more detail which 32 We plan to engage smartphone users in studies to explore in more detail which
33 of these approaches is more effective, comparing them by comparing users' 33 of these approaches is more effective, comparing them by comparing users'
34 -levels of satisfaction under each scenario. For our value measure we are 34 +levels of satisfaction under each scenario. In the first experiment we ask users
  35 +to uninstall apps because often apps have a background component that keeps consuming
  36 +energy even when not used by users any more. For our value measure we are
35 hopeful that users will prove capable of assigning cardinal utilities to 37 hopeful that users will prove capable of assigning cardinal utilities to
36 apps---as in the second experiment---since this matches most directly with 38 apps---as in the second experiment---since this matches most directly with
37 our proposed value measure and could provide ground truth for a value measure 39 our proposed value measure and could provide ground truth for a value measure
@@ -47,7 +49,7 @@ these setups are the only way or the right way to measure value. In both @@ -47,7 +49,7 @@ these setups are the only way or the right way to measure value. In both
47 cases low value measures have fairly extreme consequences---the app is 49 cases low value measures have fairly extreme consequences---the app is
48 actually removed or rendered unusable. This may cause users to overvalue 50 actually removed or rendered unusable. This may cause users to overvalue
49 essential tools such as communication apps and undervalue inessential apps 51 essential tools such as communication apps and undervalue inessential apps
50 -that nevertheless provide them with a great deal of enjoyment such as a game. 52 +that nevertheless provide them with a great deal of enjoyment such as games.
51 However, given that our goal is a value measure that can be paired with and 53 However, given that our goal is a value measure that can be paired with and
52 used to allocate energy, and that energy exhaustion has such severe 54 used to allocate energy, and that energy exhaustion has such severe
53 consequences on the usability of all apps, a more extreme experimental setup 55 consequences on the usability of all apps, a more extreme experimental setup
@@ -63,10 +65,10 @@ The most powerful use of a value measure would be to compare apps by @@ -63,10 +65,10 @@ The most powerful use of a value measure would be to compare apps by
63 comparing their energy efficiency, therefore overcoming the most critical 65 comparing their energy efficiency, therefore overcoming the most critical
64 flaw in current attempts to compare or categorize apps by their energy 66 flaw in current attempts to compare or categorize apps by their energy
65 consumption alone~\cite{carat-sensys13}. Consider attempting to compare a 67 consumption alone~\cite{carat-sensys13}. Consider attempting to compare a
66 -chat client and videoconferencing app by only measuring their energy 68 +chat client and video conferencing app by only measuring their energy
67 consumption. Unless it is terribly written, the chat client will consume less 69 consumption. Unless it is terribly written, the chat client will consume less
68 energy. But this does not mean that it is efficient, or that the 70 energy. But this does not mean that it is efficient, or that the
69 -videoconferencing app is not. Ultimately, all the energy consumption 71 +video conferencing app is not. Ultimately, all the energy consumption
70 comparison truly reveals is that the two apps do different things---which we 72 comparison truly reveals is that the two apps do different things---which we
71 already knew. 73 already knew.
72 74
@@ -75,20 +77,22 @@ same app difficult. Given an app that consumes twice as much energy on @@ -75,20 +77,22 @@ same app difficult. Given an app that consumes twice as much energy on
75 Alice's smartphone than on Bob's, the question of why is left unanswered by 77 Alice's smartphone than on Bob's, the question of why is left unanswered by
76 pure energy measures. Even if usage time can be used to normalize the 78 pure energy measures. Even if usage time can be used to normalize the
77 comparison, power consumption alone cannot incorporate differences due to the 79 comparison, power consumption alone cannot incorporate differences due to the
78 -different app features or configurations used by Alice and Bob. 80 +different app features or app configurations used by Alice and Bob.
79 81
80 By computing value and, thus, energy efficiency, we can overcome these 82 By computing value and, thus, energy efficiency, we can overcome these
81 weaknesses. A value measure should allow us to compare the efficiency of two 83 weaknesses. A value measure should allow us to compare the efficiency of two
82 apps in different categories based on how efficiently they use energy to 84 apps in different categories based on how efficiently they use energy to
83 -deliver user value, making it possible to compare games to email clients to  
84 -video players. Comparisons within the same app category should allow users to 85 +deliver user value.
  86 +%, making it possible to compare games to email clients to video players.
  87 +Comparisons within the same app category should allow users to
85 select the most efficient email client or web browser. Aggregating results 88 select the most efficient email client or web browser. Aggregating results
86 over all users, differences in app energy efficiency should reflect how well 89 over all users, differences in app energy efficiency should reflect how well
87 the app is written and how well it predicts and adapts to users, not just 90 the app is written and how well it predicts and adapts to users, not just
88 differences in the core features it provides. When comparing two users using 91 differences in the core features it provides. When comparing two users using
89 -the same app, differences in efficiency should reflect different  
90 -configurations or differences in how efficiently the app provides certain  
91 -features. 92 +the same app, differences in efficiency should reflect differences in
  93 +app configurations or app features.
  94 +%different app configurations or differences in how efficiently the app provides certain
  95 +%features.
92 96
93 \subsection{Evaluating App Changes} 97 \subsection{Evaluating App Changes}
94 98
@@ -97,9 +101,11 @@ and deliver more value per joule. Today's energy profiling tools may be able @@ -97,9 +101,11 @@ and deliver more value per joule. Today's energy profiling tools may be able
97 to show the energy impact of adding a new feature or changing the way that a 101 to show the energy impact of adding a new feature or changing the way that a
98 particular feature is implemented, but energy consumption alone is not 102 particular feature is implemented, but energy consumption alone is not
99 sufficient to apply Amdahl's Law properly to the problem of improving app 103 sufficient to apply Amdahl's Law properly to the problem of improving app
100 -energy efficiency. For example, if a particular feature consumes a great deal  
101 -of energy but adds little value, it is possible that it should be eliminated,  
102 -not improved. Overall developers should strive to make the parts of their app 104 +energy efficiency.
  105 +%For example, if a particular feature consumes a great deal
  106 +%of energy but adds little value, it is possible that it should be eliminated,
  107 +%not improved. Overall
  108 +Developers should strive to make the parts of their app
103 that generate a large amount of value as energy-efficient as possible, remove 109 that generate a large amount of value as energy-efficient as possible, remove
104 parts that generate little value while consuming a great deal of energy, and 110 parts that generate little value while consuming a great deal of energy, and
105 defer work on everything else. 111 defer work on everything else.
@@ -108,10 +114,10 @@ defer work on everything else. @@ -108,10 +114,10 @@ defer work on everything else.
108 114
109 A measure of app value makes it possible to produce a rigorous definition of 115 A measure of app value makes it possible to produce a rigorous definition of
110 the term \textit{energy virus}: an app that produces little to no value per 116 the term \textit{energy virus}: an app that produces little to no value per
111 -joule. The choice of threshold will require some study, as it is unlikely 117 +joule. The choice of threshold will require some study, as it is unlikely and
112 impossible to produce a single efficiency cutoff that cleanly separates 118 impossible to produce a single efficiency cutoff that cleanly separates
113 -malicious apps from ones that are merely poorly-written. Note also that this  
114 -definition of energy virus can be made on a per-user basis. This is important 119 +malicious apps from ones that are merely poorly-written. This
  120 +definition of energy virus can also be made on a per-user basis. This is important
115 since a non-malicious but poorly-written app that continues to consume energy 121 since a non-malicious but poorly-written app that continues to consume energy
116 even long after the user has stopped using it---and it has stopped providing 122 even long after the user has stopped using it---and it has stopped providing
117 value---functions as an energy virus for that user, but may not for a user 123 value---functions as an energy virus for that user, but may not for a user
@@ -119,7 +125,7 @@ that interacts with it more frequently. @@ -119,7 +125,7 @@ that interacts with it more frequently.
119 125
120 \subsection{Prioritizing System Resources} 126 \subsection{Prioritizing System Resources}
121 127
122 -An app value measure should able to able to be used to prioritize limited 128 +An app value measure should be able to be used to prioritize limited
123 system resources, particularly energy but also storage, memory, networking 129 system resources, particularly energy but also storage, memory, networking
124 bandwidth and processor time. While mechanisms differ, most previous attempts 130 bandwidth and processor time. While mechanisms differ, most previous attempts
125 to control energy consumption rely on some form of rate control which 131 to control energy consumption rely on some form of rate control which
@@ -141,14 +147,15 @@ are likely many ways to combine energy consumption with a value measure in @@ -141,14 +147,15 @@ are likely many ways to combine energy consumption with a value measure in
141 order to prioritize energy consumption, it is not clear that energy 147 order to prioritize energy consumption, it is not clear that energy
142 consumption can be prioritized effectively without some measure of value. The 148 consumption can be prioritized effectively without some measure of value. The
143 same approach can also be applied to determine how much of any limited system 149 same approach can also be applied to determine how much of any limited system
144 -resource to allocate to each app, with high-value apps gaining priority over  
145 -the processor, memory allocation, networking bandwidth and limited storage.  
146 -Together these resources allocation measures are designed to ensure that 150 +resource to allocate to each app,
  151 +%with high-value apps gaining priority over
  152 +%the processor, memory allocation, networking bandwidth and limited storage.
  153 +Together these resources allocation measures can be designed to ensure that
147 high-value apps run smoothly at the expense of lower-value apps. 154 high-value apps run smoothly at the expense of lower-value apps.
148 155
149 \subsection{Summary of Requirements} 156 \subsection{Summary of Requirements}
150 157
151 -The uses cases above give rise to a set of requirements for a possible value 158 +The use cases above give rise to a set of requirements for a possible value
152 measurement: 159 measurement:
153 % 160 %
154 \begin{itemize} 161 \begin{itemize}
@@ -162,7 +169,7 @@ inputs, requiring that it be calculable given data from a single user. @@ -162,7 +169,7 @@ inputs, requiring that it be calculable given data from a single user.
162 \item It should enable targeted development by highlighting what parts of an 169 \item It should enable targeted development by highlighting what parts of an
163 app generate value and what parts do not. 170 app generate value and what parts do not.
164 171
165 -\item It should be able to be efficiently computed to not overly consume the 172 +\item It should be efficiently computable without unduly consuming the
166 resources that it is designed to help manage. 173 resources that it is designed to help manage.
167 174
168 \item It should be derived with little to no input from the user. 175 \item It should be derived with little to no input from the user.