results.tex
7.4 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
\section{Results}
\label{sec-results}
To examine the potential components of a value measure further, we utilize a
large dataset of energy consumption measurements collected by an IRB-approved
experiment run on the \PhoneLab{} testbed. \PhoneLab{} is a public smartphone
platform testbed located at the University at
Buffalo~\cite{phonelab-sensemine13}. 220~students, faculty, and staff carry
instrumented Android Nexus~5 smartphones and receive subsidized service in
return for willingness to participate in experiments. \PhoneLab{} provides
access to a representative group of participants balanced between genders and
across a wide variety of age brackets, making our results more
representative.
Understanding fine-grained energy consumption dynamics required more
information than Android normally exposes to apps. In addition, to explore
components of our value measure we also wanted to capture information about
app usage---including foreground and background time and use of the display
and audio interface---that was not possible to measure on unmodified Android
devices. So to collect our dataset we took advantage of \PhoneLab{}'s ability
to modify the Android platform itself. We instrumented the
\texttt{SurfaceFlinger} and \texttt{AudioFlinger} Android platform components
to record usage of the screen and audio, and altered the Activity Services
package to record energy consumption at each app transition, allowing energy
consumption by components such as the screen to be accurately attributed to
the foreground app, a feature that Android's internal battery monitoring
component (the Fuel Gauge) lacks. Changes were distributed to \PhoneLab{}
participants in November, 2013, via an over-the-air (OTA) platform update.
The resulting 2~month dataset of 67~GB of compressed log files represents
\num{6806} user days during which \num{1328}~apps were started \num{277785}
times and used for a total of \num{15224} hours of active use by
107~\PhoneLab{} participants.
Our analysis begins by investigating several components of a possible value
measure and shows the effect of using each to weight the overall energy
consumed by each app. Next, we formulate a simple measure of content
delivery by measuring usage of the screen and audio output devices and test
it through a survey completed by 47~experiment participants. Unfortunately,
our results are inconclusive and open to several possible interpretations
which we conclude by discussing.
\subsection{Total Energy}
\input{./figures/tables/tableALL.tex}
Clearly, ranking apps by total energy consumption over the entire study says
much more about app popularity than it does about anything else.
Table~\ref{table-total} shows the top and bottom energy-consuming apps over
the entire study. As expected, popular apps such as the Android Browser,
Facebook, and the Android Phone component consume the most energy, while the
list of low consumers is dominated by apps with few installs. This table does
serve, however, to identify the popular apps in use by \PhoneLab{}
participants, and as a point of comparison for the remainder of our results.
\subsection{Power}
Computing each app's power consumption by scaling their total energy usage
against the total time they were running, either in the background or
foreground, reveals more information, as shown in Table~\ref{table-rate}. Our
results identify Facebook Messenger, Google+, and the Super-Bright LED
Flashlight as apps that rapidly-consume energy, while the Bank of America and
Weather Channel apps consume energy slowly. Differences between apps in
similar categories may begin to identify apps with problematic energy
consumption, such as contrasting the high energy usage of Facebook Messenger
with other messaging clients such as WhatsApp, Twitter, and Android
Messaging.
\subsection{Foreground Energy Efficiency}
Isolating the foreground component of execution time provides a better
measure of value, since it ignores the time that users spend ignoring apps.
Table~\ref{table-foreground} shows a measure of energy efficiency computed by
utilizing foreground time alone as our value measure. Some surprising changes
from the power results can be seen. Some apps have remaining in their former
categories: Bank of America, which was identified as a low-power app, is also
a highly-efficient app when using foreground time as the value measure; and
Facebook Messenger, which was identified as a high-power app, is also marked
as inefficient. Other apps, however, have switched categories. ESPN
Sportscenter and Yahoo Mail do not consume much power, but also don't spend
much time in the foreground; interestingly, none of the high-power apps
looked better when their foreground usage was considered.
\subsection{Content Energy Efficiency}
Finally, we the data we collected by instrumenting the
\texttt{SurfaceFlinger} and \texttt{AudioFlinger} components to compute a
simple measure of content delivery. We measure the audio and video frame
rates and combine them into a single measure by using bit-rates corresponding
to a 30~fps YouTube-encoded video and 128~kbps two-channel audio, with the
weights representing the fact that a single frame of video contains much more
content than a single sample of audio. We use this combined metric as the
value measure and again use it to weight the energy consumption of each app,
with the results shown in Table~\ref{table-content}.
Comparing with the foreground energy efficiency again shows several
interesting changes. Yahoo Mail, which foreground energy efficiency marked as
inefficiency, looks more efficient when content delivery is considered. While
it is possible that one \PhoneLab{} participant uses it to read email very
quickly, it may be more likely that it uses a ``spinner'' or other fancy UI
elements that generate artificially high frame rates without delivering much
information. The inability to distinguish between meaningless and meaningful
video frame content is a significant weakness of this simple approach.
YouTube and Candy Crush Saga both earn high marks, which is encouraging given
that they are very different apps but also might be a result of overweighting
screen refreshes. The Android Clock is also an unsurprising result, as it
requires almost no energy to generate a relatively-large number of screen
redraws.
\subsection{Survey Results and Discussion}
\begin{figure*}[t]
\centering
\includegraphics[width=\textwidth]{./figures/survey.pdf}
\caption{\textbf{Participant responses to energy inefficient app sugestions.} The height of each bar
demonstrates how many of the suggested apps the user is willing to remove for better battery life. }
\label{fig-survey}
\end{figure*}
To evaluate our efficiency metric against usage based metric, we sent out a
survey to our participants asking to answer if they would remove the 3 top
energy inefficient apps suggested by both metrics to improve the battery-life
of the smartphones by choosing one of the three options: yes, may be, no 47
participants responded to the survey. Figure~\ref{fig-survey} shows that our
efficiency metric did not do a better job than the usage based metric. This
negative result points out that our content metric design is too simplistic
to be effective. Only screen time or audio time is not enough to evaluate the
different types of rich content delivered by apps. For example, our metric
cannot distinguish between video content and interactive content. We also
need to be careful about how we assign weight to the multiple components that
consume energy to deliver the content.