metric.tex 5.49 KB
\section{Value Measure Inputs}
\label{sec-measure}

To continue we discuss possible inputs to a value measure and how to collect
them at runtime. In each case, we also discuss how such statistics could be
misleading.

\subsection{Overall Usage}

There are a variety of different ways to measure overall app usage that could
be useful inputs to our value measure. Total foreground time is
straightforward to measure, particularly on today's smartphones where one app
tends to dominate the display. However, next-generation smartphone platforms
that provide multiple apps with simultaneous access to the display will
complicate this task by making it more difficult to determine which app the
user is paying attention to. Number of starts is also a potentially-useful
input, as may be the distribution of interaction times across all times that
the app was brought to the foreground.

While these measures of contact time are intuitive, there are obvious cases
in which they fail, particularly for apps that spend a great deal of time
running in the background in order to deliver a small amount of useful
foreground information---such as a pedometer app.

\subsection{User Interface Statistics}

Patterns of interaction may also be useful to observe, and inputs such as
keystrokes and touchscreen events are simple to track. However, there is more
obvious differentation between app interaction patterns between
categories---users deliver far more keystrokes to a chat client than to a
video player---so it is clear that interaction statistics will have to be
used in conjunction with complementary value measure components that offset
the differences between high-interaction and low-interaction apps. This
approach also fails in the case where apps deploy confusing or unnecessary
interfaces that require a great deal of unnecessary interaction to accomplish
simple tasks. Clearly such apps should not be rewarded.

\subsection{Notification Click-Through Rates}

Another interesting statistic that could provide insight on app value is how
often users view or click through app notifications. When notifications are
delivered but not viewed, then it is unclear whether the app needed to
deliver them at all. When clickable notifications---such as those for new
email---provide a way for users to immediately launch the app, the percentage
of notifications that are actually clicked as opposed to ignored could be
used to at least evaluate how effective the notifications are, and may also
reflect on overall app value.

Notification view and click-through rates also help put into context the
energy used by apps when they are running in the background. Legitimate
background energy consumption should be for one of two purposes: to prepare
the app to deliver more value the next time it is foregrounded, as is the
case when music players download songs and store them locally to reduce their
runtime networking usage; or to deliver realtime notifications to the user.
The effectiveness of background energy consumption to fill caches will be
reflected in the apps overall energy usage, since retrieving local content is
more energy efficient than using the network. Effectiveness of background
consumption to deliver notifications may be reflected in the rate at which
notifications are viewed or clicked, since a notification that is not
consumed did not need to be retrieved.

However, in some cases apps may do an effective job at summarizing the event
within the notification itself, providing no need for the user to bring the
app to the foreground. Clearly such apps should not be penalized.

\subsection{Content Delivery}

Another approach to measuring value that we feel is promising is to consider
apps as content delivery agents and measure how efficiently they deliver
information to and from the user. Encouragingly, multiple apps that we have
previously considered can fit into this framework:

\begin{itemize}

\item \textbf{Chat client:} the content is the messages exchanged by users,
and efficiency is determined by the amount of screen time and interaction
required to retrieve and render incoming messages and generate outgoing
messages as replies. Value is measured by the content of the messages and
efficient chat clients send and receive a large number of messages per joule.

\item \textbf{Video player:} the content is the video delivered to the user
and efficiency is determined by the amount of network bandwidth and processing
needed to render the video. Value is measured by the information delivered by
the videos and efficient video players present a large amount of video
content to their users per joule.

\item \textbf{Pedometer:} the content is the count of the number of steps
presented to the user and efficiency is determined by the accelerometer rate
and any post-processing required to produce an accurate estimate. Value is
measured as the ability to maintain the step count and efficient pedometers
can compute accurate values while consuming small amounts of energy.

\end{itemize}

However, while this framework is conceptually appealing, fitting each app
into it requires app-specific features that we are trying to avoid: content
is measured in messages for the chat client, frames for the video player, and
the accuracy of the step value for the pedometer. This raises the question of
whether a single measure of content delivery requiring no app-specific
knowledge can be utilized in all cases. We explore this question in more
detail, as well as differences between the other value measure inputs we have
discussed, through the experiment and results described next.