Spotlight Talk
in
Workshop: Machine Learning for Mobile Health
Using Wearables for Influenza-Like Illness Detection: The importance of design
Bret Nestor
Consumer wearable sensors are estimated to be used by one in five Americans for tracking fitness and other personal health. Recently, they have been touted as low-cost vehicles for frequent healthcare monitoring and have received approval as diagnostic devices to detect conditions such as atrial fibrillation. Common fitness tracker measurements such as heart rate or steps can be used to implicate underlying causes. One application of interest is to anticipate or detect influenza-like illness (ILI). However, a timely detection of influenza is a challenge as the virus can be transmitted prior to symptom onset (pre-symptomatic), or by individuals who harbour the virus, but do not experience symptoms (asymptomatic). Similarly, 44\% of viral shedding of COVID-19, another disease which causes ILI, in symptomatic individuals happens prior to the onset of symptoms. We investigate if ILI (as caused by influenza, COVID-19, and other diseases) can be detected by wearable sensors, and if possible, how early we can anticipate the onset of symptoms. Having a system to warn users that they are about to become ill can reduce viral transmission -- mitigating the spread of seasonal influenza and suppressing the COVID-19 epidemic.
ILI symptoms can be detected from wearable sensors. For example, temperature covaries with cardiac rhythm. The associated increase of resting heart rate (RHR) during ILI has been demonstrated in previous studies and has been used to estimate the incidence of influenza at a population level, using data collected from wearable sensors. Yet, individual-level ILI predictions from wearable features have been elusive, though research is actively underway. Rigorous work is required to evaluate the sensitivity of models that anticipate ILI onset prior to experiencing symptoms. In this paper we expose potential pitfalls in building ILI prediction models. Specifically, we compare the performance of a model trained and evaluated retrospectively with a held-out set of subjects versus prospectively on a held-out future week of data, mimicking actual deployment scenarios. We show that when the design is focused on deployment, though the performance may drop, it is still improved over naive baselines, indicating potential real-world applications.