Microsoft Research has developed a technology which utilizes consumer smartphone cameras for contact-free physiological measurement in telehealth and more.

Over the past decade, researchers have discovered that increasingly available webcams and cellphone cameras combined with AI algorithms can be used as effective health sensors. These methods involve measurement of very subtle changes in the appearance of the body across time, in many cases changes imperceptible to the unaided human eye, to recover physiological information.

A team of researchers from Microsoft Research, University of Washington, and OctoML have collaborated to create an innovative video-based on-device optical cardiopulmonary vital sign measurement approach. The approach uses everyday camera technology (such as webcams and mobile devices) and a novel convolutional attention network, called MTTS-CAN, to make real-time cardio-pulmonary measurements possible on mobile platforms with state-of-the-art accuracy.

Physiological processes such as blood flow and breathing change the appearance of the body very subtly over time. A smartphone camera can pick up this reflected light, and the changes in pixel intensities over time can be used to recover the underlying sources of these variations (namely a person’s pulse and respiration). Using optical models grounded in the knowledge of these physiological processes, a video of a person can be processed to determine their pulse rate, respiration, and even the concentration of oxygen in their blood.

The technology can be used for mundane things such as fitness, well-being and clinical applications. For everyday consumers, it could make home monitoring and fitness tracking more convenient. Your treadmill or smart at-home fitness equipment could continuously track your vitals during your run for example without you needing to wear a device or sync the data. In clinical contexts, camera-based measurements could enable a cardiologist to more objectively analyze a patient’s heart health over a video call.

Perhaps the most obvious application for camera-based physiological sensing is in telehealth.  The COVID-19 virus has been linked to increased risk of myocarditis and other serious cardiac (heart) conditions, and experts are suggesting that particular attention should be given to cardiovascular and pulmonary protection during treatment.

In most telehealth scenarios, however, physicians lack access to objective measurements of a patient’s condition because of the inability to capture signals such as the patient’s vital signs. This concerns many patients because they worry about the quality of the diagnosis and care they can receive without objective measurements. Ubiquitous sensing could help transform how telehealth is conducted, and it could also contribute to establishing telehealth as a mainstream form of healthcare.

Finally, the ability to run at a high frame rate enables opportunistic sensing (for example, obtaining measurements each time you look at your phone) and helps capture waveform dynamics that could be used to detect atrial fibrillation, hypertension, and heart rate variability where high-frame rates (at least 100Hz) are a requirement to yield precise measurements of the waveform dynamics.

All the details can be read in the team’s paper, “Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement,” which has been accepted at the 34th Conference on Neural Information Processing Systems (NeurIPS 2020) and will be presented in a Spotlight talk on Monday, December 7th at 6:15PM- 6:30PM (PT).