For most of my adult life, I didn’t really think about sleep: It was just an activity that my body required, for about six to eight hours a night, in order to not feel like garbage the next day. Rarely did I pause to consider the quality of my rest or whether my sleep patterns were “normal.” That is, until I was gifted a Fitbit Charge 3.
Now, just about every morning, I open my phone’s Fitbit app and look at the sleep report, which tells me how long I slept and how much time I spent in various sleep stages. Typically, the ritual of poring over my metrics sparks a mix of curiosity (is two hours of REM sleep normal?), alarm (wait, I woke up a dozen times?), and that feeling I get when I read my horoscope: This seems right if I don’t think about it too hard.
I wasn’t sure whether my skepticism was justified, so I did some research. Turns out, sleep is complicated, and I was right to doubt that the device on my wrist is always giving me accurate data. And even if it is, not even sleep researchers could tell me what to do with the information.
Asleep or awake?
Although there’s a lot we don’t understand about sleep, we know that it’s incredibly important: Getting too little on a regular basis is associated with a range of health problems, including diabetes, heart disease, and depression. Animal research suggests that being sleep-deprived for long enough can literally kill you.
Sleep researchers generally agree that most adults need about seven to nine hours of sleep per night to stay in good health. Studies also show that we need high-quality sleep, which means falling asleep relatively quickly, sleeping soundly through the night, and spending most of the time that we’re in bed asleep.
Early versions of Fitbit (as well as two of the current models, Fitbit Inspire and Fitbit Ace 2) focused on providing this sort of basic sleep information by using a tri-axial accelerometer to measure the movement of your wrist, both up and down and side-to-side, as you rest. As Fitbit lead sleep scientist Conor Heneghan put it, “if you’re moving a lot, you’re highly unlikely to be asleep.”
It’s simple logic—and indeed, the same general method is used frequently for clinical research, where patients wear something called an actigrapher to track their wrist motion while they sleep. Scientists then use algorithms to translate that motion into basic sleep/wake patterns. A 2011 review paper found that in healthy people, clinical actigraphy devices were able to correctly identify actual sleep as sleep between 87 and 99 percent of the time.
I couldn’t find a similar estimate for the accuracy of accelerometer-based smartwatches, but studies show these devices often compare favorably to their clinical counterparts, according to a review paper published earlier this year. “Overall, they are not so much dissimilar to standard actigraphy,” said Massimiliano de Zambotti, a research scientist in the Human Sleep Research Program at SRI International and lead author on that review.
Where all motion-based sleep trackers fall short is in their ability to detect wake: According to Zambotti, most devices will only get it right about half the time. That’s because these devices assume that a person lying perfectly still is asleep, and anyone who’s ever had a night of insomnia will know that isn’t necessarily true. Because of this limitation, accelerometer-based sleep tracking tends to overestimate a person’s total time asleep, according to a 2016 review paper.
Rebecca Spencer, a neuroscientist at the University of Massachusetts Amherst who has studied the reliability of motion-based sleep trackers, noted that they can also be tricked into thinking a sleeping person is actually awake, if, say, the sleeper is lying in bed with a jumpy dog or restless partner.
“One big failure of all [motion-based] devices: They’re assuming there’s no movement of that wrist except for the person wearing it,” Spencer said.
Still, Spencer felt that for healthy adults, overall trends in how much you’re sleeping could be accurately captured with an accelerometer. So did Andrew Kubala, a PhD candidate at the University of Pittsburgh who recently led a study comparing six commercial smartwatches to an actigraph.
“For the general consumer, if they’re interested in their sleep patterns, I don’t see any issue with buying a [commercial] monitor,” he said. “And they’ll get a good estimate of their sleep.”
But total sleep is just the tip of the iceberg when it comes to understanding our slumber. That’s why newer models of Fitbit and other smartwatches harness more data, including heart rate, to provide insight into the entire sleep cycle.
The sleep cycle
Sleep is far more than a nightly lapse in consciousness. While we’re getting those z’s, the brains and body are doing a lot, cycling through four different sleep stages like a person moving up and down between four floors of a house.
Three of those stages are types of non-rapid eye movement (non-REM) sleep, which sleep scientists have rather unimaginatively dubbed NREM 1-3 (or just N1-N3). In these sleep stages, your heart rate, breathing, and brainwaves get progressively slower as you fall into a deeper and deeper slumber.
The N1 and N2 stages are what Fitbit refers to as “light sleep,” and together they account for most of our sleep on a typical night. The N3 stage, also known as “slow wave sleep” among researchers or “deep sleep” in your Fitbit app, accounts for a smaller fraction of your total sleep, but scientists consider it important for feeling refreshed the next day.
Then there’s REM sleep. During this stage, a person’s eyes move back and forth rapidly, their heart rate and blood pressure increase, their arm and leg muscles become temporarily paralyzed, and their brain activity becomes more similar to what’s seen in wakefulness. This is the sleep stage in which we’re most likely to dream. Research suggests that REM and deep sleep together play an important role in memory consolidation and stabilization.
The gold standard for mapping these sleep stages is a technique known as polysomography, where brainwaves, muscle activity, and eye movement are recorded throughout the night in a lab by placing electrodes all over a person’s body. Two or more professional scorers look at the resultant data, manually score the different sleep stages, and accept the data as valid when they exceed a certain threshold of agreement (often, around 90 percent).
A Fitbit obviously can’t sense your brainwaves. Instead, it uses algorithms that combine data on movement and heart rate, as well as demographic information like age and gender (which you enter into the app when you’re setting up your Fitbit) to approximate your nightly oscillations between the various stages. According to a Fitbit-funded study published in 2017, the tracker’s algorithms agree with polysomography around 70 percent of the time for light and REM sleep and 60 percent of the time for deep sleep.
Screenshot of a nightly hypnogram on my Fitbit app. Wow, look how much I slept!Screenshot: Maddie Stone
An independent validation study on the Fitbit Charge 2, led by de Zambotti of SRI, came to fairly similar results: The smartwatch agreed with polysomography 80 percent of the time on light sleep and 75 percent of the time on REM sleep, but the two saw just 50 percent agreement on deep sleep. Across the few published studies, the jury’s out on whether Fitbit’s algorithms over- or underestimate deep sleep, or whether there’s no such bias (as the company’s study says).
What’s clear across all of this research is that Fitbit’s sleep staging data is, at best, a fuzzy approximation. To Zilu Liang, a researcher at the Kyoto University of Advanced Science who studies consumer wearables and digital health metrics, that’s no surprise.
“We have to measure a lot of bio-signals to understand sleep stages,” she said. “Fitbit only has two sources of information,” motion and heart rate. “I don’t think those two sources are sufficient to accurately infer sleep stages.”
But for the average person, the accuracy of this data shouldn’t really matter. That’s because there’s no scientifically established optimum for sleep architecture, that is, the amount and organization of the various stages. A 2017 meta-analysis that looked at nearly 300 studies to make recommendations about sleep quality had only two “consensus findings” when it comes to the sleep cycle: that for adults, getting too much REM sleep (more than about 40 percent of the total) is probably bad, and getting very little deep sleep (less than 5 percent of the total) is also probably bad.
“We really don’t have a precise way to measure sleep need, let alone to measure how much of a specific stage is needed,” Spencer said.
Michael Grandner, director of the Sleep and Health Research Program at the University of Arizona and a sleep expert on Fitbit’s advisory board, described the sleep staging data as “ballpark” and said that it’s intended to give folks “a window into what’s going on under the hood.”
“If they feel they’re not sleeping well, it gives them some objective indication that they’re not crazy,” he said. However, he emphasized that the data should not be used to self-diagnose, and that if you’re truly troubled by what you’re seeing, the best course of action is to “go talk to a sleep specialist.”
The sleep staging data is fuzzy, but if you’re going to look at it, it’s probably better to look at your averages than worry about night-to-night variation.Screenshot: Maddie Stone
While the science of sleep continues to develop, what’s an average smartwatch-wearer to make of all this information? If you’re a healthy adult and you don’t suffer from a sleep disorder, you can probably trust your Fitbit to do a pretty good job tracking your total sleep most of the time. If the device is wildly off, you’ll know it: For instance, one time I lay awake almost all night, and my Fitbit, tricked by my apparent lifelessness, told me I slept seven hours.
When it comes to the more fine-grained sleep staging data, focus on the big picture trends available in your Fitbit app: Are the patterns consistent over time? How does your data compare with others of your age/gender? Keep in mind that this data is an approximation, created by a proprietary algorithm that could change at any point. That’s especially true for the brand-new “sleep score” feature, a 0-100 ranking Fitbit now assigns your nightly slumber and which the company says is based on your heart-rate, time asleep, and sleep staging data. Asked how meaningful this number is, Grandner simply said “time will tell.” And remember that not even the world’s top sleep researchers can tell you what is optimal or normal—so if you feel fine, you probably are!
If, on the other hand, you feel crappy and your smartwatch starts registering a change in your sleep patterns, the best course of action is to talk to your doctor about it. Commercial sleep trackers, after all, aren’t medical devices.
Ideally, in a world obsessed with productivity, trackers can help us all pay better attention to our sleep and think about how to improve it. But if you feel like monitoring your metrics is stressing you out more than it’s helping, try taking the smartwatch off for a few nights. Maybe you’ll sleep better knowing you won’t be graded on it in the morning.