top of page

Wearables vs Sleep Labs: Can Consumer Devices Replace Polysomnography

  • Aaqifah Hilmi
  • Jun 3
  • 11 min read

Consumer sleep trackers such as smartwatches, smart rings, and under-mattress sensors have made sleep monitoring more accessible than ever. While these devices can track metrics like heart rate, blood oxygen levels, movement, and sleep stages, they cannot match the clinical accuracy of polysomnography (PSG), the gold standard for sleep assessment. Wearables excel at identifying trends and screening for conditions such as sleep apnea, often achieving high sensitivity, but they still produce false positives and lack direct measurements of brain activity and airflow. As a result, wearables are best viewed as complementary tools for sleep screening and long-term monitoring rather than replacements for professional sleep studies.


For decades, understanding what happens during sleep required a visit to a sleep laboratory. Patients would spend a night connected to an array of sensors measuring brain waves, eye movements, breathing patterns, oxygen levels, heart activity, and muscle movements. This comprehensive test, known as polysomnography (PSG), remains the gold standard for diagnosing sleep disorders.


Wearables vs Sleep Labs: Can Consumer Devices Replace Polysomnography
Source: SleepCRS

Today, however, millions of people wake up to sleep scores generated by smartwatches, smart rings, and other wearable devices. These consumer technologies promise insights into sleep quality, recovery, breathing disturbances, and even potential signs of sleep apnea - all from the comfort of home. As these devices become more sophisticated and widely adopted, an important question emerges: can wearables deliver the same level of insight as a clinical sleep study?


The answer is nuanced. Modern wearables have made remarkable progress in tracking sleep-related metrics and can provide valuable long-term data that traditional sleep labs cannot easily capture. Yet they also operate with significant limitations, relying on indirect measurements rather than directly monitoring many of the physiological signals used in clinical diagnosis. Understanding where wearables excel, and where they fall short, is essential for anyone looking to use sleep data to improve their health.


What Are PSG, HSAT and Consumer Sleep Wearables?


Polysomnography (PSG) is the gold-standard sleep test performed in a sleep lab. It records multiple physiological signals like EEG (brain waves), EOG (eye movements), EMG (muscle tone), ECG (heart), breathing airflow and effort, oxygen saturation (SpO₂), and snore audio; all synchronized overnight. Trained technicians score sleep stages (wake, light, deep, REM) and events such as apneas and leg movements. PSG is highly accurate but expensive, intrusive (many sensors), and usually limited to one night.


Home Sleep Apnea Tests (HSAT) use simplified Type III monitors at home. They typically record airflow, breathing effort, SpO₂ and pulse, but usually no EEG. HSAT devices (often prescription-based) diagnose obstructive sleep apnea or OSA by estimating the apnea-hypopnea index (AHI). AASM guidelines allow HSAT in uncomplicated patients with high OSA suspicion. HSATs have high sensitivity (~95-97%) for moderate/severe OSA, but modest specificity (∼60-80%). They still focus only on breathing events, not full sleep architecture.¹


Consumer Sleep Trackers (CSTs) include wearable devices such as smartwatches, wristbands, smart rings, and “nearables” like mattress sensors and phones. They are non-prescription, use optical and motion sensors, and measure proxies: heart rate (via PPG), motion (accelerometer), SpO₂ (pulse oximetry), skin temperature, and sometimes ambient sounds. 


For example, Apple and Samsung watches use PPG and microphone to derive “breathing disturbance” metrics; smart rings use PPG and temperature; under-mattress mats use ballistocardiography² and acoustic sensors. CSTs estimate sleep vs wake and infer sleep stages from heart-rate variability and movement. They can flag possible apnea by noticing repeated dips in SpO₂ or patterns of disturbed breathing, but cannot measure airflow or EEG. Regulatory bodies treat most CSTs as wellness devices: no FDA clearance is required unless a specific medical claim is made. 


Current Capabilities of Wearable Sleep Trackers


Wearables combine several sensors to approximate sleep metrics:


  • Photoplethysmography (PPG): PPG is an optical sensor that measures blood volume pulses. From PPG you get heart rate (HR), heart rate variability (HRV), and often SpO₂. Changes in HR and HRV help infer sleep stages (e.g. lower HR in deep sleep) and breathing irregularities (via pulse amplitude drops). SpO₂ sensors additionally, can track oxygen desaturation events.


  • Accelerometer/Motion sensor: Tracks body movement to detect sleep/wake. Zero motion usually equals “asleep,” so devices achieve high sleep sensitivity. Modern wearables also use gyroscopes to improve detection of posture and respiration-related motion (chest/abdominal movement). They are excellent at detecting sleep but often miss quiet wakefulness, i.e. when you’re awake but lying quietly.


  • Respiration surrogates: Some devices derive respiratory rate by analyzing oscillations in the PPG waveform or by measuring subtle chest/abdomen movement (via accelerometer or special bands). Others use microphones to detect snoring or breathing sounds. 


  • Temperature sensors: A few trackers measure skin or near-body temperature. This aids in circadian rhythm analysis and might help distinguish sleep phases.


  • Nearables: Contactless monitors like mattress mats or under-mattress sensors use ballistocardiography and sound to measure heart rate, respiratory rate, movement and snoring, without any device worn on the body. They can track sleep continuously and unobtrusively


Overall, wearables provide continuous, nightly data on sleep (often 30-second or 1-minute epochs) without needing a lab. They can run for long time periods depending on battery life permitting, therefore creating longitudinal health records.


Wearable Accuracy Compared to Polysomnography


Sleep/Wake Detection 


Studies consistently find that modern wearables detect sleep epochs with >90% sensitivity (few false negatives) but have much lower specificity (many false positives as “sleep” during still wake). For instance, a 2025 trial of six wearable devices found sleep sensitivity >90% for all devices, but specificity only 29–52%.³


Thus, wearables catch most true sleep, but often miss brief awakenings or wakefulness (especially if the user is lying still). This leads to overestimated total sleep time and sleep efficiency. Across devices, the consensus is that commercial trackers are good at detecting sleep, but less effective at detecting wake. In practice, a consumer tracker might report “you slept 8.2 h with 87% efficiency,” whereas PSG might say 7.5 h and 78% if the person had multiple brief arousals that the wearable smoothed over.


Sleep Stage Accuracy


Because wearable devices do not measure brain activity (EEG), they cannot directly determine sleep stages. Instead, they estimate sleep stages using indirect signals such as heart rate, heart rate variability, movement, and blood oxygen patterns.


Studies have shown that wearables generally perform better at identifying deep sleep and REM sleep than light sleep or wakefulness. Many devices tend to overestimate the amount of deep sleep and underestimate light sleep when compared with polysomnography (PSG), the clinical gold standard.


Research has also found that sleep stage detection is more accurate for deep and REM sleep than for light sleep. While wake periods are often detected with high sensitivity, wearables may incorrectly classify some wakefulness as sleep. More advanced devices that combine photoplethysmography (PPG) with motion sensors achieve higher accuracy than those relying on movement data alone, with some studies reporting agreement rates of approximately 75–76% for light sleep and over 90% for REM sleep. 


Despite these improvements, wearable devices still do not match the accuracy of PSG for detailed sleep staging. They can provide a useful estimate of overall sleep duration and approximate time spent in deep and REM sleep, but they cannot reliably reproduce the complete sleep architecture or distinguish between clinical sleep stages with the precision required for diagnosis.


It is also important to note that most wearable sleep algorithms are developed and validated using people with relatively regular sleep schedules. Accuracy can decline in individuals with fragmented sleep, frequent awakenings, shift work schedules, or multiple daytime naps, making sleep estimates less reliable in these populations.


Sleep Apnea and Respiratory Event Detection


Many modern wearable devices attempt to identify individuals who may be at risk of sleep apnea by monitoring changes in blood oxygen levels (SpO₂), breathing patterns, and related physiological signals.


Research shows that these devices generally have high sensitivity but lower specificity when screening for obstructive sleep apnea (OSA). A recent review found that consumer-grade oxygen-monitoring devices achieved an average sensitivity of approximately 93% and specificity of approximately 63% for detecting moderate-to-severe OSA. This means they are effective at identifying most people who truly have sleep apnea, but they may also incorrectly flag some individuals who do not have the condition.


Validation studies of several commercially available screening technologies have reported sensitivities ranging from approximately 67% to 88% and specificities between 88% and 96% for detecting moderate-to-severe OSA. These results indicate that wearable-based screening can be useful for identifying people who may require further evaluation.


However, wearable devices are not considered diagnostic tools. A positive screening result should always be followed by a formal sleep assessment, such as a home sleep apnea test (HSAT) or an in-laboratory polysomnography (PSG). While wearables can help identify individuals at higher risk, they cannot definitively diagnose sleep apnea or determine its severity with the same level of confidence as clinical testing.


Regulatory and Intended Use


Most consumer sleep trackers are “wellness” products, not medical devices. The FDA generally does not regulate general activity/sleep trackers, but only specific medical claims. In the recent years, regulators have begun clarifying boundaries:


  • Intended Use and Regulatory Approval

    Several consumer sleep-monitoring features have received regulatory clearance as over-the-counter screening tools for sleep apnea risk. However, these features are not approved to diagnose sleep apnea. Regulatory labeling typically states that they are intended to identify individuals who may be at increased risk and who should seek further medical evaluation. As a result, these technologies should be viewed as screening tools rather than replacements for formal sleep studies such as polysomnography (PSG).


  • Medical Device Regulations

    Regulatory agencies distinguish between wellness features and medical devices based on the claims made by the manufacturer. Features that provide general health or wellness information (e.g., sleep trends or oxygen saturation monitoring) may not require medical device approval. In contrast, products that claim to detect, diagnose, or guide treatment for a medical condition must undergo additional regulatory review and demonstrate clinical validity. Recent guidance for Software as a Medical Device (SaMD) has further emphasized the need for robust validation when health-related claims are made.


  • International Standards

    Similar regulatory frameworks exist in many countries. Device classification generally depends on the intended use, level of risk, and clinical claims being made. Products marketed for medical screening or diagnostic purposes are typically subject to stricter regulatory requirements than those intended solely for wellness or lifestyle tracking.


In summary, the regulatory context underscores that wearables currently serve as screening and monitoring tools. Official guidelines (e.g. AASM) insist that CST data not be used alone for diagnosis or treatment planning. However, experts acknowledge their utility in engaging patients and enhancing clinic visits with supplemental data. 


Clinical Workflows and Use Cases


Wearable devices are increasingly being used to screen for sleep disorders, prioritize patients for further testing, and monitor health over time. However, they are generally not used to make a final diagnosis.


  • Screening and Early Detection: Patients may wear a sleep tracker for several nights at home, allowing clinicians to review patterns such as low oxygen levels during sleep, frequent heart-rate fluctuations, or repeated apnea alerts. If these patterns consistently suggest a sleep disorder, the patient may be referred for a formal sleep study. For example, a person with high blood pressure who regularly experiences overnight oxygen drops may be recommended for an in-lab polysomnography (PSG) test.


  • Supporting Home Sleep Apnea Testing (HSAT): Some wearable devices are used alongside home sleep apnea tests to provide additional information, such as movement patterns, heart rate, or heart rate variability (HRV). This extra data can help clinicians interpret results more effectively, particularly when the primary HSAT recording is incomplete or inconclusive.


  • Long-Term Monitoring: Unlike PSG, which typically records a single night of sleep, wearables can collect data continuously over weeks or months. This makes them useful for monitoring long-term trends, including:

    • Treatment monitoring: Tracking changes in metrics such as HRV or respiratory rate to assess recovery and overall health.

    • Post-diagnosis follow-up: Individuals using treatments such as CPAP can monitor changes in sleep-related metrics and observe improvements over time.

    • Research and public health: Large-scale wearable datasets are helping researchers study sleep patterns and sleep disorders across populations.

    • Patient engagement: Many people find wearable sleep data motivating and informative. Even though the measurements are not perfect, they can encourage healthier sleep habits, weight management, and more productive discussions with healthcare providers.


Some healthcare providers are also beginning to use wearable data as an initial screening step before recommending more expensive and resource-intensive sleep studies. For example, a patient may be asked to wear an overnight oximeter ring or sleep tracker for several days. If the data shows repeated oxygen desaturations or other concerning patterns, an in-lab PSG may then be ordered to confirm the diagnosis and guide treatment. This "wearable-first, PSG-as-needed" approach may become more common as wearable technology continues to improve.


Limitations of Wearables


Despite technological advances, key gaps remain:


  • No EEG/EOG: Wearable devices do not measure brain activity (EEG) or eye movements (EOG), which are the physiological signals used to determine sleep onset and classify sleep stages according to AASM criteria. As a result, all sleep staging performed by wearables is indirect and based on inferred patterns rather than direct measurement.


  • Lack of Airflow and Respiratory Effort data: Devices that rely primarily on SpO₂ or acoustic signals provide limited insight into the underlying cause of respiratory events. Without direct measurements of airflow and breathing effort, they cannot reliably distinguish between obstructive and central apneas. In addition, they may fail to detect events that disrupt sleep without causing significant oxygen desaturation, such as respiratory-related arousals.


  • Sensor Limitations: PPG signal quality can be compromised by movement (such as tossing and turning in bed), poor sensor contact, darker skin pigmentation, and cold extremities, which may reduce the accuracy of SpO₂ measurements. Moreover, many wearable devices stop recording during excessive movement, potentially missing data during apnea episodes when restlessness and thrashing are most pronounced.


  • Validation and Standards: Consumer devices often lack rigorous independent validation. Even among FDA-cleared features, there is variability in performance. Different devices use proprietary algorithms, making cross-comparison hard. A lack of standardized reporting also makes it challenging to judge reliability. 


  • Data Overload and Noise: Wearables produce large amounts of data. Clinicians may be unprepared to interpret it. False positives can also cause anxiety. Furthermore, sleep varies from night to night, so one odd night may not give an accurate picture overall.


When is PSG Still Essential?


PSG remains essential for definitive diagnosis and management.


  • Diagnostic Confirmation: Any positive screening by a wearable (suspected apnea, unusual pattern) should be confirmed by PSG or HSAT before diagnosing OSA or other disorders. Sleep labs are required by guidelines, to confirm apnea and determine severity.


  • Complex Sleep Disorders: PSG is needed for insomnia with co-morbidities, parasomnias, narcolepsy, periodic limb movements, or when patient history is unclear. No wearable can diagnose restless legs syndrome (RLS), narcolepsy/ cataplexy, or REM behavior disorder.


  • Treatment Planning and CPAP Titration: PSG determines optimal CPAP pressure by observing apnea under sleep, a task wearables cannot do. It also measures oxygen and blood CO₂ levels precisely.


  • Legal/occupational requirements: Certain certifications (e.g. pilot health exams) still mandate lab studies.


  • Population for whom wearables don’t work: Some patients (very obese neck, high BMI, movement disorder) may not get reliable wearable data. PSG works in a controlled environment even if at-home sensing fails.


Recommendations and Conclusion


  • For Clinicians: Encourage patients to use wearables as a supplement, not substitute. Review patient-generated sleep data critically: use it to prioritize who needs a sleep study. Consider validated FDA-cleared tools when available for screening. Continue to rely on PSG/HSAT for diagnosis. Educate patients that wearable alarms are alerts, not diagnoses.


  • For Consumers: Wear your tracker regularly for baseline and trend information. If your device flags possible sleep apnea (e.g. “breathing disturbances”), follow up medically. Do not change therapy based on a tracker alone. Recognize that trackers often overestimate sleep time. Use them for insight into sleep habits like sleep consistency and general improvements with lifestyle changes.


  • For Device Designers/Researchers: Focus on improving wake detection (reduce false sleep) and standardizing SpO₂ accuracy across skin tones. Validate algorithms in diverse populations and publish results. Continue developing hybrid approaches such as adding low-profile EEG or airflow sensors to wearables. Work with regulators to clarify pathways for next-gen sleep monitors.


In summary, modern wearables are a valuable “treasure trove” of nightly health data. They augment sleep medicine by making continuous monitoring accessible, but they do not replace the comprehensive insight of a clinical sleep study. As technology and algorithms improve, the gap will narrow, but for now, PSG remains the indispensable gold standard for diagnosing sleep disorders while wearables serve as practical screening and tracking tools.



References:

  1. da Silva Dantas, E. L., Stelzer, F. G., Bernardo, W. M., & Eckeli, A. L. (2025). Oximetry-based devices in diagnosis of obstructive sleep apnea: A systematic review and meta-analysis. Sleep Medicine Reviews, 83, 102139. https://doi.org/10.1016/j.smrv.2025.102139

  2. Ballistocardiography - an overview | sciencedirect topics. (n.d.). https://www.sciencedirect.com/topics/medicine-and-dentistry/ballistocardiography 

  3. Schyvens, A.-M., Peters, B., Van Oost, N. C., Aerts, J.-M., Masci, F., Neven, A., Dirix, H., Wets, G., Ross, V., & Verbraecken, J. (2025). A performance validation of six commercial wrist-worn wearable sleep-tracking devices for sleep stage scoring compared to polysomnography. Sleep Advances, 6(2). https://doi.org/10.1093/sleepadvances/zpaf021 

  4. New research evaluates accuracy of sleep trackers. Sleep Foundation. (2024, February 9). https://www.sleepfoundation.org/sleep-news/new-research-evaluates-accuracy-of-sleep-trackers#:~:text=sleep%20trackers%20collect,time%20a%20wearer%20spends%20asleep 

  5. Svensson T, Madhawa K, Nt H, Chung UI, Svensson AK. Validity and reliability of the Oura Ring Generation 3 (Gen3) with Oura sleep staging algorithm 2.0 (OSSA 2.0) when compared to multi-night ambulatory polysomnography: A validation study of 96 participants and 421,045 epochs. Sleep Med. 2024 Mar;115:251-263. doi: 10.1016/j.sleep.2024.01.020. Epub 2024 Jan 26. PMID: 38382312. 

  6. Estimating breathing disturbances and sleep apnea risk from Apple Watch. (n.d.-b). https://www.apple.com/health/pdf/sleep-apnea/Sleep_Apnea_Notifications_on_Apple_Watch_September_2024.pdf 

  7. Lugten, E. (2026, January 12). Withings sleep rx: Allowing sleep specialists to make house calls. Withings USA. https://www.withings.com/us/en/health-solutions/insight-hub/withings-sleep-rx-allowing-sleep-specialists-to-make-house-calls?srsltid=AfmBOoprRiOPyAoDVr14KkISKmUUVKogLJTKpnX61bR61NeVrh-_1jk4#:~:text=adverse%20breathing%20events%20per%20hour,sensitivity%20and%2088 

  8. Khosla, S., Deak, M., Gault, D. et al. Consumer Sleep Technology: An American Academy of Sleep Medicine Position Statement. J Clin Sleep Med 14, 877–880 (2018). https://doi.org/10.5664/jcsm.7128 

Comments


bottom of page