| Literature DB >> 33782494 |
Jeong-Whun Kim1,2, Seok Kim3, Borim Ryu3, Wongeun Song3, Ho-Young Lee3, Sooyoung Yoo4.
Abstract
Well-defined large-volume polysomnographic (PSG) data can identify subgroups and predict outcomes of obstructive sleep apnea (OSA). However, current PSG data are scattered across numerous sleep laboratories and have different formats in the electronic health record (EHR). Hence, this study aimed to convert EHR PSG into a standardized data format-the Observational Medical Outcome Partnership (OMOP) common data model (CDM). We extracted the PSG data of a university hospital for the period from 2004 to 2019. We designed and implemented an extract-transform-load (ETL) process to transform PSG data into the OMOP CDM format and verified the data quality through expert evaluation. We converted the data of 11,797 sleep studies into CDM and added 632,841 measurements and 9,535 observations to the existing CDM database. Among 86 PSG parameters, 20 were mapped to CDM standard vocabulary and 66 could not be mapped; thus, new custom standard concepts were created. We validated the conversion and usefulness of PSG data through patient-level prediction analyses for the CDM data. We believe that this study represents the first CDM conversion of PSG. In the future, CDM transformation will enable network research in sleep medicine and will contribute to presenting more relevant clinical evidence.Entities:
Year: 2021 PMID: 33782494 PMCID: PMC8007756 DOI: 10.1038/s41598-021-86564-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Conversion of polysomnography into the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) tables.
Polysomnographic parameters included in the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) transformation.
| Category | Polysomnographic parameters |
|---|---|
| Body measurement | Body height (cm), Body weight (Kg), Body mass index (BMI), Neck circumference (cm), Waist circumference (cm), Hip circumference (cm), Waist/hip ratio |
| Sleep summary | Sleep efficiency (SE) (%), Sleep latency (SL) (min), Sleep period time (SPT) (min), Total sleep time (TST) (min), Total time analyzed (Time In bed, TIB) (min), Wake time after sleep onset (WASO) (min), REM latency from sleep onset |
| Sleep stage | % stage 1 Nonrapid eye movement (NREM),% stage 2 NREM,% stage 3 NREM,% stage REM, Time spent during REM (min) |
| Respiratory events | Respiratory disturbance index (RDI), Apnea hypopnea index (AHI) (/h), Apnea index (AI) (/h), Central apnea index (/h), Mixed apnea index (/h), Obstructive apnea index (/h), Hypopnea index (HI) (/h), Hypopnea Index with oxygen desaturation (/h), Hypopnea Index without oxygen desaturation (/h), AHI during supine (/h), AHI during left lateral (/h), AHI during right lateral (/h), AHI during prone (/h), AHI during NREM (/h), AHI during REM (/h), Respiratory effort-related arousal (RERA) |
| Duration of apnea or hypopnea | Longest apnea duration (second), Mean apnea duration (second), Mean hypopnea duration (second), Mean total apnea and hypopnea duration (second) |
| Sleep position | Time spent during Supine position (min), % Time spent during Supine position (%), Time spent during Left Lateral position (min), % Time spent during Left Lateral position (%), Time spent during Right Lateral position (min), % Time spent during Right Lateral position (%), Time spent during Prone position (min), % Time spent during Prone position (%) |
| Arousal | Number of awakenings, Respiratory arousal, Spontaneous arousal, LM with arousals (/h), Periodic limb movement (PLM) arousal |
| Limb movement | Limb movement index (/h), Periodic limb movement index (PLMI) |
| Snoring | Average snoring episode duration (min), Longest snoring episode (min), Number of snoring episodes, Snoring percent time (%), Snoring time (min) |
| Oxygen statistics | %Time of saturation < 60%, %Time of saturation < 70%, %Time of saturation < 80%, %Time of saturation < 90%, Waking oxygen saturation (%), Average oxygen saturation during sleep (%), Lowest oxygen saturation (%), Oxygen desaturation index (ODI) |
| CPAP pressure | Titrated pressure (cmH2O) |
| Questionnaire | Epworth sleepiness scale, Pittsburgh sleep quality index |
| Multiple sleep latency test | REM latency #1 (min), REM latency #2 (min), REM latency #3 (min), REM latency #4 (min), REM latency #5 (min), Sleep latency #1 (min), Sleep latency #2 (min), Sleep latency #3 (min), Sleep latency #4 (min), Sleep latency #5 (min), Mean sleep latency (min) |
| Apnea level manometry test | % Retroglossal obstruction |
| Friedman staging | Tonsil grade, Mallampati grade, Friedman stage |
Demographic characteristics of total sleep tests that were converted into OMOP CDM. The sleep tests from February 2004 to June 2019 were extracted, transformed, and loaded into the OMOP CDM.
| Characteristics | Number of records: n (%) | Number of persons: n |
|---|---|---|
| Total | 11,392 | 9577 |
| Male | 8363 (73.4) | 6829 |
| Female | 3029 (26.6) | 2748 |
| < = 9 | 205 (1.8) | 190 |
| 10 s | 385 (3.4) | 368 |
| 20 s | 565 (5.0) | 528 |
| 30 s | 1229 (10.8) | 1063 |
| 40 s | 2230 (19.6) | 1833 |
| 50 s | 2849 (25.0) | 2355 |
| 60 s | 2348 (20.6) | 2016 |
| 70 s | 1226 (10.8) | 1065 |
| 80 s | 346 (3.0) | 313 |
| 90 s | 9 (0.1) | 8 |
| 2004 | 319 (2.8) | 288 |
| 2005 | 458 (4) | 398 |
| 2006 | 546 (4.8) | 495 |
| 2007 | 702 (6.2) | 600 |
| 2008 | 639 (5.6) | 547 |
| 2009 | 605 (5.3) | 528 |
| 2010 | 604 (5.3) | 523 |
| 2011 | 647 (5.7) | 549 |
| 2012 | 677 (5.9) | 582 |
| 2013 | 685 (6) | 600 |
| 2014 | 860 (7.5) | 751 |
| 2015 | 1014 (8.9) | 862 |
| 2016 | 1067 (9.4) | 972 |
| 2017 | 1023 (9) | 958 |
| 2018 | 1035 (9.1) | 1010 |
| 2019 | 511 (4.5) | 508 |
| 11,250 | ||
| AHI < 5 | 3209 (28.5) | 3156 |
| Mild OSA (5 ≤ AHI < 15), | 2681 (23.8) | 2622 |
| Moderate OSA (15 ≤ AHI < 30) | 2167 (19.3) | 2091 |
| Severe OSA (AHI30) | 3193 (28.5) | 3001 |
*The prevalence of OSA severity levels were calculated based on Apnea Hypopnea Index (AHI) for only records with AHI values.
Figure 2The attrition for the model development at the best performance setting of prediction.
Prediction model performance for test data set. All covariates setting used all OMOP CDM variables including polysomnography parameter concepts, and PSG only covariates used only gender, age group, and polysomnography parameter concepts for developing and training the prediction model.
| Covariate setting | Model | Target size (Test) | Outcome count (Test) | Outcome rate (%) | AUC | AUPRC |
|---|---|---|---|---|---|---|
| All covariates | Random forest | 639 | 71 | 11.11 | 0.751 | 0.289 |
| Gradient boosting machine | 483 | 56 | 11.59 | 0.700 | 0.250 | |
| Lasso Logistic Regression | 640 | 71 | 11.09 | 0.672 | 0.212 | |
| PSG only covariates | Random forest | 638 | 71 | 11.13 | 0.654 | 0.213 |
| Gradient boosting machine | 437 | 50 | 11.44 | 0.630 | 0.170 | |
| Lasso Logistic Regression | 482 | 56 | 11.62 | 0.598 | 0.164 |
AUC area under the receiver operating characteristic curve, AUPRC area under the precision recall curve.
Top 20 predictors selected from random forest model. The polysomnography parameters are indicated in bold.
| No | Covariate name | Importance | Covariate mean with outcome | Covariate mean with no outcome |
|---|---|---|---|---|
| 1 | drug_era group during day -7 through 0 days relative to index: Synthetic antispasmodics, amides with tertiary amines | 0.008 | 0.021 | 0.001 |
| 2 | measurement value during day -180 through 0 days relative to index: Triglyceride [Mass/volume] in Serum or Plasma (milligram per deciliter) | 0.008 | 35.158 | 13.156 |
| 3 | measurement value during day -180 through 0 days relative to index: Systolic blood pressure (millimeter mercury column) | 0.007 | 43.961 | 24.646 |
| 4 | 0.006 | 11.496 | 7.281 | |
| 5 | measurement value during day -30 through 0 days relative to index: Gamma glutamyl transferase [Enzymatic activity/volume] in Serum or Plasma (unit per liter) | 0.006 | 5.236 | 1.530 |
| 6 | measurement value during day -180 through 0 days relative to index: Diastolic blood pressure (millimeter mercury column) | 0.006 | 25.845 | 14.697 |
| 7 | drug_era group during day -7 through 0 days relative to index: tiropramide | 0.006 | 0.021 | 0.001 |
| 8 | 0.006 | 11.496 | 7.281 | |
| 9 | 0.006 | 1.040 | 0.394 | |
| 10 | 0.005 | 11.496 | 7.281 | |
| 11 | measurement value during day -7 through 0 days relative to index: Gamma glutamyl transferase [Enzymatic activity/volume] in Serum or Plasma (unit per liter) | 0.005 | 2.923 | 0.726 |
| 12 | 0.005 | 74.268 | 84.735 | |
| 13 | 0.005 | 1.040 | 0.394 | |
| 14 | 0.005 | 1.040 | 0.394 | |
| 15 | 0.005 | 111.030 | 100.081 | |
| 16 | 0.004 | 11.716 | 7.616 | |
| 17 | 0.004 | 22.177 | 18.561 | |
| 18 | 0.004 | 74.268 | 84.735 | |
| 19 | drug_era group during day -30 through 0 days relative to index: tiropramide | 0.004 | 0.021 | 0.002 |
| 20 | drug_era group during day -180 through 0 days relative to index: tiropramide | 0.004 | 0.028 | 0.004 |