| Literature DB >> 26510685 |
Victor Castro1, Yuanyuan Shen2, Sheng Yu3, Sean Finan4, Cindy Ta Pau5, Vivian Gainer6, Candace C Keefe7, Guergana Savova8, Shawn N Murphy9,10, Tianxi Cai11, Corrine K Welt12.
Abstract
BACKGROUND: Polycystic ovary syndrome (PCOS) is a heterogeneous disorder because of the variable criteria used for diagnosis. Therefore, International Classification of Diseases 9 (ICD-9) codes may not accurately capture the diagnostic criteria necessary for large scale PCOS identification. We hypothesized that use of electronic medical records text and data would more specifically capture PCOS subjects.Entities:
Mesh:
Year: 2015 PMID: 26510685 PMCID: PMC4625743 DOI: 10.1186/s12958-015-0115-z
Source DB: PubMed Journal: Reprod Biol Endocrinol ISSN: 1477-7827 Impact factor: 5.211
Polycystic ovary syndrome related signs, symptoms, comorbidities, medication, laboratory results, ultrasound findings and other phenotypes abstracted from the medical record to inform feature selection and model training for natural language processing (NLP) analysis
| Feature | Parameter | Source |
|---|---|---|
| PCO morphology | Ovarian volume >10 | Pelvic ultrasound |
| PCO morphology | ≥12 follicles or PCO morphology in text | Pelvic ultrasound |
| Hyperandrogenism | Elevated testosterone, DHEAS or androstenedione | Laboratory data |
| Hyperandrogenism | Hirsutism | Note |
| Hyperandrogenism | Ferriman Gallwey Score | Physical exam |
| Hyperandrogenism | Acne | Physical exam or note |
| Hyperandrogenism | Alopecia, Hair loss, balding | Physical exam or note |
| Irregular menses | Cycle length | Note |
| Irregular menses | Irregular menses, oligomenorrhea, amenorrhea, etc. | Note |
| Hyperandrogenism | Clitoromegaly | Physical exam |
| Associated Features | ||
| Acanthosis Nigricans | Acanthosis | Physical exam |
| Gestational Diabetes | Gestational diabetes | Note |
| Infertility | Anovulatory infertility | Note |
| Obesity | Obesity | Physical exam or note |
| Type 2 diabetes | Type 2 diabetes | Laboratory data or note |
| Pertinent Negatives | ||
| Excessive exercise | Exercise history | Note |
| Chronic opioid or drug use | Substance history | Note |
| Hypothalamic amenorrhea | BMI or Hypothalamic amenorrhea history | Physical exam or note |
Comparison of true polycystic ovary syndrome (PCOS) on chart review in women with PCOS determined using ICD-9 codes or using an algorithm incorporating natural language processing and codified data
| Method | ICD-9 Code | PCOS Algorithm-definite | PCOS algorithm-probable |
|
|---|---|---|---|---|
| Number of Charts | 200 | 150 | 41 | |
| Chart Reviewed Definite PCOS (%) | 132 (66) | 98 (65) | 25 (61) | 0.2* |
| Chart Reviewed Probable PCOS (%) | 29 (14.5) | 33 (22) | 7 (17) | 0.2 |
| Not PCOS (%) | 17 (8.5) | 10 (7) | 8 (20) | 0.9 |
| Unable to Determine (%) | 22 (11) | 9 (6) | 1 (2) | 0.04 |
For the algorithm PCOS diagnoses, Definite or Probable were defined by probability cutoff levels. Chart reviewed definite PCOS had at least two confirmatory diagnostic criteria to support the diagnosis and probable had at least one confirmatory criterion
*The p value for the algorithm was calculated using both the definite and probable categories
Demographics of subjects chosen for the refined datamart
| PCOS Cases | PCOS Controls |
| |||||
|---|---|---|---|---|---|---|---|
| N | 6,295 | 59,456 | |||||
| Proportion | Proportion | ||||||
| Gender | Female | 1.00 | 1.00 | ||||
| Age | 18-25 | 0.15 | 0.11 | ||||
| 26-35 | 0.45 | 0.37 | |||||
| 36-45 | 0.35 | 0.33 | |||||
| 46-55 | 0.05 | 0.13 | |||||
| 56-65 | 0.00 | 0.06 | 0.03 | ||||
| Insurance | Private | 0.71 | 0.67 | ||||
| Public-Medicaid | 0.05 | 0.08 | |||||
| Public-Medicare | 0.01 | 0.02 | |||||
| Public-Other | 0.08 | 0.11 | |||||
| Other | 0.09 | 0.10 | |||||
| Unknown | 0.05 | 0.03 | 0.9 | ||||
| Race | White | 0.63 | 0.64 | ||||
| Asian | 0.07 | 0.06 | |||||
| Black | 0.08 | 0.08 | |||||
| Hispanic | 0.11 | 0.10 | |||||
| Other | 0.11 | 0.11 | 1.0 | ||||
| Pregnancy | (Partners Hospital System) | 0.36 | 0.56 | 0.007 | |||
| Pap Smear | (lifetime history) | 0.29 | 0.21 | 0.3 | |||
| Smoker | (ever smoked) | 0.08 | 0.05 | 0.6 | |||
| Type 2 Diabetes | (lifetime history) | 0.08 | 0.02 | 0.1 | |||
| Hypertension | (lifetime history) | 0.10 | 0.09 | 1.0 | |||
| Womens' Health Visit | (lifetime history) | 0.72 | 1.00 | <0.001 | |||
| Mean | SD | Mean | SD | ||||
| Age | current | 33.55 | 7.23 | 36.93 | 9.75 | <0.001 | |
| BMI | lifetime max | 30.99 | 9.02 | 26.85 | 6.49 | <0.001 | |
| Median | Q25 | Q75 | Median | Q25 | Q75 | ||
| Observation Period Start | year, median (IQR) | 2003 | 1998 | 2008 | 2004 | 1999 | 2008 |
| Observation Period End | year, median (IQR) | 2012 | 2010 | 2013 | 2011 | 2010 | 2012 |
| Number of facts | count, median (IQR) | 186 | 68 | 428 | 226 | 90 | 524 |
Controls were slightly older, with a lower BMI. Other factors were not different