| Literature DB >> 32821428 |
Kiely N James1, Michelle M Clark1, Brandon Camp1, Cyrielle Kint2, Peter Schols2, Sergey Batalov1, Benjamin Briggs1, Narayanan Veeraraghavan1, Shimul Chowdhury1, Stephen F Kingsmore1.
Abstract
To investigate the diagnostic and clinical utility of a partially automated reanalysis pipeline, forty-eight cases of seriously ill children with suspected genetic disease who did not receive a diagnosis upon initial manual analysis of whole-genome sequencing (WGS) were reanalyzed at least 1 year later. Clinical natural language processing (CNLP) of medical records provided automated, updated patient phenotypes, and an automated analysis system delivered limited lists of possible diagnostic variants for each case. CNLP identified a median of 79 new clinical features per patient at least 1 year later. Compared to a standard manual reanalysis pipeline, the partially automated pipeline reduced the number of variants to be analyzed by 90% (range: 74%-96%). In 2 cases, diagnoses were made upon reinterpretation, representing an incremental diagnostic yield of 4.2% (2/48, 95% CI: 0.5-14.3%). Four additional cases were flagged with a possible diagnosis to be considered during subsequent reanalysis. Separately, copy number analysis led to diagnoses in two cases. Ongoing discovery of new disease genes and refined variant classification necessitate periodic reanalysis of negative WGS cases. The clinical features of patients sequenced as infants evolve rapidly with age. Partially automated reanalysis, including automated re-phenotyping through CNLP, has the potential to identify molecular diagnoses with reduced expert labor intensity.Entities:
Keywords: Genetic testing; Molecular medicine
Year: 2020 PMID: 32821428 PMCID: PMC7419288 DOI: 10.1038/s41525-020-00140-1
Source DB: PubMed Journal: NPJ Genom Med ISSN: 2056-7944 Impact factor: 8.617
Demographic and clinical characteristics of the reanalysis cohort.
| Age at enrollment | Median: 5 months; range 0.1–238 months | |
| <1 month | 10 (21%) | |
| <6 months | 27 (56%) | |
| Age at reanalysisa | Median: 24 months; range 0.1–255 months | |
| <6 month | 4 (8%) | |
| <24 months | 23 (49%) | |
| Sex | Female | 23 (48%) |
| Male | 25 (52%) | |
| Race and ethnicity | Hispanic/Latino | 24 (50%) |
| Caucasian | 14 (29%) | |
| Asian/Pacific Islander | 3 (6%) | |
| African/African American | 2 (4%) | |
| Other/unknown | 5 (10%) | |
| Primary system involved | Neurological | 13 (27%) |
| Multiple congenital anomalies | 12 (25%) | |
| Hepatic | 6 (13%) | |
| Hematological | 5 (10%) | |
| Musculoskeletal | 3 (6%) | |
| Pulmonary | 3 (6%) | |
| Cardiac | 2 (4%) | |
| Endocrine/biochemical | 2 (4%) | |
| Gastrointestinal | 2 (4%) | |
aOr at the age of exitus.
Fig. 1Comparison of partially automated and manual reanalysis pipelines.
The partially automated reanalysis pipeline incorporates automated phenotype extraction from the EHR and variant shortlist generation.
Summary of cases with reported or possible diagnoses upon reanalysis.
| Case | Phenotype | Potential diagnostic gene (RefSeq transcript ID) | Potential diagnostic variant(s) | Zygosity (inheritance) | Potential diagnosis | Classification | Outcome (reason not reported) | Resultant changes in medical care | Changes between initial analysis and reanalysis |
|---|---|---|---|---|---|---|---|---|---|
| 6009 | Small for gestational age and failure to thrive, congenital heart defects, bilateral ectrodactyly | c.267C>A, p.(Cys89*) | Heterozygous (paternally phased de novo mutation) | Silver–Russell syndrome | P | Reported | Assessment of renal and hepatic function, blood pressure. Avoidance of metronizadole. Consideration of baclofen, carbidopa/levodopa | Correction of manual data labeling error. 2016 classification of this variant as pathogenic (GeneDx; ClinVar accession SCV000491643.1) | |
| 6033 | Failure to thrive, intention tremor, ataxic gait, developmental regression, hypotonia, microcephaly, white matter abnormalities on MRI | c.-15+3G>T; c.1583G>A, p.(Gly528Glu) | Heterozygous (paternal); heterozygous (maternal) | Cockayne syndrome type B | P; LP (after orthogonal functional testing) | Reported | Growth hormone treatment, hypoglycemia prevention, monitoring for premature adrenarche, maxillofacial assessment | Publication of c.1583G>A variant in Cockayne syndrome cohort (Calmels et al.[ | |
| 6046 | Unprovoked cardiac arrest, sudden infant death syndrome, suspicion of long Q-T syndrome | c.1574C>T, p.(Ala525Val) | Heterozygous (maternal) | Long Q-T syndrome 4 | VUS; VUS | Queue for periodic reanalysis (VUS) | N/A | 2018 classification of this variant as a VUS in association with long Q-T syndrome (Invitae; ClinVar accession SCV000822853.1) | |
| 6062 | Feeding difficulties, ketosis, mast cell activation disorder, hypotonia, joint hypermobility, fine motor delay, fatigue after exercise | c.-32-13T>G; c.497C>T; p.(Thr166Ile) | Heterozygous (not maternal); heterozygous (maternal) | Glycogen storage disease II | P; VUS | Queue for periodic reanalysis (VUS, unclear phenotypic matching) | N/A | ||
| 6068 | Acute lymphoid leukemia in remission, rectal fistula, gallstones, fevers, clotting | c.2600G>A, p.(Arg867Gln) | Unknown (parental samples unavailable) | Leukemia susceptibility (novel gene/disease association) | VUS | Queue for periodic reanalysis (VUS, unclear gene–disease relationship) | N/A | Publication of this variant segregating in a kindred with thrombocytosis and progression to polycythemia vera in one affected (Maie et al.[ | |
| 6081 | Arthrogryposis multiplex congenita, micrognathia, desaturations, parenchymal dysplasia, cortical dysplasia, bilateral clubfeet, suspected congenital knee dislocation, uterine growth restriction | c.122C>T, p.(Ala41Val); c.950C>T p.(Pro317Leu) | Heterozygous (maternal); heterozygous (paternal) | Ehlers–Danlos syndrome, spondylodysplastic type 2 | VUS; VUS | Queue for periodic reanalysis (VUS, unclear phenotypic matching) | N/A | — |
P Pathogenic, LP Likely Pathogenic, VUS Variant of Uncertain Significance.
Fig. 2Characterization of HPO term lists automatically extracted from patient EHRs.
a The number of HPO terms extracted at enrollment and reanalysis, with each case represented by two linked points. b The number of HPO subcategories (out of 28 total) represented by terms extracted at enrollment and reanalysis. c The average information content (IC) of HPO terms extracted at enrollment and reanalysis differed significantly (Wilcoxon paired signed-rank test; p < 0.0001). Each case is represented by one point. The two cases for which diagnoses were reported following reanalysis are represented by a red square (case 6009) and a red triangle (case 6033).
Fig. 3Characteristics of HPO term lists extracted from patient EHRs are correlated with characteristics of resultant variant shortlists.
Each case is represented by one point. a The number HPO terms extracted at enrollment is positively correlated with the number of Moon variants on the resultant variant shortlist (r = 0.40, p = 0.005). b The number HPO terms extracted at reanalysis is positively correlated with the number of Moon variants on the resultant shortlist (r = 0.41, p = 0.0032). c The change in input HPO term list size between enrollment and reanalysis is positively correlated with the change in resultant variant shortlist size (r = 0.48, p = 0.0005), as well as with d variant shortlist turnover (r = 0.57, p < 0.0001).
Fig. 4Overlapping variant shortlists at enrollment and reanalysis.
Cases are ordered chronologically by date at enrollment. Black: variants on initial enrollment shortlist only; medium gray: variants on enrollment and reanalysis shortlists; light gray: variants on reanalysis shortlist only. The percentage next to each case’s bar reflects the number of new variants on the reanalysis shortlist, relative to the size of the initial enrollment shortlist (median = 25%).