| Literature DB >> 34556655 |
Joseph T Shieh1,2, Monica Penon-Portmann3,4, Karen H Y Wong5, Michal Levy-Sakin5, Michelle Verghese5, Anne Slavotinek3,4, Renata C Gallagher3,4, Bryce A Mendelsohn4, Jessica Tenney4, Daniah Beleford4, Hazel Perry4, Stephen K Chow5, Andrew G Sharo6, Steven E Brenner7, Zhongxia Qi8, Jingwei Yu8, Ophir D Klein3,4,9, David Martin10, Pui-Yan Kwok3,5,11, Dario Boffelli10.
Abstract
Current genetic testenhancer and narrows the diagnostic intervals for rare diseases provide a diagnosis in only a modest proportion of cases. The Full-Genome Analysis method, FGA, combines long-range assembly and whole-genome sequencing to detect small variants, structural variants with breakpoint resolution, and phasing. We built a variant prioritization pipeline and tested FGA's utility for diagnosis of rare diseases in a clinical setting. FGA identified structural variants and small variants with an overall diagnostic yield of 40% (20 of 50 cases) and 35% in exome-negative cases (8 of 23 cases), 4 of these were structural variants. FGA detected and mapped structural variants that are missed by short reads, including non-coding duplication, and phased variants across long distances of more than 180 kb. With the prioritization algorithm, longer DNA technologies could replace multiple tests for monogenic disorders and expand the range of variants detected. Our study suggests that genomes produced from technologies like FGA can improve variant detection and provide higher resolution genome maps for future application.Entities:
Year: 2021 PMID: 34556655 PMCID: PMC8460793 DOI: 10.1038/s41525-021-00241-5
Source DB: PubMed Journal: NPJ Genom Med ISSN: 2056-7944 Impact factor: 6.083
Clinical and molecular features of diagnostic cases tested with FGA (n = 20).
| Case | Sex | Event | Diagnostic finding | Condition (OMIM#) | Prior exome | Prior microarray |
|---|---|---|---|---|---|---|
| 0703 | M | SV - Translocation | 46,XY,t(1;9)(p33;p21) | Neuroblastoma, GDD | + | + |
| 5103 | M | SV deletion 36 kb | + | + | ||
| 4203 | M | SV deletion 1480 bp | Desanto-Shinawi syndrome (616708) | + | + | |
| 4803 | F | SV deletion 5000 bp | Overlapping with 2p15 deletion syndrome (612513)a | + | + | |
| 2403 | F | Indel/SNV | Ectodermal dysplasia 14, tooth type±hypohidrosis (618180) | + | − | |
| 1003 | M | Indel/SNV | Neurodevelopmental disorder, cataracts, growth, dysmorphic facies (618571) | + | + | |
| 1903 | M | SNVs Indel | Methylmalonic aciduriab (251110) Mental retardation, Siderius (300263) | + | − | |
| 0803 | M | Indel | Mental retardation-hypotonic facies syndrome, X-linked (309580) | + | + | |
| 1703 | F | SV duplication 32 kb | 2q35 duplication syndrome (185900) | − | + | |
| 2303 | F | SV duplication 303 kb | 4p16.3, de novo, chr4:446930-749753 | 4p16.3 duplication | − | + |
| 3103 | F | SNV | Myhre syndrome (139210) | − | + | |
| 0103 | F | Indel | Coffin-Lowry syndrome (303600) | − | + | |
| 0203 | M | SNV | Cardiofaciocutaneous syndrome 3 (615279) | − | + | |
| 0503 | F | SNV | Epileptic encephalopathy, early infantile (616506) | − | + | |
| 1503 | F | Indel SNVs | Pituitary deficiency3, (221750) a-ketoglutarate dehydrogenase (203740) | − | + | |
| 2103 | F | SNVs | LONP1 potential gain-of-function Maple syrup urine, Iab (248600) | − | + | |
| 2704 | M | SNVs | Fraser syndrome 1 (219000) | − | + | |
| 3603 | M | SNVs | Cleft lip/palate dysplasia syndrome (225060) | − | + | |
| 4103 | F | Indel | Short stature, advanced bone age± early osteoarthritis (165800) | − | + | |
| 4903 | F | SNV | Alport syndrome 3, autosomal dominant (104200) | − | + |
aGene likely implicated in deletion condition.
bFound previously; + yes; - no.
Fig. 1Heterozygous, intronic tandem duplication (32 kb) in NHEJ1.
The region affected (chr2: 219,102,933 - 219,134,970, 2q35, genome version GRCh38) corresponds to an IHH upstream enhancer and narrows the diagnostic interval for this condition. a Depicts a de novo assembly (light blue) and its alignment to reference (green). The labeled motifs in the reference genome (vertical maroon lines) are duplicated in Haplotype B and their orientation demonstrates the duplication occurred adjacent to the original sequence, in tandem. b Shows a matrix view of linked reads. The dark orange square in the left panel (proband), illustrates a higher density of barcode overlap in the read matrix compared to parents, indicating the variant likely occurred de novo. c Contains phased haplotypes generated using linked-read data. Haplotype B, in purple, contains the intronic region with higher number of linked reads due to sequence duplication. Accompanying supplemental data show overlap with enhancer.
Comparison of duplication calls between short-read WGS CNV and genome assembly technologies. Calls for the 32 kb intronic NHEJ1 duplication case.
| Short-read WGS CNV | Linked-reads | De novo assembly | |
|---|---|---|---|
| Total number | 1945 | 264 | 225 |
| High quality | 1415 | 2 | n/a |
| Mean size ± SE (bp) | 54,409 ± 21,115 | 78,861 ± 3,124 | 97,654 ± 22,204 |
| Identified | no | yes | yes |
| Correct SV type | − | yes | yes |
| Correct zygosity | − | yes | no |
Short-read WGS CNV=Manta output; Linked-reads=10x genomics output; de novo assembly=Bionano optical mapping output.
Fig. 2Structural rearrangement detection with de novo assembly and linked reads; t(1:9)(p33,p21).
a Contains de novo assemblies of chromosome 9 and 1. Genomic coordinates in gray at the top and the reference assembly in green (reference GRCh38). The proband assembly map is shown in blue with vertical maroon lines that show matching label patterns. The first and second panels show two de novo assembly maps that align to reference chromosomes 1 and 9 and the translocation breakpoint where the alignment switches. The third panel depicts two assembly maps in chromosome 1 with segments that align and segments that do not align to the reference due to the translocation. b Shows the matrix view of linked reads that contains unexpected barcode overlap (in orange) between chromosome 1 and 9, corresponding to the intronic point of fusion between the two. This overlap is absent in the parents.
Fig. 3Deletion disrupting TANGO2 (chr22: 20,039,637–20,075,714 and chr22: 20,041,469—20,075,432, genome version GRCh38).
a De novo assembly (light blue) demonstrates missing sequence labels with respect to reference (green). The orange bracket and gray triangle show the deleted region. b Matrix view with absent signal from intervening region demonstrates proband with biallelic deletion. c Deletion is also seen by drop in coverage in both haplotypes in linked-read data.
Fig. 4Variant haplotype distinction.
a Shows compound heterozygous TSPEAR variants (NM_144991). Phasing was successful for etiologic variants 184,756 bp apart given a phasing block of 15.1 Mb, which is not possible with short-read sequencing. Maroon and yellow arrows point to each variant. Gray arrowheads point to single-nucleotide polymorphisms that confirm trans orientation in relation to parental haplotypes. b Shows a de novo SMAD4 pathogenic variant (NM_005359) identified by linked-read sequencing, also detectable by short-read sequencing. Haplotype analysis showed the variant occurred on the paternally inherited allele, Haplotype A. Variant position is indicated with a maroon arrow. Gray arrowheads point to.