| Literature DB >> 31577830 |
Gavin R Oliver1,2, Xiaojia Tang1,2, Laura E Schultz-Rogers1,2, Noemi Vidal-Folch3, W Garrett Jenkinson1,2, Tanya L Schwab4, Krutika Gaonkar1,2, Margot A Cousin1,2, Asha Nair1,2, Shubham Basu1,2, Pritha Chanana1,2, Devin Oglesbee3,5, Eric W Klee1,2,3,6.
Abstract
BACKGROUND: RNA sequencing has been proposed as a means of increasing diagnostic rates in studies of undiagnosed rare inherited disease. Recent studies have reported diagnostic improvements in the range of 7.5-35% by profiling splicing, gene expression quantification and allele specific expression. To-date however, no study has systematically assessed the presence of gene-fusion transcripts in cases of germline disease. Fusion transcripts are routinely identified in cancer studies and are increasingly recognized as having diagnostic, prognostic or therapeutic relevance. Isolated reports exist of fusion transcripts being detected in cases of developmental and neurological phenotypes, and thus, systematic application of fusion detection to germline conditions may further increase diagnostic rates. However, current fusion detection methods are unsuited to the investigation of germline disease due to performance biases arising from their development using tumor, cell-line or in-silico data.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31577830 PMCID: PMC6774566 DOI: 10.1371/journal.pone.0223337
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Fusion candidate BLAST categorization rationale.
Putative fusion sequences were BLASTN aligned to the human genome and transcriptome to enable categorization. A) Candidates aligning to abundant hematological genes (Globins, T-cell receptors) were not considered further due to their overrepresentation in blood samples and observed overrepresentation in fusion analysis results. These might represent artifacts or transient biological events. B & C) Full length candidates producing unbroken alignments against the human transcriptome or genome were classified as likely known transcripts or genomic sequence respectively. D) When the candidate produced no alignment against the human genome or transcriptome or only a part alignment was possible, the candidate was classified as a likely artifact, potentially containing low quality or non-human sequence including adapters. E) When the candidate produced multiple alignments within the gene boundaries of a single gene but did not align completely to a known transcript, it was classified as a potential novel transcript of a known gene. This category has the also potential to capture aberrant single-gene events. F) When the candidate produced two hits to separate immunoglobulins the event was classed as potentially representing immune diversity. Alternatively these may be generated by alignment artifacts due to high homology between immunoglobulin genes. G) When two distinct alignments were produced against two different chromosomes, the candidate was defined as a potential interchromosomal fusion. Fused genes with known homology were flagged to enable additional checking for alignment artifacts. H) When the candidate aligned to two distinct genes or regions on a single chromosome, it was classified as a potential intrachromosomal fusion. Fused genes with known homology were flagged to enable additional checking for alignment artifacts. Intrachromosomal candidates occurring between neighboring genes were annotated as potential read-through events. These events could represent true fusions or aberrant transcriptional events but might also represent biologically normal events that occur due to co-transcription of neighboring genes that have yet to be re-classified as single genes. Interchromosomal and intrachromosomal candidates were annotated as homologous when the two hits occurred against known homologous genes based on the Duplicated Genes Database (http://dgd.genouest.org/). Such instances might represent artifacts due to misalignment between closely homologous genes or might equally represent true aberrant events, preferentially occurring due to homology at the genomic sequence level.
Fig 2Fusion categorization workflow and median number of fusion candidates per category.
Unfiltered results from TopHat fusion were BLASTed, annotated and input into the candidate classification workflow. The median number of events per sample in each category is shown. All candidates classified as potential fusions or read-through events, proceeded into a final review stage that determined phenotypic relevance of the genes to the patient condition using both automated PCAN analysis and manual review. Candidates classified as most phenotypically relevant were selected for follow-up validation.
Technical details of 16 fusion candidiates passing phenotypic review.
| Patient ID | Fusion | Supporting vs Non-Supporting Reads | Fused at Exon boundaries? | Fusion preserves reading frame? | Inter/Intrachromosomal | Genomic coordinates (hg19) | Separation on chromosome (bp) | Transcripts | Strand | Detected by Standard TopHat Fusion Filters? |
|---|---|---|---|---|---|---|---|---|---|---|
| Patient 3 | 10 vs 18 | Exon-Exon | Yes | Intrachromosomal | chr10:101554225-chr10:101515382 | 38843 | NM_000392 Exon 6—NM_015960 Exon 9 | Forward–Forward | No | |
| Patient 5 | 23 vs 22 | Exon-Exon | No | Intrachromosomal | chr11:78239888-chr11:78369861 | 129973 | NM_001243251 Exon 6—NM_001098816 | Reverse-Reverse | No | |
| Patient 6 | 14 vs 6 | Exon-Exon | Yes | Intrachromosomal | chr11:108129802-chr11:107663526 | 466276 | NM_000051 Exon 16—NM_017515 Exon 8 | Forward–Reverse | Yes | |
| 43 vs 2 | chr11:107673727-chr11:108137898 | 464171 | NM_017515 Exon 7—NM_000051 Exon 17 | Reverse—Forward | ||||||
| Patient 7 | 26 vs 33 | Exon-Exon | No | Intrachromosomal | chr11:111951282-chr11:111907997 | 43285 | NM_001301019 Exon 4—NM_001931 Exon 6 | Forward—Forward | No | |
| Patient 12 | 19 vs 5 | Exon-Exon | No | Intrachromosomal | chr18:47009954-chr18:46956817 | 53137 | NM_001199356 Exon 6—NM_017653 Exon 2 | Reverse-Reverse | No | |
| Patient 13 | 11 vs 22 | Exon-Exon | Yes | Intrachromosomal | chr2:32409407-chr2:32340771 | 68636 | NM_001330476 Exon 2—NM_199436 Exon 5 | Forward-Forward | No | |
| Patient 13 | 4 vs 2 | Exon-Exon | No | Intrachromosomal | chr15:43398140-chr15:43489662 | 91522 | NM_174916 Exon 1—NM_0001199 Exon 13 | Reverse-Reverse | No | |
| Patient 18 | 7 vs 3 | Exon-Exon | No | Intrachromosomal | chr2:152659521-chr2:152590309 | 69212 | NM_012097 Exon 6—NM_001271208 Exon 2 | Reverse-Reverse | No | |
| Patient 20 | 29 vs 44 | Exon-Exon | Yes | Intrachromosomal | chr2:74230293 -chr2:74173846 | 56447 | NM_001287491 Exon 2—NM_080916 Exon 3 | Forward—Forward | No | |
| Patient 21 | 15 vs 5 | Exon-Exon | No | Intrachromosomal | chr16:8738582 -chr16:8829556 | 90974 | NM_024109 Exon 10—NM_020686 Exon 2 | Forward-Forward | No | |
| Patient 33 | 33 vs 12 | Exon-Exon | No | Intrachromosomal | chr2:152954844 -chr2:153006743 | 51899 | NM_000726 Exon 2—NM_005843 Exon 2 | Reverse-Reverse | No | |
| Patient 36 | 27 vs 21 | Exon-Exon | Yes | Intrachromosomal | chr1:150737114 -chr1:150786715 | 49601 | NM_001199739 Exon 2—NM_001668 Exon 20 | Reverse—Reverse | No | |
| Patient 36 | 7 vs 45 | Intron-Exon | No | Interchromosomal | chr21:34927578 -chr1:157670375 | NA | NM_138927 Exon 3—NM_001320333 Exon 2 | Reverse—Reverse | No | |
| Patient 37 | 51 vs 120 | Exon-Exon | No | Intrachromosomal | chr16:2633586 -chr16:2875971 | 242385 | NM_002613 Exon 10—ENST00000575739.1 Exon 2 | Forward—Forward | No | |
| Patient 37 | 17 vs 2 | Exon-Exon | No | Intrachromosomal | chr8:119592952-chr8:118849438 | 743514 | NM_001101676 Exon 2—NM_000127 Exon 2 | Reverse-Reverse | No |
Table 1 describes technical details of the fusion canddiates passing all steps of the categorization pipeline and putatively determined to have phenotypic relevance. Only one fusion candidate was detected by the standard Tophat Fusion filter settings.
Validation status and phenotypic justiifcation for the 11 fusion candidates selected for validation.
| Patient ID | Fusion | Reason for interest? | Flagged by | Experimental Validation |
|---|---|---|---|---|
| Patient 5 | Patient was referred due to epilepsy phenotype. NARS2 mutations are responsible for combined oxidative phospohorylation deficiency with symptoms including epilepsy. OMIM notes variable penetrance and severity. | PCAN (NARS2 reactome pathway p-value 0.027) | Positive (PCR, ddPCR) | |
| Patient 6 | The patient carries a single pathogenic mutation in ATM, for which a second hit is sought as mutations are recessive. | Manual analysis & PCAN (ATM gene relative rank 0.002, Reactome pathway p-value 0.037, STRING p-value 0.028) | Positive (PCR, ddPCR, Sanger Sequencing, PacBio Sequencing) | |
| Patient 12 | Patient symptoms include microcephaly, global developmental delay and scoliosis. Mutations in DYM gene responsible for Dyggve-Melchior-Clausen disease whose symptoms include microcephaly, scoliosis, and psychomotor retardation. | PCAN (DYM gene relative rank 0.067, Reactome pathway, STRING p-value 0.028) | Positive (ddPCR) | |
| Patient 13 | Fusions between these two genes have been previously described in cases of spastic paraplegia. SPAST mutations are responsible for autosomal dominant spastic paraplegia (which the patient is not diagnosed with) but also various symptoms based on mutation e.g. mild-moderate cognitive defects, stutter, wheelchair bound by age 40 etc (OMIM). | Manual analysis. | Negative (PCR & ddPCR) | |
| Patient 18 | NEB mutations are responsible for Nemaline Myopathy. Symptoms include hypotonia and delayed motor development. Patient symptoms are developmental delay, hypotonia & laryngomalacia. | PCAN (NEB gene relative rank 0.03) | Positive (ddPCR) | |
| Patient 20 | TET3 is a TET Oncogene family member. TET3-DGUOK fusions have been reported in tumors. | Manual analysis | Negative (PCR & ddPCR) | |
| Patient 33 | CACNB4 mutations are associated with episodic ataxia (inc. vertigo, nystagmus, dysarthria) and epilepsy. Patient phenotype is progressive gait difficulty/balance, abnormal brain MRI with atrophy, progressive cognitive decline. | Manual analysis. Overlap quite weak. | Negative (PCR & ddPCR) | |
| Patient 36 | ZTTK syndrome is caused by haploinsufficiency of SON (AD inheritance). Symptoms include congenital heart defects, developmental delay, strabismus, various facial dysmorphisms, cleft palate. Patient has all of these plus a couple more. | Manual analysis | Positive (ddPCR) | |
| Patient 37 | Both genes fell at the boundaries of a deletion detected in this patient by aCGH. Links to phenotype remain unclear. | Manual analysis—phenotypic relevance unknown but corresponds to a deletion detected by aCGH. | Positive (PCR, ddPCR, aCGH, FISH) | |
| Patient 37 | EXT1 mutations are known to cause many cases of multiple exostoses. Patient has unresolved exostoses. | Manual analysis & PCAN (EXT1 gene relative rank 0.001, Reactome p-value 0.00025, STRING p-value 0.0000066) | Positive (PCR, ddPCR, MIP, aCGH) |
Table 2 describes the 11 fusions selected for validation and phenotypic evidence putatively linking them to the patient phenotype. 8 of 11 fusions were successfully validated by orthogonal technologies. Validation status and utilized technologies are described.
Fig 3Diagnostic fusion transcripts identified by RNA-Seq in Mendelian disease cases.
3A) A SAMD12-EXT1 fusion identified in Patient 37 whose phenotype included multiple exostoses. Multiple exostoses are most often attributed to autosomal dominant mutations in EXT1 and EXT2 but extensive clinical testing failed to identify any variants of interest in either gene. RNA-Seq identified a fusion candidate which might be explained by an interstitial deletion based on the genes’ orientation and position on chromosome 8 and would lead to loss of function of both EXT1 and SAMD12 due to loss of coding potential at the fusion boundary. Despite clinical aCGH and MLPA results initially indicating no deletion affecting the putatively conjoined genes, reevaluation of clinical aCGH results appeared suggestive of a mosaic deletion of approximately 604 kb at chr8:118960168–119569348. The deletion was subsequently validated by several orthogonal methods and determined to be diagnostic of the multiple exostoses phenotype. The SAMD12-EXT1 fusion was not detected by standard TopHat filters. 3B) Reciprocal ATM-SLC35F2 and SLC35F2-ATM fusions detected in Patient 6, with a severe combined immunodeficiency phenotype. The patient carried a paternally inherited pathogenic ATM variant for which a second hit was sought due to the autosomal recessive nature of ATM mutations. RNA-Seq revealed reciprocal fusions that were expected to retain their protein-coding potential but lead to aberrant ATM function based on the results of a novel flow cytometry assay. The fusions were experimentally validated by several orthogonal methods and shown to be maternally inherited, equating to compound heterozygous loss of ATM function which was classified as diagnostic of the patient phenotype. These reciprocal fusions were the only members of our validation panel that were detected by standard TopHat filters.