Literature DB >> 32206209

Challenges in detecting and quantifying intron retention from next generation sequencing data.

Abstract

Intron retention (IR) occurs when an intron is transcribed into pre-mRNA and remains in the final mRNA. An increasing body of literature has demonstrated a major role for IR in numerous biological functions and in disease. Here we give an overview of the different computational approaches for detecting IR events from sequencing data. We show that these are based on different biological and computational assumptions that may lead to dramatically different results. We describe the various approaches for mitigating errors in detecting intron retention and for discovering IR signatures between different conditions.

Entities: Chemical Disease Gene Species

Keywords: AS, alternative splicing; Bioinformatics; Gene expression; IR, Intron retention; Intron retention; RNA sequencing; RNA-seq, RNA sequencing; mRNA splicing

Year: 2020 PMID： 32206209 PMCID： PMC7078297 DOI： 10.1016/j.csbj.2020.02.010

Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN： 2001-0370 Impact factor: 7.271

Introduction

Amongst the three major types of alternative splicing (AS) that include exon skipping/inclusion, alternative 5′ and 3′ splice‐site selection and intron retention (IR), the latter has until recently been regarded as an oddity in mammals; IR was often added to the list for no other intent than to be exhaustive. However, recent discoveries about the role IR can play in fine-tuning gene expression [1], [2], [3] as well as the observation of characteristic IR patterns [4], [5], [6], [7], [8], [9] highlight the value of investigating IR in transcriptomic studies. Numerous reports have demonstrated a regulatory role for IR in hematopoiesis [1], [2], neuronal differentiation [10], germ cell differentiation [11] and CD4+ T cell activation [12] amongst others. In addition, a recent analysis of 1812 cancer patient samples showed that over 18% of splicing-associated single nucleotide variants caused IR and most of these events affected tumor suppressor genes [13]. Finally, the analysis of 2573 samples showed that IR occurs in all tissues analyzed and can affect over 80% of all coding genes [4]. Measuring IR can help decipher gene-level variations and inter-connections between transcriptional loads, structural variations and phenotypes [14], [15]. IR has been to demonstrated to downregulate gene expression in numerous systems by triggering the nonsense mediated mRNA decay (NMD) pathway. NMD recognizes transcripts with premature stop codons (PTC) that could potentially generate C-terminal truncated proteins and degrades them. This surveillance mechanism can thus rapidly degrade IR transcripts if they harbor a PTC. Given that introns are much longer than exons and under less selective pressure to conserve open reading frames, the probability that an IR event harbors a PTC is high and IR transcripts are thus good candidates for degradation via NMD. Initially, transcriptomic and bioinformatic analyses of NMD concluded that it was not coupled with mRNA splicing and that most PTC containing transcripts do not have major functional roles [16], thus relegating NMD to the role of scavenger [17]. The same team however recently revised their view on the functional importance of NMD [3] and the current consensus is that it couples with IR (and other forms of AS) to regulate gene expression in numerous systems [1], [3], [18], [19], [20]. The importance of NMD is further underscored by the fact that deletion of its core components result in embryonic lethality [21]. RNA-seq data is well suited for resolving local exon connectivity because sequencing reads are sufficiently long to cover exon-exon junctions. It is also well suited for measuring global gene expression because the high number of reads that map to genes generally enables the use of powerful statistical models. Detecting and measuring IR with RNA-seq is more complex. The technical biases that are known to distort gene expression levels (eg: GC content, amplification biases) and other types of splicing, also affect IR measurement. In addition, measurement of intronic expression is challenged by numerous factors. Within introns, highly expressed features such as small nucleolar RNAs, microRNAs or unannotated exons may erroneously inflate count-based measures of intronic expression. Conversely, low complexity regions, common in introns, prevent unique mapping of reads. Because retained introns are generally expressed at a fraction of their flanking exons, uncorrected biases can massively disrupt IR estimation. Transcriptome-wide evaluation of IR by computational means is still a budding field of investigation. In many studies, IR was assessed via custom and briefly detailed procedures, which is probably due, in part, to the fact that few dedicated and comprehensive tools have been published so far. In the following, we give a survey of the technical biases that confound IR detection and available computational methods to tackle IR screening. We emphasize three crucial steps which are: the preparation and quality control of the sequencing data and the reference transcriptome; generating metrics that reflect the biological signal of IR transcripts and using a model to discover condition-specific IR events.

IR detection

Filtering sequencing data

Unlike sequencing reads that map to exon-exon junctions, reads that map to introns can originate from DNA contamination caused by ineffective DNase treatment or from pre-mature mRNA. One method to detect DNA contamination is to measure reads across splice sites and check that the majority of introns display high splicing efficiency (above 90%) [22]. Another approach, implemented in [4] is to verify that the ratio of the number of reads that map to intergenic regions to the number that maps to coding regions is less than 10%. Another source of bias is that intronic reads may originate from nascent and pre-mature RNAs [23]. So as to lessen signals due to unprocessed transcripts and overlapping antisense transcripts, it is recommended to use Poly-A enriched RNA-seq (or cytoplasmic fractionation) and strand-specific protocols [4]. Regarding library size, IR occurs at relatively low frequency in mammals, and introns tend to be substantially longer than exons. Determining an optimal library size obviously depends on many experiment-specific factors and on the IR effect size considered as biologically significant, 35 millions mapped reads for a one-versus-one experiment was suggested as an optimum for detection of differential intron usage (based on a resampling approach, [24]). In order to bypass intronic alignment biases, it has been suggested to consider only splice junction reads or similarly to focus only on a window centered on splice sites. Nonetheless, those junction-only analyses are likely to be more affected by splicing variations in flanking exons and lead to even more unstable estimates. Accordingly, previous studies pointed out that they require higher sequencing-depth (at least 70 million reads per sample, ideally more than 150 million reads) [23].

Defining reference intronic sequences

Sequencing reads that map to intronic intervals may originate from several different sources such as overlapping genes. These confound measurements of the magnitude of true IR. It is therefore crucial to correctly define the intronic intervals that will be used to measure IR. Here, two main approaches have been adopted (cf: Fig. 1), each calling for precautions for interpretation and specific processing to avoid false positive detection.

Fig. 1

Defining intronic intervals to be analyzed. Comprehensiveness of transcript annotation and the selection of reference intronic sequences have a major impact on IR detection. In the example, we consider a gene having three possible isoforms (A, B and C). Exons are represented as plain rectangles and introns as thick black lines. If only Isoforms B and C were annotated, the starred interval (*) would not be defined as an intron and most likely not detected as retained. Colored boxes indicate whether the annotated introns match the “all introns” or “independent/measurable intron” criteria used by current algorithms.

The all-introns view

A first possibility is to analyze all intronic intervals present in at least one annotated transcript model [4]. Although this allows to screen the largest set of candidates, this comes at the expense of having to deal with peaks of intronic alignments caused by expressed alternative exons and redundant IR calls due to overlapping introns.

The measurable introns view

Measurable (or independant) introns are (parts of) introns that do not overlap with any annotated exon [24], [26], [27], [30]. They are obtained by subtracting merged exons from genes. This comes with the advantage to simplify the analysis of introns flanked by exons with known alternative donor sites, but ignores introns fully overlapped by annotated exons.

Overcoming sequencing and alignment artifacts

Even when appropriate sequencing protocols have been used, several sources of confusion remain that can only be overcome with computational means. These are detailed below and in Fig. 2.

Fig. 2

Potential sources of bias and confusion: a very unfortunate gene. Only intron 3 is retained in this example. In intron 1: expression of an overlapping feature causes a peak in alignments, which can artificially inflate the estimation of IR. Intronic alignments in intron 2 originate from an unanottated exon. Intron 3 is retained but it’s detection is hampered by multiple biases. First, the presence of a low mappability region (repeated A sequence in red) would result either in a gap or in high uncertainty in read alignments in that region. Secondly, high GC content in the 5′ exon explains the lack of exon-exon junctions and 3′ exon-intron reads and may affect filtering and IR metrics based on them. Thirdly, due to its long length, it tends to be more sparsely covered. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Overlapping features

First, the sequencing protocol may capture molecules overlapping or mapping within introns, such as small nucleolar RNAs, micro RNAs, unannotated exons or alternative 5′ and 3′ splice-sites. They form characteristic peaks of reads within intronic regions that may induce false IR detections and result in inaccurate quantification if not properly identified and filtered.

Repeated regions

Relatively to exons, introns are long and poorly conserved, often contain low mappability regions (eg: duplicated regions like transposons [31], [32], [33] or repeated regions such as microsatellites [34] which impair correct IR level estimations.

Low coverage of flanking exons

Thirdly, sequencing and alignment artifacts may occur in flanking exons. For example, GC rich exons are under-covered and very small exons are more difficult to map. This may perturb or inflate IR measures, especially those that only measure reads directly surrounding the intron. They may also lead to missed IR events as the junctions are weakly supported.

3′ Coverage bias

PolyA-enriched RNA-seq data usually display a marked 3′ coverage bias, so that most 3′ introns are likely to be more covered and thus more easily captured than 5′ ones. This should be kept in mind for any inference regarding any positional bias of intron retention. In practice, the prevailing strategy for classifying introns relies essentially on user-defined thresholds. Here, we summarize the different strategies used by IR detection tools, and indicate the parameter values suggested by their respective authors when they exist (Fig. 3 and Table 1).

Fig. 3

Standard implementation of computational detection of IR events.

Table 1

Computational tools available to perform IR detection and their main features.

	Year	Publication	Language	Intron definition	IR measure	Low mappability correction	Unknown overlapping events detection
MISO [25]	2010	Nature Methods	Python	Independent introns	PSI	No	No
KMA [26]	2015	arXiv	Python and R	Measurable introns	PSI	No	Coverage analysis (Probabilistic test)
iRead [27]	2017	bioRXiv	Python	Independent introns	FPKM	No	Coverage Analysis (Shannon entropy)
IRFinder [4]	2017	Genome Biology	C++	All Introns	IRratio	Yes	Coverage Analysis (Detection of outlier regions)
IntEREest [24]	2018	BMC Bioinformatics	R	Independent introns	PSI or FPKM	Optional	No
ASDT [28]	2018	ATM	Perl	No (Reference-free)	No	No	Yes
JUM [29]	2018	PNAS	Perl	No (Reference-free)	No	No	Yes

Standard implementation of computational detection of IR events. Computational tools available to perform IR detection and their main features.

Implemented strategies to pinpoint reliable IR events

Although numerous computational methods have been developed to estimate splicing efficiency and to model sequencing errors that may affect their estimation, we have decided to list here those approaches that specifically cater to the difficulties of detecting IR. Keep Me Around (KMA) [26] uses the measurable introns approach. Transcripts are quantified using eXpress [35] or Kallisto [36], [37] and a PSI value is computed to evaluate IR levels (cf: Quantifying IR levels). Spurious intronic signals are spotted by finding the longest alignment gap in an intron and calculating the probability of observing such a long gap given the intron’s expression and if the distribution were uniform. The authors selected introns with at least three uniquely mapped reads, cumulated TPM values for non-IR transcripts greater than 1 and introns with zero-coverage regions longer than 20% of the intron length [38]. IRFinder [4] makes the choice to screen all introns derived from the annotation. Introns that overlap with any known exon or RNA molecule are marked in the output. A procedure is implemented to identify low mappability regions and exclude them and their reads from the subsequent calculation. Potential other artifacts are handled by discarding bases with outlier read depth value compared to the average intron depth. The IRratio and several other complementary metrics are then computed to evaluate support for IR. Suggested parameters values for IRFinder are: IRratio > 0.1, and at least 3 reads supporting intron exclusion on both sides. In iRead [27], reference introns are provided by the user. Suspect cases are likened to non-uniform coverage. Read coverage uniformity is quantified by the Shannon’s entropy of read distribution along the intron and low entropy is associated to a non-uniform read coverage in the intron. By default, iRead will select intron having FPKM > 3, at least one exon-intron junction read and the normalized entropy-score > 0.9. IntEREst [24] computes FPKM and PSI values for independent introns. Optionally, low mappability regions can be excluded from the calculations. No specific guidelines are provided to select IR events, but data are formatted for some methods for performing differential analysis. It is worth emphasizing that each approach makes use of pre-fixed threshold values for all intronic regions. Most of these values are defined according to the coverage profile expected for (well-behaved) short length medium-coverage introns, and are maybe the most straightforward way to guard from the bulk of artefactual detections. However, introns form a highly heterogeneous set of regions, hugely differing in length, inner and flanking coverage and sequence feature. For example, on genes having sparse coverage, chances to observe counts on a very small pre-specified area of the genome are quite low. Therefore, filters on the number of junction reads are likely to exclude most of their introns from further analysis. Very long introns are especially problematic and in practice these regions have little chance to be covered at their full-extent, and well generally fail on the hard cutoffs set by these algorithms. It is thus clear that no universal threshold can be convenient in all cases, and that any rigid thresholding is likely to introduce a severe selection bias. We thus argue that a sensible choice for the various parameters must be intron-specific and encourage the development and use of models that account explicitly for sequence features and coverage variations.

Quantifying IR levels

Though essential, devising a computable and robust metric that reflects “splicing efficiency” or oppositely the level of IR can be difficult. Three types of alignments should be taken into account [18]. Intronic reads and reads that span the flanking exons are informative of the level of IR. In addition, all the remaining alignments, can indicate how reliable the sequencing data is and what degree of confidence we may have in each IR event. The most commonly used ratios to quantify IR are the percentage spliced in and the IR-ratio, both described below.

Percentage spliced-in

Alternative splicing event frequencies are commonly quantified by the percentage spliced-in (PSI) ratio [29]. An intronic version has been suggested [19] as the number of reads supporting the retention of the intron against the number of reads supporting its exclusion. In practice, a transcript-level quantification is performed using an annotation of IR-free isoforms augmented with independent introns (taken as dummy transcripts). The PSI for a given intron can be formulated as:where the sum is performed across all annotated transcripts of the same gene not retaining an intron.

IR ratio

This metric is to reflect splicing efficiency as the portion of informative reads which come from a transcript retaining the intron, that is:where Intronic abundance is measured by the median [4] or average [39] intron depth. The abundance of normal splicing is taken as the number of reads spliced across the intron. These ratios tend to show high fluctuations and their behavior is difficult to model. This may explain why, so far, no approach has been developed to estimate dispersions and confidence intervals. Importantly, this hinders the identification of robust and reproducible patterns based on their observed values. Although these metrics can be employed, as a proxy for splicing efficiency, to call manifest IR events, additional statistics are required to infer intra- and cross-sample variation levels. One of the major difficulties for quantifying splicing efficiency is due to the fact that the exons flanking an intron may connect not only to each other but with other exons from the same gene to form different isoforms. This hampers the estimation of the portion of reads to attribute to the transcripts in which the intron is spliced. The two measures presented above (PSI and IRratio) address this problem differently. So as to overcome global variations in gene coverage caused by alternative exon usage, the strategy behind the IRratio is to only make use of the junction-crossing reads that hit any one of the two exons flanking the intron. The maximum value between the left and right quantities is then taken as a means to mitigate against the existence of multiple isoforms that connect to the flanking exons. However, the number of junction reads tend to be highly dispersed with high coverage, and to take zero values when the coverage is low. This may incidentally affect estimation accuracy. On the other hand, by using information across the whole transcript to evaluate the gene coverage, the PSI estimator might be more resistant to these local variations. Nonetheless, it would be of interest to assess the quality of the PSI estimates on genes which undergo manifold alternative splicing events [40], [41].

Cross-sample comparison

Inferring differences in IR between conditions necessitates a statistical framework to combine biological replicates, assess dispersion of IR level estimates and control the false positive rate. Moreover, sample read abundances need to be normalized to account for variations in library size [42]. Moreover, the coverage depth of an intron is correlated to its gene coverage; sample comparison thus necessitates strategies to control for differences in gene expression [23]. To our knowledge, currently four implemented frameworks fulfill these requirements (cf: Table 2). In regards of their statistical methodology, we split them into three families of approaches.

Table 2

Available computational methods to perform IR differential analysis.

Method	Year	Language	IR-specific	IR measure	Normalization For library size	Control for gene expression	Modeling of biological variability	Statistical Framework
edgeR-IR*	2010	R	No/Yes**	Intron bin count	TMM (ref)	No/Yes**	Yes	Generalized Linear Model
DESeq2-IR*	2014	R	No/Yes**	Intron bin count	Variance estimation and rescaling (ref)	No/Yes**	Yes	Generalized Linear Model
DEXSeq-IR*	2012	R	No/Yes**	Intron bin count	Variance estimation and rescaling (ref)	Yes	Yes	Generalized Linear Model
iDiffIR	2018	Python	Yes	Average per base read coverage	TMM (ref)	Yes	Yes	LogFC statistic and Z-test

These refer to IR-tuned versions of existing software, and may require custom pre-processing.

After IR-specific tuning.

Available computational methods to perform IR differential analysis. These refer to IR-tuned versions of existing software, and may require custom pre-processing. After IR-specific tuning.

Intron-bin count-based methods

The first two approaches re-use existing methods that were primarily devised either for gene expression [43], [44] or exon usage [30], [45] analyses, after some reworking of the data to adapt them to IR. They are count-based. To adjust for differences in library size, a gene-wise normalization factor is determined and applied to all (exon and intron) bins, as in usual gene expression differential analyses. The authors of ASpli1 suggest to adjust each intron bin counts B_{i}, in each sample s, by biological condition through:where G_{Condition} is the average gene count in condition and bar{G} the average gene count across all samples. Classical testing procedures (either of edgeR or DESeq2) are then applied to these adjusted counts to infer a set of differential introns. In the DEXSeq-IR method, for each intron, two count bins are considered: the intron bin and the union of all the remaining bins. The average bin count is modeled via a negative binomial generalized linear model with interaction term. In more details, for an intron indexed by i in a sample j: where l = 1 for the intron bin and 0 otherwise. The sample parameter β_{Sample} adjusts internally for the gene expression level and differences in intron usage between conditions is inferred by testing whether the interaction term is significantly different from zero2. As described previously, so as to limit spurious noise in intronic read counts, it may be worth refining further intron bin counts by removing reads that map to artifact-prone intervals. Several detection softwares already output these corrected read counts [4], [24].

Average intron coverage method

The third and most recent approach, iDiffIR (implemented but not yet published, https://bitbucket.org/comp_bio/idiffir, [39], [46]) is primarily designed for IR. IR levels are quantified, from genomic unique alignments, by the average per base read coverage (over the intron interval). To account for library size, TMM normalization [47] is applied, separately, for intron and exon per base read counts. Per base counts are further normalized to force overall gene coverage to be equal across conditions. The test statistic is a corrected log fold change between the average read coverage in each of the conditions compared: The correction parameter a is a pseudo-count whose value is chosen to minimize the log fold-change and control large values caused by lowly covered introns. The biological intuition of splicing efficiency translates quantitatively as the proportion of emitted transcripts which retain an intron. However, owing to short read size and rarity of IR events, this frequency cannot be reliably estimated from RNA-seq data. Metrics that can actually be computed (eg: PSI, IRratio) would only be proxys, and their properties and meaning are still poorly understood. Importantly, it is not sure whether intronic expression can be compared based on these metrics. All the problems discussed previously in the measurement of IR levels will have direct repercussions on the detection of alternate IR levels between samples. Ironically, despite numerous efforts to quantify IR events, the cases best suited to most of these approaches remains introns with high coverage and short length, exon-like introns.

Discussion

The recent interest in expressed introns has led to a flourishing number of examples of regulation through intron retention. The accurate detection of retained introns and precise measurement of intronic expression are crucial to these studies. Numerous factors impede the detection of IR from next generation sequencing data. Introns are much longer than exons and thus have a much higher probability of containing overlapping features that may confound the estimation of intronic expression. In addition, introns are enriched in low complexity and repeat sequences that may prevent sequencing data from being uniquely mapped. These factors must be accounted for when detecting IR events. Most computational approaches however will introduce a selection bias as only introns with sufficient coverage can be detected and the statistical power required to detect differences between conditions increases with coverage depth and the read count [48], [49]. As a result of this bias, gene enrichment tests of genes derived from IR signatures [50] are heavily skewed towards the more expressed genes and towards introns that do not contain these confounding features. Despite the recent results that demonstrate a crucial role for IR, very few IR events have been validated in the wetlab and amongst these an even smaller portion have been investigated for their functional impact. As a consequence, no reliable benchmark of IR detection or differential intronic expression has been published. This lack of reliable controls is however temporary because long read technologies capable of sequencing entire IR transcripts help resolve most of the detection problems. However, due to their low coverage, these technologies are far from allowing a comprehensive detection of IR events and even further from allowing a reliable quantification of IR levels between different tissues.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

48 in total

Review 1. Intron retention in mRNA: No longer nonsense: Known and putative roles of intron retention in normal and disease biology.

Authors: Justin J-L Wong; Amy Y M Au; William Ritchie; John E J Rasko
Journal: Bioessays Date: 2015-11-27 Impact factor: 4.345

Review 2. The functional consequences of intron retention: alternative splicing coupled to NMD as a regulator of gene expression.

Authors: Ying Ge; Bo T Porse
Journal: Bioessays Date: 2013-12-18 Impact factor: 4.345

3. Orchestrated intron retention regulates normal granulocyte differentiation.

Authors: Justin J-L Wong; William Ritchie; Olivia A Ebner; Matthias Selbach; Jason W H Wong; Yizhou Huang; Dadi Gao; Natalia Pinello; Maria Gonzalez; Kinsha Baidya; Annora Thoeng; Teh-Liane Khoo; Charles G Bailey; Jeff Holst; John E J Rasko
Journal: Cell Date: 2013-08-01 Impact factor: 41.582

4. Genome-wide characterization of the routes to pluripotency.

Authors: Samer M I Hussein; Mira C Puri; Peter D Tonge; Marco Benevento; Andrew J Corso; Jennifer L Clancy; Rowland Mosbergen; Mira Li; Dong-Sung Lee; Nicole Cloonan; David L A Wood; Javier Munoz; Robert Middleton; Othmar Korn; Hardip R Patel; Carl A White; Jong-Yeon Shin; Maely E Gauthier; Kim-Anh Lê Cao; Jong-Il Kim; Jessica C Mar; Nika Shakiba; William Ritchie; John E J Rasko; Sean M Grimmond; Peter W Zandstra; Christine A Wells; Thomas Preiss; Jeong-Sun Seo; Albert J R Heck; Ian M Rogers; Andras Nagy
Journal: Nature Date: 2014-12-11 Impact factor: 49.962

5. A scaling normalization method for differential expression analysis of RNA-seq data.

Authors: Mark D Robinson; Alicia Oshlack
Journal: Genome Biol Date: 2010-03-02 Impact factor: 13.583

6. A benchmark for RNA-seq quantification pipelines.

Authors: Mingxiang Teng; Michael I Love; Carrie A Davis; Sarah Djebali; Alexander Dobin; Brenton R Graveley; Sheng Li; Christopher E Mason; Sara Olson; Dmitri Pervouchine; Cricket A Sloan; Xintao Wei; Lijun Zhan; Rafael A Irizarry
Journal: Genome Biol Date: 2016-04-23 Impact factor: 13.583

7. An NXF1 mRNA with a retained intron is expressed in hippocampal and neocortical neurons and is translated into a protein that functions as an Nxf1 cofactor.

Authors: Ying Li; Yeou-Cherng Bor; Mark P Fitzgerald; Kevin S Lee; David Rekosh; Marie-Louise Hammarskjold
Journal: Mol Biol Cell Date: 2016-10-05 Impact factor: 4.138

8. Computational approaches for isoform detection and estimation: good and bad news.

Authors: Claudia Angelini; Daniela De Canditiis; Italia De Feis
Journal: BMC Bioinformatics Date: 2014-05-09 Impact factor: 3.169

9. IntEREst: intron-exon retention estimator.

Authors: Ali Oghabian; Dario Greco; Mikko J Frilander
Journal: BMC Bioinformatics Date: 2018-04-11 Impact factor: 3.169

10. Abiotic Stresses Modulate Landscape of Poplar Transcriptome via Alternative Splicing, Differential Intron Retention, and Isoform Ratio Switching.

Authors: Sergei A Filichkin; Michael Hamilton; Palitha D Dharmawardhana; Sunil K Singh; Christopher Sullivan; Asa Ben-Hur; Anireddy S N Reddy; Pankaj Jaiswal
Journal: Front Plant Sci Date: 2018-02-12 Impact factor: 5.753

11 in total

Review 1. Practical Considerations for Single-Cell Genomics.

Authors: Claire Regan; Jonathan Preall
Journal: Curr Protoc Date: 2022-08

2. Tracking pre-mRNA maturation across subcellular compartments identifies developmental gene regulation through intron retention and nuclear anchoring.

Authors: Kyu-Hyeon Yeom; Zhicheng Pan; Chia-Ho Lin; Han Young Lim; Wen Xiao; Yi Xing; Douglas L Black
Journal: Genome Res Date: 2021-04-08 Impact factor: 9.043

Review 3. Alternative splicing and cancer: insights, opportunities, and challenges from an expanding view of the transcriptome.

Authors: Sara Cherry; Kristen W Lynch
Journal: Genes Dev Date: 2020-08-01 Impact factor: 11.361

Review 4. Differential fates of introns in gene expression due to global alternative splicing.

Authors: Anjani Kumari; Saam Sedehizadeh; John David Brook; Piotr Kozlowski; Marzena Wojciechowska
Journal: Hum Genet Date: 2021-12-14 Impact factor: 4.132

Review 5. A second hit somatic (p.R905W) and a novel germline intron-mutation of TSC2 gene is found in intestinal lymphangioleiomyomatosis: a case report with literature review.

Authors: Bogyeong Han; Juhwan Lee; Yoon Jin Kwak; Hyun-Young Kim; Kwang Hoon Lee; Yumi Shim; Hyunju Lee; Sung-Hye Park
Journal: Diagn Pathol Date: 2021-08-31 Impact factor: 2.644