Literature DB >> 33085659

Evaluation of both exonic and intronic variants for effects on RNA splicing allows for accurate assessment of the effectiveness of precision therapies.

Anya T Joynt1, Taylor A Evans1, Matthew J Pellicore1, Emily F Davis-Marcisak1, Melis A Aksit1, Alice C Eastman1, Shivani U Patel2, Kathleen C Paul1, Derek L Osorio1, Alyssa D Bowling1, Calvin U Cotton3, Karen S Raraigh1, Natalie E West2, Christian A Merlo2, Garry R Cutting1, Neeraj Sharma1.   

Abstract

Elucidating the functional consequence of molecular defects underlying genetic diseases enables appropriate design of therapeutic options. Treatment of cystic fibrosis (CF) is an exemplar of this paradigm as the development of CFTR modulator therapies has allowed for targeted and effective treatment of individuals harboring specific genetic variants. However, the mechanism of these drugs limits effectiveness to particular classes of variants that allow production of CFTR protein. Thus, assessment of the molecular mechanism of individual variants is imperative for proper assignment of these precision therapies. This is particularly important when considering variants that affect pre-mRNA splicing, thus limiting success of the existing protein-targeted therapies. Variants affecting splicing can occur throughout exons and introns and the complexity of the process of splicing lends itself to a variety of outcomes, both at the RNA and protein levels, further complicating assessment of disease liability and modulator response. To investigate the scope of this challenge, we evaluated splicing and downstream effects of 52 naturally occurring CFTR variants (exonic = 15, intronic = 37). Expression of constructs containing select CFTR intronic sequences and complete CFTR exonic sequences in cell line models allowed for assessment of RNA and protein-level effects on an allele by allele basis. Characterization of primary nasal epithelial cells obtained from individuals harboring splice variants corroborated in vitro data. Notably, we identified exonic variants that result in complete missplicing and thus a lack of modulator response (e.g. c.2908G>A, c.523A>G), as well as intronic variants that respond to modulators due to the presence of residual normally spliced transcript (e.g. c.4242+2T>C, c.3717+40A>G). Overall, our data reveals diverse molecular outcomes amongst both exonic and intronic variants emphasizing the need to delineate RNA, protein, and functional effects of each variant in order to accurately assign precision therapies.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 33085659      PMCID: PMC7605713          DOI: 10.1371/journal.pgen.1009100

Source DB:  PubMed          Journal:  PLoS Genet        ISSN: 1553-7390            Impact factor:   5.917


Introduction

Splicing of the pre-mRNA to produce mature transcript is a complex process that depends on specific sequence motifs [1-6]. Thus, it is not surprising that genetic variants can disrupt this process through a variety of mechanisms including weakening canonical splice sites, activation of cryptic splice sites, and alteration of splice regulatory sequences. While some of this variation is tolerated and preserves normal splicing to varying degrees (e.g. GC donor splice sites [, many of these variants completely preclude production of full-length protein. In fact, genetic variants that impact splicing have been estimated to account for anywhere from 10% to 50% of disease causing variants [10-12]. The disease liability of variants affecting the canonical splice sites (5’GT and 3’AG) has been long understood since early identification in β-thalassemia and phenylketonuria [7, 13, 14]. While variants that act on splicing in different positions are more varied in their consequences, they are well documented causes of a variety of inherited conditions including β-thalassemia [13], Duchenne muscular dystrophy (DMD) [15], Hutchinson-Gilford progeria syndrome (HGPS) [, and cystic fibrosis (CF) [17-20]. CF is an autosomal recessive, multisystem disorder affecting approximately 70,000 individuals worldwide and is caused by variants in the cystic fibrosis transmembrane conductance regulator (CFTR) gene, which encodes an epithelial chloride channel [21]. Key phenotypic features include elevated sweat chloride (>60mmol/L), diminished lung function, high risk of respiratory infections, and pancreatic insufficiency [21]. Allelic heterogeneity makes CF an informative model for the study of genetic disease as ~2,000 variants (CFTR Mutation Database) have been identified in the CFTR gene, ~10% of which are thought to impact splicing (CFTR Mutation Database). Notably, the severity of clinical presentations of CF have been well correlated with the deleteriousness of CFTR variants [22]. Identifying the disease liability of variants and their specific mechanism of disease has become critical as molecular therapies for CF target the underlying molecular defect, thus applying to select individuals depending on the genetic variants they harbor [21]. As these treatments expand to include more and more individuals with variants affecting protein processing and function, understanding which variants allow for production of protein becomes imperative. While these drugs, known as CFTR modulators, were initially developed with specific and generally more common variants in mind (i.e. F508del, G551D), label expansion has since allowed for treatment of additional CFTR variants [23, 24]. Notably, in vitro data has been used in support of these expansions when in vivo data cannot be obtained [25]. The recent development of a highly effective triple-combination therapy (Trikafta) has expanded treatment to include 90% of individuals with CF [26]. However, the remaining 10% include individuals with rare exonic and intronic variants that have yet to be thoroughly characterized. The drug response of these variants is unknown and unlikely to be evaluated in a clinical trial setting. Therefore, it is paramount that in vitro data accurately reflects in vivo circumstances to allow for expansion of these therapies as well as determination of which individuals will require alternative treatment options. Typical in vitro assessment of CFTR variants has focused on cDNA based systems allowing only protein processing and function to be evaluated [27-30]. While some variants have been considered for RNA-level effects, hybrid minigene systems are often employed, which only allow for assessment of splicing [31-33]. We and others have previously established expression minigenes (EMGs) as a useful model for studying RNA and protein simultaneously [34-39]. Assessment of the downstream consequences of missplicing allows for precise determination of which variants may be eligible for modulator treatment. While ‘missense’ variants are often assumed to cause disease through an impact at the protein-level, and thus assumed to respond to modulators, intronic variants can be incorrectly assumed to allow for no protein production at all and thus no drug response. Here we consider both of these mechanisms and focus on evaluating individual nucleotide changes from RNA-level impacts to ultimate functional consequences in order to assess therapeutic options. By studying individual variants in heterologous expression systems, we determine molecular mechanism and drug response on an allele by allele basis. Complementary assessment in primary nasal epithelial cells derived from individuals with CF allows for confirmation of our in vitro data.

Results

Predicted effects of exonic variants overlook the impact of single nucleotide changes at the RNA level

Exonic variants resulting in a single nucleotide change that could produce an amino acid substitution are typically assumed to be missense variants and are often referred to by their predicted effect at the protein level [40, 41], rather than by their HGVS designation (which we use here for clarity) [42]. This system of evaluating variants overlooks the potential for these nucleotide changes to have RNA-level effects. Here we evaluate exonic CFTR variants for effects on mRNA splicing. Given that the consensus splice sites extend into the exons, we chose naturally occurring CFTR variants located in the first and second (n = 6 beginning of exon variants) as well as penultimate and last (n = 5 end of exon variants) nucleotide of an exon. Additionally, we evaluated the splice-defect potential of CFTR exonic variants that were predicted by the splice algorithm CryptSplice [37], but not located at an intron-exon junction (n = 4 ‘middle’ of exon variants). In total, we chose 15 exonic variants for study, all of which were subsequently evaluated by both CryptSplice and an additional splice prediction tool (SpliceAI, [43]). One variant was assessed in primary human nasal epithelial (HNE) cells and 14 were introduced into expression minigenes (EMGs). EMGs are plasmids containing all of the exons of CFTR and select full-length or abridged CFTR intronic sequences (). These constructs allow for evaluation of the effects of CFTR variants on mRNA splicing of adjacent introns and have been previously shown to faithfully recapitulate in vivo mRNA splicing [35, 36, 38].

Graphical representation of CFTR gene and location of variants analyzed.

Black boxes indicate exons and their size relative to other exons; lines between represent introns that are not to scale. Brackets indicate regions encompassed by expression minigenes (EMG, i: intron). Dots show location of variants studied (green: intronic, blue: exonic). After verifying that the EMGs used in this study generate normally spliced RNA, we used site-directed mutagenesis to introduce 14 exonic variants into relevant EMGs and evaluated their effect on splicing by transfecting HEK293 cells. RT-PCR and Sanger sequencing revealed normal splicing of six variants, while eight variants resulted in production of misspliced CFTR mRNA isoforms (). Of these eight variants, one was located in the first nucleotide of an exon (c.274G>A), three were located in the middle (outside the consensus splice site) of an exon (c.454A>G, c.523A>G, c.2816A>G), and four were located in the last nucleotide of an exon (c.2908G>C, c.2908G>A, c.3717G>C, c.3873G>C). We did not find any variants in the second or penultimate nucleotide of an exon that resulted in misspliced products (c.581G>T, c.3719T>G or c.3872A>G, c.166G>A). Overall, two of the eight misspliced variants produced both misspliced and normally spliced transcript (c.274G>A, c.3873G>C), allowing for production of the full-length protein isoform that would be predicted by looking at the nucleotide substitution as a missense variant (). While misspliced transcript was the major isoform observed for c.3873G>C (studied in detail below), c.274G>A favored the normally spliced product (). Additionally, we studied one exonic variant known to missplice (c.3700A>G, [44]) in HNE cells collected from an individual homozygous for this variant and identified the same misspliced isoform as previously reported, but we also discovered very low levels of normally spliced transcript (). Comparison of experimental outcomes to predictions by in silico tools identified 12/15 exonic variants that were accurately predicted (), with one false negative incorrectly called by both algorithms and two variants deemed “indeterminate” due to a lack of consensus between the two tools (). The numbers shown in the superscript indicate previously published study on the respective variant. Data supporting these findings in . *A prediction was deemed “indeterminate” if the two in silico tools used did not agree (). †This variant is approved for modulator therapy. ‡The primary RNA isoform produced by this variant is normally spliced ().

‘Missense’ variants resulting in only misspliced mRNA isoforms may be misclassified as drug responsive

Given our observation that 6/15 exonic variants studied resulted in complete missplicing, we hypothesized that these ‘missense’ variants could be misclassified as modulator responsive if small molecule drugs were tested on the predicted protein isoforms containing an altered amino acid, as opposed to those generated by the misspliced transcripts. This was of particular concern for CFTR bearing the variant c.2908G>C (predicted effect G970R), which was shown to respond well to ivacaftor in vitro, whereas individuals with CF carrying this variant that were entered into a clinical trial did not respond [28, 45]. Indeed, c.2908G>C was recently reported to alter CFTR mRNA splicing in primary cells, thus explaining the lack of response [46]. Therefore, study of these variants also provided an opportunity to test the ability of EMGs to replicate what has been observed in vivo, and extend splicing analysis to an additional variant in this codon. To this end, we introduced three known CF-causing variants that affect this codon 970 into an EMG containing full-length intron 14, abridged intron 15, full-length intron 16, and abridged introns 17 and 18 (EMG_i14-i18, , left). To quantify the ratio of the CFTR mRNA isoforms, we performed RT-PCR using a FAM-6 tagged primer followed by fragment analysis. Interestingly, the c.2909G>A (predicted effect G970D) variant at the beginning of exon 18 had no effect on CFTR splicing (, right). In contrast, the two variants that alter the last nucleotide of exon 17, c.2908G>A (predicted effect G970S) and c.2908G>C (predicted effect G970R), resulted in missplicing and complete absence of full-length RNA transcript. For both of these variants, the change in the last nucleotide of the exon disrupted the 5’ consensus splice donor site resulting in either complete skipping of exon 17 ( left, isoform 1) or partial skipping of exon 17 (177-nucleotide in-frame deletion) through use of a cryptic splice donor within the exon ( left, isoform 2). As compared to WT EMG_i14-i18 and c.2909G>A EMG_i14-i18, which both produced 100% normally spliced transcript, we found that c.2908G>C EMG_i14-i18 resulted in 78.5%±3.2 of CFTR transcript skipping exon 17 and 21.5%±3.2 of transcript harboring the 177-nucleotide deletion. In contrast, c.2908G>A EMG_i14-i18 favored the 177-nucleotide deletion isoform with 68.3%±2.7 of this product versus 31.1%±2.7 of the shortened isoform (, right).

Variants at codon 970 have different effects on mRNA splicing, protein processing and function of CFTR.

(A) c.2908G>C (predicted effect G970R) and c.2908G>A (predicted effect G970S) alter mRNA splicing, while c.2909G>A (predicted effect G970D) allows synthesis of normally spliced Schematic showing EMG_i14-i18 bearing c.2908G>A or c.2908G>C and CFTR mRNA isoforms produced. Right panel. Fragment analysis of RT-PCR products labelled with 6-FAM reveals the relative abundance of misspliced isoforms of CFTR transcript for EMG_i14-i18 bearing c.2908G>C or c.2908G>A. EMG_i14-i18 bearing c.2909G>A and wildtype (WT) EMG_i14-i18 served as controls. RNA isoform quantity was determined from the area under the curve (AUC) for each RT-PCR (minimum of three technical replicates per construct). The percent shown for each isoform is relative to the sum of the AUC for both isoforms. Error bars represent standard deviations. (B) EMGs with c.2908G>C and c.2908G>A do not produce processed CFTR protein, while c.2909G>A does. Immunoblot of protein lysates from transiently transfected HEK293 cells showing CFTR on top, with Na+,K+-ATPase on bottom as a loading control. Band C is mature, complex glycosylated CFTR protein. Band B is immature, core glycosylated CFTR protein. Negative control is an empty vector plasmid. WT cDNA, F508del cDNA, and WT EMG_i14-i18 served as controls. WT EMG_i14-i18 and EMG_i14-i18 bearing c.2909G>A were loaded at a reduced concentration (10%) as compared to other lysates to facilitate better visualization of less abundant isoforms in adjacent lanes (C) G970D-CFTR generates reduced levels of chloride transport, which is augmented by ivacaftor and lumacaftor treatment. Representative tracings for short circuit current (Isc) assay of CFTR channel function and CFTR modulator response performed on CF bronchial epithelial (CFBE) cells stably expressing CFTR cDNA bearing G970D. Inset- Quantification of change in current (ΔIsc) in response to modulators (minimum of two independent measurements per condition). Data shown as mean±SD. p value determined by one-way ANOVA. *** (p≤0.001), **(p≤0.01), n.s. (not significant, p>0.05) when compared to DMSO treated vehicle control. (D) The Schematic showing EMG_i1-i5 bearing c.523A>G and resulting misspliced CFTR mRNA. Bottom panel. Sanger sequencing results confirm truncation of CFTR exon 5 at c.523 followed by the beginning of CFTR exon 6. (E) Representative tracing for short circuit current (Isc) assay of CFTR channel function performed on CFBE cells stably expressing the c.523A>G cDNA construct. Inset- Comparison of quantification of change in current (ΔIsc) measured for CFBE cells stably expressing either c.523A>G EMG construct or c.523A>G cDNA construct. Data shown as mean±SD. Data underlying graphs in this figure reported in . Of the two splice isoforms observed, we would expect the exon 17 skipped isoform (isoform 1) to result in a frame-shift (Asn886Lysfs5Ter), while the partial exon skipping isoform (isoform 2) would result in an in-frame deletion (Ser912_Gly970del). We have previously shown that frameshifts in this region of the CFTR gene do not result in stable CFTR protein [38], however we wanted to verify that the 177-nucleotide deletion isoform would allow for production of shortened CFTR. To this end, we performed immunoblotting of protein lysates extracted from HEK293 cells transfected with EMG constructs. To assess CFTR protein processing in the absence of splicing effects, we also introduced each variant to a plasmid expressing CFTR cDNA. CFTR processing was assessed by comparing lower molecular weight, core glycosylated immature (band B) protein to the higher molecular weight, complex glycosylated mature (band C) protein. WT cDNA and WT EMG_i14-i18 controls produced predominantly mature CFTR, as well as some immature protein due to overexpression from a constitutive promoter (, lanes 1 and 4). As expected, F508del produced immature protein (, lane 2). Expression of cDNA (that does not require splicing) bearing c.2908G>C (G970R), c.2908G>A (G970S), or c.2909G>A (G970D), generated mature CFTR (, lanes 6,8,10). In contrast, c.2908G>C and c.2908G>A in an EMG (that requires splicing) produced protein of a lower molecular mass than immature, core glycosylated (band B) of their respective cDNA counterparts. This protein isoform is consistent with the predicted 59 amino acid deletion (p.Ser912_Gly970del) that partial exon skipping (177-nucleotide deletion) would produce and cannot be attributed to the exon 17 skipped isoform as the CFTR antibody recognizes an epitope downstream of where this isoform would be expected to truncate. As expected from our observation of normal splicing, c.2909G>A in an EMG produced mature CFTR (, lane 9). To determine whether the mature CFTR protein bearing c.2909G>A (p.Gly970Asp) was functional, we created CF bronchial epithelial (CFBE) cell lines that stably expressed C.2909G>A cDNA from a single integration site as previously described [47]. Cells were grown in monolayers on filters to allow for polarization and mounted in Ussing chambers to assess CFTR chloride ion channel activity by measuring short circuit current (Isc). Addition of forskolin was used to activate channel activity via cAMP-mediated signaling, followed by inhibition with the CFTR-specific inhibitor compound (Inh-172), allowing for quantification of the CFTR-specific change in current (ΔIsc±SD). We observed residual channel activity corresponding to approximately 3% of what we observe in our WT CFTR cDNA cell lines after normalizing for variation in expression (calculated as previously described, [30]). Treatment with the CFTR corrector, lumacaftor, for 24 hours resulted in a moderate increase in channel activity corresponding to a ~1.4-fold change (ΔIsc = 19.55μA/cm2 ±3.62 as compared to a DMSO treated control with ΔIsc = 13.56μA/cm2 ±1.54, black line compared to blue line). Acute treatment with the CFTR potentiator, ivacaftor, also increased channel activity by about 3.6-fold (ΔIsc = 48.5μA/cm2 ±8.31, green line). The combination therapy of 24hr lumacaftor treatment followed by acute ivacaftor treatment resulted in a greater increase than either modulator alone (ΔIsc = 69.8μA/cm2 ±16.1, red line) with a ~5.1-fold increase over DMSO control. Together, these results indicate that only one of three variants at codon 970 predicted to substitute an amino acid allows normal RNA splicing and production of mutant CFTR protein that responds to modulators (c.2909G>A (G970D)). Previous studies have shown that individuals bearing c.2908G>C (G970R) do not respond to modulator treatment [45] and here we have shown that c.2908G>A (G970S) produces the same misspliced mRNA isoforms and would thus be expected to show no response. All four of the ‘middle’ of exon variants were predicted and subsequently shown experimentally to also result in production of predominately misspliced mRNA isoforms. To assess the feasibility of modulator treatment for this type of variant, we performed an in-depth study of c.523A>G (predicted effect I175V). We were particularly interested in this variant because previous studies had indicated that individuals harboring c.523A>G have severe disease [48], while functional testing had an opposite result indicating high levels of function for CFTR protein with the I175V amino acid substitution [49]. We introduced c.523A>G to an EMG containing abridged introns 1–4 and full-length intron 5 (EMG_i1-i5, , top) to determine if missplicing could explain the discrepancy between these two reports, and if the outcome would impact drug response. Transcript sequencing revealed that c.523A>G activated a cryptic 5’ splice site within exon 5 of CFTR that caused an in-frame deletion encompassing the last 57 nucleotides of exon 5 (, bottom). Normally spliced transcript could not be detected. We then wanted to know if the shortened CFTR protein expected to be produced by c.523A>G () would have any residual channel activity, and how this would compare to the function of CFTR protein harboring the I175V amino acid substitution if splicing was not considered. To this end, we created CFBE cell lines that stably expressed c.523A>G on the EMG background and on the cDNA background in order to perform functional testing. While the cDNA bearing c.523A>G generated robust CFTR chloride currents, cells expressing the c.523A>G EMG produced minimal current corresponding to <0.5% of CFTR function observed in a WT EMG_i1-i5 cell line (). Thus, missplicing explains the severe phenotype associated with c.523A>G. In contrast to these EMG results, CFBEs expressing c.523A>G on a cDNA background, representing the effect of the amino acid substitution alone (p.Ile175Val), produced robust forskolin activated and inh-172 inhibited current characteristic of CFTR (). The magnitude of current was comparable to about 38% of WT. Additionally, we found that CFTR bearing I175V was responsive to modulators (). These results show that the amino acid substitution I175V allows for production of modulator responsive CFTR, illustrating that study of this variant on a cDNA background would result in incorrect classification of this variant as modulator responsive.

Residual normal splicing of an exonic variant produces functional, modulator responsive CFTR protein

To study the effect of exonic variants resulting in partial missplicing, we chose c.3873G>C (predicted effect Q1291H) located at the last nucleotide of CFTR exon 23. We were interested in this variant due to variability in disease severity between individuals harboring this variant and previous reports that it affects CFTR mRNA splicing in vivo [50]. Two CFTR mRNA isoforms are produced when c.3873G>C is introduced to an EMG containing abridged introns 21–24 (EMG_i21-i24, , left). RNA isoform 1 was a misspliced product utilizing a cryptic noncanonical GC splice donor site, resulting in retention of 29 nucleotides of CFTR intron 23 (, left). Of note, we also observed a second isoform that was consistent with normal splicing resulting in a full-length transcript harboring the missense change (, left). To determine the ratio of misspliced to normally spliced CFTR mRNA isoforms, we performed fragment analysis. As compared to WT EMG_i21-24, which produced 100% normally spliced transcript, we found that c.3873G>C on the EMG background resulted in 62.8%±2.2 misspliced transcript and 37.2%±2.2 normally spliced transcript (, right). A variant at the end of exon 23 affects splicing of CFTR mRNA and causes an amino acid substitution in CFTR protein (A) The variant c.3873G>C (predicted effect Q1291H) weakens the canonical donor splice site resulting in partial intron retention and residual normally spliced CFTR transcript. Left panel. Schematic showing location of c.3873G>C relative to EMG_i21-i24 and resulting CFTR splice isoforms. Right panel. Fragment analysis of RT-PCR products labelled with 6-FAM reveals the ratio of misspliced to normally spliced CFTR transcript for EMG_i21-i24 bearing c.3873G>C. WT EMG_i21-i24 served as control. RNA isoform quantity was determined from the AUC (minimum of three technical replicates). The percent shown for each isoform is relative to the sum of the AUC for both isoforms. Error bars represent standard deviations. (B) Q1291H-CFTR is highly functional and responds to ivacaftor, lumacaftor, and combination therapy. Representative tracings for short circuit current (Isc) assay of CFTR channel function and CFTR modulator response performed on CFBE cells stably expressing CFTR cDNA bearing c.3873G>C. Inset- Quantification of change in current (ΔIsc) in response to modulators (minimum of three independent measurements per condition). Data shown as mean±SD. p value determined by one-way ANOVA. *** (p≤0.001), **(p≤0.01), n.s. (not significant, p>0.05) when compared to DMSO treated vehicle control. Data underlying graphs in this figure reported in S4 Data. While our findings indicate that individuals with c.3873G>C would be expected to have reduced levels of CFTR transcript as a result of degradation of the misspliced product, we wanted to characterize the channel activity and drug response of the residual full-length protein that would harbor the Q1291H amino acid substitution. We found that cell lines expressing c.3873G>C cDNA showed high levels of CFTR-specific current. This was evidence that the amino acid substitution p.Gln1291His allows for proper folding, processing, and trafficking of CFTR protein, producing short circuit currents consistent with approximately 80% of what we observe in expression matched WT cDNA cell lines (calculated as previously described [30]). In addition, we found that treatment with any or multiple modulators further augmented channel function. Specifically, 24hr treatment with lumacaftor resulted in a ~1.4-fold change over baseline (ΔIsc = 107.2μA/cm2 ±25.7 compared to ΔIsc = 74.5μA/cm2 ±11 for DMSO control, black line compared to blue line), acute treatment with ivacaftor resulted in approximately a 1.9-fold change (ΔIsc = 139.2μA/cm2 ±13.8, green line), and combination therapy resulted in a ~2.2-fold change (ΔIsc = 162.5μA/cm2 ±12.9, red line). This indicates that the small amount of full-length transcript that is generated from CFTR bearing c.3873G>C produces CFTR protein that is partially functional and targetable by modulator therapy.

Variants in intronic splice sites can generate residual full-length wildtype transcript

Having shown that exonic variants can have unanticipated RNA-level effects, we next looked at the more classically considered splice affecting variants—those located within the introns. These variants, particularly those in the canonical and consensus splice sites, are often assumed to result in complete missplicing of the affected transcript, thus making them difficult-to-treat with protein-targeted therapies. To test this hypothesis, we chose 37 naturally occurring CFTR intronic variants to study in our EMG system. 19 were located in the canonical splice sites (+1, +2, -1, or -2) and 15 variants were located in the extended consensus splice sites (+3, +4, +5, and -3, “proximal intronic”). An additional three variants were located outside of the consensus splice sites (“distal intronic”), as previous studies have indicated that these can result in activation of cryptic splice sites and partial missplicing [37]. As with our exonic variants, all variants were assessed by both CryptSplice and SpliceAI. We utilized transient transfection and RT-PCR to evaluate the effect on CFTR mRNA splicing. We found that 19/19 canonical splice site variants resulted in production of one or more misspliced isoforms (). Interestingly, two variants (c.4005+2T>C, c.4242+2T>C) resulted in production of residual full-length transcript due to incomplete missplicng, which allowed for production of full-length CFTR protein (). Notably, both of these variants change the canonical GT donor site to a noncanonical GC, which is used to produce the full-length transcript. We found that 14/15 proximal intronic variants resulted in missplicing, while only one variant (c.164+3_164+4insT) produced WT levels of full-length CFTR transcript. Importantly, five of these 14 misspliced variants allowed for production of residual normally spliced transcript. Of these 15 variants, 11 were accurately assessed by in silico tools with three “indeterminate” calls (due to lack of consensus between the two algorithms, ) and one false negative. Of the three distal intronic variants evaluated, only one (c.3717+40A>G) caused missplicing, consistent with predictions (). Overall, we identified eight intronic variants that could allow for production of residual WT CFTR (). The numbers shown in the superscript indicate previously published study on the respective variant. Data supporting these findings in . *A prediction was deemed “indeterminate” if the two in silico tools used did not agree (S1 Table). †This variant does not cause CF. ⧧This variant is approved for modulator therapy.

Intronic variants that allow for residual full-length protein production are modulator responsive

Given our observation that 8/37 of our intronic variants allowed for production of residual full-length CFTR mRNA, we wanted to determine if these variants resulted in sufficient WT protein production to allow for residual channel function and modulator response. To test this hypothesis, we obtained primary nasal epithelial cells (HNE) from a CF individual with the genotype c.2657+5G>A/W1282X. To verify that our in vitro EMG results were recapitulated in vivo, RNA was obtained from these HNEs and sequencing was performed. We looked for the presence of reads mapping to erroneous exon junctions as visualized in a sashimi plot (). As compared to a WT control, we observed evidence of transcripts missing exon 16 with three reads mapping from exon 15 to exon 17 (). Notably, despite the presence of a nonsense variant on the other allele, 9 reads mapping from exon 15 to exon 16 were observed (compared to 27 reads mapping to this junction in a WT control), corroborating that this variant results in only partial missplicing. This finding was further validated by RT-PCR and sanger sequencing showing that the majority of the normally spliced transcript comes from the c.2657+5G>A allele (). To evaluate CFTR function in this individual, a well-differentiated monolayer culture of HNEs grown at air-liquid interface (ALI) was established on snapwell filters and mounted in Ussing chambers to allow for Isc measurements to be taken. Amiloride was used to inactivate the apically located epithelial sodium channel (ENaC), followed by activation of CFTR via cAMP-mediated signaling with addition of forskolin and 3-isobutyl-1-methylxanthine (IBMX). As with CFBEs, we used inhibition with Inh-172 to allow for determination of the CFTR-specific change in current (ΔIsc ±SEM). Assessment of channel activity confirmed that this residual normal splicing was sufficient to produce functional CFTR protein, as residual function (ΔIsc = 2.9μA/cm2 ±0.12) was observed in untreated cells (, blue line). This function was further augmented by addition of modulator compounds. Notably, the greatest increase in function (ΔIsc = 4.35μA/cm2 ±0.1), was observed when potentiator (ivacaftor) alone was added (, green line), corresponding to 30.6% of what we observe, on average, in WT HNEs (ΔIsc = 14.2μA/cm2 ±4.0). Additionally, HNEs from a second individual who harbors the distal intronic variant c.3717+40A>G in trans with F508del also showed moderate levels of CFTR function (ΔIsc = 3.86μA/cm2 ±0.11 , blue line). This function could be attributed to partial missplicing caused by 3717+40A>G () that leads to production of residual full-length protein [37]. As with c.2657+5G>A, ivacaftor increased 3717+40A>G-CFTR function by ~1.2-fold (ΔIsc = 4.47μA/cm2, , green line), consistent with our observation that ivacaftor elicits a response in WT HNEs () and therefore could be acting on WT-CFTR produced by residual normally spliced transcript from the 3717+40A>G alleles. Interestingly, a combination of correctors (elexacaftor and tezacaftor) and potentiator (ivacaftor) resulted in the greatest increase in function (ΔIsc = 8.72μA/cm2, 61.3% of WT) an ~2.2-fold change, the majority of which could be attributed to the effect of these modulators on F508del-CFTR produced from the second allele (, pink line, [26]).

CFTR intronic splice site variants with residual function are responsive to CFTR modulators.

(A) Sashimi plots showing skipping of exon 16 caused by 2657+5G>A variant. RNA-seq was performed on the total RNA extracted from nasal cells of the individual harboring c.2657+5G>A/W1282X genotype. RNA from healthy individual served as a control. Per-base expression is plotted on y-axis of Sashimi plot. ‘E’ refers to exon locations on x-axis. +5 on the Sashimi plot indicates variant is located in intron 16 (not to scale). Numbers on the Sashimi plot indicate counts of reads spanning exon junctions. ɸ on the illustration indicates relative location of the variant c.2657+5 G>A. (B) Residual full-length . Representative short circuit current (Isc) tracings on Human nasal epithelial (HNE) cells harvested from an individual harboring c.2657+5G>A/W1282X. CFTR channel function and CFTR modulator response were evaluated in well-differentiated airway liquid interface (ALI) culture. (C) Primary cells from an individual harboring c.3717+40A>G/F508del respond to modulator therapy. Representative short circuit current (Isc) tracings on Human nasal epithelial (HNE) cells harvested from an individual harboring c.3717+40A>G/F508del. CFTR channel function and CFTR modulator response were evaluated in well- differentiated airway liquid interface (ALI) culture. (D) Intronic variants that allow for residual normal splicing respond to modulator therapies. Stacked bar graphs indicate effect of modulator treatment on CFBE stable cells expressing different CFTR intronic variants. Change in current (ΔIsc) was defined as the current inhibited by Inh-172 after sustained Isc responses were achieved upon stimulation with forskolin alone or sequentially with ivacaftor. Data shown as mean±SEM (minimum of three independent measurements per condition). p value was determined by one-way ANOVA. ****(p ≤0.0001), n.s. (not significant, p>0.05) when compared to DMSO treated vehicle control. Data underlying graphs in this figure reported in . To investigate the modulator response of additional intronic variants, we stably expressed EMGs bearing eight intronic variants in CFBE cells (). Measure of CFTR channel chloride transport using Isc indicated that cell lines bearing variants shown to produce residual full-length transcript (c.165-3C>T, c.2657+2_2657+3insA, c.2657+5G>A, c.3468+5G>A, c.3717+40A>G, c.4242+T>C) have residual CFTR function, which can be augmented by treatment with corrector only, potentiator only, or combination therapy. Notably, cell lines bearing variants that cause complete missplicing and production of only shortened protein (c.273+1G>A, c.579+1G>T) did not demonstrate any residual channel activity, nor modulator response. These results show that intronic variants have to be experimentally tested to determine which allow for residual normal splicing, and thus production of WT CFTR protein that will respond to modulator therapies.

Discussion

Splicing of mRNA is a critical process in the generation of protein, and disruption of this process has been implicated in a variety of genetic diseases [51]. Regulation of splicing is complex and involves both exonic and intronic sequences. In addition to the critical 5’ and 3’ consensus splice sites, regulatory sequences known as splice silencers and splice enhancers are found in both the exons (ESS, ESE, [52, 53]) and introns (ISS, ISE, [54, 55]). Additionally, interplay between RNA transcription and splicing as well as chromatin-level modulation add even more complexity [56]. Genetic variants can disrupt this process via many different mechanisms (e.g. weakening of canonical splice site, activation of cryptic splice site, disruption of splice regulatory region, etc)[57]. Through the study of 52 naturally occurring CFTR variants (exonic = 15, intronic = 37) we demonstrate the importance of delineating the specific molecular consequences of such variants for the accurate assignment of precision therapies. Our evaluation of exonic variants demonstrates the need to consider the RNA-level impact, as an amino acid substitution is not always the cause of disease. Additionally, we demonstrate that intronic variants, even within the consensus splice site, may retain some level of normal splicing opening up the possibility for protein-level treatment. These two overlooked mechanisms emphasize the importance of the role each specific nucleotide change plays in the manifestation of genetic disease. By using heterologous expression systems to study individual CFTR variants, we were able to evaluate molecular mechanism of disease and drug response on an allele by allele basis and for rare variants for which primary cells are difficult to obtain. Previous work has established our expression minigene (EMG) system as a reliable model for the study of CFTR variants that impact mRNA splicing [34-38]. This approach utilizing minigenes harboring exons and the adjacent native introns (either full-length or abridged) has also been undertaken to assess defects of mRNA splicing caused by disease-associated variants in different genes, e.g., PRPH2 [58], CLRN1 (USH3A)[59], CNGβ1 [60]. A minimum of 200bp of intronic sequence from each end of the intron were used in generating abridged CFTR introns as this has been shown to be sufficient to capture the most critical splice regulatory sequences [61, 62]. One limitation of our model system is that CFTR transcription occurs outside of its native genomic context and only a portion of the gene sequence is present. While our constructs retain full exonic sequences and can thus account for exonic regulatory elements (e.g. composite exonic regulatory element of splicing in CFTR exon 12, [63]), our EMG studies cannot give insight into the effects of intronic sequences outside of the scope of our constructs on RNA splicing and processing. Transient expression of EMGs in HEK293s cannot account for the effects of chromatin, however stably integrated EMGs in our CFBE cell lines are subject to chromatin-level regulation, albeit at a different genomic locus than endogenous CFTR. While CRISPR/Cas and other methods of gene editing can generate cell lines expressing CFTR variants in the native genomic context [64, 65], EMGs provide a more rapid approach for assessing a large number of variants to identify those with unexpected consequences that warrant follow-up studies. Additionally, we show here and in prior work by ourselves and others that our results in EMGs are corroborated by studies in primary cells [20, 35–38, 44, 46, 50, 66]. Overall, we observed a pattern of missplicing consistent with previous observations regarding the degree of tolerance for variation at different exonic nucleotides within the consensus splice sites [6]. Namely, exonic variants that misspliced were more often located in the end of an exon as opposed to the beginning of an exon, consistent with the observation that the exon nucleotides of the 3’ splice site are more tolerant of variation than the exon nucleotides of the 5’ splice site [6]. Notably, we found that the splicing effects of 12/15 exonic variants were correctly predicted by the splice algorithms. Of the three not predicted, two were “indeterminate” (c.3873G>C, c.2909G>A) due to disagreement between the algorithms and were accurately assessed by SpliceAI, but not CryptSplice (). One of these was a false negative for CryptSplice (c.3873G>C) and the other was a false positive (c.2909G>A). The false negative can be explained by the fact that the change in splice site strength sits just above the threshold for Cryptsplice. Additionally, c.3873G>A missplices through use of a noncanonical splice donor site that has a GC dinucleotide rather than a GT, which makes this outcome more difficult to predict. We have yet to identify an explanation for the false positive called only by CryptSplice, however our in vitro results recapitulate what others have reported in primary cells [46], giving additional evidence that the in silico prediction made by CryptSplice is incorrect. The variant which was incorrectly assessed by both algorithms (c.274G>A), resulted in only very low levels of missplicing which is likely why it was a false negative. While the legacy names [41] of these exonic variants are given based on the presumed amino acid substitution, it is critical to note that only 3/9 misspliced variants allow for production of any of the predicted protein isoform. Thus, focusing on the amino acid substitution through in vitro studies in a cDNA based system can be misleading. In CFTR, a good example of this is the variant c.2908G>C (legacy G970R). This variant was evaluated in vitro and shown to respond robustly to modulator therapy [28]. On the basis of these data, individuals with CF who had at least one copy of this variant were enrolled in a clinical trial, but did not respond to therapy [45]. Studying this variant in the context of surrounding introns revealed the cause of this incongruence was an impact on CFTR mRNA splicing [46]. Here we corroborate these results in our EMG system providing further evidence that this is a considerable problem, but one which can be addressed by evaluating each nucleotide substitution and utilizing appropriate model systems. In addition, we show that the variant c.2908G>A (legacy G970S) also results in complete missplicing of CFTR mRNA. Importantly, our results replicate prior studies showing that alteration of this amino acid at a different residue (c.2909G>A) does not have an impact on splicing, emphasizing the importance of considering variants at the nucleotide level. Prior studies in HNE cells from an individual with CF bearing c.2909G>A in trans with F508del showed evidence of modulator response [46] and our study of this variant in a heterologous system provides evidence that G970D-CFTR contributes to this response. Our study of the variant c.523A>G has revealed the potential for a similar misclassification of other splice-effecting variants outside of the consensus splice site. This variant appears to be of very high function when studied on a cDNA background, however the protein produced by the misspliced product (p.Ile175_Glu193del) does not retain any meaningful channel activity, which emphasizes the possibility of misclassification of disease liability and modulator response for exonic variants when relying on cDNA-based systems. Additional CFTR variants have been previously reported to follow this same paradigm of being named for an amino acid substitution that they ultimately do not produce [34, 37, 38, 44, 67]. Together these results suggest that exonic variants that missplice completely are unlikely to respond to modulators, while those that allow for production of some full-length protein have the potential to respond, albeit to a much lesser degree than would be expected if missplicing were not occurring. Additionally, so called ‘synonymous’ variants that would not be expected to result in a protein-level change are known to have RNA-level consequences. We have previously shown that the CFTR variant c.2988G>A, located in the last nucleotide of exon 18, results in partial missplicing [36] and a middle of exon ‘synonymous’ variant, c.2679G>T, has been reported to activate a cryptic splice site in CFTR exon 17 [68]. Importantly, these RNA-level effects of exonic variants have been reported in other disease-associated genes, such as ATP7B [69], DMD [15], and PRPH2 [58], thus our work emphasizes the need for careful evaluation of molecular consequences of exonic variants across the genome. When considering if a variant has been incorrectly assessed, a useful metric is a comparison of genotype and phenotype. This is well demonstrated by the example of c.523A>G. This variant was identified in individuals to cause severe disease [48], but shown in vitro to allow for production of highly functional protein [49]. The contrast between these two findings points to some overlooked mechanism, which we demonstrate here to be splicing. Similarly, high levels of variation in the phenotype of individuals harboring the same variant may be explained by a complex molecular mechanism of disease, such as missplicing. This is demonstrated by the example of c.3873G>C. Clinical features of individuals harboring this variant have a large amount of variability. For example, sweat chloride (a metric shown to correlate with disease severity, [22]) ranges from 35–112 mmol/L in these individuals (CFTR2). This may be explained by variability in the amount of normally spliced transcript between individuals, which would in turn impact the amount of residual CFTR function. Additionally, accurate assessment of disease liability and therapeutic options is particularly complicated for such variants that are exonic but also result in partial missplicing as the residual normal splicing results in a full-length protein that harbors an amino acid substitution. Assignment of precision therapies thus requires determination of both RNA and protein-level effects. In addition to our evaluation of exonic variants, we chose to study 37 intronic variants spread across the CFTR gene and located throughout the introns. While +1, +2, -1, and -2 variants invariably caused missplicing, we observed more tolerance for variation in the +3, -3, and +5 positions consistent with previous reports [6]. Notably, these sites that were less likely to result in missplicing were also less accurately predicted by in silico tools, with our only false negative occurring at the -3 position (c.165-3C>T) as well as three “indeterminate” calls at the +3 and +5 positions (c.579+3A>G, c.579+3A>T, c.579+5G>A). Similar to our observation with c.274G>A (E92K), it is likely that the variant in the -3 position was not properly predicted because the major isoform is normally spliced. While SpliceAI outperformed CryptSplice in evaluation of exonic variants, the three “indeterminate” predictions for intronic variants were the result of false negative calls by SpliceAI indicating that each algorithm has different strengths and weaknesses. Also of note, we identified two variants within the canonical splice site (c.3873+2T>C, c.4242+2T>C) that allowed for retention of some normal splicing. It has long been known that while less common than 5’GT donor sites, 5’GC donors are tolerated by the splicing machinery [7, 13]. Our findings indicated reduced splicing at these sites, consistent with a recent report which estimates that these splice sites can allow for production of up to ~80% of normally spliced transcript [8, 9]. Through evaluation of splice isoforms and CFTR channel function in primary human nasal epithelial cultures (HNE), we were able to demonstrate congruence between results determined in vitro (Table ) and those determined in vivo. Our RNA-seq results show an alternative method of assessing RNA-level effects of variants. Additionally, the variant c.2657+5G>A is already approved for modulator therapy [24], thus showing that our EMG results and our HNE results can provide effective assessment of drug response. Previous studies have demonstrated that c.3717+40A>G allows for production of reduced levels of normally spliced transcript [37] and here we confirm that this could facilitate modulator response in individuals harboring this variant. While some of the response we observe in c.3717+40A>G can be attributed to the in trans F508del allele, complementary results in our heterologous expression system show that c.3717+40A>G likely contributes to this response. Additionally, c.2657+5G>A is in trans with a nonsense variant that underwent nonsense-mediated mRNA decay (NMD) (W1282X, ) and thus we attribute the modulator response observed in HNEs to residual normally spliced transcript from the c.2657+5G>A allele. However, due to inter-individual variability in NMD efficiency some residual nonsense transcripts could be observed [70]. Although in our own studies, we have previously shown in primary cells that W1282X undergoes efficient NMD [39] and functional studies in W1282X homozygote HNEs have yielded no modulator response [71], which is further supported by a lack of clinical improvement in individuals treated with these drugs [72]. Therefore, our results for c.2657+5G>A confirm that modulator therapy can augment WT channel function when missplicing is incomplete and allows for production of residual WT CFTR. We also identified an additional six variants that responded, including c.3717+40A>G, which nicely corroborated our primary cell data. These results strengthen the validity of our in vitro studies and show that EMGs can be a useful tool in the characterization of variants and assignment of precision therapies. Our systematic study of variants across the CFTR gene revealed that 43/52 variants (exonic = 9, intronic = 34) resulted in production of at least one misspliced mRNA isoform. We found that while in silico tools are useful for identifying splice variants, they are not fail-safe and experimental validation is required to determine the degree of missplicing and the downstream functional consequences. Evaluation of these downstream impacts of missplicing was critical for assessment of residual function and modulator response. Both exonic and intronic variants that allow for production of protein should be assessed for treatment with modulator therapy. As modulator therapies for CF continue to improve, additional variants that allow for shortened protein isoforms to be generated may benefit from these types of treatments. However, some variants that result in complete missplicing may require alternative treatment strategies, such as direct modulation of splicing through modified snRNPs [73] or antisense oligonucleotides [74, 75] or correction of the primary genetic defect through gene editing approaches [76, 77]. Overall, we demonstrate the need for careful consideration of the molecular mechanism of a single nucleotide substitution when evaluating genetic variants for assignment of precision therapies. While we use CF here as a model for the study of splice variants, these variants are pervasive across the spectrum of genetic disease. Thus, our findings are a reminder that RNA-level impacts should be considered when assessing any genetic variant. This becomes especially important when therapeutic approaches are dependent on targeting variants of a specific molecular mechanism.

Materials and methods

Ethics statement

This study was approved by the Institutional Review Board at Johns Hopkins Medicine, Baltimore (IRB# NA00029159 and IRB# 00116966). Written informed consent was obtained from all subjects.

Assessment of variants by splice prediction tools

Middle of exon variants were chosen based on initial prediction by CryptSplice as previously described [37], and were then also assessed by the splice prediction algorithm SpliceAI using the recommended threshold of 0.5 [43]. All other variants were chosen to allow for evaluation of the impact on splicing in different regions of the gene and were assessed by CryptSplice and SpliceAI after selection. A prediction for a variant (Tables ) was considered “indeterminate” if the two algorithms did not agree. Predictions for each variant with each algorithm can be found in supporting information ().

Introduction of variants to EMG and cDNA constructs

Expression minigene (EMG) plasmids were created as previously described [36, 38]. Variants were introduced to either EMG or cDNA plasmid using site-directed mutagenesis (SDM) as previously described [30, 36]. Briefly, plasmids were PCR-amplified using primers containing the desired single nucleotide change () and successful addition of the variant was verified by sanger sequencing.

Transfection of HEK293 cells and evaluation of mRNA splicing

Plasmids were transiently introduced to HEK293 cells using Lipofectamine 2000 (ThermoFisher Scientific). Total RNA was collected 48hours post-transfection and 500ng was used as an input for reverse transcription, which was performed using iScript cDNA synthesis kit (BioRad). RT-PCR was then performed as previously described [36]. Briefly, 2μl of cDNA was used as input for PCR amplification using exonic primers spanning the region surrounding relevant introns and KOD Hot Start polymerase master mix (Millipore Sigma). Conditions for PCR were 2 minutes at 95°C followed by 35 cycles of 20 seconds at 95°C, 10 seconds at the annealing temperature (specific to each set of primers), and 20 seconds at 70°C. PCR products were analyzed by gel electrophoresis followed by extraction of relevant bands and Sanger sequencing.

Fragment analysis

Fragment analysis was performed as previously described [36]. Briefly, RT-PCR was performed as described above, with a forward primer bearing a 5’6FAM tag. Each RT-PCR was run in triplicate and relevant bands were gel extracted and purified. Products were separated on Applied Biosystems 3730 DNA Analyzer capillary electrophoresis system with GeneScan-500 Rox (Applied Biosystems) used as an internal size standard. Relative RNA isoform quantity was determined by the area under the curve (AUC) for each isoform compared to the total AUC for both isoforms for each sample and these values were then averaged across the technical replicates.

Immunoblotting

Protein lysates were collected 48 hours after transient transfection of EMG or cDNA plasmids into HEK293 cells. 40μg of lysate were loaded per sample into a 7.5% Criterion TGX protein gel (BioRad). Transfer to PVDF membrane was performed in a Trans-Blot Turbo Transfer System (BioRad). After blocking, the membrane was probed with either mouse monoclonal anti-CFTR antibody 596 binding amino acids 1204–1211 (CFFT, University of North Carolina Chapel Hill) or mouse monoclonal anti-CFTR antibody 570 binding amino acids 731–742 (CFFT, University of North Carolina Chapel Hill) diluted to 1:5,000. Rabbit monoclonal anti-sodium/potassium-ATPase (Abcam) diluted 1:50,000 was used as a loading control. Secondary antibodies were anti-mouse (1:150,000 GE Healthcare) and anti-rabbit (1:100,000 GE Healthcare), respectively. Blots were exposed on film using ECL Primer Western Blotting Detection Reagent (GE Healthcare).

Creation of stable cell lines

CF bronchial epithelial cells stably expressing EMG or cDNA constructs were generated as previously described [30, 38, 78]. Briefly, CFBEs lacking endogenous CFTR expression (CFBE41o-) and containing a single Flp recombinase target site [47] were co-transfected with EMG or cDNA plasmid and pOG44 (a plasmid encoding Flp recombinase) using lipofectamine LTX (ThermoFisher Scientific). Cells were placed under Hygromycin selection until individual clones grew large enough to isolate and propagate into new cell lines. For cDNA cell lines, genomic DNA was extracted from each clone and PCR amplification with overlapping primer sets spanning the entirety of CFTR was used to verify integration. For EMG cell lines, RNA was extracted and reverse transcribed into cDNA, which was then used for integration PCR. For all cell lines, CFTR expression level was verified by qPCR as previously described [30].

Collection and culture of primary HNEs

Primary nasal epithelial cells (HNEs) were collected and cultured as previously described [38, 79–81] under Johns Hopkins University IRB#NA00029159 and IRB#00116966. An anesthetic was applied to the nasal mucosa proceeding brushing of the mid-part of the inferior turbinate. Nasal cells were harvested from both nostrils. Brushes were washed with propagation media as cells were scraped off the brush and the cells and media were collected and placed in a conical tube and pelleted by centrifugation. Brushes were then placed in PBS and vortexed to collect additional cells and this PBS was used to wash the pelleted cells. The cell pellet was then resuspended in 3mL Accutase and placed at 37°C for three minutes. Accutase was inactivated by the addition of propagation media () and cells were pelleted and resuspended in HNE propagation media (). A flask containing a fibroblast feeder layer (3T3-J2, Kerafast) was irradiated in advance in the presence of 30 Gy by setting the CIXD Biological Irradiator at 220V, 13A, and 468 sec. Resuspended cells were added to the feeder layer flask and maintained in propagation media in the presence of 10μM Reagent Y (ROCK inhibitor) (). To facilitate differentiation, 300,000–400,000 cells were moved to snapwell filters coated with collagen type IV (Sigma#C6745-1ML) 0.3 mg/mL, final concentration 50 μg/mL. Cells were kept in propagation media for 5–7 days and upon confluency changed to either differentiation media containing Ultroser G or proprietary PneumaCult ALI media, both of which lack ROCK inhibitor (). One day later, apical media was removed to establish air-liquid interface (ALI) culture. To maintain ALI culture, basolateral media was changed twice per week and the apical side of the filter was washed with 37°C PBS once per week. Prior to functional assessment, cells were grown on ALI culture for 28 days, which is the time required for the appearance of cilia as a marker of complete differentiation ().

Functional assessment and modulator testing

Assessment of CFTR channel function and response to drugs was performed in CFBEs as previously described [30, 47]. Briefly, CFBE stable cell lines were plated on snapwell filters and grown until transepithelial resistance reached ~200μΩ. Filters were mounted in Ussing chambers (Physiological Instruments). A high chloride solution was added to the basolateral chamber and a low chloride solution was added to the apical chamber. After equilibration of currents, 10μM forskolin (Selleckchem) was added to the basolateral side to activate CFTR channels via cAMP signaling. Currents were allowed to plateau and CFTR was inhibited using 10μM Inh-172 (Selleckchem) added to the apical chamber. Modulator testing for cDNA cell lines was carried out with addition of 6μM lumacaftor (Selleckchem) or equivalent volume DMSO 24hrs prior to run and acute apical addition of 10μM ivacaftor (Selleckchem) or equivalent volume of DMSO following plateau after channel activation. For EMG cell lines, correctors (lumacaftor or tezacaftor (Selleckchem)) were added individually or in combination 24hrs prior to run to a final concentration of 3μM. 10μM ivacaftor was added acutely. For all Ussing chamber studies the drop in current produced by the addition of Inh-172Inh-172) was used to quantify CFTR channel function. Assessment of CFTR channel function and response to drugs was performed in primary HNEs as previously described [38]. Briefly, well-differentiated HNEs () on ALI culture were mounted in Ussing chambers and bathed in a chloride solution. After equilibration of currents, 100μM of amiloride (Selleckchem) was added to the apical chamber to inactivate the epithelial sodium channel (ENaC). 10μM forskolin and 100μM 3-isobu-tyl-1-methylxanthine (IBMX, Sigma-Aldrich) were added simultaneously to the basolateral chamber to activate chloride transport through cAMP-mediated channel opening. Specific inhibition of CFTR driven chloride transport was achieved through apical addition of 10μM Inh-172. WT HNE function was determined using an average of values obtained from two unrelated individuals with a total of 11 readings. Modulator testing was performed by 24hr addition of 3μM corrector (lumacaftor, tezacaftor, or tezacaftor+elexacaftor) and acute apical addition of 10μM ivacaftor.

RNA-sequencing

RNA-sequencing preparation and analysis was performed as previously described [38]. Briefly, RNA was extracted from cultured primary HNEs and 1.0μg was used as input for library preparation. 50 million paired end reads were obtained and aligned to the reference genome (hg19) using BowTie2 [82]. Tophat2 [83] was used to determine splice junctions. To allow for visualization of splicing, sashimi plots were generated using the Integrative Genomics Viewer.

Statistical analysis

Statistical analysis was completed using GraphPad Prism (GraphPad Software, San Diego, California USA, www.graphpad.com). One-way ANOVA was performed followed by Dunnett’s test for multiple comparisons. A p-value of <0.05 was considered significant. Raw data underlying all graphs reported in . Web resources CFTR2 (Clinical and Functional Translation of CFTR), https://cftr2.org (version January 10, 2020) CFTR mutation database, www.genet.sickkids.on.ca CryptSplice, https://bitbucket.org/jhucidr/cryptsplice Splice AI, https://github.com/Illumina/SpliceAI

The beginning of exon variant c.274G>A (predicted effect E92K) results in a low level of exon 3 skipping.

Top panel. Representative raw data for fragment analysis of RT-PCR products from evaluation of splicing in HEK293 cells after transient transfection with EMG_i1-i5 bearing c.274G>A. Results show majority normally spliced transcript and a small fraction of products consistent with the size of exon 3 skipped transcript. Bottom panel. Representative raw data for fragment analysis of RT-PCR products from WT EMG_i1-i5 showing normally spliced transcript. (TIF) Click here for additional data file.

In an individual bearing two copies of c.3700A>G (predicted effect I1234V).

c.3700A>G activation of a cryptic splice site results in production of primarily misspliced transcript. Left panel. Fragment analysis of RT-PCR products shows abundance of normally spliced transcript relative to misspliced transcript. Right panel. Screenshot of IGV view showing RNA sequencing reads mapping to region surrounding c.3700A>G. Reduced coverage is evidence of misspliced product, while one read maps to exon/exon junction indicating some normal splicing. (TIF) Click here for additional data file.

CFTR cDNA bearing c.523A>G (175V) produces processed protein, while EMG_i1-i5 bearing the same variant produces only unprocessed CFTR.

Immunoblot showing CFTR on top, with Na+,K+-ATPase on bottom as a loading control. Lanes from a single blot were reordered and split into two panels to allow for appropriate comparison of matched experimentals and controls. Left panel. All constructs driven by EF1α promoter and express CFTR cDNA. WT and F508del served as controls. Negative control is an empty vector plasmid. Right panel. All constructs driven by CMV promoter and express either CFTR cDNA or CFTR EMG_i1-i5, as indicated. WT cDNA, F508del, WT EMG_i1-i5 served as controls. (TIF) Click here for additional data file.

CFBEs stably expressing WT EMG_i1-i15 produce substantial current.

Representative tracing for short circuit current (Isc) assay of CFTR channel function performed on CF bronchial epithelial cells stably expressing the WT EMG_i1-i5 construct. (TIF) Click here for additional data file.

I175V-CFTR responds to modulator therapy.

Representative tracings for short circuit current (Isc) assay of CFTR channel function and CFTR modulator response performed on CFBE cells stably expressing the I175V cDNA construct. Inset- Quantification of change in Isc in response to modulators (minimum of three independent measurements per condition). Data shown as mean±SD. p value determined by one-way ANOVA. *** (p≤0.001), *(p≤0.05) when compared to DMSO treated vehicle control. Data underlying graph in this figure reported in . (TIF) Click here for additional data file.

Residual normal splicing and protein production resulting from c.4242+2T>C variant in the CFTR terminal intron.

(A) Schematic illustration of the Expression minigene i25-i26 and location of the splice-site variant. Full-length introns 25 and 26 were inserted into pcDNA5FRT-CFTR cDNA construct to create the expression minigene. Variant c.4242+2T>C was created by site directed mutagenesis. Two isoforms (aberrant and normal indicated on the labels) were observed in HEK293Flp cells by transient transfection of the EMG harboring the c.4242+2T>C variant. (B) Sanger sequencing confirmed skipping of exon 26 in the aberrant spliced isoform resulting in frameshift and introduction of premature termination codon. The second minor isoform confirmed normal splicing. (C) Immunoblotting of the protein lysates collected from HEK293Flp cells transfected with either WT EMG_i25-26, F508del cDNA or c.4242+2T>C. Non-transfected parental cells served as control. CFTR antibody 596 (CFFT) was used to probe for CFTR protein production. Residual levels of full-length mature and immature protein products denoted as band C and B respectively were produced by c.4242+2T>C variant due to normal splicing. A lower sized truncated CFTR protein fragment resulting from aberrantly spliced isoform was also observed. (TIF) Click here for additional data file.

Residual level of normal spliced wild-type mRNA produced from c.2657+5G>A allele is higher than W1282X mRNA in the primary nasal epithelial cells of individual harboring c.2657+5G>A and W1282X in compound heterozygosity.

(A) Top panel, schematic illustration of the region selected to amplify CFTR. Vertical arrow indicates location of W1282X variant in the context of processed mRNA. Horizontal arrows indicate forward and reverse primers selected from exon 22 and exon 24 respectively for the reverse transcription-polymerase chain reaction (RT-PCR). Bottom panel, shows selection of primers to amplify TATA box binding protein (TBP) gene as control. (B) Ethidium bromide—stained agarose gel to visualize RT-PCR products. RNA was extracted either directly from the nasal cells dislodged from brush using a forcep (non-cultured) or nasal cells expanded in propagation medium containing 10 μM reagent-Y (see methods for culture and S3 Table for recipes). 200 ng RNA was used to prepare cDNA. PCR was performed on either stock cDNA or diluted (1:5). cDNA prepared from the non-CF cultured nasal cells was used as a positive control. The molecular weight of the amplification product matched the expected size products for CFTR (417 base pairs) and TBP (108 base pairs). Faint amplification was achieved for stock cDNA prepared from the brush, and very faint amplification for 1:5 diluted cDNA. TATA box binding protein (TBP) was amplified as control for the quality of RNA. No-RT, used as negative control, contained RNA from the cultured nasal cells of individual with 2789+5/W1282X. (C) Representative electropherogram of the RT-PCR product to assess differential expression of mRNA produced from 2789+5 and W1282X alleles. RT-PCR products obtained were sent for Sanger sequencing. Small peak for “nucleotide A” (indicated by vertical orange arrow) corresponding to W1282X allele, and large for “nucleotide G” corresponding to normal spliced wild-type mRNA produced from 2789+5 were observed. RT-PCR result shown here lends support to RNA-seq data that 2789+5 results in partial missplicing, and corroborates with residual CFTR function in the nasal cells observed by short-circuit current measurements. (TIF) Click here for additional data file.

RNA-sequencing performed on HNEs bearing c.3717+40A>G/F508del shows evidence of missplicing.

Top panel. Sashimi plot shows retention of 40 nucleotides of intron 23 as well as normal splicing. WT HNEs served as control. Numbers indicate number of reads mapping to each splice isoform. Bottom panel. Zoomed in version of sashimi plot for better visualization of intron retention. (TIF) Click here for additional data file.

Assessment of CFTR function and ivacaftor response in WT human nasal epithelial (HNE) cells.

Upper graph, two replicates of short-circuit current (Isc)tracings from the same sample. Lower graph, same Isc tracings as above, but after amiloride inhibition was set to zero to allow for better visualization of the function. (TIF) Click here for additional data file.

Primary human nasal epithelial (HNE) cells at propagation and differentiation stages.

(A) Compound microscopy image of HNE cells propagating in conditionally reprogrammed co-culture with irradiated J2 feeder cell (J2s) at passage 0. Arrow indicates HNE cells growing in island surrounded by J2 fibroblast. (B) Scanning electron microscopy image of well- differentiated 28-days old HNE cells growing in air-liquid interface (ALI) culture. Cells propagating at passage 2 were transferred on to the filters to establish ALI. Arrow indicates appearance of cilia as a marker of differentiation. (TIF) Click here for additional data file.

Splice predictions using bioinformatics tools.

(XLSX) Click here for additional data file.

Site-directed mutagenesis (SDM) primers.

(XLSX) Click here for additional data file.

Reagents used in culturing primary HNEs.

(XLSX) Click here for additional data file.

Additional Sanger sequencing of RT-PCR products S1 Table.

(PDF) Click here for additional data file.

Additional immunoblots S2 Table.

(PDF) Click here for additional data file.

Raw data used in generating graphs in Fig 2.

(XLSX) Click here for additional data file.

Raw data used in generating graphs in Fig 3.

(XLSX) Click here for additional data file.

Raw data used in generating graphs in Fig 4.

(XLSX) Click here for additional data file.

Raw data used in generating graph in S5 Fig.

(XLSX) Click here for additional data file. 4 Aug 2020 * Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. * Dear Dr Sharma, Thank you very much for submitting your Research Article entitled 'Evaluation of both exonic and intronic variants for effects on RNA splicing allows for accurate assessment of the effectiveness of precision therapies' to PLOS Genetics. Your manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some aspects of the manuscript that should be improved. You will note that one reviewer felt additional germline editing experiments are necessary. While we disagree that such experiments are required, we do believe that some commentary addressing this point, including strengths, weaknesses, and limitations, would be valuable.  If there are any data available, such as examples of variants tested multiple ways that provide similar or discordant results, that would be ideal, but would not be required. We therefore ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer. In addition we ask that you: 1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. 2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images. We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org. If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission. PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process. To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder. [LINK] Please let us know if you have any questions while making these revisions. Yours sincerely, Gregory M. Cooper, PhD Associate Editor PLOS Genetics Hua Tang Section Editor: Natural Variation PLOS Genetics Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The paper from A.T .Juynt et al ,is devoted to the evaluation of both exonic and intronic variants for their effect on RNA splicing and for acurate assessement of the effectiveness in the context of precision therapies . The CFTR gene, responsible of cystic fibrosis, when both alleles are mutated is a perfect model to decipher the relationship between a genetic variant and its fonctional effect both in vitro and in vivo . This is particularly important in the field of cystic fibrosis as more than 2000 mutations or variants have been reported in this gene and since ten years new drugs called modulators (potentiators and correctors) have been discovered and are now proposed to CF patients in the context of allele specific efficiency. By analysis individual CFTR variants they carefully determine the molecular mechanism associated with the change at the nucleotide level and the drug response on an allele by allele basis. In the first part of their results they first selected missenses CFTR variants for evaluating their effect on mRNA splicing .To do that they chose naturally occuring CFTR variants located in the first and the second nucleotide of an exon as well as the last and the penultimate nucleotide .Moreover they selected also the splice –defect potential of CFTR missense variants predicted by the CryptSplice algorithm.They selected 15 variants for this study ,14 of these were introduced into expression minigenes (EMGs) that contained full length or abridged CFTR intronic sequences allowing the evaluation of the effects of CFTR variants on m RNA splicing of adjacent introns. This first part is followed by showing carefully that missense variants could be misclassified as drug responsive ,then they showed that residual normal splicing of amissense variant produces functional ,mudulator responsive CFTR protein and finally they showed that variants in intronic splice sites can generate reduced level of full length wildtype transcript . These data are extremely well performed on 52 naturely occuring variants combining in vitro ,in vivo and functional results .Doing that they demonstrate the need to consider the impact on RNA-level showing that an amino acid sbstitution is not always the cause of the disease and that intronic variants even in a consensus splice signal may retain some level of normal splicing allowing the possibility for protein –level treatment . I think this is an excellent work that could be helpful not only for geneticists working in the field of CF but also for those working in the field of other genetic disorders Reviewer #2: PLOS Genetics CF Joynt and colleagues functionally characterize 52 variants of CFTR in a splicing assay. The majority are intronic, but they find exotic and intronic variants that alter splicing and confirm this in CFTR current assays and in primary cells from patients. They interpret their data in light of CF modulator therapies and provide mechanistic support for variants that respond or don’t respond to modulators. One interesting example was show in figure two where the different mutations in the same codon can causes different ratios of RNA species to be produced. This paper’s study of a large number of variants leads, in my reading of this, to two key findings that should be of broad interest. First is that exonic variants can have “unanticipated RNA-level impacts”, and that genotype specific therapies for CF patients need to consider broader molecular mechanisms. The first finding I would not consider interesting, but not unanticipated, as previous reports have found exonic variants that are pathogenic due to altered protein binding to DNA vs. altered amino acid sequences, and likewise missense exonic variants near splice sites are routinely considered in light of splicing. The second finding currently largely applies to CF where identifying and expanding genotype specific therapies have made outstanding progress, but one hopes can be used as a model for other diseases, where The authors could broaden the interest by emphasizing the carful molecular assessment (i.e. splicing) of all variants is necessary as more diseases move into genotype driven precision therapies. Minor Concerns 1. At various points it is very hard to figure out what they did and to how many variants. They authors should use consistent language, for example they switch from exonic, to missense to “variants of interest” within the same paragraphs on page 7 and 8. As I believe all variants tested are of interest, and exonic ones tested are missense, while all intronic ones are not, I would prefer they simplify and use “exonic” or “intronic” unless necessary to denote a functional consequence they uncover. 2. More description or details on CF for those not PLOS Genetics readers well versed in the disease. For example on #285 “We were interested in this variant due to its varying clinical consequences” they can describe the varying clinical consequences, which are described only as a broad sweat chloride range on line #511. 3. Fig 3B, Add data for WT CFTR so we can see how close is Luma+Iva to WT CFTR. The text states “approximately 80% of WT function” for Gln1291His, but this is not cited in any way and it’s unclear if this is a protein level/single channel measurement, or a measurement from cells genetically homozygous for 3873G>C. 4. Figure 4 showing data in HBE cells from a patient is a nice validation of what they see in EMG assays. I know procuring patient samples for rare genotypes is difficult, but any additional assays in primary cells would greatly strengthen this papers reliance on the EMG assays. Reviewer #3: Comments to Author: The manuscript by Joynt et al describes a need to experimentally validate exonic and intronic substitutions within the CFTR gene for effects on splicing, as this may be overlooked, especially at nonsynonymous substitutions near splice sites. This is especially important for proper characterization of CFTR variants in terms of prescription of relevant CF drugs. This type of analysis may also explain why genotype (and therefore predicted functional change) might not align with expected outcomes in drug studies for particular variants. The authors have used an array of previously designed tools/models to explore the effects of variants on splicing, primarily using expression minigenes (EMGs) containing relevant CFTR intronic sequences, and cDNA constructs to generate stable cell lines or primary HNEs with relevant genotypes in electrophysiology studies to measure channel activity and response to CFTR modulators. The results presented here are an extensive survey of the potential for mischaracterization of CFTR variant impact on CFTR function, however there important concerns that arise from this study as outlined below. Major comments: Though the data presented are no doubt the result of a substantial experimental effort, this paper is more suitable for Molecular Diagnostics Journal than PLOS Genetics, primarily because it is based on outdated methodologies, which do not have the capacity to provide novel insights into the role of variants on CFTR structure/function: 1. The use of expression minigenes (EMGs) and heterologous expression in HEK cells are outdated technologies with known artifacts. This type of study should have been done with CRISPR/Cas9 modification of the endogenous locus in airway and intestinal cell lines at minimum, since primary epithelial cells are rather difficult to engineer or, in iPSCs with subsequent differentiation in to relevant epithelial cells. 2. The authors have ignored an extensive literature on complex intronic and exonic elements that impact spicing, and on the fundamental role of chromatin features on splicing and RNA processing. 3. To define variants as "deep" intronic (eg Table 2) when they are 20 or 40 bp away from the splice site is erroneous. Many CFTR introns are greater than 10 kb long and it is known that there are well defined regulatory elements and splicing variants truly "deep" within these introns, several kb away from intron/exon boundaries. The variants described in this manuscript all lie within an extended splice site machinery. 4. How is the fragment analysis quantified in SFig 2, F2A, F3A, etc? It would seem difficult to reliably quantify PCR amplicons that are gel extracted prior to fragment analysis. The recovery efficiency of gel extraction is very low and variable from extraction to extraction. This could influence the quantification of transcripts derived from the same EMG (as presented in the mentioned figures). 5. If isoform 2 of c.2908 variants produces S912-G970 deletion (Figure 2A, left), and c.2908G>A produces more of this transcript than isoform 1, how can the authors be confident that the protein species detected for the EMG constructs of the two variants that cause missplicing are from isoform 1 and not isoform 2? The predicted protein sizes are not given, nor are any molecular weight markers noted on the western blot in Figure 2B. To better estimate the identity of this protein species, the authors should express the cDNA of these predicted protein species and compare to lanes 5 and 7 in the current western blot. 6. The conclusion made on lines 240-245 are not fully supported by the data presented. cDNA constructs were not made that would generate the predicted protein sequence of the misspliced c.2909 variants, and tested in short circuit current studies, therefore, it is only speculation that these immature protein species would be non-functional. 7. How does the short circuit current experiments for the stable lines in CFBE compare to the parental line (CFBE with Flp integration)? That is, how can the change in channel activity be attributed to the integrated cDNA construct and not to the effect of the drugs on the F508del proteins in these cells (CFBE/CFBE41o- is F508del homozygous), especially since Orkambi (lum/iva) can be prescribed for and ivacaftor has been reported to affect channel activity of F508del in cultured HBE cells (Van Goor, PNAS 2009, PMID: 19846789). The contribution of F508del to the signal is mentioned in the description of the results of figure 4C (pg 18), however again, it is unclear how the authors can conclude with the data presented, that the responses of the cells in figure 4C, especially in regards to ivacaftor alone, are contributed by the 3717+40G>A allele and not F508del. 8. Comprehensive raw data is not shown for “RNA effect” and “Protein present” columns for all variants in both tables, this should be provided in some form as supplementary data, and most especially for variants further explored in subsequent figures. It is unclear if protein production was assessed via western for each variant, or if predictions were made based on sequencing of RNA species. Additional information is also important in order to support statements like those made on line 330, in relation to the two intronic variants that express reduced levels of full-length transcript--was that calculated in relation to the WT EMG construct that harbored these mutations? How much reduction of transcript? Was it significantly different? A quantitative assay is needed to support this statement. This would also be helpful to support the conclusion that the “shortened protein isoform produced by this variant (from the EMG of c.454A>G)….” (line 487) is in fact producing a protein species. 9. HNEs with the genotype c2657+5G>A/W1282X are used to assess missplicing caused by the intronic variant, while comparing to a WT control (Figure 4A). The authors conclude that because W1282X is a nonsense variant, the presence of 9 reads that map from E15 to E16 and 3 reads that map from E15 to E17 as compared to the 27 reads in WT mapping from E15 to E16 mean that the intronic variant only causes partial missplicing. However, since W1282X is in exon 23, and is known to produce some detectable levels of mRNA transcript that can be measured under normal conditions (for example: Linde et al JCI, 2007, PMID: 17290305), the authors should use a W1282X homozygous HNE sample for control to fully conclude that the normal splicing detected between E15/E16 is not contributed by the W1282X allele. Minor comments: Please provide full details of antibodies that were used in this study. It would be useful to remind readers that the CFBE/FLP cell line only has one copy of the FLP recombinase (reference 29), and that therefore integration copy number should not impact EMG or cDNA expression values between cell lines as all lines should have 1 integration Please provide in the methods summary for “Collection and culture of primary HNEs” additional details on growth conditions (ie media used, disassociation reagent, concentration of ROCK inhibitor, approximately how long cells were cultured at ALI, etc)—having to sift through 4 references is tedious. While the focus of this manuscript is on non-synonymous substitutions in exons, there have been multiple reports of synonymous substitutions that alter CFTR splicing. Reference of these studies could further strengthen the argument to further characterize variants by other means than just predicted protein sequence alterations. How were the variants indicated in the tables identified? Were these all the variants from the CFTR Mutation Database (Toronto) or from the CFTR2 list? If the latter, what list version was used? (page 6, lines 127-129, page 15 lines 322-323) What is the scale of the y-axis for the fragment analysis shown (SF1)? Please provide the molecular weight markers for the western blot in Figure 2B, as well as numbering the lanes as described in the text on page 10. References for previously reported primer sequences should be provided within the table For HNE electrophysiology in Figure 4, the results note the % channel function as it corresponds to WT HNE (pg 18), however these traces/values are not provided. ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No 3 Sep 2020 Submitted filename: splicepaper_responsetoreviewers_final.docx Click here for additional data file. 8 Sep 2020 Dear Dr Sharma, We are pleased to inform you that your manuscript entitled "Evaluation of both exonic and intronic variants for effects on RNA splicing allows for accurate assessment of the effectiveness of precision therapies" has been editorially accepted for publication in PLOS Genetics. Congratulations! Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional accept, but your manuscript will not be scheduled for publication until the required changes have been made. Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org. In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. If you have a press-related query, or would like to know about one way to make your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date. Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics! Yours sincerely, Gregory M. Cooper, PhD Associate Editor PLOS Genetics Hua Tang Section Editor: Natural Variation PLOS Genetics www.plosgenetics.org Twitter: @PLOSGenetics ---------------------------------------------------- Comments from the reviewers (if applicable): ---------------------------------------------------- Data Deposition If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website. The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-20-00901R1 More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support. Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present. ---------------------------------------------------- Press Queries If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org. 14 Oct 2020 PGENETICS-D-20-00901R1 Evaluation of both exonic and intronic variants for effects on RNA splicing allows for accurate assessment of the effectiveness of precision therapies Dear Dr Sharma, We are pleased to inform you that your manuscript entitled "Evaluation of both exonic and intronic variants for effects on RNA splicing allows for accurate assessment of the effectiveness of precision therapies" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work! With kind regards, Jason Norris PLOS Genetics On behalf of: The PLOS Genetics Team Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom plosgenetics@plos.org | +44 (0) 1223-442823 plosgenetics.org | Twitter: @PLOSGenetics
Table 1

Exonic variants evaluated for impact on CFTR mRNA splicing.

BEGINNING OF EXON (FIRST AND SECOND NUCLEOTIDE)
HGVSPredicted EffectsExperimental Outcomes
Nucleotide changePredicted effect (Legacy)Splicing (in silico prediction)*RNA effectProtein presentModulator response
c.166 G>Ap.Glu56Lys (E56K)Does not misspliceDoes not missplicep.Glu56LysResponsive
c.274 G>Ap.Glu92Lys (E92K)Does not misspliceMissplicesp.Glu92LysResponsive78
c.580 G>Ap.Gly194Arg (G194R)Does not misspliceDoes not missplicep.Gly194Arg
c.581 G>Tp.Gly194Val (G194V)Does not misspliceDoes not missplicep.Gly194Val
c.2909 G>Ap.Gly970Asp (G970D)46IndeterminateDoes not missplicep.Gly970AspResponsive
c.3719 T>Gp.Val1240Gly (V1240G)Does not misspliceDoes not missplicep.Val1240Gly
MIDDLE OF EXON (OUTSIDE OF SPLICE SITE)
Nucleotide changePredicted effect (Legacy)Splicing (in silico prediction)*RNA effectProtein presentModulator response
c.454 A>Gp.Met152Val (M152V)37MissplicesMissplicesshortenedNon-responsive
c.523 A>Gp.Ile175Val (I175V)MissplicesMissplicesshortened
c.2816 A>Gp.His939Arg (H939R)37MissplicesMissplicesshortened
c.3700 A>Gp.Ile124Val (I1234V)44MissplicesMissplicesp.Ile1234ValResponsive
END OF EXON (PENULTIMATE AND LAST NUCLEOTIDE)
Nucleotide changePredicted effect (Legacy)Splicing (in silico prediction)*RNA effectProtein presentModulator response
c.2908 G>Cp.Gly970Arg (G970R)46MissplicesMissplicesshortenedNon-responsive45
c.2908 G>Ap.Gly970Ser (G970S)MissplicesMissplicesshortened
c.3717 G>Cp.Arg1239Ser (R1239S)MissplicesMissplicesshortened
c.3872 A>Gp.Gln1291Arg (Q1291R)Does not misspliceDoes not missplicep.Gln1291Arg
c.3873 G>Cp.Gln1291His (Q1291H)50IndeterminateMissplicesp.Gln1291HisResponsive

The numbers shown in the superscript indicate previously published study on the respective variant.

Data supporting these findings in .

*A prediction was deemed “indeterminate” if the two in silico tools used did not agree ().

†This variant is approved for modulator therapy.

‡The primary RNA isoform produced by this variant is normally spliced ().

Table 2

Intronic variants evaluated for impact on CFTR mRNA splicing.

CANONICAL SPLICE SITE (+1, +2, -1, -2)
HGVS (Legacy)Predicted EffectExperimental Outcomes
Nucleotide changeSplicing (in silico prediction)*RNA effectProtein presentModulator response
c.164+1 G>A (296+1 G>A)MissplicesMissplicesShortened
c.164+2 T>C (296+2 T>C)MissplicesMissplicesShortened
c.165-2 A>G (297–2 A>G)MissplicesMissplicesShortened
c.273+1 G>A (405+1 G>A)MissplicesMissplicesShortenedNon-responsive
c.274-1 G>A (406–1 G>A)MissplicesMissplicesShortened
c.274-2 A>G (406–2 A>G)MissplicesMissplicesShortened
c.489+1 G>T (621+1 G>T)MissplicesMissplicesShortened
c.579+1 G>T (711+1 G>T)MissplicesMissplicesShortenedNon-responsive
c.1584+1 G>A (1716+1 G>A)MissplicesMissplicesShortened
c.1585-1 G>A (1717–1 G>A)36MissplicesMissplicesNo protein
c.2658-1 G>C (2790–1 G>C)MissplicesMissplicesNo protein
c.2658-2 A>G (2790–2 A>G)MissplicesMissplicesNo protein
c.2988+1 G>A (3120+1 G>A)36MissplicesMissplicesNo protein
c.3469-2 A>G (3601–2 A>G)MissplicesMissplicesShortened
c.3717+1 G>A (3849+1 G>A)MissplicesMissplicesShortened
c.3718-1 G>A (3850–1 G>A)MissplicesMissplicesShortened
c.3873+1 G>A (4005+1 G>A)MissplicesMissplicesShortened
c.3873+2 T>C (4005+2 T>C)MissplicesMissplicesResidual full-length
c.4242+2 T>C (4374+2 T>C)MissplicesMissplicesResidual full-lengthResponsive
PROXIMAL INTRONIC (+3, +4, +5, -3)
Nucleotide changeSplicing (in silico prediction)*RNA effectProtein presentModulator response
c.164+3_164+4insT (296+3insT)Does not misspliceDoes not misspliceFull-length
c.165-3 C>T (297–3 C>T)Does not misspliceMissplicesResidual full-lengthResponsive
c.273+3 A>C (405+3 A>C)MissplicesMissplicesShortened
c.489+3 A>G (621+3 A>G)MissplicesMissplicesResidual full-length
c.579+3 A>G (711+3 A>G)IndeterminateMissplicesShortened
c.579+3 A>C (711+3 A>C)MissplicesMissplicesShortened
c.579+3 A>T (711+3 A>T)IndeterminateMissplicesShortened
c.579+5 G>A (711+5 G>A)IndeterminateMissplicesShortened
c.2657+2_2657+3insA (2789+2insA)36MissplicesMissplicesResidual full-lengthResponsive
c.2657+5 G>A (2789+5 G>A)36MissplicesMissplicesResidual full-lengthResponsive
c.3468+2_3468+3insT (3600+2insT)MissplicesMissplicesNo protein
c.3468+5G>A (3600+5 G>A)MissplicesMissplicesResidual full-lengthResponsive
c.3717+4 A>G (3849+4 A>G)MissplicesMissplicesShortened
c.3717+5 G>A (3849+5 G>A)MissplicesMissplicesShortened
c.3718-3 T>G (3850–3 T>G)MissplicesMissplicesShortened
DISTAL INTRONIC
Nucleotide changeSplicing (in silico prediction)*RNA effectProtein presentModulator response
c.164+28 A>G (296+28 A>G)Does not misspliceDoes not misspliceFull-length
c.2620-26 A>G (2752–26 A>G)Does not misspliceDoes not misspliceFull-length
c.3717+40 A>G (3849+40 A>G)37MissplicesMissplicesResidual full-lengthResponsive

The numbers shown in the superscript indicate previously published study on the respective variant.

Data supporting these findings in .

*A prediction was deemed “indeterminate” if the two in silico tools used did not agree (S1 Table).

†This variant does not cause CF.

⧧This variant is approved for modulator therapy.

  82 in total

1.  Isogenic cell models of cystic fibrosis-causing variants in natively expressing pulmonary epithelial cells.

Authors:  Hillary C Valley; Katherine M Bukis; Alisa Bell; Yi Cheng; Eric Wong; Nikole J Jordan; Normand E Allaire; Andrey Sivachenko; Feng Liang; Hermann Bihler; Philip J Thomas; Jerome Mahiou; Martin Mense
Journal:  J Cyst Fibros       Date:  2018-12-15       Impact factor: 5.482

Review 2.  Regulatory feedback from nascent RNA to chromatin and transcription.

Authors:  Lenka Skalska; Manuel Beltran-Nebot; Jernej Ule; Richard G Jenner
Journal:  Nat Rev Mol Cell Biol       Date:  2017-03-08       Impact factor: 94.444

3.  5' splice site GC>GT and GT>GC variants differ markedly in terms of their functionality and pathogenicity.

Authors:  Jin-Huan Lin; Emmanuelle Masson; Arnaud Boulling; Matthew Hayden; David N Cooper; Claude Férec; Zhuan Liao; Jian-Min Chen
Journal:  Hum Mutat       Date:  2020-05-12       Impact factor: 4.878

4.  Unusual splice sites revealed by mutagenic inactivation of an authentic splice site of the rabbit beta-globin gene.

Authors:  B Wieringa; F Meyer; J Reiser; C Weissmann
Journal:  Nature       Date:  1983-01-06       Impact factor: 49.962

5.  Single base-pair substitutions in exon-intron junctions of human genes: nature, distribution, and consequences for mRNA splicing.

Authors:  Michael Krawczak; Nick S T Thomas; Bernd Hundrieser; Matthew Mort; Michael Wittig; Jochen Hampe; David N Cooper
Journal:  Hum Mutat       Date:  2007-02       Impact factor: 4.878

6.  A classification model relative to splicing for variants of unknown clinical significance: application to the CFTR gene.

Authors:  Caroline Raynal; David Baux; Corinne Theze; Corinne Bareil; Magali Taulan; Anne-Françoise Roux; Mireille Claustres; Sylvie Tuffery-Giraud; Marie des Georges
Journal:  Hum Mutat       Date:  2013-04-05       Impact factor: 4.878

7.  The role of exon sequences in splice site selection.

Authors:  A Watakabe; K Tanaka; Y Shimura
Journal:  Genes Dev       Date:  1993-03       Impact factor: 11.361

8.  A novel donor splice site in intron 11 of the CFTR gene, created by mutation 1811+1.6kbA-->G, produces a new exon: high frequency in Spanish cystic fibrosis chromosomes and association with severe phenotype.

Authors:  M Chillón; T Dörk; T Casals; J Giménez; N Fonknechten; K Will; D Ramos; V Nunes; X Estivill
Journal:  Am J Hum Genet       Date:  1995-03       Impact factor: 11.025

9.  Genetic, cell biological, and clinical interrogation of the CFTR mutation c.3700 A>G (p.Ile1234Val) informs strategies for future medical intervention.

Authors:  Steven V Molinski; Tanja Gonska; Ling Jun Huan; Berivan Baskin; Ibrahim A Janahi; Peter N Ray; Christine E Bear
Journal:  Genet Med       Date:  2014-02-20       Impact factor: 8.822

10.  Intronic splicing enhancers, cognate splicing factors and context-dependent regulation rules.

Authors:  Yang Wang; Meng Ma; Xinshu Xiao; Zefeng Wang
Journal:  Nat Struct Mol Biol       Date:  2012-09-16       Impact factor: 15.369

View more
  5 in total

1.  Clotting factor genes are associated with preeclampsia in high-altitude pregnant women in the Peruvian Andes.

Authors:  Maria A Nieves-Colón; Keyla M Badillo Rivera; Karla Sandoval; Vanessa Villanueva Dávalos; Luis E Enriquez Lencinas; Javier Mendoza-Revilla; Kaustubh Adhikari; Ram González-Buenfil; Jessica W Chen; Elisa T Zhang; Alexandra Sockell; Patricia Ortiz-Tello; Gloria Malena Hurtado; Ramiro Condori Salas; Ricardo Cebrecos; José C Manzaneda Choque; Franz P Manzaneda Choque; Germán P Yábar Pilco; Erin Rawls; Celeste Eng; Scott Huntsman; Esteban Burchard; Andrés Ruiz-Linares; Rolando González-José; Gabriel Bedoya; Francisco Rothhammer; Maria Cátira Bortolini; Giovanni Poletti; Carla Gallo; Carlos D Bustamante; Julie C Baker; Christopher R Gignoux; Genevieve L Wojcik; Andrés Moreno-Estrada
Journal:  Am J Hum Genet       Date:  2022-05-18       Impact factor: 11.043

2.  Missplicing suppressor alleles of Arabidopsis PRE-MRNA PROCESSING FACTOR 8 increase splicing fidelity by reducing the use of novel splice sites.

Authors:  Adrián Cabezas-Fuster; Rosa Micol-Ponce; Sara Fontcuberta-Cervera; María Rosa Ponce
Journal:  Nucleic Acids Res       Date:  2022-06-10       Impact factor: 19.160

3.  Congenital nephrotic syndrome in a Hispanic Guatemalan newborn associated with a NPHS1 variant: A case report.

Authors:  Thu T Tran; Vijay G Linga; Mohammed A I Al-Obaide; Daniella Bello-Germino; Mehar Hoda; Olubukunola Adesanya; Tetyana L Vasylyeva
Journal:  Biomed Rep       Date:  2021-11-16

Review 4.  Advances in Preclinical In Vitro Models for the Translation of Precision Medicine for Cystic Fibrosis.

Authors:  Iris A L Silva; Onofrio Laselva; Miquéias Lopes-Pacheco
Journal:  J Pers Med       Date:  2022-08-16

5.  Performance Evaluation of SpliceAI for the Prediction of Splicing of NF1 Variants.

Authors:  Changhee Ha; Jong-Won Kim; Ja-Hyun Jang
Journal:  Genes (Basel)       Date:  2021-08-25       Impact factor: 4.096

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.