| Literature DB >> 23451180 |
Mara Colombo1, Giovanna De Vecchi, Laura Caleca, Claudia Foglia, Carla B Ripamonti, Filomena Ficarazzi, Monica Barile, Liliana Varesco, Bernard Peissel, Siranoush Manoukian, Paolo Radice.
Abstract
Several unclassified variants (UVs) have been identified in splicing regions of disease-associated genes and their characterization as pathogenic mutations or benign polymorphisms is crucial for the understanding of their role in disease development. In this study, 24 UVs located at BRCA1 and BRCA2 splice sites were characterized by transcripts analysis. These results were used to evaluate the ability of nine bioinformatics programs in predicting genetic variants causing aberrant splicing (spliceogenic variants) and the nature of aberrant transcripts. Eleven variants in BRCA1 and 8 in BRCA2, including 8 not previously characterized at transcript level, were ascertained to affect mRNA splicing. Of these, 16 led to the synthesis of aberrant transcripts containing premature termination codons (PTCs), 2 to the up-regulation of naturally occurring alternative transcripts containing PTCs, and one to an in-frame deletion within the region coding for the DNA binding domain of BRCA2, causing the loss of the ability to bind the partner protein DSS1 and ssDNA. For each computational program, we evaluated the rate of non-informative analyses, i.e. those that did not recognize the natural splice sites in the wild-type sequence, and the rate of false positive predictions, i.e., variants incorrectly classified as spliceogenic, as a measure of their specificity, under conditions setting sensitivity of predictions to 100%. The programs that performed better were Human Splicing Finder and Automated Splice Site Analyses, both exhibiting 100% informativeness and specificity. For 10 mutations the activation of cryptic splice sites was observed, but we were unable to derive simple criteria to select, among the different cryptic sites predicted by the bioinformatics analyses, those actually used. Consistent with previous reports, our study provides evidences that in silico tools can be used for selecting splice site variants for in vitro analyses. However, the latter remain mandatory for the characterization of the nature of aberrant transcripts.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23451180 PMCID: PMC3579815 DOI: 10.1371/journal.pone.0057173
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Experimentally observed effects on mRNA splicing of group A variants and predicted protein change.
| Variant | SS | mRNA change observed | Allelicexpression ofnormaltranscript(s) | Predicted protein change | Classificationaccording tocurrentguidelines | |||
| BIC-nomenclature | HGVS-nomenclature | Description | HGVS-nomenclature | Description | HGVS-nomenclature | |||
|
| ||||||||
| IVS7+2T>G | c.441+2T>G | D | skipping of 62 bp at the3′-end of exon 7 | r.[380_441del] | mono-allelic | stop at codon 137 | p.Ser127ThrfsX11 | 5 |
| IVS8+2T>A | c.547+2T>A | D | skipping of exon 8 | r.[442_547del] | mono-allelic | stop at codon 198 | p.Gln148AspfsX51 | 5 |
| IVS16+1G>T | c.4986+1G>T | D | retention of 65 bp at the5′-end of intron 16 | r.[4986_4987ins4986+1_4986+65; 4986+1g>u] | not assessable | stop at codon 1676 | p. Met1663ValfsX14 | 4 or 5 |
| IVS16−1G>A | c.4987−1G>A | A | skipping of exon 17 | r.[4987_5074del] | mono-allelic | stop at codon 1672 | p.Val1665SerfsX8 | 5 |
| IVS20−2delA | c.5278−2delA | A | skipping of exon 21;skipping of 8 bp at the 5′-end of exon 21 | r.[5278_5332del,5278_5285del] | mono-allelic | stop at codon 1774,stop at codon 1826 | p.Phe1761AsnfsX14,p.Ile1760GlyfsX67 | 5 |
| IVS21+1G>A | c.5332+1G>A | D | skipping of exon 21 | r.[5278_5332del] | mono-allelic | stop at codon 1774 | p.Phe1761AsnfsX14 | 5 |
|
| ||||||||
| IVS5+1G>A | c.475+1G>A | D | skipping of exon 5 | r.[426_475del] | mono-allelic | stop at codon 165 | p.Pro143GlyfsX23 | 5 |
| IVS5−2A>G | c.476−2A>G | A | skipping of exons 6;up-regulation of Δexons 5–6 isoform | r.[ = , 476_516del,426_516del] | bi-allelic | stop at codon 168,stop at codon 154 | p.Val159GlyfsX10,p.Ser142ArgfsX13 | 4 |
| IVS13−2A>T | c.7008−2A>T | A | skipping of exon 14;skipping of 10 bp at 5′-end of exon 14;skipping of 246 bp at 5′-end of exon 14 | r.[7008_7435del,7008_7017del,7008_7253del] | mono-allelic | stop at codon 2353,stop at codon 2363,stop at codon 3337 | p.Thr2337PhefsX17,p.Thr2337AsnfsX27,p.Thr2337ValfsX1001 | 5 |
| IVS21−1G>A | c.8755−1G>A | A | skipping of exon 22; skipping of exon22+51 bp at the 5′-end of exon 23 | r.[ = , 8755_8953del,8755_9004del] | bi-allelic | stop at codon 2921,stop at codon 2944 | p.Gly2919LeufsX3,p.Gly2919LysfsX26 | 4 |
| IVS22−1delGTTinsAA | c.8954−1_8955delGTTinsAA | A | skipping of 51 bp at the 5′-end ofexon 23; skipping of exon 23 | r.[8954_9004del,8954_9117del] | mono-allelic | in frame deletion of 17aa,stop at codon 2988 | p.Val2985_Thr3001del,p.Val2985GlyfsX4 | 5 |
Protein change was predicted using ExPASy Proteomics Server (http://www.expasy.ch/);
The classification as class 5 (pathogenic) or class 4 (likely pathogenic) was based on mono- or bi-allelic expression of the normal transcript [23]. Previously characterized variants are indicated;
[22];
[43];
[22], [44], [45];
[43];
[18]. An asterisk indicates variants for which the observed transcript pattern differed from that reported by previous studies (see Table S6). Abbreviations: SS, splice Site (D, donor; A, acceptor); BIC, Breast Cancer Information Core (http://research.nhgri.nih.gov/bic/); HGVS, Human Genetic Variation Society (http://www.hgvs.org/mutnomen).
Experimentally observed effects on mRNA splicing of group B variants and predicted protein change.
| Variant | SS | mRNA change observed | Allelicexpression ofnormaltranscript (s) | Predicted protein change | Classificationaccording tocurrentguidelines | |||
| BIC-nomenclature | HGVS-nomenclature | Description | HGVS-nomenclature | Description | HGVS-nomenclature | |||
|
| ||||||||
| IVS3+3del AAGT | c.134+3_134+6del AAGT | D | up-regulation of Δexon3isoform | r.[81_134del] | not assessable | stop at codon 27 | p.Cys27X | 4 or 5 |
| 331G>A | c.212G>A | D | up-regulation of Δexon5q isoform | r.[191_212del] | mono-allelic | stop at codon 64 | p.Cys64X | 5 |
| IVS5−11T>G | c.213−11T>G | A | retention of 59 bp atthe 3′-end of intron 5 | r.[212_213ins213-59_213-1; 213-11u>g] | mono-allelic | stop at codon 81 | p.Arg71SerfsX11 | 5 |
| IVS8−3delT | c.548−3delT | A | none | r.[ = ] | bi-allelic | none | p. = | 2 |
| IVS9−4A>G | c.594−4A>G | A | none | r.[ = ] | bi-allelic | none | p. = | 2 |
| 4216G>A | c.4097G>A | A | none | r.[4097g>a] | bi-allelic | aa change at codon 1366 | p.Gly1366Asp | 2 |
| 4603G>T | c.4484G>T | D | skipping of exon 14 | r.[4358_4484del] | mono-allelic | stop at codon 1462 | p.Ala1453GlyfsX10 | 5 |
| IVS16+5G>A | c.4986+5G>A | D | retention of 65 bp atthe 5′-end of intron 16 | r.[4986_4987ins4986+1_4986+65; 4986+5g>a] | mono-allelic | stop at codon 1676 | p. Met1663ValfsX14 | 5 |
| 5452A>G | c.5333A>G | A | none | r.[5333a>g] | bi-allelic | aa change at codon 1778 | p.Asp1778Gly | 2 |
|
| ||||||||
| 859G>A | c.631G>A | D | skipping of exon 7 | r.[517_631del] | mono-allelic | stop at codon 191 | p.Gly173SerfsX19 | 5 |
| IVS21+3G>C | c.8754+3G>C | D | retention of 46 bp atthe 5′-end of intron 21 | r.[8754_8755ins8754+1_8754+46; 8754+4a>g] | mono-allelic | stop at codon 2922 | p.Gly2919ValfsX4 | 5 |
| 9344C>T | c.9116C>T | D | none | r.[9116c>u] | bi-allelic | aa change at codon 3039 | p.Pro3039Leu | 2 |
| 9345G>A | c.9117G>A | D | skipping of exon 23 | r.[8954_9117del] | mono-allelic | stop at codon 2988 | p.Val2985GlyfsX4 | 5 |
Protein change was predicted using ExPASy Proteomics Server. (http://www.expasy.ch/);
The classification as class 5 (pathogenic) or class 4 (likely pathogenic) was based on mono- or bi-allelic expression of the normal transcript [23], that of class 2 (likely neutral) on A-GVGD software prediction (http://agvgd.iarc.fr/). Previously characterized variants are indicated;
[22], [50];
[22], [46];
[22], [47], [49];
[21];
[19], [44], [45];
[21], [26];
[20], [22];
[11], [18], [22], [48]. An asterisk indicates variants for which the observed transcript pattern differed from that reported by previous studies (see Table S6). Abbreviations: SS, splice Site (D, donor; A, acceptor); BIC, Breast Cancer Information Core (http://research.nhgri.nih.gov/bic/); HGVS, Human Genetic Variation Society (http://www.hgvs.org/mutnomen/).
Figure 1RT-PCR analyses of group A variants.
For each variant, the RT-PCR products were characterized by agarose gel electrophoresis and sequencing. Gel images: lane 1, no template; lane 2, genomic DNA used as negative control of the RT-PCR reaction; lane 3, cDNA from the BRCA1/BRCA2 wild-type LCL used as positive control; lane 4, cDNA from LCL carrying the UV. M, molecular marker (ΦX-174 HaeIII digest). The size of the full-length (FL) and aberrant transcripts are reported. Sequencing electropherogram data: (A–D, F) the RT-PCR products were directly sequenced; (E, G–K) the sequencing was performed after band excision or cloning step. (D, G, H) Additional bands due to improper annealing of full-length and aberrant transcripts are indicated by the asterisk. (E) In addition to the full-length and the Ex7_62 bp del aberrant transcript, the naturally occurring isoform lacking the first 3 bp of exon 8 (Ex8_3 bp del) was observed. Ex, exon; I, intron.
Figure 2RT-PCR analyses of group B variants.
For each variant, the RT-PCR products were characterized by agarose gel electrophoresis and sequencing. Gel images: lane 1, no template; lane 2, genomic DNA used as negative control of the RT-PCR reaction; lane 3, cDNA from the BRCA1/BRCA2 wild-type LCL used as positive control; lane 4, cDNA from LCL carrying the UV. M, molecular marker (ΦX-174 HaeIII digest). The size of the full-length (FL) and aberrant transcripts are reported. Sequencing electropherogram data: (B–G) the RT-PCR products were directly sequenced; (A, H) the sequencing was performed after band excision or cloning step. (H) An additional band due to improper annealing of full-length and aberrant transcripts is shown by the asterisk. The Ex5del, visible in both sample and control is a naturally occurring isoform lacking exon 5. (A) In addition to the full-length and the Ex14del aberrant transcript, the naturally occurring isoform lacking the first 3 bp of exon 14 (Ex14_3 bp del) was observed. Ex, exon; I, intron.
Figure 3Functional analysis of BRCA2 p.Val2985_Thr3001del.
(A) Schematic representation of GST-BRCA2 recombinant proteins. Wild-type and mutant BRCA2 fragments, encoding the DBD and the N-terminal region, were cloned into pGEX4T1 vector to express GST-BRCA2 fusion proteins under the control of lacUV5 promoter. BRCA2 amino acid positions, helical domain (HD) and OB fold domains 1, 2, 3 (OB1, OB2, OB3) are indicated. (B) Interaction of wild-type and mutated BRCA2 DBD polypeptides with DSS1. Equivalent amounts of GST-tagged wild-type or mutated BRCA2 fusion proteins were immobilized on GSH-Sepharose beads and challenged with MCF7 lysates as a source of GFP-DSS1. Input (top panel) and pulled down (middle panel) GFP-DSS1 protein were visualized by Western blotting with anti-GFP antibody. GSH-Sepharose beads and GST protein were used as negative controls. GST-tagged recombinant proteins were visualized by Coomassie staining of the SDS-PAGE gel used in the pull-down experiment (bottom panel). (C) Interaction of wild-type and mutated BRCA2 polypeptides with ssDNA. The mutated and wild-type peptides, removed from glutathione-agarose beads by thrombin digestion, were chromatographed on ssDNA agarose beads. A 200 amino acids N-terminal peptide was used as negative control. The free (F) and bound (B) fractions were separated, submitted to gel electrophoresis and visualized by Coomassie staining. Immunoblots were scanned using HP Scanjet G3010 Photo Scanner (Hewlett Packard).
In silico predicted effect of group B variants and comparison with experimental results.
| Variant HGVS-nomenclature | Aberrant mRNAs | SSF | MES | NNSPLICE | GS | HSF | NG2 | SV | SP | ASSA |
|
| ||||||||||
| c.134+3_134+6del AAGT | YES | −100%; S (C) | −100%; S (C) | −100%; S (C) | −100%; S (C) | −25.7%; S (C) | −100%; S (C) | −100%; S (C) | −100%; S (C) | −99.7%; S (C) |
| c.212G>A | YES | −100%; S (C) | −81.6%; S (C) | −100%; S (C) | −100%; S (C) | −13.5%; S (C) | −100%; S (C) | −100%; S (C) | −100%; S (C) | −88.3%; S (C) |
| c.213−11T>G | YES | −100%; S (C) | −100%; S (C) | −4.1%; S (C) | −100%; S (C) | −100%; S (C) | −81.05%; S (C) | |||
| c.548−3delT | NO | −39.8; S (D) | −2.3%; NS (C) | +0.6%; NS (C) | 0.0%; NS (C) | |||||
| c.594−4A>G | NO | 0.0%; NS (C) | +1.5; NS (C) | −1.6%; NS (C) | +13.4; NS(C) | −0.1%; NS (C) | −100%; S (D) | 0.0%; NS (C) | +0.1%; NS (C) | +7.2%; NS (C) |
| c.4097G>A | NO | −4.4%; NS (C) | −23.4%; NS (C) | −13.9%; S (D) | −12.3%; NS (C) | −3.8%; NS (C) | −2.4%; NS (C) | −0.1%; NS (C) | −46.4%; NS (C) | |
| c.4484G>T | YES | −13.4%; S (C) | −47.4%; S (C) | −9.8%; S (C) | −100%; S (C) | −11.2%; S (C) | −44.4%; S (C) | −6.7%; S (C) | −100%; S (C) | −89.8%; S (C) |
| c.4986+5G>A | YES | −100%; S (C) | −100%; S (C) | −100%; S (C) | −15.0%; S (C) | −100%; S (C) | −100%; S (C) | −91.2%; S (C) | ||
| c.5333A>G | NO | +5.4%; NS (C) | +12.3%; NS (C) | +32.5%; NS (C) | +17.7%; NS(C) | +3.9%; NS (C) | 0.0%; NS (C) | +2.6%; NS (C) | 0.0%; NS (C) | +86.6%; NS (C) |
|
| ||||||||||
| c.631G>A | YES | −100%; S (C) | −100%; S (C) | −100%; S (C) | −12.7%; S (C) | −100%; S (C) | −87.5%; S (C) | |||
| c.8754+3G>C | YES | −6.3%; S (C) | −31.5; S (C) | −35.5%; S (C) | −100%; S (C) | −7.4%; S (C) | −29.0%; S (C) | −8.0%; S (C) | −100%; S (C) | −95.3%; S (C) |
| c.9116C>T | NO | −43.6%; S (D) | −100%; S (D) | 0.0%; NS (C) | 0.0%; NS (C) | −100%; S (D) | −6.7%; NS (C) | |||
| c.9117G>A | YES | −100%; S (C) | −100%; S (C) | −14.7%; S (C) | −100%; S (C) | −100%; S (C) | −87.5%; S (C) | |||
| False positive analyses rate (%) | 0.0 (0/3) | 40.0 (2/5) | 50.0% (2/4) | 0.0 (0/3) | 0.0 (0/5) | 50.0 (1/2) | 0.0 (0/4) | 20.0 (1/5) | 0.0 (0/5) | |
For all computational program except ASSA, the relative percent differences of the splice site prediction scores (SSPSs) in the wild-type and the mutated sequences are reported. For ASSA, which uses the information theory-base values (Ri), the percent differences of binding affinity in the mutated compared to the wild-type sequences are reported. Empty cells indicates natural splice site not recognized by the indicated programs, In silico analyses predicting spliceogenic (S) or non spliceogenic (NS) variants according to the described procedure (see text) are indicated. (C) indicates in silico predictions concordant with in vitro data; (D), discordant predictions. Abbreviations: HGVS, Human Genetic Variation Society (http://www.hgvs.org/mutnomen/).