| Literature DB >> 26315209 |
Márton Münz1, Elise Ruark2, Anthony Renwick3, Emma Ramsay4, Matthew Clarke5, Shazia Mahamdallie6,7, Victoria Cloke8, Sheila Seal9,10, Ann Strydom11,12, Gerton Lunter13, Nazneen Rahman14,15,16.
Abstract
BACKGROUND: Next-generation sequencing (NGS) offers unprecedented opportunities to expand clinical genomics. It also presents challenges with respect to integration with data from other sequencing methods and historical data. Provision of consistent, clinically applicable variant annotation of NGS data has proved difficult, particularly of indels, an important variant class in clinical genomics. Annotation in relation to a reference genome sequence, the DNA strand of coding transcripts and potential alternative variant representations has not been well addressed. Here we present tools that address these challenges to provide rapid, standardized, clinically appropriate annotation of NGS data in line with existing clinical standards.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26315209 PMCID: PMC4551696 DOI: 10.1186/s13073-015-0195-6
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Fig. 1Example of an indel with alternative representations. The variant is a ‘GGG’ insertion that overlaps the 5′ boundary of BRCA2 exon 11. This would be annotated as an inframe glycine duplication in the most 3′ representation, as is standard for clinical annotations, but as an intronic insertion with no impact on coding sequence if left aligned, as is typical for most NGS annotation tools
CAVA variant classification system
| Class | Description |
|---|---|
| SG | Stop-gain (nonsense) variant caused by base substitution |
| ESS | Any variant that alters essential splice-site base (+1, +2, −1, −2) |
| SS5 | Any variant that alters +5 splice-site base but not an ESS base |
| SS | Any variant that alters splice-site base within the first eight intronic bases flanking exon (i.e., +8 to −8) but not an ESS or SS5 base |
| EE | Variant that alters the first or last three bases of an exon (i.e., the |
| FS | Frameshifting insertion and/or deletion. It alters length and frame of coding sequence |
| IM | Variant that alters initiating methionine start codon |
| SL | Variant that causes a stop-loss (i.e., the stop codon is altered) |
| IF | Inframe insertion and/or deletion. It alters length but not frame of coding sequence |
| NSY | Nonsynonymous variant. It alters amino acid(s) but not coding sequence length |
| SY | Synonymous variant. It does not alter amino acid or coding sequence length |
| INT | Any variant in an intron that does not alter splice-site bases |
| 5PU | Any variant in 5′ untranslated region |
| 3PU | Any variant in 3′ untranslated region |
A variant can only have one CAVA class. If a variant could potentially be included in more than one class, the first class in the list is assigned. For example, a frameshifting deletion that alters the start codon would be CAVA class FS (not IM). Nonsynonymous is also known as missense. Stop-gain is also known as nonsense
Comparison of CSN and current nomenclature for exonic base substitutions
| CSN | Current nomenclaturea | |
|---|---|---|
| Nucleotide | Amino acid | |
| c.1040A>G_p.Gln347Arg | c.1040A>G | p.Gln347Arg |
| c.1911T>C_p.= | c.1911T>C | p.Gly637Gly |
| c.3264T>C_p.= | c.3264T>C | p.Pro1088Pro |
| c.3515C>T_p.Ser1172Leu | c.3515C>T | p.Ser1172Leu |
| c.3516G>A_p.= | c.3516G>A | p.Ser1172Ser |
| c.5682C>G_p.Tyr1894X | c.5682C>G | p.Tyr1894Ter |
| c.5855T>A_p.Leu1952X | c.5855T>A | p.Leu1952Ter |
| c.6131G>T_p.Gly2044Val | c.6131G>T | p.Gly2044Val |
| c.6675A>G_p.= | c.6675A>G | p.Thr2225Thr |
| c.7558C>T_p.Arg2520X | c.7558C>T | p.Arg2520Ter |
| c.8182G>A_p.Val2728Ile | c.8182G>A | p.Val2728Ile |
| c.9976A>T_p.Lys3326X | c.9976A>T | p.Lys3326Ter |
CSN allows easy visual discrimination between the different classes of exonic base substitutions with ‘=’ denoting a synonymous variant, ‘X’ denoting a stop-gain variant and the three letter code of the new amino acid denoting a nonsynonymous variant. CSN includes both the nucleotide and amino acid level descriptions to give a single, unique identifier for each variant.
aThe current nomenclature given is one of several different notation systems currently in use
Example default output of CAVA v.1.0
| C | Pos | Ref | Alt | Qual | Filter | Type | ENST | Gene | TRINFO | Loc | CSN | Class | SO | Impact | Alt ann | Alt class | Alt SO |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 12009955 | C | T | 200 | PASS | Substitution | ENST00000196061 | PLOD1 | +/40.8 kb/19/2.9 kb | Ex3 | c.294C>T_p.= | SY | Synonymous_variant | 3 | . | . | . |
| 1 | 12919891 | G | T | 200 | PASS | Substitution | ENST00000240189 | PRAMEF2 | +/4.8 kb/4/1.6 kb | Ex3 | c.631G>T_p.Glu211X | SG | Stop_gained | 1 | . | . | . |
| 1 | 14106394 | A | ACTC | 200 | PASS | Insertion | ENST00000235372 | PRDM2 | +/120.2 kb/10/7.9 kb | Ex8 | c.2107_2109dupCCT_p.Pro703dup | IF | Inframe_insertion | 2 | c.2104_2105insCTC_p.Pro703dup | . | . |
| 1 | 15789297 | A | C | 200 | PASS | Substitution | ENST00000359621 | CELA2A | +/15.4 kb/8/0.9 kb | Ex4 | c.297A>C_p.= | SY | Synonymous_variant | 3 | . | . | . |
| 1 | 15812432 | A | G | 200 | PASS | Substitution | ENST00000375910 | CELA2B | +/15.3 kb/8/0.9 kb | Ex6 | c.530A>G_p.Gln177Arg | NSY | Missense_variant | 2 | . | . | . |
| 1 | 16727305 | G | GCTT | 200 | PASS | Insertion | ENST00000335496 | SPATA21 | -/38.8 kb/13/2.0 kb | Ex11 | c.1081_1083dupAAG_p.Lys361dup | IF | Inframe_insertion | 2 | c.1078_1079insAGA_p.Lys361dup | . | . |
| 1 | 22310824 | T | C | 200 | PASS | Substitution | ENST00000337107 | CELA3B | +/12.3 kb/8/0.9 kb | Ex6 | c.642 T>C_p.= | EE | Aplice_region_variant|synonymous_variant | 2 | . | . | . |
| 1 | 31905889 | A | ACAG | 200 | PASS | Insertion | ENST00000373710 | SERINC2 | +/25.1 kb/11/2.1 kb | Ex10 | c.1129_1131dupCAG_p.Gln377dup | IF | Inframe_insertion | 2 | c.1116_1117insCAG_p.Gln377dup | . | . |
| 1 | 36937059 | A | G | 200 | PASS | Substitution | ENST00000373103 | CSF3R | -/17.2 kb/17/3.5 kb | Ex10 | c.1260 T>C_p.= | SY | Synonymous_variant | 3 | . | . | . |
| 1 | 38023316 | C | T | 200 | PASS | Substitution | ENST00000296218 | DNALI1 | +/9.9 kb/6/2.6 kb | Ex2 | c.260C>T_p.Ala87Val | NSY | Missense_variant | 2 | . | . | . |
| 1 | 43771016 | TA | T | 200 | PASS | Deletion | ENST00000372476 | TIE1 | +/22.1 kb/23/3.9 kb | In3/4 | c.484 + 5delA | SS5 | Splice_donor_5th_base_variant | 2 | c.484 + 3delA | . | . |
| 1 | 54605319 | G | GC | 200 | PASS | Insertion | ENST00000371330 | CDCP2 | -/14.8 kb/4/2.7 kb | Ex4 | c.1223_1224insG | FS | Frameshift_variant | 1 | . | . | . |
| 1 | 55251689 | T | C | 200 | PASS | Substitution | ENST00000371276 | TTC22 | -/21.6 kb/7/3.3 kb | Ex5 | c.987A>G_p.= | SY | Synonymous_variant | 3 | . | . | . |
| 1 | 55603581 | T | TA | 200 | PASS | Insertion | ENST00000294383 | USP24 | -/149.0 kb/68/10.8 kb | In26/27 | c.2929-5dupT | SS | Intron_variant|splice_region_variant | 3 | c.2929-9_2929-8insT | INT | intron_variant |
| 1 | 60503762 | T | C | 200 | PASS | Substitution | ENST00000371201 | C1orf87 | -/83.4 kb/12/2.0 kb | Ex6 | c.765A>G_p.= | SY | Synonymous_variant | 3 | . | . | . |
| 1 | 62232031 | C | T | 200 | PASS | Substitution | ENST00000371158 | INADL | +/421.4 kb/43/8.5 kb | Ex4 | c.270C>T_p.= | SY | Synonymous_variant | 3 | . | . | . |
| 1 | 67155862 | TCTC | T | 200 | PASS | Deletion | ENST00000371037 | SGIP1 | +/210.8 kb/25/4.6 kb | In16/17 | c.1444-8_1444-6delCCT | SS | Intron_variant|splice_region_variant | 3 | c.1444-10_1444-8delCTC | . | . |
Chr chromosome, Pos position, Ref reference alllele, Alt alternative allele, Qual quality score, TRINFO transcript information, Loc location in transcript, Alt ann alternative annotation, Alt class alternative class, Alt SO alternative SO term