| Literature DB >> 29925311 |
Julie F Foley1, Dhiral P Phadke2, Owen Hardy3, Sara Hardy3, Victor Miller3, Anup Madan4, Kellie Howard5, Kimberly Kruse5, Cara Lord4, Sreenivasa Ramaiahgari6, Gregory G Solomon7, Ruchir R Shah2, Arun R Pandiri8, Ronald A Herbert8, Robert C Sills8, B Alex Merrick6.
Abstract
BACKGROUND: The rat genome was sequenced in 2004 with the aim to improve human health altered by disease and environmental influences through gene discovery and animal model validation. Here, we report development and testing of a probe set for whole exome sequencing (WES) to detect sequence variants in exons and UTRs of the rat genome. Using an in-silico approach, we designed probes targeting the rat exome and compared captured mutations in cancer-related genes from four chemically induced rat tumor cell lines (C6, FAT7, DSL-6A/C1, NBTII) to validated cancer genes in the human database, Catalogue of Somatic Mutations in Cancer (COSMIC) as well as normal rat DNA. Paired, fresh frozen (FF) and formalin-fixed, paraffin-embedded (FFPE) liver tissue from naive rats were sequenced to confirm known dbSNP variants and identify any additional variants.Entities:
Keywords: C6; COSMIC; DSL-6A/C1; FAT7; NBTII; Next generation sequencing; Sanger; Whole exome sequencing
Mesh:
Substances:
Year: 2018 PMID: 29925311 PMCID: PMC6011395 DOI: 10.1186/s12864-018-4858-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Bioinformatic evaluation pipeline for variant detection. Initially, raw reads were mapped and trimmed followed by targeted coverage analysis, filtration and functional annotation of variant calls. The resulting calls were then compared to validated cancer gene variants in the human COSMIC database. The final analysis involved assessment of the mutational spectrum from the tested samples
Rat exome-seq platform study samples
| Sample Name | Type | Treatment | Rat Strain |
|---|---|---|---|
| C6 | Glioma | N,N-nitroso-methylurea | Wistar |
| FAT7 | Nasal cavity squamous cell carcinoma | Formaldehyde | Fisher-344 |
| DSL-6A/C1 | Pancreatic acinar carcinmoma | Azaserine | Lewis |
| NBTII | Surface epithelial bladder carcinoma | N-butyl-N-(4-hydroxybutyl)nitrosamine | Wistar |
| FF1 | Fresh frozen | Normal liver | Sprague Dawley |
| FF2 | Fresh frozen | Normal liver | Sprague Dawley |
| FF3 | Fresh frozen | Normal liver | Sprague Dawley |
| FF4 | Fresh frozen | Normal liver | Sprague Dawley |
| FFPE1 | Formalin-fixed, paraffin-embedded | Normal liver | Sprague Dawley |
| FFPE2 | Formalin-fixed, paraffin-embedded | Normal liver | Sprague Dawley |
| FFPE3 | Formalin-fixed, paraffin-embedded | Normal liver | Sprague Dawley |
| FFPE4 | Formalin-fixed, paraffin-embedded | Normal liver | Sprague Dawley |
Summary statistics for rat WES reads
| Sample | Total Reads | Aligned Reads | Aligned Reads (%) | Reads in Target Exons | Reads on Target (%) | Duplicate Reads (%) |
|---|---|---|---|---|---|---|
| C6 | 171M | 170 M | 98.9 | 135 M | 79.5 | 32.3 |
| DSL-6A/C1 | 179 M | 178 M | 99.1 | 141 M | 79.1 | 27.5 |
| FAT7 | 185 M | 183 M | 99.1 | 146 M | 79.6 | 29.6 |
| NBTII | 176 M | 174 M | 99.1 | 139 M | 79.9 | 27.6 |
| FF1 | 187 M | 185 M | 99.2 | 146 M | 79.0 | 22.7 |
| FF2 | 181 M | 180 M | 99.1 | 141 M | 78.4 | 21.8 |
| FF3 | 189 M | 187 M | 99.0 | 150 M | 79.8 | 24.9 |
| FF4 | 189 M | 187 M | 99.0 | 149 M | 79.9 | 23.0 |
| FFPE1 | 172 M | 159 M | 92.0 | 118 M | 74.4 | 46.9 |
| FFPE2 | 174 M | 165 M | 95.0 | 128 M | 77.3 | 42.7 |
| FFPE3 | 160 M | 134 M | 84.0 | 81 M | 60.3 | 49.1 |
| FFPE4 | 170 M | 150 M | 88.2 | 105 M | 70.4 | 44.1 |
Read length was 101 bp for all samples
M Million
Fig. 2Breadth of reference genome coverage. The percentage of target bases covering the rat Rn6 reference genome is shown at 1X, 10X, 20X, 30X and 50X depth of coverage. Rigourous testing at 50X demonstrated strong coverage of the rat reference genome by the sequenced fragments
Fig. 3Uniformity of coverage for up to 500 bp reads for the cell lines and paired FF-FFPE samples. a-d Depth of coverage distribution for the C6, FAT7, DSL-6A/C1 and NBTII cell lines. e-h Depth of coverage distribution for the fresh frozen (FF) liver tissue. i-l) Depth of coverage distribution for the formalin-fixed, paraffin-embedded (FFPE) liver tissue
All exonic variants detected in the rat exome-seq samples
| Sample | Total Number of SNPs (%) | Annotated SNPs (%) (based on dbSNP) | Non-annotated SNPs (%) (based on dbSNP) |
|---|---|---|---|
| C6 | 30,529 | 15,945 (52.2%) | 14,584 (47.8%) |
| FAT7 | 24,167 | 13,818 (57.2%) | 10,349 (42.8%) |
| DSL-6A/C1 | 22,060 | 12,719 (57.7%) | 9341 (42.3%) |
| NBTII | 37,984 | 18,121 (47.7% | 19,863 (52.3%) |
| FF | 22,685 | 14,240 (62.7%) | 8445 (37.2%) |
| FFPE | 15,944 | 10,193 (63.9%) | 5751 (36.1%) |
Variant detection at 50X depth of coverage and alternate allele depth of 20x
Cell line specific exonic variants detected in the rat exome-seq samples
| Sample | Total Number of Variants | Annotated Variants (%) | Non-annotated Variants (%) |
|---|---|---|---|
| C6 | 5387 | 1523 (28.3%) | 3864 (71.7%) |
| FAT7 | 3551 | 1211 (34.1%) | 2340 (65.9%) |
| DSL-6A/C1 | 3277 | 1069 (32.6%) | 2208 (67.4%) |
| NBTII | 10,123 | 2430 (24.0%) | 7693 (76.0%) |
Variant detection at 50X depth of coverage and alternate allele depth of 20×
Variant candidate validation by Sanger sequencing
| Chromosome #: | Gene | Exon | Codon Change | C6 | FAT7 | DSL 6A/C1 | NBTII | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Coverage | AF | Coverage | AF | Coverage | AF | Coverage | AF | ||||||||
| Total | Allele | Total | Allele | Total | Allele | Total | Allele | ||||||||
| Chr10:56196111 |
| R271H | cGt/cAt | WT | WT | – |
|
|
| WT | WT | – | WT | WT | – |
| Chr10:56195619 |
| R211W | Cgg/Tgg | WT | WT | – | WT | WT | – | WT | WT | – |
|
|
|
| Chr10:56195677 |
| I230T | aTc/aCc | WT | WT | – | WT | WT | – | WT | WT | – |
|
|
|
| Chr10:48699815 |
| E544D | gaG/gaC |
|
|
|
|
|
|
|
|
|
|
|
|
| Chr2:118851700 |
| M811 T | aTg/aCg | WT | WT | – | WT | WT | – | WT | WT | – |
|
|
|
| Chr2:118831618 |
| C90Y | tGt/tAt |
|
|
| WT | WT | – | WT | WT | – | WT | WT | – |
| Chr5 | Cdkn2a | Deletion | WT | WT | – | WT | WT | – | WT | WT | – | WT | WT | – | |
| Chr5 | Cdkn2b | Deletion | WT | WT | – | WT | WT | – | WT | WT | – | WT | WT | – | |
| Chr10:66790063 | Nf1 | Q962* | Cag/Tag |
|
|
| WT | WT | – | WT | WT | – | WT | WT | – |
| Chr1:208011800 | Mki67 | G207R | Ggg/Agg |
|
|
| WT | WT | – |
|
|
| WT | WT | – |
| Chr1:53731379 | Mllt4 | Q1440* | Caa/Taa | WT | WT | – |
|
|
| WT | WT | – | WT | WT | – |
| Chr1:200672370 | Fgfr2 | V70 M | Gtg/Atg |
|
|
|
|
|
| WT | WT | – | WT | WT | – |
| Chr8:129618253 |
| D32V | gAt/gTt | WT | WT | – | WT | WT | – |
|
|
| WT | WT | – |
| Chr16:50398579 | Fat1 | L2464P | cTc/cCc | WT | WT | – | WT | WT | – | WT | WT | – |
|
|
|
| Chr16:50485429 | Fat1 | M536 T | aTg/aCg |
|
|
|
|
|
| WT | WT | – |
|
|
|
| Chr16:50485954 | Fat1 | F361S | tTc/tCc | WT | WT | – |
|
|
| WT | WT | – | WT | WT | – |
| Chr3:72125722 |
| M417 | atg/ | WT | WT | – | WT | WT | – |
|
|
| WT | WT | – |
| Chr2:125846677 |
| I3077V | Att/Gtt | WT | WT | – | WT | WT | – |
|
|
| WT | WT | – |
| Chr2:125754029 |
| S627 T | aGt/aCt | WT | WT | – | WT | WT | – | WT | WT | – |
|
|
|
| Chr2:125754184 |
| L679F | Ctc/Tcc | WT | WT | – | WT | WT | – | WT | WT | – |
|
|
|
| Chr5:151918885 |
| Y658* | taT/taA | WT | WT | – | WT | WT | – | WT | WT | – |
|
|
|
| Chr16:23972323 | Nat1 | S15 L | tCa/tTa | WT | WT | – | WT | WT | – | WT | WT | – |
|
|
|
| Chr16:23961976 |
| L52* | tTa/tAa | WT | WT | – | WT | WT | – | WT | WT | – |
|
|
|
| Chr8:118824049 |
| R792TEPSVR | agg/aCTGAACCTTCAGTTAgg | WT | WT | – |
|
|
|
|
|
| WT | WT | – |
| Chr8:118824049 |
| R79TESSVR | agg/aCTGAATCTTCAGTTAgg | WT | WT | – | WT | WT | – | WT | WT | – | 120 | 38 | 0.317 |
| Chr8:87845140 |
| T133I | aCc/aTc |
|
|
| WT | WT | – | WT | WT | – | WT | WT | – |
| Chr6:38567303 |
| T837 N | aCc/aAc | 56 | 56 | 1.000 |
|
| – | 60 | 60 | 1.000 |
|
| – |
| Chr6:38567315 |
| A841V | gCg/gTg | 53 | 53 | 1.000 |
|
| – | 58 | 58 | 1.000 |
|
| – |
| Chr8:58154526 |
| K1186R | aAg/aGg | 68 | 68 | 1.000 | 50 | 50 | 1.000 | WT | WT | – | WT | WT | – |
aIn the gene column, bold and italics indicates the variant found in the rat exome platform matches the amino acid substitution and location in COSMIC
bIn the gene column, italics indicates the variant found in the rat exome platform matches only the amino acid substitution in COSMIC
cIn the gene column, bold, italics and underline indicates the variant found in the rat exome platform matches only the location in COSMIC
dIn a cell line column, bold cell line indicates a homozygous mutation
eIn a cell line column, italicized indicates a heterozygous mutation
fIn a cell line column, Rat WES platform variant call/Sanger-based sequencing variant call. (WT: -, Variant: +). Bold and italics implies a sequencing discrepancy between the WES and Sanger
Fig. 4Mutational spectrum of the rat exome-seq data. a All exonic variants captured across all samples in the 71 Mb design (50 Mb + UTRs) plus the dbSNP variants were plotted using the Kullback-Leibler divergence. a A high frequency of C > T and T > C mutations presented with minimal observed differences in the mutational spectrum across all samples and dbSNP. b Hierarchical clustering grouped dbSNP with the normal, FF-FFPE tissue and showed divergence of these groups from the tumor cell lines
Fig. 5Mutational spectrum of cell line specific exonic variants. a Filtered for FF-FFPE variants from the exome-seq data separated dbSNP from the tumor cell line samples. b Based on the hierarchical clustering of cell specific variants filtered from FF-FFPE and dbSNP variants, the mutational spectrum of the rat exome-seq data is closest to COSMIC Signatures 16, 5, 8 and 3