| Literature DB >> 27480531 |
Anthony G Doran1, Kim Wong1, Jonathan Flint2, David J Adams1, Kent W Hunter3, Thomas M Keane4.
Abstract
BACKGROUND: The Mouse Genomes Project is an ongoing collaborative effort to sequence the genomes of the common laboratory mouse strains. In 2011, the initial analysis of sequence variation across 17 strains found 56.7 M unique single nucleotide polymorphisms (SNPs) and 8.8 M indels. We carry out deep sequencing of 13 additional inbred strains (BUB/BnJ, C57BL/10J, C57BR/cdJ, C58/J, DBA/1J, I/LnJ, KK/HiJ, MOLF/EiJ, NZB/B1NJ, NZW/LacJ, RF/J, SEA/GnJ and ST/bJ), cataloguing molecular variation within and across the strains. These strains include important models for immune response, leukaemia, age-related hearing loss and rheumatoid arthritis. We now have several examples of fully sequenced closely related strains that are divergent for several disease phenotypes.Entities:
Keywords: Biological pathways; Cancer; Disease; Genomic variation; Laboratory mouse; Mouse genomes; Sequencing; arthritis
Mesh:
Year: 2016 PMID: 27480531 PMCID: PMC4968449 DOI: 10.1186/s13059-016-1024-y
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Phenotypes and variant statistics for each of the 13 strains. The strains sequenced in this project with a short description and notable phenotypes, followed by the total number of SNPs, indels and large deletions. Individual strain images provided by The Jackson Laboratory (ME, USA). The phylogenetic tree is a genome-wide summary built using all of the HapMap genotypes for each strain [22] and does not reflect local haplotype structure
Sequencing, alignment and variant statistics
| Strain | Raw bases (Gb) | Mapped coverage | SNPs | Private | Indels | Private | SVs | Private |
|---|---|---|---|---|---|---|---|---|
| BUB/BnJ | 135.42 | 38.92 | 4,982,043 | 54,418 | 929,617 | 19,187 | 37,034 | 5375 |
| C57BL/10J | 108.43 | 29.85 | 349,702 | 2733 | 83,930 | 1893 | 12,939 | 9904 |
| C57BR/cdJ | 140.94 | 39.76 | 2,601,138 | 9683 | 485,803 | 6307 | 18,927 | 3462 |
| C58/J | 153.06 | 43.76 | 2,834,841 | 27,805 | 547,151 | 11,726 | 20,666 | 3777 |
| DBA/1J | 139.79 | 38.55 | 5,129,920 | 9001 | 955,621 | 9684 | 34,239 | 4480 |
| I/LnJ | 124.97 | 34.11 | 5,547,267 | 78,583 | 1,013,686 | 24,628 | 36,612 | 6342 |
| KK/HiJ | 151.60 | 44.42 | 5,893,784 | 94,110 | 1,078,555 | 31,084 | 36,588 | 3725 |
| MOLF/EiJ | 112.55 | 30.68 | 19,027,669 | 1,818,029 | 2,892,638 | 367,208 | 79,239 | 17,394 |
| NZB/B1NJ | 133.24 | 36.15 | 5,473,276 | 54,189 | 1,014,782 | 20,670 | 34,406 | 3164 |
| NZW/LacJ | 160.46 | 49.06 | 5,736,044 | 52,902 | 1,051,170 | 21,397 | 38,207 | 3308 |
| RF/J | 147.33 | 43.18 | 5,090,115 | 32,390 | 937,091 | 13,440 | 44,205 | 11,685 |
| SEA/GnJ | 134.23 | 38.67 | 4,603,720 | 30,263 | 857,599 | 12,729 | 33,033 | 3686 |
| ST/Bj | 223.17 | 63.74 | 5,107,777 | 70,064 | 978,238 | 24,325 | 37,350 | 4474 |
Total sequencing in Gigabases (Gb), mapped coverage (based on 3Gb genome) and the total and private number of SNPs, indels and large deletions is shown
Fig. 2SNP, indel and SV densities for chromosome 1 of all strains. SNP, indel and SV (insertions and deletions) densities (per MB) for all variants identified on chromosome 1 for each of the 13 strains
Predicted consequences of variants
| SNPs | Indels | Exonic SVs | |||||
|---|---|---|---|---|---|---|---|
| Strain | Synonymous | Non-synonymous | Stop gain/loss | Frameshift variant | Inframe variant | Deletions | Insertions |
| BUB/BnJ | 9751 | 6050 | 44 | 41 | 169 | 1247 | 720 |
| C57BL/10J | 581 | 298 | 0 | 4 | 8 | 57 | 102 |
| C57BR/cdJ | 4400 | 2948 | 18 | 26 | 52 | 288 | 227 |
| C58/J | 5238 | 3362 | 23 | 35 | 69 | 330 | 254 |
| DBA/1J | 9782 | 6198 | 39 | 53 | 149 | 419 | 389 |
| I/LnJ | 10389 | 6249 | 42 | 52 | 150 | 690 | 1671 |
| KK/HiJ | 10816 | 6197 | 50 | 56 | 140 | 599 | 462 |
| MOLF/EiJ | 33710 | 17,311 | 108 | 172 | 439 | 897 | 1305 |
| NZB/B1NJ | 10403 | 6367 | 43 | 62 | 146 | 500 | 390 |
| NZW/LacJ | 10434 | 6326 | 46 | 61 | 146 | 673 | 425 |
| RF/J | 9891 | 6174 | 39 | 56 | 140 | 2033 | 862 |
| SEA/GnJ | 9309 | 6027 | 38 | 57 | 138 | 1284 | 614 |
| ST/bJ | 9021 | 5578 | 45 | 59 | 133 | 713 | 413 |
Significantly over-represented biological pathways and candidate genes identified in RF/J
| Pathway name | Database |
| Candidate genes |
|---|---|---|---|
| Degradation of the extracellular matrix | REACTOME | 4.91 × 10–3 |
|
| Extracellular matrix organization | REACTOME | 1.15 × 10–2 |
|
| ECM-receptor interaction | KEGG | 1.32 × 10–2 |
|
| Pathways in cancer | KEGG | 4.56 × 10–2 |
|
Candidate genes are those that are found in the respective pathways and contained a missense SNP private to the RF/J strain. Corrected p values are based on the Benjamini–Hochberg method
Significantly over-represented biological pathways and candidate genes identified in DBA/1J
| Pathway name | Database |
| Candidate genes |
|---|---|---|---|
| Cell adhesion molecules (CAMs) | KEGG | 3.98 × 10–4 |
|
| Type I diabetes mellitus | KEGG | 7.54 × 10–3 |
|
| Graft-versus-host disease | KEGG | 7.59 × 10–3 |
|
| Autoimmune thyroid disease | KEGG | 8.55 × 10–3 |
|
| Allograft rejection | KEGG | 1.01 × 10–2 |
|
| Antigen processing and presentation | KEGG | 1.06 × 10–2 |
|
| Viral myocarditis | KEGG | 1.14 × 10–2 |
|
| Endocytosis | KEGG | 3.07 × 10–2 |
|
| Linoleic acid metabolism | KEGG | 4.49 × 10–2 |
|
| Cytokine-cytokine receptor interaction | KEGG | 4.73 × 10–2 |
|
Candidate genes are genes that are found in the over-represented pathways and which contained a missense, deleterious SNP found in DBA/1J and not in DBA/2J or FVB/NJ. Corrected p values were calculated using the Benjamini–Hochberg method