| Literature DB >> 30858237 |
John R Shorter1, Maya L Najarian2, Timothy A Bell1,3, Matthew Blanchard1, Martin T Ferris1, Pablo Hock1, Anwica Kashfeen2, Kathryn E Kirchoff2, Colton L Linnertz1, J Sebastian Sigmon2, Darla R Miller1, Leonard McMillan2, Fernando Pardo-Manuel de Villena4,3.
Abstract
Two key features of recombinant inbred panels are well-characterized genomes and reproducibility. Here we report on the sequenced genomes of six additional Collaborative Cross (CC) strains and on inbreeding progress of 72 CC strains. We have previously reported on the sequences of 69 CC strains that were publicly available, bringing the total of CC strains with whole genome sequence up to 75. The sequencing of these six CC strains updates the efforts toward inbreeding undertaken by the UNC Systems Genetics Core. The timing reflects our competing mandates to release to the public as many CC strains as possible while achieving an acceptable level of inbreeding. The new six strains have a higher than average founder contribution from non-domesticus strains than the previously released CC strains. Five of the six strains also have high residual heterozygosity (>14%), which may be related to non-domesticus founder contributions. Finally, we report on updated estimates on residual heterozygosity across the entire CC population using a novel, simple and cost effective genotyping platform on three mice from each strain. We observe a reduction in residual heterozygosity across all previously released CC strains. We discuss the optimal use of different genetic resources available for the CC population.Entities:
Keywords: GenPred; Genomic Prediction; MPP; Multiparent Populations; Shared data resources; heterozygosity; inbreeding; whole genome sequence
Mesh:
Year: 2019 PMID: 30858237 PMCID: PMC6505143 DOI: 10.1534/g3.119.400039
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
MRCA information for the six new sequenced strains
| Strain and mouse ID | % Heterozygous in MRCA | % Heterozygous in sequenced sample | % Heterozygous in 3 miniMUGA genotypes |
|---|---|---|---|
| CC078/TauUnc_M1502 | 14.29996 | 1.7 | 5.52 |
| CC079/TauUnc_M1086 | 6.766071 | 4.5 | 8.62 |
| CC080/TauUnc_M1283 | 22.18861 | 3.77 | 17.11 |
| CC081/Unc_M332 | 21.91887 | 3.7 | 18.81 |
| CC082/Unc_M505 | 22.76935 | 5 | 18.84 |
| CC083/Unc_M3234 | 31.48258 | 15.4 | 28.87 |
Founder haplotype frequency in sequenced samples
| Population | A/J | C57BL/6J | 129S1/SvImJ | NOD/ShiLtJ | NZO/HlLtJ | CAST/EiJ | PWK/PhJ | WSB/EiJ | |
|---|---|---|---|---|---|---|---|---|---|
| Released CC strains | Average | 0.117 | 0.150 | 0.146 | 0.142 | 0.137 | 0.094 | 0.085 | 0.128 |
| Min | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | |
| Max | 0.297 | 0.338 | 0.282 | 0.317 | 0.374 | 0.201 | 0.173 | 0.354 | |
| New CC strains | CC078/TauUnc | 0.176 | 0.166 | 0.142 | 0.113 | 0.046 | 0.098 | 0.136 | 0.123 |
| CC079/TauUnc | 0.199 | 0.147 | 0.130 | 0.086 | 0.109 | 0.157 | 0.135 | 0.038 | |
| CC080/TauUnc | 0.163 | 0.223 | 0.151 | 0.059 | 0.182 | 0.055 | 0.027 | 0.140 | |
| CC081/Unc | 0.089 | 0.148 | 0.181 | 0.154 | 0.153 | 0.071 | 0.101 | 0.096 | |
| CC082/Unc | 0.148 | 0.101 | 0.062 | 0.107 | 0.150 | 0.177 | 0.099 | 0.149 | |
| CC083/Unc | 0.177 | 0.166 | 0.142 | 0.110 | 0.046 | 0.098 | 0.136 | 0.123 | |
| Average | 0.159 | 0.159 | 0.135 | 0.105 | 0.114 | 0.109 | 0.106 | 0.112 |
Figure 1Genomic contribution of new strains to the CC resource on chromosome 5. (A) Haplotype structure for the 6 new sequenced strains. (B) Current haplotype frequency among the previously sequenced 69 strains. Dotted lines represent areas with low PWK or CAST representation. (C) The substrain contribution across the eight CC founders. The CAST founder has genomic regions of domesticus origin within the highlighted area. We use the following colors and letter codes to represent the eight founder strains of the CC: A/J, yellow; C57BL/6J, gray; 129S1/SvImJ, pink; NOD/ShiLtJ, dark blue; NZO/HlLtJ, light blue; CAST/EiJ, green; PWK/ PhJ, red; and WSB/EiJ, purple.
Analysis of de novo deletions
| Strain | Chr | “45-mer” Start | “45-mer” End | Resolved start | Resolved end | Size (kb) | Reads spanning deletion boundaries | Haplotype | Genes | Regulatory Elements | Notes |
|---|---|---|---|---|---|---|---|---|---|---|---|
| CC079/TauUnc | 102931561 | 102947356 | 102931583 | 102947323 | 15.795 | na | 129S1/ SvlmJ | None | ENSMUSR00000737813 | ||
| CC079/TauUnc | 11600371 | 11620171 | 11600372 | 11620112 | 19.8 | 11 | PWK/ PhJ | ENSMUSR00000759295 | Independent validation | ||
| ENSMUSR00000759296 | |||||||||||
| ENSMUSR00000759294 | |||||||||||
| CC079/TauUnc | 129264616 | 129265201 | 129264705 | 129265185 | 0.585 | 37 | PWK/PhJ | ||||
| CC081/Unc | 56858446 | 56862406 | 56858470 | 56862407 | 3.96 | 40 | C57BL/6J | None |
Figure 2De novo private deletions in the new CC sequenced strains. (A) A deletion on the X chromosome in CC079/TauUnc not shared with CC058/Unc, which also has a PWK haplotype. (B) The normalized whole-genome coverage of sequence in 1-kb bins for CC079/TauUnc. The deletion spans the pseudogene Gm14515. (C) Assembled sequencing from msBWTs shows the breakpoint of the deletion.
Figure 3Residual heterozygosity levels in CC populations. (A) Change in CC strain heterozygosity compared to total percent heterozygosity. Red dots represent the 6 new strains. Dark blue line represents the regression for change in heterozygosity compared to heterozygosity levels in the MRCAs. (B) Heterozygosity levels in sequenced samples compared to heterozygosity levels in MiniMUGA samples. Dotted black line represents the equal level of heterozygosity between both populations. Dots to the right of the dotted line are samples where the sequenced sample underestimates heterozygosity and dots to the left are where the sequenced sample over represents heterozygosity.