| Literature DB >> 26653294 |
Pavel Dobrynin1, Shiping Liu2,3, Gaik Tamazian4, Zijun Xiong5, Andrey A Yurchenko6, Ksenia Krasheninnikova7, Sergey Kliver8, Anne Schmidt-Küntzel9, Klaus-Peter Koepfli10,11, Warren Johnson12, Lukas F K Kuderna13, Raquel García-Pérez14, Marc de Manuel15, Ricardo Godinez16, Aleksey Komissarov17, Alexey Makunin18,19, Vladimir Brukhin20, Weilin Qiu21, Long Zhou22, Fang Li23, Jian Yi24, Carlos Driscoll25, Agostinho Antunes26,27, Taras K Oleksyk28, Eduardo Eizirik29, Polina Perelman30,31, Melody Roelke32, David Wildt33, Mark Diekhans34, Tomas Marques-Bonet35,36,37, Laurie Marker38, Jong Bhak39, Jun Wang40,41,42,43, Guojie Zhang44,45, Stephen J O'Brien46,47.
Abstract
BACKGROUND: Patterns of genetic and genomic variance are informative in inferring population history for human, model species and endangered populations.Entities:
Mesh:
Year: 2015 PMID: 26653294 PMCID: PMC4676127 DOI: 10.1186/s13059-015-0837-4
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Assembly and annotation of the cheetah genome
| Number | Feature | Size | Source | |
|---|---|---|---|---|
| Genome sequence and assembly | 7 cheetahs | |||
| 1 |
| 3 cheetahs | 75 × reference | Table S2 |
| 2 |
| 4 cheetahs | 5 × resequencing | Table S2 |
| 3 | SOAP deNovo assembly | Tables S1, S4 | ||
| 4 | Assisted assembly with domestic cat |
| Table S5 | |
| a. Radiation Hybrid map | 3000 markers | |||
| b. Linkage map | 60,000 SNVs | |||
| 5 | Estimated genome size (assembly and 17-mer) | 2.375–2.395 Gb | Table S3 | |
| 6 | N50 contigs | 28.2 kbp | Table S4 | |
| 7 | N50 scaffolds | 3.1 Mb | Table S4 | |
| 8 | Average GC content | 0.475 | Figure S3 | |
| Annotation | ||||
| 9 | Coding genes | 20,343 genes | 601.2 Mb | Table S10 |
| 10 | Non-coding RNA 200,045 loci | 17 Mb | Table S11 | |
| a. 43,878 microRNA | 4.41 Mb | Table S11 | ||
| b. 1,605 small nuclear RNA | 186 kbp | Table S11 | ||
| c. 154,031 transport RNA | 12.7 Mb | Table S11 | ||
| d. 531 ribosomal RNA | 85 kbp | Table S11 | ||
| 11 | Single nucleotide variants (SNVs) | 1,820,419 loci | Tables S15–S20 | |
| 12 | Repetitive elements | Interspersed repeats | 746 Mb | Tables S6, S7 |
| 39.48 % of cheetah genome | Tandem repeats | 51.2 Mb | Table S8 | |
| Complex tandem repeats | 2.04 Mb | Table S9 | ||
| 3,126 loci | ||||
| Microsatellites | 23.47 Mb | Table S8 | ||
| 487,898 loci | ||||
| 13 | Genomic rearrangements of cheetah vs domestic cat | 93 Mb | Figures S5, S6 | |
| Tables S13, S14 | ||||
| 14 | Nuclear mitochondrial segments | 105.6 kbp | Table S12 | |
| 15 | Positively selected genes | 946 genes | Datasheet S5 | |
| 16 | GARfield Genome Browser |
| ||
Fig. 1Estimates of genome diversity in the cheetah genome relative to other mammal genomes. a SNV rate in mammals. SNV rate for each individual was estimated using all variant positions, with repetitive regions not filtered. b SNV density in cheetahs, four other felids and human based upon estimates in 50-kbp sliding windows. Of these, 38,661 fragments had lengths less than the specified window size and thus were excluded from further analysis; most of those fragments are contigs with length less than 500 bp, and thus 46,787 windows of total length 2.337 Gb were built and analyzed. c Number of SNVs in protein-coding genes in felid genomes. d The cheetah genome is composed of 93 % homozygous stretches. The genome of Boris, an outbred feral domestic cat living in St. Petersburg (top) is compared to Cinnamon, a highly inbred Abyssinian cat (Fca-6.2 reference for domestic cat genome sequence [19, 20], middle) and a cheetah (Chewbacca, bottom) as described here. Approximately 15,000 regions of 100 Mb across the genome for each species were assessed for SNVs. Regions of high variability (>40 SNVs/100 kbp) are colored red; highly homozygous regions (≤40 SNVs/100 kbp) are colored green. The first seven chromosome homologues of the genomes of Boris, Cinnamon and Chewbacca are displayed for direct comparison. The median lengths of homozygosity stretches in cheetahs (seven individuals), African lions (five individuals), Siberian and Bengal tigers, and the domestic cat are presented in Additional file 1: Figure S7
Fig. 2Comparison of MHC region structure between cheetah and domestic cats. Left side: Two chromosome B2 segments containing domestic cat MHC genes ordered on BAC libraries [29, 30]. Right side: Cheetah scaffolds related to MHC region. Order of scaffolds is based on the results of synteny analysis (light blue fill). Individual genes are denoted by dots and colored according to their MHC class: light blue for extended class II, blue for class II, green for class III, orange for class I, red for olfactory receptors and purple for histones. Genetic diversity in the MHC region was estimated by calculating SNV counts in non-overlapping 50-kbp windows. These counts are visualized by colored lines in the plot; for cats: green for wildcat, red for Boris and purple for Cinnamon; for cheetahs: red for Tanzania and orange for Namibia
Fig. 3Demographic history analysis of African cheetah. a Demographic history of two cheetah populations (southern in Namibia and eastern in Tanzania) based on DaDi analyses. Four distinctive but plausible model scenarios were simulated by the DaDi analysis with the AFS data. Model 4 fits the data best; see “Materials and methods” for our decision algorithm pathway that identified model 4 as best. b First and second graphs represent marginal spectra for a pair of populations. The third graph shows residuals between the model and the observed data. Red or blue residuals indicate that the model predicts too many or too few alleles in a given cell, respectively. The fourth graph shows goodness-of-fit tests based on the likelihood and Pearson’s statistic, with both indicating that our model is a reasonable, though incomplete, description of the data
Fig. 4Comparison of Dn/Ds distributions for reproduction-related and all cheetah genes. a Distributions of branch-specific values of Dn/Ds for reproductive system genes. Dn/Ds ratios were calculated for five species (dog, human, cat, tiger and cheetah) based on 500 bootstrap replications and the free-ratio model in PAML [37]. b Distributions of branch-specific Dn/Ds values for four species (dog, cat, tiger and cheetah) and ancestral reconstructed Felidae branch. Dn/Ds ratios for branches based on 200 bootstrap replications of 10 Mb protein-coding sequences
Fig. 5Analysis of orthologous gene families. a Unique and shared gene families in the cheetah genome. b Dynamic evolution of ortholog gene clusters. The estimated numbers of ortholog groups in the common ancestral species are shown on the internal nodes. The numbers of orthologous groups that expanded or contracted in each lineage after speciation are shown on the corresponding branch, with + referring to expansion and − referring to contraction. The cheetah genome contained 17,863 orthologous gene families. Among these, 10,983 orthologous gene families were shared by all eight genomes and 12,114 by felids while 11 orthologous gene families were exclusively shared among Felidae species (cat, lion, tiger and cheetah) and another 112 were exclusively shared by the cheetah and cat (Additional file 3: Datasheet S2). There were 1335 predicted genes containing 2293 InterPro domains unique to cheetahs (Additional file 3: Datasheet S1). Both figures are based on the comparison of orthologous gene families among eight mammalian species