| Literature DB >> 35271794 |
Jianqing Lin1, David Duchêne2, Christian Carøe3, Oliver Smith3, Marta Maria Ciucani3, Jonas Niemann3, Douglas Richmond3, Alex D Greenwood4, Ross MacPhee5, Guojie Zhang6, Shyam Gopalakrishnan7, M Thomas P Gilbert8.
Abstract
Three principal methods are under discussion as possible pathways to "true" de-extinction; i.e., back-breeding, cloning, and genetic engineering.1,2 Of these, while the latter approach is most likely to apply to the largest number of extinct species, its potential is constrained by the degree to which the extinct species genome can be reconstructed. We explore this question using the extinct Christmas Island rat (Rattus macleari) as a model, an endemic rat species that was driven extinct between 1898 and 1908.3-5 We first re-sequenced its genome to an average of >60× coverage, then mapped it to the reference genomes of different Rattus species. We then explored how evolutionary divergence from the extant reference genome affected the fraction of the Christmas Island rat genome that could be recovered. Our analyses show that even when the extremely high-quality Norway brown rat (R. norvegicus) is used as a reference, nearly 5% of the genome sequence is unrecoverable, with 1,661 genes recovered at lower than 90% completeness, and 26 completely absent. Furthermore, we find the distribution of regions affected is not random, but for example, if 90% completeness is used as the cutoff, genes related to immune response and olfaction are excessively affected. Ultimately, our approach demonstrates the importance of applying similar analyses to candidates for de-extinction through genome editing in order to provide critical baseline information about how representative the edited form would be of the extinct species.Entities:
Keywords: Christmas Island rat; Rattus macleari; ancient DNA; de-extinction; evolutionary divergence; genomic sequencing; molecular dating
Mesh:
Year: 2022 PMID: 35271794 PMCID: PMC9044923 DOI: 10.1016/j.cub.2022.02.027
Source DB: PubMed Journal: Curr Biol ISSN: 0960-9822 Impact factor: 10.900
Summary of mapping genomic sequencing data of five Rattus species to Norway brown rat reference genome, related to Table S1 and Figure S1
| Hit reads | Coverage | MaxDepth | 1× | 5× | 10× | ||
|---|---|---|---|---|---|---|---|
| Christmas Island rat/Maclear’s rat (Merged data) | 2892096135 | 60.80977094 | 287 | 0.9515 | 0.9142 | 0.8793 | |
| Christmas Island rat/Maclear’s rat (BGISeq data) | 1095698394 | 22.59070271 | 97 | 0.9365 | 0.8601 | 0.766 | |
| Christmas Island rat/Maclear’s rat (HiSeq data) | 1794361241 | 38.17020817 | 205 | 0.937 | 0.8771 | 0.816 | |
| Christmas Island rat/Maclear’s rat (MiSeq data) | 2036500 | 0.048860062 | 3 | 0.0444 | 0.0001 | 0.0001 | |
| Norway brown rat (Simulative ancient DNA) | 2921567669 | 60.77177418 | 114 | 0.9919 | 0.9919 | 0.9919 | |
| Norway brown rat (Simulative modern DNA) | 1616683907 | 60.79049948 | 83 | 0.9919 | 0.9919 | 0.9919 | |
| Norway brown rat (Real modern DNA, five samples) | 1555330238 | 60.83049687 | 205 | 0.9914 | 0.987 | 0.9759 | |
| Norway brown rat (Real modern DNA, four samples) | 1591467983 | 60.79035188 | 184 | 0.9912 | 0.9845 | 0.9593 | |
| Himalayan field rat (seven samples) | 2132796965 | 120.4604476 | 447 | 0.9848 | 0.9748 | 0.9664 | |
| Himalayan field rat (three samples) | 1015013828 | 57.69764693 | 208 | 0.9805 | 0.9647 | 0.9502 | |
| Himalayan field rat (four samples) | 1117783137 | 62.7628007 | 246 | 0.9815 | 0.9657 | 0.9512 | |
| Asian house rat | 587134689 | 29.87708106 | 101 | 0.9276 | 0.8634 | 0.8002 | |
| Black rat | 1152792528 | 42.48457157 | 139 | 0.9481 | 0.9098 | 0.8815 |
Norway brown rat (Real modern DNA, five samples): China1+Mali+AH2+BJ+Cambodia5
Norway brown rat (Real modern DNA, four samples): Mali+AH1+SD+Cambodia5
Himalayan field rat (three samples): NZ1+NZ2+WH3
Himalayan field rat (four samples): SG1+SG2+WH1+WH2
Figure 1The phylogenetic placement and evolutionary timescale of R. norvegicus, Rattus nitidus, R. macleari, Rattus rattus, and Rattus tanezumi
Numbers following the species names indicate the coverage of genomic sequencing data for the corresponding species when mapped to the Norway brown rat reference genome. Related to Table S6.
Figure 2Numbers of genes found at different coverage levels after mapping Christmas Island rat genomic sequencing data to the Norway brown rat reference genome
Coverage levels (on a scale of 0–1) are shown next to the figure. Related to Tables S2 and S3 and Figure S2.
Figure 3Annotation of genes with unrecoverable regions in of Christmas Island rat genome
(A) GO enrichment (q < 0.05) of Christmas Island rat genes obtained at coverage lower than 0.9. Numbers following bars: the number of genes; x axes: −log10(q value); y axes: the GO terms enriched in genes with coverage lower than 0.9.
(B) KEGG enrichment (q < 0.05) in genes with coverage lower than 0.9; x axes: rich factor, number of genes with coverage lower than 0.9/total genes in KEGG terms; y axes: the KEGG pathways enriched in genes with coverage lower than 0.9.
(C and D) The coverage of genomic sequencing data on major histocompatibility complex (MHC) and vomeronasal receptor (VoR) genes in the Norway brown rat reference genome.
Related to Tables S4 and S5.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Dry preserved skin samples of the Christmas Island rat ( | Oxford University Museum of Natural History | 18844 |
| Dry preserved skin samples of the Christmas Island rat ( | Oxford University Museum of Natural History | 18845 |
| Christmas Island rat ( | This study; ENA ( | SAMEA12813846 (18844) |
| Reference genome of Norway brown rat | NCBI website ( | GCF_015227675.2 |
| Reference genome of black rat | NCBI website ( | GCF_011064425.1 |
| Reference genome of house mouse | NCBI website ( | GCF_000001635.27 |
| Norway brown rat resequencing datasets | CNCB-NGDC ( | CRX019583 (Mali), CRX019522 (Cambodia5), CRX019633 (China1), CRX019515 (AH1), CRX019516 (AH2), CRX019517 (BJ1) and CRX019639 (SD1) |
| Black rat resequencing dataset #1 | CNCB-NGDC ( | CRX019632 |
| Black rat resequencing dataset #2 | ENA ( | SAMEA2051945 |
| Black rat resequencing dataset #3 | NCBI ( | SRX9009079 |
| Himalayan field rat resequencing datasets | NCBI ( | SAMN05425704 (NZ2), SAMN05425705 (SG1) SAMN05425706 (SG2), SAMN05425709 (NZ1), SAMN05425641 (WH1), SAMN05425642 (WH2), and SAMN05425643 (WH3). |
| Asian house rat resequencing dataset | NCBI ( | SAMN05425710 |
| House mouse resequencing dataset | NCBI ( | SRX10650663 |
| AdapterRemoval v2.3.1 | Schubert et al., 2016 | |
| bwa | Li and Durbin | |
| Paleomix v1.3.2 | Schubert et al., 2014 | |
| mapDamage v2.2.1 | Jónsson et al. | |
| MEGA X | Kumar et al. | |
| samtools v1.9 | Li et al. | |
| bedtools v2.29.0 | Quinlan | |
| bcftools v1.9 | Li et al. | |
| seqkit v0.16.1 | Shen et al. | |
| gargammel | Renaud et al. | |
| KOBAS 3.0 | Bu et al. | |
| MAFFT v7.4 | Katoh et al. | |
| IQ-TREE v1.6 | Minh et al. | |
| PAML v4.8 | Yang | |