| Literature DB >> 27616775 |
Tze King Tan1, Ka Yun Tan2, Ranjeev Hari1, Aini Mohamed Yusoff1, Guat Jah Wong3, Cheuk Chuen Siow3, Naresh V R Mutha3, Mike Rayko4, Aleksey Komissarov4, Pavel Dobrynin4, Ksenia Krasheninnikova4, Gaik Tamazian4, Ian C Paterson5, Wesley C Warren6, Warren E Johnson7, Stephen J O'Brien8, Siew Woh Choo9.
Abstract
Pangolins (order Pholidota) are the only mammals covered by scales. We have recently sequenced and analyzed the genomes of two critically endangered Asian pangolin species, namely the Malayan pangolin (Manis javanica) and the Chinese pangolin (Manis pentadactyla). These complete genome sequences will serve as reference sequences for future research to address issues of species conservation and to advance knowledge in mammalian biology and evolution. To further facilitate the global research effort in pangolin biology, we developed the Pangolin Genome Database (PGD), as a future hub for hosting pangolin genomic and transcriptomic data and annotations, and with useful analysis tools for the research community. Currently, the PGD provides the reference pangolin genome and transcriptome data, gene sequences and functional information, expressed transcripts, pseudogenes, genomic variations, organ-specific expression data and other useful annotations. We anticipate that the PGD will be an invaluable platform for researchers who are interested in pangolin and mammalian research. We will continue updating this hub by including more data, annotation and analysis tools particularly from our research consortium.Database URL: http://pangolin-genome.um.edu.my.Entities:
Mesh:
Year: 2016 PMID: 27616775 PMCID: PMC5018392 DOI: 10.1093/database/baw063
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Summary statistics of two pangolin genome and transcriptome datasets in PGD
| Genome | Malayan Pangolin | Chinese Pangolin |
|---|---|---|
| Number of scaffolds | 81,732 | 87,621 |
| Estimated coverage (X) | 146 | 56 |
| Estimated Genome size | 2,549,959,554 bp | 2,205,289,822 bp |
| N50 (bp) | 204,525 | 157,892 |
| # of protein-coding genes | 23,446 | 20,298 |
| # of annotated genes | 21,451 (91%) | 19,287 (95%) |
| # of pseudogenes | 4660 | 2416 |
| # of transcripts | 89,751 | NA |
Assembly statistic of pangolins genome. Adapted from Pangolin genomes and the evolution of mammalian scales and immunity. by Choo et al., 2016.
Figure 1.PGD four-tier web application architecture. (client workstation, web server, application server and database server).
Figure 2.Schematic structure of the PGD.
Figure 3.A screenshot of gene details page. This page will display information of a gene including its sequences and functional annotation.
Figure 4.Pangolin genome browser. Users can turn on and off the annotation tracks in the left panel.
Genome assembly version for each mammal genome used for multiple sequence alignment
| Animal | Scientific name | Genome assembly |
|---|---|---|
| Dog | CanFam3.1 | |
| Cat | Felis_catus_8.0 | |
| Cow | Bos_taurus_3.1 | |
| Horse | EquCab_2.0 | |
| Human | GRCh37.p5 | |
| Mouse | GRCm38.p4 |
Figure 5.The phylogenetic tree of species involved into the structural alignment.
Figure 6.Web interfaces for data download in PGD.