| Literature DB >> 31077316 |
Ruby Dhar1, Ashikh Seethy1, Karthikeyan Pethusamy1, Sunil Singh1, Vishwajeet Rohil2, Kakali Purkayastha2, Indrani Mukherjee1, Sandeep Goswami1, Rakesh Singh3, Ankita Raj1, Tryambak Srivastava1, Sovon Acharya1, Balaji Rajashekhar4,5, Subhradip Karmakar1.
Abstract
BACKGROUND: The Indian peafowl (Pavo cristanus) is native to South Asia and is the national bird of India. Here we present a draft genome sequence of the male blue peacock using Illumina and Oxford Nanopore technology (ONT).Entities:
Keywords: Indian national bird; Oxford Nanopore; Pavo cristatus; genome assembly; peacock
Mesh:
Substances:
Year: 2019 PMID: 31077316 PMCID: PMC6511069 DOI: 10.1093/gigascience/giz038
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:Photograph of the Indian blue peacock (P. cristatus).
Figure 2:Detailed workflow for de novo whole-genome assembly and annotation. LI-PE: long-insert paired-end; QC: quality control; SI-PE: short-insert paired-end.
Raw data statistics of peacock genome reads generated by Illumina HiSeq and ONT
| Sample | Platform | Library and chemistry | No. of reads | Coverage | Sequence Read Archive ID |
|---|---|---|---|---|---|
| SO_6221_SKPea2016_SI | HiSeq | PE-SI (150 * 2) | 489,114,747 | 146.73 | SUB3108018, SAMN07739105 |
| SO_6221_SKPea2016_LI | HiSeq | PE-LI (150 * 2) | 302,884,819 | 90.87 | SUB3108017, SAMN07739104 |
| SO_6221_FPL_3_5KB | HiSeq | MP (150 * 2) | 72,915,033 | 21.87 | SUB3107930, SAMN07739101 |
| SO_6221_FPL_5_7KB | HiSeq | MP (150 * 2) | 47,440,144 | 14.23 | SUB3108015, SAMN07739102 |
| SO_6221_FPL_7_10KB | HiSeq | MP (150 * 2) | 36,464,628 | 10.94 | SUB3108016, SAMN07739103 |
| SO_6221_NP | ONT | 5–341,124 | 366,323 | 2.3 | SUB3108020, SAMN07739107 |
Abbreviations: KB, kilobases; LI, long insert; MP, mate-pair; PE, paired-end; SI, short insert.
De novo assembly statistics of the peacock genome
| Description | Contigs | ONT scaffolds | Super-scaffolds | GapClosed | >1,000 kb | >5,000 kb |
|---|---|---|---|---|---|---|
| Contigs | 685,241 | 281,272 | 179,346 | 179,332 | 34,178 | 15,025 |
| Maximum length | 49,159 | 251,510 | 2390,121 | 2,488,982 | 2,488,982 | 2,488,982 |
| Minimum length | 300 | 5 | 265 | 265 | 1,000 | 5,000 |
| Mean length | 1,360 | 3,250 | 5,111 | 5,729 | ||
| Total length | 932,162,464 | 914,363,908 | 916,720,956 | 1,027,510,962 | 954,449,349 | 915,342,012 |
| Length ≥ 100 bp | 685,241 | 281,271 | 179,346 | 179,332 | 34,178 | 15,025 |
| Length ≥ 200 bp | 685,241 | 281,271 | 179,346 | 179,332 | 34,178 | 15,025 |
| Length ≥ 500 bp | 616,120 | 186,433 | 93,727 | 93,718 | 34,178 | 15,025 |
| Length ≥ 1 kb | 363,428 | 104,479 | 34,168 | 34,178 | 34,178 | 15,025 |
| Length ≥ 10 kb | 1,591 | 24,748 | 9,249 | 10,310 | 10,310 | 10,310 |
| Length ≥ 1 Mb | 0 | 0 | 27 | 37 | 37 | 37 |
| Non-ATGC No. | 350,325 | 42,696,911 | 49,169,831 | 4,043,129 | 4,040,790 | 3,986,487 |
| Non-ATGC percentage | 0.038 | 4.67 | 5.36 | 0.393 | 0.423 | 0.436 |
| N50 value | 1,639 | 14,748 | 168,140 | 190,304 | 218,023 | 232,312 |
Figure 3:Peacock proteins showing homology. Pie chart showing significant similarity scores of peacock proteins against the NCBI NR database. The pie chart colors are grouped based on the E-value scores from most significant E-value of 0.0 (red) going clockwise to least significant of ∼1E–5 (blue).
Figure 4:Phylogenetic tree generated from homologous proteins from 49 different avian species.
Figure 5:Venn diagram showing common and absent Pfam domains between peacock, chicken, and turkey proteins.
Figure 6:Heat map showing Pfam domains distributed in peacock, chicken, or turkey species. The number represents the Pfam domain count predicted from the protein sequences. Pfam domains of 50 and above identified in any 1 of the species are compared in the heat map.
Figure 7:Venn diagram showing peacock proteins with significant homology to the NCBI NR database, the EuKaryotic Orthologous Groups (KOG) database, and Pfam and GO ontologies.
Figure 8:Circular image of the assembled peacock genome, aligned against the G. gallus genome. The right side of the image represents the reference chicken genome; left side represents the peacock genome.