| Literature DB >> 27784328 |
Zijun Xiong1,2,3, Fang Li2, Qiye Li2,3,4, Long Zhou2, Tony Gamble5, Jiao Zheng2, Ling Kui3, Cai Li2, Shengbin Li1, Huanming Yang6,7, Guojie Zhang8,9,10.
Abstract
BACKGROUND: Geckos are among the most species-rich reptile groups and the sister clade to all other lizards and snakes. Geckos possess a suite of distinctive characteristics, including adhesive digits, nocturnal activity, hard, calcareous eggshells, and a lack of eyelids. However, one gecko clade, the Eublepharidae, appears to be the exception to most of these 'rules' and lacks adhesive toe pads, has eyelids, and lays eggs with soft, leathery eggshells. These differences make eublepharids an important component of any investigation into the underlying genomic innovations contributing to the distinctive phenotypes in 'typical' geckos.Entities:
Keywords: Assembly; Eublepharis macularius; Gekkota; Genome sequencing; Leopard gecko
Mesh:
Year: 2016 PMID: 27784328 PMCID: PMC5080775 DOI: 10.1186/s13742-016-0151-4
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Fig. 1Example of a Leopard gecko Eublepharis macularius (image from Tony Gamble)
Summary statistics of leopard gecko sequence data derived from paired-end sequencing of seven insert libraries using an Illumina HiSeq 2000 platform
| Library insert size (bp) | # Lane | Read length (bp) | Raw data | High-quality data | ||
|---|---|---|---|---|---|---|
| Total bases (Gb) | Sequencing depth (X) | Total bases (Gb) | Sequencing depth (X) | |||
| 170 | 2 | 100 | 60.25 | 27.03 | 57.20 | 25.66 |
| 500 | 2 | 150 | 76.08 | 34.13 | 59.36 | 26.63 |
| 800 | 1 | 150 | 27.84 | 12.49 | 15.90 | 7.13 |
| 2000 | 3 | 49 | 58.04 | 26.04 | 34.88 | 15.65 |
| 5000 | 2 | 49 | 33.96 | 15.24 | 10.99 | 4.93 |
| 10,000 | 2 | 49 | 29.17 | 13.09 | 5.09 | 2.28 |
| 20,000 | 1 | 49 | 17.33 | 7.78 | 4.07 | 1.83 |
| Total | 13 | 302.66 | 135.78 | 187.49 | 84.11 | |
Note: Sequencing depth was calculated based on a genome size of 2.23 Gb. High-quality data were obtained by filtering raw data for low-quality and duplicate reads and correcting sequencing errors
Statistics of genome size estimation by 17-mer analysis. The genome size was estimated according to the formula: Genome size = # Kmers/Peak of depth
| Genome | Kmer length (bp) | # Kmers | Peak of depth | Estimated genome size (bp) | Data used (bp) |
|---|---|---|---|---|---|
|
| 17 | 46,813,180,882 | 21 | 2,229,199,089 | 53,806,135,250 |
Comparison of genome features between Eublepharis macularius and Gekko japonicus
| Genome features |
|
|
|---|---|---|
| Assembled genome size (Gb) | 2.02 | 2.55 |
| Scaffold N50 (kb) | 664 | 685 |
| Contig N50 (kb) | 20.0 | 21.1 |
| Gene Number | 24,755 | 22,487 |
| Repeat content (% of genome) | 42.18 | 48.94 |
Summary statistics of key parameters for 13 reptile genomes
| Species | Common name | Sequencing technology | Sequence coverage | Assembly size (Gb) | Contig N50 (kb) | Scaffold N50 (kb) | References |
|---|---|---|---|---|---|---|---|
|
| Green anole lizard | Sanger | 6.0X | 1.78 | 79.9 | 4033 | [ |
|
| Chinese alligator | NGS | 109.0X | 2.30 | 23.4 | 2188 | [ |
|
| Western painted turtle | Sanger + NGS | 18.0X | 2.59 | 11.9 | 5212 | [ |
|
| Green sea turtle | NGS | 82.3X | 2.24 | 20.4 | 3778 | [ |
|
| Soft-shell turtle | NGS | 105.6X | 2.21 | 21.9 | 3331 | [ |
|
| Burmese python | NGS | 20.0X | 1.44 | 10.7 | 208 | [ |
|
| King cobra | NGS | 28.0X | 1.66 | 4.0 | 226 | [ |
|
| American alligator | NGS | 156.0X | 2.17 | 7.0 | 509 | [ |
|
| Indian gharial | NGS | 81.0X | 2.88 | 14.2 | 127 | [ |
|
| Saltwater crocodile | NGS | 74.0X | 2.12 | 32.8 | 205 | [ |
|
| Japanese gecko | NGS | 131.3X | 2.55 | 21.1 | 685 | [ |
|
| Australian dragon lizard | NGS | 179.1X | 1.82 | 31.3 | 2290 | [ |
|
| Leopard gecko | NGS | 135.8X | 2.02 | 20.0 | 664 |
Coverage of core eukaryotic genes (CEGs) in the gecko genome assessed by CEGMA. All CEGs were divided into four groups based on their degree of protein sequence conservation. Group 1 contains the least conserved CEGs and group 4 contains the most conserved
|
|
| |||
|---|---|---|---|---|
| Proteins | Completeness (%) | Proteins | Completeness (%) | |
| Complete | 210 | 84.68 | 182 | 73.39 |
| Group 1 | 53 | 80.30 | 51 | 77.27 |
| Group 2 | 49 | 87.50 | 44 | 78.57 |
| Group 3 | 52 | 85.25 | 43 | 70.49 |
| Group 4 | 56 | 86.15 | 44 | 67.69 |
| Partial | 225 | 90.73 | 202 | 81.45 |
| Group 1 | 59 | 89.39 | 58 | 87.88 |
| Group 2 | 52 | 92.86 | 47 | 83.93 |
| Group 3 | 55 | 90.16 | 48 | 78.69 |
| Group 4 | 59 | 90.77 | 49 | 75.38 |
Summarized benchmarks in the BUSCO assessment
|
|
| |||
|---|---|---|---|---|
| BUSCO benchmark | Number | Percentage | Number | Percentage |
| Total BUSCO groups searched | 3023 | 3023 | ||
| Complete single-copy BUSCOs | 1746 | 57.757 | 1528 | 50.546 |
| Complete duplicated BUSCOs | 31 | 1.025 | 27 | 0.893 |
| Fragmented BUSCOs | 551 | 18.227 | 580 | 19.186 |
| Missing BUSCOs | 726 | 24.016 | 915 | 30.268 |
Summary statistics of annotated repeats in the leopard gecko genome assembly
| Repeat type | Total repeat length (bp) | Percentage of genome |
|---|---|---|
| DNA | 69,961,035 | 3.47 |
| LINE | 255,603,529 | 12.67 |
| SINE | 106,528,475 | 5.28 |
| LTR | 64,149,381 | 3.18 |
| Unknown | 390,378,296 | 19.35 |
| Total | 850,708,938 | 42.18 |
Statistics for functional annotation
| Functional database | Number of genes annotated |
|---|---|
| InterPro | 20,958 (84.66 %) |
| GO | 15,873 (64.12 %) |
| KEGG | 16,172 (65.33 %) |
| TrEMBL | 23,139 (93.47 %) |
| SwissProt | 22,347 (90.27 %) |