| Literature DB >> 32227195 |
Warren Brian Simison1, James F Parham1,2, Theodore J Papenfuss3, Athena W Lam1, James B Henderson1.
Abstract
Among vertebrates, turtles have many unique characteristics providing biologists with opportunities to study novel evolutionary innovations and processes. We present here a high-quality, partially phased, and chromosome-level Red-Eared Slider (Trachemys scripta elegans, TSE) genome as a reference for future research on turtle and tetrapod evolution. This TSE assembly is 2.269 Gb in length, has one of the highest scaffold N50 and N90 values of any published turtle genome to date (N50 = 129.68 Mb and N90 = 19 Mb), and has a total of 28,415 annotated genes. We introduce synteny analyses using BUSCO single-copy orthologs, which reveal two chromosome fusion events accounting for differences in chromosome counts between emydids and other cryptodire turtles and reveal many fission/fusion events for birds, crocodiles, and snakes relative to TSE. This annotated chromosome-level genome will provide an important reference genome for future studies on turtle, vertebrate, and chromosome evolution.Entities:
Keywords: Hi-C; IsoSEQ; assembly; chromosome; linked-reads; reference genome; synteny; turtle
Mesh:
Year: 2020 PMID: 32227195 PMCID: PMC7186784 DOI: 10.1093/gbe/evaa063
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Assemblathon+ Statistics for TSE Genome Assembly (generated with custom asmstats.pla)
| Number of scaffolds >1K nt | 26,710 | 66.60% | |
| Number of scaffolds >10K nt | 2,988 | 7.50% | |
| Number of scaffolds >100K nt | 32 | 0.10% | |
| Number of scaffolds >1M nt | 28 | 0.10% | |
| Number of scaffolds >10M nt | 24 | 0.10% | |
| Mean scaffold size | 56,571 | ||
| Median scaffold size | 1,555 | ||
| N50 scaffold length | 129,675,691 | L50 scaffold count | 6 |
| N60 scaffold length | 126,808,733 | L60 scaffold count | 7 |
| N70 scaffold length | 85,829,911 | L70 scaffold count | 10 |
| N80 scaffold length | 43,716,676 | L80 scaffold count | 13 |
| N90 scaffold length | 19,049,219 | L90 scaffold count | 21 |
| Scaffold %A | 27 | Number of A | 609,556,304 |
| Scaffold %C | 21 | Number of C | 482,905,406 |
| Scaffold %G | 21 | Number of G | 483,034,848 |
| Scaffold %T | 27 | Number of T | 609,529,607 |
| Scaffold % | 4 | Number of | 83,700,144 |
| Scaffold %non-ACGTN | 0 | ||
| Number of scaffold non-ACGTN nt | 0 | ||
| Percentage of assembly in scaffolded contigs | 94.50 | ||
| Percentage of assembly in unscaffolded contigs | 5.50 | ||
| Average number of contigs per scaffold | 1.5 | ||
| Average length of break (≥10 | 4,165 | ||
| Number of contigs | 60,193 | ||
| Number of contigs in scaffolds | 22,314 | ||
| Number of contigs not in scaffolds | 37,879 | ||
| Total size of contigs | 2,185,039,207 | ||
| Longest contig | 1,642,093 | ||
| Shortest contig | 48 | ||
| Number of contigs >1K nt | 45,044 | 74.80% | |
| Number of contigs >10K nt | 19,706 | 32.70% | |
| Number of contigs >100K nt | 6,735 | 11.20% | |
| Number of contigs >1M nt | 21 | 0.00% | |
| Number of contigs >10M nt | 0 | 0.00% | |
| Mean contig size | 36,301 | ||
| Median contig size | 3,258 | ||
| N50 contig length | 189,165 | L50 contig count | 3,255 |
| N60 contig length | 146,678 | L60 contig count | 4,570 |
| N70 contig length | 108,417 | L70 contig count | 6,295 |
| N80 contig length | 71,323 | L80 contig count | 8,775 |
| N90 contig length | 32,113 | L90 contig count | 13,220 |
| Contig %A | 28 | Number of A | 609,556,304 |
| Contig %C | 22 | Number of C | 482,905,406 |
| Contig %G | 22 | Number of G | 483,034,848 |
| Contig %T | 27.9 | Number of T | 609,529,607 |
| Contig %N | 0 | Number of N | 13,042 |
| Contig %non-ACGTN | 0 | ||
| Number of contig non-ACGTN nt | 0 | ||
asmstats.pl is a modification of github.com/ucdavis-bioinformatics/assemblathon2-analysis/blob/master/assemblathon_stats.pl (last accessed April 8, 2020) available at github.com/calacademy-research/ccgutils/tree/master/asmstats (last accessed April 8, 2020).
F—Circos synteny diagrams displaying chromosomal fissions/fusions within diapsids. The dark red and dark orange lines represent homologous clusters of SCOs found on chromosomes not found on testudinoid genomes (TSE and Gopherus) and likely fused with TSE and Gopherus chromosomes 2 and 4. Synteny diagram (A) reveals the fusion of two Chelonia chromosomes 16 and 9 with Gopherus chromosomes 2 and 4 respectively. Synteny diagram (B) reveals the fusion of two Chelonia chromosomes 16 and 9 with TSE chromosomes 2 and 4 respectively. Synteny diagram (C) reveals that for both Gopherus and TSE the dark red and dark orange cluster of SCOs have fused with TSE and Gopherus chromosomes 2 and 4. Synteny diagram (D) and (E) reveal the archosaurian Alligator chromosomes 4 and 11 and Gallus chromosomes 26 and 22 have fused with testudinoid (Gopherus and TSE) chromosomes. Synteny diagram (F) reveals that Python chromosome 9 has fused with Gopherus and TSE chromosome 2. The dark orange cluster of SCOs from TSE chromosome 4 are part of Python chromosome 5. The colored bars in rings represent TSE chromosomes except in (A), where colored bar represent Gopherus. Black bars represent other chromosomes. Note that all single relocations have been removed for clarity. For the phylogenetic tree of reptile and avian genomes used in this study, the number in ovals represents the hypothesized ancestral 2n number of chromosome for each lineage based on Bickham and Carr (1983) and the circular diagrams represent the synteny diagrams highlighting the chromosomes involved in the indicated fusion event. TSE and Gopherus share the same fusion of these two clusters (diagram C).