| Literature DB >> 34247236 |
Ran Tian1,2, Kai Han3, Yuepan Geng1, Chen Yang1, Han Guo1, Chengcheng Shi3, Shixia Xu2, Guang Yang2, Xuming Zhou4, Vadim N Gladyshev5, Xin Liu3, Lisa K Chopin6,7, Diana O Fisher8, Andrew M Baker9,10, Natália O Leiner11, Guangyi Fan3,12,13, Inge Seim1,6,7,9.
Abstract
There are more than 100 species of American didelphid marsupials (opossums and mouse opossums). Limited genomic resources for didelphids exists, with only two publicly available genome assemblies compared with dozens in the case of their Australasian counterparts. This discrepancy impedes evolutionary and ecological research. To address this gap, we assembled a high-quality chromosome-level genome of the agile gracile mouse opossum (Gracilinanus agilis) using a combination of stLFR sequencing, polishing with mate-pair data, and anchoring onto pseudochromosomes using Hi-C. This species employs a rare life-history strategy, semelparity, and all G. agilis males and most females die at the end of their first breeding season after succumbing to stress and exhaustion. The 3.7-Gb chromosome-level assembly, with 92.6% anchored onto pseudochromosomes, has a scaffold N50 of 683.5 Mb and a contig N50 of 56.9 kb. The genome assembly shows high completeness, with a mammalian BUSCO score of 88.1%. Around 49.7% of the genome contains repetitive elements. Gene annotation yielded 24,425 genes, of which 83.9% were functionally annotated. The G. agilis genome is an important resource for future studies of marsupial biology, evolution, and conservation.Entities:
Keywords: zzm321990 Gracilinanuszzm321990 ; South America; chromosome-level; genome; mouse opossum
Mesh:
Year: 2021 PMID: 34247236 PMCID: PMC8390783 DOI: 10.1093/gbe/evab162
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Overview of the Gracilinanus agilis genome assembly. (a) Assembly circos plot. The outermost segment represents chromosome sequences, with the numbers on the external surface indicating genome size (Mb). Line plots, from outside to inside, respectively, represent the distribution of CDS density (from 0 to 0.15), GC content (from 0.30 to 0.65) and TE ratio (from 0.2 to 1.0). Frequencies were calculated in 500 kb sliding windows. Photography courtesy of Noé U. de la Sancha (Chicago State University and Field Museum of Natural History, Chicago). (b) Circos plot showing shared synteny of G. agilis (chr1–chr7) and the gray short-tailed opossum (Monodelphis domestica) (NC_008801.1-NC_008809.1). Aligned using LASTZ. The synteny blocks are linked using lines colored in accordance with the G. agilis chromosomes. Aligned blocks with length shorter than 10 kb are not shown. Chr7 in G. agilis corresponds to the X chromosome of M. domestica.
Summary of Gracilinanus agilis Genome Assembly and Annotation
| Genome assembly | Estimated genome size | 3.40 Gb |
| Assembly size (scaffold) | 3.70 Gb | |
| Assembly size (contig) | 3.40 Gb | |
| Hi-C anchored rate | 92.57% | |
| Contig number | 146,614 | |
| Contig N50 | 56.91 kb | |
| Longest contig | 649.78 kb | |
| Scaffold number | 61,400 | |
| Scaffold N50 | 683.52 Mb | |
| Longest scaffold | 801.37 Mb | |
| GC content | 37.87% | |
| Gaps (N) | 8.25% | |
| Transposable elements | Annotation | Percent |
| DNA | 2.18 | |
| LINE | 42.30 | |
| SINE | 11.98 | |
| LTR | 12.05 | |
| Other | 0.000094 | |
| Unknown | 1.64 | |
| Total | 49.71 | |
| Protein-coding genes | Predicted genes | 24,425 |
| Average transcript length | 64,360 bp | |
| Average coding sequence length | 1,510 bp | |
| Average exon length | 179 bp | |
| Average intron length | 8,448 bp | |
| Functionally annotated genes | 20,492 | |
| BUSCO | Complete BUSCOs (C) | 8,128 (88.10%) |
| Complete and single-copy BUSCOs (S) | 7,889 | |
| Complete and duplicated BUSCOs (D) | 239 | |
| Fragmented BUSCOs (F) | 290 | |
| Missing BUSCOs (M) | 808 |
Note.—Hi-C anchored rate refers the proportion of scaffolded bases assembled onto seven pseudochromosomes. Assembly quality was assessed using BUSCO 5.0.0_cv1 with the 9,226-gene mammalian odb10 data set.