Literature DB >> 34247236

A Chromosome-Level Genome of the Agile Gracile Mouse Opossum (Gracilinanus agilis).

Ran Tian1,2, Kai Han3, Yuepan Geng1, Chen Yang1, Han Guo1, Chengcheng Shi3, Shixia Xu2, Guang Yang2, Xuming Zhou4, Vadim N Gladyshev5, Xin Liu3, Lisa K Chopin6,7, Diana O Fisher8, Andrew M Baker9,10, Natália O Leiner11, Guangyi Fan3,12,13, Inge Seim1,6,7,9.   

Abstract

There are more than 100 species of American didelphid marsupials (opossums and mouse opossums). Limited genomic resources for didelphids exists, with only two publicly available genome assemblies compared with dozens in the case of their Australasian counterparts. This discrepancy impedes evolutionary and ecological research. To address this gap, we assembled a high-quality chromosome-level genome of the agile gracile mouse opossum (Gracilinanus agilis) using a combination of stLFR sequencing, polishing with mate-pair data, and anchoring onto pseudochromosomes using Hi-C. This species employs a rare life-history strategy, semelparity, and all G. agilis males and most females die at the end of their first breeding season after succumbing to stress and exhaustion. The 3.7-Gb chromosome-level assembly, with 92.6% anchored onto pseudochromosomes, has a scaffold N50 of 683.5 Mb and a contig N50 of 56.9 kb. The genome assembly shows high completeness, with a mammalian BUSCO score of 88.1%. Around 49.7% of the genome contains repetitive elements. Gene annotation yielded 24,425 genes, of which 83.9% were functionally annotated. The G. agilis genome is an important resource for future studies of marsupial biology, evolution, and conservation.
© The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  zzm321990 Gracilinanuszzm321990 ; South America; chromosome-level; genome; mouse opossum

Mesh:

Year:  2021        PMID: 34247236      PMCID: PMC8390783          DOI: 10.1093/gbe/evab162

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Significance There is currently a distinct lack of genome assemblies of more than 100 species of American marsupials, with only two assemblies available, compared with dozens for Australasian marsupials. Here, we present a chromosome-level assembly of the agile gracile mouse opossum (Gracilinanus agilis). This species exhibits semelparity—a rare life-history strategy, where all G. agilis males and most females die at the end of their first breeding season after succumbing to stress and exhaustion. This genome will contribute to research on marsupial biology, evolution, and conservation.

Introduction

Living marsupials fall within seven orders spread unevenly across the Americas and Australasia (Nilsson et al. 2010; Kumar et al. 2017). Dozens of Australasian marsupials have been genetically sequenced or are forthcoming (reviewed in Deakin and O’Neill [2020]), but few American marsupials have been sequenced. Indeed, of the ∼100 species of didelphids (opossums and mouse opossums) (Astúa 2015; Faurby et al. 2018) (supplementary fig. 1, Supplementary Material online), only two genomes are publicly available: the gray short-tailed opossum (Monodelphis domestica) (Mikkelsen et al. 2007) and the Virginia opossum (Didelphis virginiana) (Dudchenko et al. 2018). At least three Australian marsupial genera are characterized by a semelparous reproductive strategy where all males of certain species or populations die at the end of their first breeding season after succumbing to stress and exhaustion (Baker and Dickman 2018; Collett et al. 2018; Mutton et al. 2019). In a small number of didelphids, both sexes are reported to be semelparous in four species of two tribes (three in Thylamyini and one in Marmosini) in the subfamily Didelphinae (Leiner et al. 2008; de Andreazzi et al. 2011; Baladrón et al. 2012; Lopes and Leiner 2015; Puida and Paglia 2015; Hernandez et al. 2018; Zangrandi 2018; Albanese et al. 2021). The genetic basis of marsupial semelparity remains largely unknown, and to understand it we must catalogue various genome sequences across the life-history continuum. Here, we contribute to this effort by presenting a chromosome-level genome assembly of the agile gracile mouse opossum (Gracilinanus agilis)—a small, nocturnal, insectivore–omnivore species inhabiting the tropical savannas of central South America (Gardner 2008).

Results and Discussion

The G. agilis genome was obtained by stLFR sequencing, polishing with mate-pair data, and anchoring onto seven pseudochromosomes (2n = 14) using Hi-C (fig. 1 and supplementary table 1 and fig. 2, Supplementary Material online). The final assembly size (3.7 Gb; including unanchored scaffolds) and GC content (37.87%) (table 1 and supplementary fig. 3, Supplementary Material online) are similar to the two other sequenced South American marsupial species—3.61 Gb and 37.82% for M. domestica and 3.42 Gb and 37.36% for the D. virginiana, respectively. The G. agilis assembly has a contig N50 of 56.9 kb, a scaffold N50 of 683.5 Mb (table 1), and the assigned chromosomes are highly homologous to M. domestica assembly MonDom5 (contig N50 108.0 kb; scaffold N50 Mb 528.0 Mb) (fig. 1). The G. agilis genome is composed of 49.7% repeat elements, including 42.3% LINEs, 12.1% LTRs, and 12.0% SINEs (table 1 and supplementary tables 2 and 3, Supplementary Material online). Out of 9,226 mammalian BUSCO genes, we recovered 7,889 (88.10%) (table 1). We also obtained a complete 16,336 bp mitochondrial genome from the mate-pair data (supplementary fig. 4, Supplementary Material online).

Overview of the Gracilinanus agilis genome assembly. (a) Assembly circos plot. The outermost segment represents chromosome sequences, with the numbers on the external surface indicating genome size (Mb). Line plots, from outside to inside, respectively, represent the distribution of CDS density (from 0 to 0.15), GC content (from 0.30 to 0.65) and TE ratio (from 0.2 to 1.0). Frequencies were calculated in 500 kb sliding windows. Photography courtesy of Noé U. de la Sancha (Chicago State University and Field Museum of Natural History, Chicago). (b) Circos plot showing shared synteny of G. agilis (chr1–chr7) and the gray short-tailed opossum (Monodelphis domestica) (NC_008801.1-NC_008809.1). Aligned using LASTZ. The synteny blocks are linked using lines colored in accordance with the G. agilis chromosomes. Aligned blocks with length shorter than 10 kb are not shown. Chr7 in G. agilis corresponds to the X chromosome of M. domestica.

Table 1

Summary of Gracilinanus agilis Genome Assembly and Annotation

Genome assemblyEstimated genome size3.40 Gb
Assembly size (scaffold)3.70 Gb
Assembly size (contig)3.40 Gb
Hi-C anchored rate92.57%
Contig number146,614
Contig N5056.91 kb
Longest contig649.78 kb
Scaffold number61,400
Scaffold N50683.52 Mb
Longest scaffold801.37 Mb
GC content37.87%
Gaps (N)8.25%
Transposable elementsAnnotationPercent
DNA2.18
LINE42.30
SINE11.98
LTR12.05
Other0.000094
Unknown1.64
Total49.71
Protein-coding genesPredicted genes24,425
Average transcript length64,360 bp
Average coding sequence length1,510 bp
Average exon length179 bp
Average intron length8,448 bp
Functionally annotated genes20,492
BUSCOComplete BUSCOs (C)8,128 (88.10%)
Complete and single-copy BUSCOs (S)7,889
Complete and duplicated BUSCOs (D)239
Fragmented BUSCOs (F)290
Missing BUSCOs (M)808

Note.—Hi-C anchored rate refers the proportion of scaffolded bases assembled onto seven pseudochromosomes. Assembly quality was assessed using BUSCO 5.0.0_cv1 with the 9,226-gene mammalian odb10 data set.

Overview of the Gracilinanus agilis genome assembly. (a) Assembly circos plot. The outermost segment represents chromosome sequences, with the numbers on the external surface indicating genome size (Mb). Line plots, from outside to inside, respectively, represent the distribution of CDS density (from 0 to 0.15), GC content (from 0.30 to 0.65) and TE ratio (from 0.2 to 1.0). Frequencies were calculated in 500 kb sliding windows. Photography courtesy of Noé U. de la Sancha (Chicago State University and Field Museum of Natural History, Chicago). (b) Circos plot showing shared synteny of G. agilis (chr1–chr7) and the gray short-tailed opossum (Monodelphis domestica) (NC_008801.1-NC_008809.1). Aligned using LASTZ. The synteny blocks are linked using lines colored in accordance with the G. agilis chromosomes. Aligned blocks with length shorter than 10 kb are not shown. Chr7 in G. agilis corresponds to the X chromosome of M. domestica. Summary of Gracilinanus agilis Genome Assembly and Annotation Note.—Hi-C anchored rate refers the proportion of scaffolded bases assembled onto seven pseudochromosomes. Assembly quality was assessed using BUSCO 5.0.0_cv1 with the 9,226-gene mammalian odb10 data set.

Conclusions

In this work, we report the genome assembly of G. agilis—the third from the more than 100 species of South American marsupials. We hope that our current efforts, which employed the relatively inexpensive (∼USD 1,000 per sample) stLFR sequencing technology (Stiller and Zhang 2019), will provide an impetus for a wave of genomics research in South America. Indeed, a consortium to facilitate such work will be delineated in an upcoming manuscript (Fisher et al., in preparation). Gracilinanusagilis is also one of a handful of South American marsupials that exhibits semelparity and, together with recent genomes of semelparous Australian relatives of genus Antechinus (Brandies et al. 2020; Tian et al. 2021), provides the first of many required to unravel a complex life-history strategy. Taken together, the chromosome-level G. agilis genome presented in this report should provide a valuable resource for a wide range of research on marsupials.

Materials and Methods

DNA Sequencing

An adult male agile gracile mouse opossum (Gracilinanus agilis; LMUSP501) was sampled in Estação Ecológica do Panga, Uberlândia, MG, Brazil (19° 9′ S, 48° 23′ W) in May 2019. Kidney and liver tissues were sequenced by single-tube Long Fragment Read (stLFR) (Fan et al. 2019; Wang et al. 2019) and short-insert library whole-genome sequencing on the BGISEQ-500 platform (2 × 100 bp reads), respectively. A total of ∼358 Gb (∼100×) stLFR reads were generated. SOAPnuke v1.5 (Chen et al. 2018) was used to filter out low-quality reads, PCR duplicates, and adaptors. Next, ∼264 Gb filtered (clean) data were assembled, using Supernova v2.1.1 (Weisenfeld et al. 2017) and the SOAPdenovo2 module Gapcloser v1.10 (Luo et al. 2012), and short-insert library WGS data (∼50×) were used to close gaps. Genome size was estimated by k-mer analysis of 100 bp paired-end WGS reads by GCE (Genomic Charactor Estimator) v1.0.0 (Marcais and Kingsford 2011) (supplementary fig. 5, Supplementary Material online). Liver Hi-C libraries were sequenced on the BGISEQ-500 platform and quality controlled using HiC-Pro v2.8.0_devel (Servant et al. 2015), resulting in ∼29 Gb uniquely aligned read pairs. Reads validated by HiC-Pro were used to scaffold contigs into seven chromosome clusters using the 3D-DNA v1.12 (Dudchenko et al. 2017). The assembly was further improved by interactive correction using Juicebox v1.11.08 (Durand et al. 2016; Dudchenko et al. 2018). Assembly quality was assessed using BUSCO (Benchmarking Universal Single-Copy Orthologs) v5.0.0_cv1 (Seppey et al. 2019) (mammalia_odb10 gene set). We also generated the complete mitochondrial genome of G. agilis from 100 bp WGS reads (see supplementary methods, Supplementary Material online).

Genome Annotation

We identified repetitive elements by integrating homology and de novo prediction data. Protein-coding genes were annotated using homology-based prediction, de novo prediction, and RNA-seq-assisted (generated from kidney, skeletal muscle, and liver from two male individuals) prediction methods. For details, see supplementary methods, Supplementary Material online.

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online. Click here for additional data file.
  18 in total

1.  Prey productivity and predictability drive different axes of life-history variation in carnivorous marsupials.

Authors:  Rachael A Collett; Andrew M Baker; Diana O Fisher
Journal:  Proc Biol Sci       Date:  2018-10-31       Impact factor: 5.349

Review 2.  Evolution of Marsupial Genomes.

Authors:  Janine E Deakin; Rachel J O'Neill
Journal:  Annu Rev Anim Biosci       Date:  2019-12-11       Impact factor: 8.923

3.  De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds.

Authors:  Olga Dudchenko; Sanjit S Batra; Arina D Omer; Sarah K Nyquist; Marie Hoeger; Neva C Durand; Muhammad S Shamim; Ido Machol; Eric S Lander; Aviva Presser Aiden; Erez Lieberman Aiden
Journal:  Science       Date:  2017-03-23       Impact factor: 47.728

4.  Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences.

Authors:  Tarjei S Mikkelsen; Matthew J Wakefield; Bronwen Aken; Chris T Amemiya; Jean L Chang; Shannon Duke; Manuel Garber; Andrew J Gentles; Leo Goodstadt; Andreas Heger; Jerzy Jurka; Michael Kamal; Evan Mauceli; Stephen M J Searle; Ted Sharpe; Michelle L Baker; Mark A Batzer; Panayiotis V Benos; Katherine Belov; Michele Clamp; April Cook; James Cuff; Radhika Das; Lance Davidow; Janine E Deakin; Melissa J Fazzari; Jacob L Glass; Manfred Grabherr; John M Greally; Wanjun Gu; Timothy A Hore; Gavin A Huttley; Michael Kleber; Randy L Jirtle; Edda Koina; Jeannie T Lee; Shaun Mahony; Marco A Marra; Robert D Miller; Robert D Nicholls; Mayumi Oda; Anthony T Papenfuss; Zuly E Parra; David D Pollock; David A Ray; Jacqueline E Schein; Terence P Speed; Katherine Thompson; John L VandeBerg; Claire M Wade; Jerilyn A Walker; Paul D Waters; Caleb Webber; Jennifer R Weidman; Xiaohui Xie; Michael C Zody; Jennifer A Marshall Graves; Chris P Ponting; Matthew Breen; Paul B Samollow; Eric S Lander; Kerstin Lindblad-Toh
Journal:  Nature       Date:  2007-05-10       Impact factor: 49.962

5.  Direct determination of diploid genome sequences.

Authors:  Neil I Weisenfeld; Vijay Kumar; Preyas Shah; Deanna M Church; David B Jaffe
Journal:  Genome Res       Date:  2017-04-05       Impact factor: 9.043

6.  Seasonal changes of faecal cortisol metabolite levels in Gracilinanus agilis (Didelphimorphia: Didelphidae) and its association to life histories variables and parasite loads.

Authors:  S E Hernandez; A L S Strona; N O Leiner; G Suzán; M C Romano
Journal:  Conserv Physiol       Date:  2018-07-18       Impact factor: 3.079

7.  Efficient and unique cobarcoding of second-generation sequencing reads from long DNA molecules enabling cost-effective and accurate sequencing, haplotyping, and de novo assembly.

Authors:  Ou Wang; Robert Chin; Xiaofang Cheng; Michelle Ka Yan Wu; Qing Mao; Jingbo Tang; Yuhui Sun; Radoje Drmanac; Brock A Peters; Ellis Anderson; Han K Lam; Dan Chen; Yujun Zhou; Linying Wang; Fei Fan; Yan Zou; Yinlong Xie; Rebecca Yu Zhang; Snezana Drmanac; Darlene Nguyen; Chongjun Xu; Christian Villarosa; Scott Gablenz; Nina Barua; Staci Nguyen; Wenlan Tian; Jia Sophie Liu; Jingwan Wang; Xiao Liu; Xiaojuan Qi; Ao Chen; He Wang; Yuliang Dong; Wenwei Zhang; Andrei Alexeev; Huanming Yang; Jian Wang; Karsten Kristiansen; Xun Xu
Journal:  Genome Res       Date:  2019-04-02       Impact factor: 9.043

8.  Chromosome-level genome assembly for giant panda provides novel insights into Carnivora chromosome evolution.

Authors:  Huizhong Fan; Qi Wu; Fuwen Wei; Fengtang Yang; Bee Ling Ng; Yibo Hu
Journal:  Genome Biol       Date:  2019-12-06       Impact factor: 13.583

9.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

10.  HiC-Pro: an optimized and flexible pipeline for Hi-C data processing.

Authors:  Nicolas Servant; Nelle Varoquaux; Bryan R Lajoie; Eric Viara; Chong-Jian Chen; Jean-Philippe Vert; Edith Heard; Job Dekker; Emmanuel Barillot
Journal:  Genome Biol       Date:  2015-12-01       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.