| Literature DB >> 29794166 |
Matthew B Couger1, Lena Arévalo2, Polly Campbell3.
Abstract
Genomic data for the closest relatives of house mice (Mus musculus species complex) are surprisingly limited. Here, we present the first complete genome for a behaviorally and ecologically unique member of the sister clade to house mice, the mound-building mouse, Mus spicilegus Using read cloud sequencing and de novo assembly we produced a 2.50 Gbp genome with a scaffold N50 of 2.27 Mbp. We constructed >25 000 gene models, of which the majority had high homology to other Mus species. To evaluate the utility of the M. spicilegus genome for behavioral and ecological genomics, we extracted 196 vomeronasal receptor (VR) sequences from our genome and analyzed phylogenetic relationships between M. spicilegus VRs and orthologs from M. musculus and the Algerian mouse, M. spretus While most M. spicilegus VRs clustered with orthologs in M. musculus and M. spretus, 10 VRs with evidence of rapid divergence in M. spicilegus are strong candidate modulators of species-specific chemical communication. A high quality assembly and genome for M. spicilegus will help to resolve discordant ancestry patterns in house mouse genomes, and will provide an essential foundation for genetic dissection of phenotypes that distinguish commensal from non-commensal species, and the social and ecological characteristics that make M. spicilegus unique.Entities:
Keywords: Mus spicilegus; de novo genome assembly; mound-building mouse; read cloud; vomeronasal receptors
Mesh:
Year: 2018 PMID: 29794166 PMCID: PMC6027863 DOI: 10.1534/g3.118.200318
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1The geographic distribution of the mound-building mouse, Mus spicilegus. Inset: Mound-building mice are highly social and exhibit natural burrowing behavior under laboratory conditions. Au, Austria; Hu, Hungary; Se, Serbia; Bu, Bulgaria; M, Moldova; A, Albania; G, Greece. Distribution based on Coroiu . Photo, AG Ophir.
Mus spicilegus genome and transcriptome raw read and base counts
| Value | 10x Genome | Transcriptome |
|---|---|---|
| Read Pairs | 1 550 168 820 | 19 878 467 |
| Total Bases | 116 262 661 500 | 59 63 540 100 |
Genome, transcriptome, and annotation statistics for M. spicilegus
| Value | |
|---|---|
| Scaffold N50 | 2 198 966 |
| Scaffold N90 | 235 414 |
| Assembly size scaffolds | 2 496 544 896 |
| Contig N50 | 30 918 |
| Contig N90 | 7729 |
| Contig assembly size | 2 390 795 516 |
| Scaffolds 10kb+ N50 | 2 265 242 |
| Scaffolds 10kb+ N90 | 413 257 |
| Size 10kb+ scaffolds | 2 396 298 463 |
| Number of assembled transcripts | 169 733 |
| Total bases in assembled transcriptome | 229 968 259 |
| Transcriptome N50 | 2178 |
| Transcriptome N90 | 536 |
| Number of predicted proteins | 112 521 |
| Number of full length predicted proteins | 55 149 |
| Annotated genome | |
| Number of transcript to genome alignments (GMAP) | 771 752 |
| Number of PASA2 assemblies | 83 465 |
| Number of AUGUSTUS ab initio models | 28 885 |
| Number of protein to genome alignments | 16 665 |
| Number of EVM gene models | 28 624 |
| Number of final gene models with PASA | 26 074 |
| Average gene length | 18 265 |
| Average protein length | 465.2 |
| Average cDNA length | 2476.2 |
| Number of exons | 334 559 |
| Average number of exons/gene | 12.8 |
| Number of genes with Blast hit ≤1e-10 | 25 557 |
Values are reported in base pairs or amino acids.
Blastp comparison of M. spicilegus gene models to other Mus species, and the largely M. m. domesticus-derived mouse reference genome
| Species or strain | Genome | Positive hits |
|---|---|---|
| SPRET_EiJ_v1 | 24 779 | |
| WSB_EiJ_v1 | 24 729 | |
| C57BL/6J | GRCm38.p5 | 25 006 |
| CAST_EiJ_v1 | 24 771 | |
| PWK_PhJ_v1 | 24 742 | |
| CAROLI_EiJ_v1.1 | 24 768 | |
| Uniprot Trembl | 25 557 |
Ensembl assembly name.
Blastp homology table for M. spicilegus top hits to Mus species database
| Species | Number of Hits |
|---|---|
| 11 800 | |
| 11 029 | |
| 5581 | |
| 2606 | |
| 2842 | |
| 2147 | |
| Total | 24 976 |
Figure 2Phylogenetic relationships among the two major vomeronasal receptor subfamilies, V1Rs (A) and V2Rs (B) in M. spicilegus (MUSP, red branches and gene names), M. musculus (MUMU, black branches and gene names), and M. spretus (SPRET, green branches and gene names). Trees are unrooted cladograms, open circles on nodes indicate bootstrap support >90. Red arrowheads indicate M. spicilegus receptors that are not sister to orthologs with the same name in either M. musculus or M. spretus. Gene names with an underscore and number appended are transcript variants in M. musculus. Gene names with “like” appended are unannotated putative VRs in the M. spretus genome. M. spicilegus VR sequences are provided in Supplemental Material (V1Rs: File S1; V2Rs: File S2).