| Literature DB >> 30423125 |
Thyago Vanderlinde1, Eduardo Guimarães Dupim1, Nestor O Nazario-Yepiz2, Antonio Bernardo Carvalho1.
Abstract
Three North American cactophilic Drosophila species, D. mojavensis, D. arizonae, and D. navojoa, are of considerable evolutionary interest owing to the shift from breeding in Opuntia cacti to columnar species. The 3 species form the "mojavensis cluster" of Drosophila. The genome of D. mojavensis was sequenced in 2007 and the genomes of D. navojoa and D. arizonae were sequenced together in 2016 using the same technology (Illumina) and assembly software (AllPaths-LG). Yet, unfortunately, the D. navojoa genome was considerably more fragmented and incomplete than its sister species, rendering it less useful for evolutionary genetic studies. The D. navojoa read dataset does not fully meet the strict insert size required by the assembler used (AllPaths-LG) and this incompatibility might explain its assembly problems. Accordingly, when we re-assembled the genome of D. navojoa with the SPAdes assembler, which does not have the strict AllPaths-LG requirements, we obtained a substantial improvement in all quality indicators such as N50 (from 84 kb to 389 kb) and BUSCO coverage (from 77% to 97%). Here we share a new, improved reference assembly for D. navojoa genome, along with a RNAseq transcriptome. Given the basal relationship of the Opuntia breeding D. navojoa to the columnar breeding D. arizonae and D. mojavensis, the improved assembly and annotation will allow researchers to address a range of questions associated with the genomics of host shifts, chromosomal rearrangements and speciation in this group.Entities:
Mesh:
Year: 2019 PMID: 30423125 PMCID: PMC6321958 DOI: 10.1093/jhered/esy059
Source DB: PubMed Journal: J Hered ISSN: 0022-1503 Impact factor: 2.645
Figure 1.Evolutionary relationships and host cactus use in the ancestral Drosophila navojoa and the derived Drosophila mojavensis and Drosophila arizonae, member of the mojavensis cluster. Divergence times were taken from Sanchez-Flores et al. 2016.
Assembly statistics for Drosophila navojoa from the original (AllPaths-LG) and new (SPAdes) assemblies compared to that of Drosophila arizonae
| Assembly statistics |
|
| |
|---|---|---|---|
| Original assembly | New assembly | Original assembly | |
| N50 | 82455 | 389283 | 171766 |
| Total number of scaffolds | 10779 | 13813 | 5133 |
| Sum (Mbp) | 115 | 147 | 141 |
| Maximum scaffold size (bp) | 1117492 | 3635071 | 1311587 |
| Complete BUSCOs (%) | 76.7 | 97.4 | 93.4 |
| Complete and single-copy BUSCOs (%) | 76.4 | 97 | 92.9 |
| Complete and duplicated BUSCOs (%) | 0.3 | 0.4 | 0.5 |
| Fragmented BUSCOs (%) | 4.8 | 1.4 | 2.1 |
| Missing BUSCOs (%) | 18.5 | 1.2 | 4.5 |
Figure 2.Completeness of a random sample of genes in the original and in the improved assemblies of Drosophila navojoa. The genes were chosen because they are commonly used in phylogenetic studies, without any prior information of their completeness in both assemblies. Seven of them are complete in both assemblies (Amyrel, even skipped, engrailed, Dopa decarboxylase, Notum, hedgehog, and Distal-less; we represented only the first one). The remaining 3 genes are complete in the new assembly but are missing parts (or are altogether absent) in the original assembly (patched, ebony and wingless). In all cases we used the protein sequence of Drosophila melanogaster ortholog as the query in a TBLASTN search.
Assembly statistics for Drosophila navojoa transcriptome
| Assembly statistics | Trinity | rnaSPAdes |
|---|---|---|
| Total number of scaffolds | 69635 | 22589 |
| Complete BUSCOs (%) | 89 | 87 |
| Complete and single-copy BUSCOs (%) | 50 | 81 |
| Complete and duplicated BUSCOs (%) | 39 | 6 |
| Fragmented BUSCOs (%) | 8 | 7 |
| Missing BUSCOs (%) | 3 | 6 |