| Literature DB >> 33004433 |
Thomas C Mathers1, Sam T Mugford2, Saskia A Hogenhout2, Leena Tripathi3.
Abstract
The banana aphid, Pentalonia nigronervosa Coquerel (Hemiptera: Aphididae), is a major pest of cultivated bananas (Musa spp., order Zingiberales), primarily due to its role as a vector of Banana bunchy top virus (BBTV), the most severe viral disease of banana worldwide. Here, we generated a highly complete genome assembly of P. nigronervosa using a single PCR-free Illumina sequencing library. Using the same sequence data, we also generated complete genome assemblies of the P. nigronervosa symbiotic bacteria Buchnera aphidicola and Wolbachia To improve our initial assembly of P. nigronervosa we developed a k-mer based deduplication pipeline to remove genomic scaffolds derived from the assembly of haplotigs (allelic variants assembled as separate scaffolds). To demonstrate the usefulness of this pipeline, we applied it to the recently generated assembly of the aphid Myzus cerasi, reducing the duplication of conserved BUSCO genes by 25%. Phylogenomic analysis of P. nigronervosa, our improved M. cerasi assembly, and seven previously published aphid genomes, spanning three aphid tribes and two subfamilies, reveals that P. nigronervosa falls within the tribe Macrosiphini, but is an outgroup to other Macrosiphini sequenced so far. As such, the genomic resources reported here will be useful for understanding both the evolution of Macrosphini and for the study of P. nigronervosa. Furthermore, our approach using low cost, high-quality, Illumina short-reads to generate complete genome assemblies of understudied aphid species will help to fill in genomic black spots in the diverse aphid tree of life.Entities:
Keywords: Hemiptera; genome assembly; insect vector; phylogenomics; plant pest
Year: 2020 PMID: 33004433 PMCID: PMC7718742 DOI: 10.1534/g3.120.401358
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Genome assembly and annotation statistics for P. nigronervosa and M. cerasi
| Species | |||
|---|---|---|---|
| Assembly | Penig_v1 | Mycer_v1.1 | Mycer_v1.2 |
| Base pairs (Mb) | 375.35 | 405.71 | 393.23 |
| % Ns | 0.07 | 0.05 | 0.16 |
| Number of contigs | 20,873 | 51,488 | 45,960 |
| Contig N50 (Kb) | 64.06 | 19.7 | 20.6 |
| Number of scaffolds | 18,348 | 49,286 | 39,595 |
| Scaffold N50 (Kb) | 103.99 | 23.27 | 35.19 |
| Longest scaffold (Kb) | 631.82 | 265.36 | 350.78 |
| Protein coding genes | 27,698 | 28,688 | 31,070 |
| Transcripts | 29,708 | 28,688 | 33,159 |
| Reference | This study | This study |
Scaffolds split on runs of 10 or more Ns.
Figure 1The P. nigronervosa genome assembly is complete and free from duplication and contamination. (a) KAT k-mer spectra plot comparing k-mer content of PCR-free P. nigronervosa Illumina reads to k-mer content of the final P. nigronervosa genome assembly (Penig_v1). Colors indicate how many times fixed length words (k-mers) from the reads appear in the assembly. Red indicates k-mers found only once in the assembly, black indicates content present in the reads but missing from the assembly and other colors indicate k-mers that are duplicated in the assembly. The x-axis shows the number of times each k-mer is found in the reads (k-mer multiplicity) and the y-axis shows the count of distinct k-mers in 1x k-mer multiplicity bins. (b) BUSCO analysis of Penig_v1, our updated assembly of M. cerasi (Mycer_v1.2) and published Macrosiphini genome assemblies. Myper_O_v2 = Myzus persicae clone O v2, Acpis_JIC1 = Acyrthosiphon pisum clone JIC1, Mycer_v1.1 = Myzus cerasi v1.1 and Dnox_v1 = Diuraphis noxia v1. The genomes were assessed using the Arthropoda gene set (n = 1,066). (c) Taxon-annotated GC content-coverage plot of the P. nigronervosa Discovar de novo genome assembly (post deduplication and prior to RNA-seq scaffolding – see Methods) showing co-assembly of the aphid and its symbionts. Each circle represents a scaffold in the assembly, scaled by length, and colored by order-level NCBI taxonomy assigned by BlobTools. The X axis corresponds to the average GC content of each scaffold and the Y axis corresponds to the average coverage based on alignment of P. nigronervosa PCR-free Illumina short reads. Marginal histograms show cumulative genome content (in Kb) for bins of coverage (Y axis) and GC content (X axis). Arrows highlight scaffolds assigned to the symbiotic bacteria Buchnera aphidicola and Wolbachia which were removed from the final assembly (Supplementary Figure 2).
Figure 2Maximum likelihood phylogeny of selected aphid species with sequenced genomes based on a concatenated alignment of 4,721 conserved one-to-one orthologs. All branches received maximal support based on the Shimodaira-Hasegawa test (Shimodaira and Hasegawa 1999) implemented in FastTree (Price , 2010) with 1,000 resamples. Clades are colored by aphid tribe. Branch lengths are in amino acid substitutions per site.