| Literature DB >> 29762724 |
Alan J Medlar1,2, Petri Törönen1, Liisa Holm1,3.
Abstract
We present AAI-profiler, a web server for exploratory analysis and quality control in comparative genomics. AAI-profiler summarizes proteome-wide sequence search results to identify novel species, assess the need for taxonomic reclassification and detect multi-isolate and contaminated samples. AAI-profiler visualises results using a scatterplot that shows the Average Amino-acid Identity (AAI) from the query proteome to all similar species in the sequence database. Taxonomic groups are indicated by colour and marker styles, making outliers easy to spot. AAI-profiler uses SANSparallel to perform high-performance homology searches, making proteome-wide analysis possible. We demonstrate the efficacy of AAI-profiler in the discovery of a close relationship between two bacterial symbionts of an omnivorous pirate bug (Orius) and a thrip (Frankliniella occidentalis), an important pest in agriculture. The symbionts represent novel species within the genus Rosenbergiella so far described only in floral nectar. AAI-profiler is easy to use, the analysis presented only required two mouse clicks and was completed in a few minutes. AAI-profiler is available at http://ekhidna2.biocenter.helsinki.fi/AAI.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29762724 PMCID: PMC6030964 DOI: 10.1093/nar/gky359
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.(A) The principle of AAI-profiler, using colour to indicate database species related to the Query species at species level (orange), genus level (light gray), or phylum level (yellow). Top: Distributions of pairwise sequence identities between proteins of the Query proteome and their best match in a species in the database. Bottom: Cladogram showing nested taxonomic groupings. Different genes evolve at different rates, broadening the distributions of taxa which are more distantly related to the Query species. The vertical lines indicate that, for a given Query proteome, the proteome-wide average of the pairwise sequence identities (Average Amino-acid Identity, AAI) correlates with taxonomic distance. (B) AAI-profiler scatterplot for bacteria symbiont BFo1 of Frankinella occidentalis. Selected bacterial genera are highlighted by different colours. The dot nearest coordinates (1,1) is the query species. Species of the genus Erwinia occupy the range AAI > 0.9. The inset shows a pie chart from the taxonomic profile view: the majority of query proteins have a closest match in Erwinia species.
Figure 2.(A) AAI-profiler scatterplot for bacteria symbiont BFo2 of Frankinella occidentalis. The dot nearest coordinates (1,1) is the query species. (B) AAI-profiler scatterplot of Rosenbergiella nectarea. Only a few Rosenbergiella proteins are included in the Uniprot database. The query proteome was obtained from NCBI genomes. Rosenbergiella, OPLPL6 and BFo2 form a closely related group which is distinct from other genera within Erwiniaceae around 75% AAI.
Figure 3.The Chlamydia trachomatis pan proteome illustrates data contamination due to mislabeled multi-isolate samples. (A) The AAI-profiler scatterplot of the Chlamydia pan proteome shows a complex mixture of several species. The pie chart (inset from taxonomic profile view) shows that a minor fraction of query proteins in the Chlamydia trachomatis pan proteome have a nearest match in Chlamydia. (B) The AAI-profiler scatterplot of Chlamydia trachomatis strain SwabB4 shows a superposition of two species (Lactobacillus crispatus [orange dots] and Chlamydia trachomatis [blue dots]). The pie charts (inset from taxonomic profile view) show the proportion of the two genera in the sample.