| Literature DB >> 27379147 |
Tania Nobre1, M Doroteia Campos1, Eva Lucic-Mercy2, Birgit Arnholdt-Schmitt1.
Abstract
Incorrectly or simply not annotated data is largely increasing in most public databases, undoubtedly caused by the rise in sequence data and the more recent boom of genomic projects. Molecular biologists and bioinformaticists should join efforts to tackle this issue. Practical challenges have been experienced when studying the alternative oxidase (AOX) gene family, and hence the motivation for the present work. Commonly used databases were screened for their capacity to distinguish AOX from the plastid terminal oxidase (also called plastoquinol terminal oxidase; PTOX) and we put forward a simple approach, based on amino acids signatures, that unequivocally distinguishes these gene families. Further, available sequence data on the AOX family in plants was carefully revised to: (1) confirm the classification as AOX and (2) identify to which AOX family member they belong to. We bring forward the urgent need of misannotation awareness and re-annotation of public AOX sequences by highlighting different types of misclassifications and the large under-estimation of data availability.Entities:
Keywords: alternative oxidase; databases; gene annotation; gene family; phylogeny; plastoquinol terminal oxidase; signature-based classification
Year: 2016 PMID: 27379147 PMCID: PMC4909761 DOI: 10.3389/fpls.2016.00868
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1(A) Neighbor-joining (NJ) tree of PTOX sequences (orange) and AOX sequences (green), clearly showing that they belong to two separate clades. Logo representation of the AOX/PTOX signature was constructed at the web interface program weblogo (Crooks et al., 2004). We propose two fingerprints group specific: (1) for PTOX, based on 60 sequences, (F)GWWRR and HHLL(I)ME; (2) for AOX, based on 206 sequences, ERMHLVT and YLEEEA (Supplementary Data 2). (B) Gene structure of AOX and PTOX Arabidopsis thaliana nucleotide sequence. AOX in plants generally presents 4 exons interrupted by 3 introns (evolutionary intron loss or gain resulted in the variation of intron numbers in some AOX members, Cardoso et al., 2015): in order of appearance, AT1G32350; AT3G22360; AT3G22370; AT3G27620; AT5G84210. PTOX is typically structured in 9 exons and 8 introns: AT4G22260.
Figure 2Reconstructed phylogeny of plants at the gene family AOX included in this study (a combination of newly collected sequences and the alignment made available by Costa et al., . The optimal substitution model was selected in MrModeltest 2.2 (Posada and Crandall, 1998) as being the JTT+I+G. The phylogeny corresponds to the majority rule consensus tree of trees sampled in a Bayesian analysis (conducted using MrBayes version 3.0 (Huelsenbeck and Ronquist, 2001; Ronquist and Huelsenbeck, 2003); with default settings and with MCMC—considering 100, 000 generations—runs being repeated three times as a safeguard against spurious results; first 1000 trees were discarded as burn-in; stationarity was confirmed by analysis of the log-likelihoods and the consistency between runs). The numbers above the branches refer to the Bayesian posterior probability of the nodes (more than 50%) derived from 19500 Markov chain Monte Carlo-sampled trees. #Clade containing a putative Solanum tuberosum AOX1 sequence (sequence id PRF: 1588565); likely a misidentification of the organism. *Centaurea maculosa (Asterales) AOX2b sequence (GenBank EH723572.1) does not cluster with the other Asterales.