| Literature DB >> 26422234 |
Dan Søndergaard1, Christian Nørgaard Storm Pedersen2.
Abstract
P-Type ATPases are part of the regulatory system of the cell where they are responsible for transporting ions and lipids through the cell membrane. These pumps are found in all eukaryotes and their malfunction has been found to cause several severe diseases. Knowing which substrate is pumped by a certain P-Type ATPase is therefore vital. The P-Type ATPases can be divided into 11 subtypes based on their specificity, that is, the substrate that they pump. Determining the subtype experimentally is time-consuming. Thus it is of great interest to be able to accurately predict the subtype based on the amino acid sequence only. We present an approach to P-Type ATPase sequence classification based on the k-nearest neighbors, similar to a homology search, and show that this method provides performs very well and, to the best of our knowledge, better than any existing method despite its simplicity. The classifier is made available as a web service at http://services.birc.au.dk/patbox/ which also provides access to a database of potential P-Type ATPases and their predicted subtypes.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26422234 PMCID: PMC4589233 DOI: 10.1371/journal.pone.0139571
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The results of 20 runs of 5-fold cross-validation for 1 ≤ k ≤ 50.
The weighed and unweighed approaches both perform well for small k. For k = 1 we obtain an accuracy of 100%. Dots are outliers. Lines show accuracy for reduced datasets.