| Literature DB >> 25392424 |
László Dobson1, Tamás Langó1, István Reményi1, Gábor E Tusnády2.
Abstract
The Topology Data Bank of Transmembrane Proteins (TOPDB, http://topdb.enzim.ttk.mta.hu) contains experimentally determined topology data of transmembrane proteins. Recently, we have updated TOPDB from several sources and utilized a newly developed topology prediction algorithm to determine the most reliable topology using the results of experiments as constraints. In addition to collecting the experimentally determined topology data published in the last couple of years, we gathered topographies defined by the TMDET algorithm using 3D structures from the PDBTM. Results of global topology analysis of various organisms as well as topology data generated by high throughput techniques, like the sequential positions of N- or O-glycosylations were incorporated into the TOPDB database. Moreover, a new algorithm was developed to integrate scattered topology data from various publicly available databases and a new method was introduced to measure the reliability of predicted topologies. We show that reliability values highly correlate with the per protein topology accuracy of the utilized prediction method. Altogether, more than 52,000 new topology data and more than 2600 new transmembrane proteins have been collected since the last public release of the TOPDB database.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25392424 PMCID: PMC4383934 DOI: 10.1093/nar/gku1119
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 19.160
Distribution of experiment types over the TOPDB entries and over the total topology data in the first (purple bars) and the current (red bars) release of the TOPDB database
Figure 1.Distribution of proteins with different number of transmembrane segments in the TOPDB database.
Figure 2.Distribution of the calculated reliability in the TOPDB database. Entries were sorted according to the reliability, and plotted the order number divided by the size of the TOPDB database (coverage) vs reliability of the protein in that position of the sorted list.