Literature DB >> 25474071

Identification of single-stranded and double-stranded DNA binding proteins based on protein structure.

Wei Wang, Juan Liu, Xionghui Zhou.   

Abstract

BACKGROUND: Protein-DNA interactions are essential for many biological processes. However, the structural mechanisms underlying these interactions are not fully understood. DNA binding proteins can be classified into double-stranded DNA binding proteins (DSBs) and single-stranded DNA binding proteins (SSBs), and they take part in different biological functions. DSBs usually act as transcriptional factors to regulate the genes' expressions, while SSBs usually play roles in DNA replication, recombination, and repair, etc. Understanding the binding specificity of a DNA binding protein is helpful for the research of protein functions.
RESULTS: In this paper, we investigated the differences between DSBs and SSBs on surface tunnels as well as the OB-fold domain information. We detected the largest clefts on the protein surfaces, to obtain several features to be used for distinguishing the potential interfaces between SSBs and DSBs, and compared its structure with each of the six OB-fold protein templates, and use the maximal alignment score TM-score as the OB-fold feature of the protein, based on which, we constructed the support vector machine (SVM) classification model to automatically distinguish these two kinds of proteins, with prediction accuracy of 87%,83% and 83% for HOLO-set, APO-set and Mixed-set respectively.
CONCLUSIONS: We found that they have different ranges of tunnel lengths and tunnel curvatures; moreover, the alignment results with OB-fold templates have also found to be the discriminative feature of SSBs and DSBs. Experimental results on 10-fold cross validation indicate that the new feature set are effective to describe DNA binding proteins. The evaluation results on both bound (DNA-bound) and non-bound (DNA-free) proteins have shown the satisfactory performance of our method.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25474071      PMCID: PMC4243121          DOI: 10.1186/1471-2105-15-S12-S4

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

The family of DNA binding proteins is able to recognize and bind to DNAs, and they play vital roles in many biological processes such as DNA replication, recombination, repair, transcription, translation, and maintenance of telomeres, and so on [1-4]. There are two kinds of DNAs, single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA). Accordingly, the DNA binding proteins usually consist of single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). SSB binds with ssDNA with high affinity and low specificity, and is mainly involved in DNA replication, recombination and repair. While DSBs involve in binding to particular dsDNA sequences, to modulate the process of transcription, to cleave DNA molecules, or to be involved in chromosome packaging and transcription in the cell nucleus, etc. Though there are some researches [5-7] on the SSB and DSB respectively, few attentions have been paid on investigating what makes SSB and DSB have such different kind of binding specificity. With the development of biotechnology, a large amount of proteins have been sequenced. However, SSBs have shown to have little sequence conservation [8]. Even DSBs involved in similar functions may have conserved subsequences, different kinds of DSBs with different functions seems to show few common subsequences. Therefore, it is hard to recognize SSB sequences from DSB sequences, or vice versa. Now that the molecular structure determines its biological function, structural information is expected to provide insight on the binding mechanism of SSB or DSB. The great progress of the structure genomics project [9] results that more and more high resolution 3D structures for DSBs and SSBs are available now, which makes it possible to investigate the common structural differences between SSB and DSB that are responsible for the binding specificity. In the meantime, the investigation results can help to annotate or refine the annotation of the proteins with known structures yet unknown or not fully understood functions. In fact, up to Jan. 25, 2013, the Protein Data Bank (PDB) [10] contains 3390 structures for DNA binding proteins (see Additional file 1), among them only about 30% and 5% are annotated as DSBs and SSBs, respectively, and whether the remains belong to DSBs or SSBs are still not very clear. Therefore, a computational method is required to annotate the DNA binding protein as DSB or SSB automatically. To address this question, this work is devoted to characterize the structural differences between DSBs and SSBs, and then to construct the distinguishing model that can automatically refine the annotations of the DNA binding proteins. The surface of a protein is generally irregular, containing many clefts and grooves of varying shapes and sizes [11]. Previous researches have shown that a large cleft can provide an increased opportunity for the protein to form interactions with other molecules, particularly small ligands [12,13]. Therefore, some researches used a particularly large and deep cleft to characterize the binding active sites of the proteins [11,13,14]. We guess that for DNA binding proteins, the cleft properties on the surface may also play important roles on the dsDNA/ssDNA binding specificity. Research results have shown that although the sequences of different SSBs are very different, there are well-conserved elements in the structures. That is, most SSBs contain one or more OB (oligonucleotide/oligosaccharide binding) -fold domains [6,15-18]. A typical OB-fold has a five-stranded beta-sheet coiled to form a closed beta-barrel. This barrel is capped by an alpha-helix located between the third and fourth strands. The OB-fold plays critical role in binding with ssDNA. Although it is hard to say that the OB-fold is unique for SSBs, we think that it should also be used as an important descriptor to distinguish SSBs from DSBs. In this paper, we aim to investigate the structural differences between collected SSBs and DSBs, and extract the structure-based features related to surface clefts and OB-folds, based on which, we construct a computational model that can automatically classify the DNA protein as a DSB or SSB by using the widely used support vector machine (SVM). The promising performance suggests that our method will be useful in the protein function annotation and refinement.

Methods

Data sets

We first extracted the structures of all 3390 DNA binding proteins from PDB (Jan. 25, 2013 release) according to their annotations, which contain 1039 DSBs (HOLO 890, APO 149), 158 SSBs (HOLO 70, APO 88) and 2193 unknowns. Then we use PISCES (http://dunbrack.fccc.edu/PISCES.php) [19] to get the non-redundant set, in which every structure is either solved by NMR or by X-ray yet with resolution better than 3Å, the sequence identity is less than 30%, and the length of chain is greater than 40 amino acid residues. As a result, we finally got 204 DSBs (HOLO 154 and APO 50), 75 SSBs (HOLO 37 and APO 38) and 727 unknowns (Additional file 2). For simplicity, we call the set containing protein-DNA bound structures as HOLO set, and the set containing protein-DNA unbound structures as APO set, and the proteins in these sets are respectively denoted as DSB_holo, SSB_holo, DSB_apo, and SSB_apo hereinafter.

Features on clefts

The protein surface has a very complex and irregular shape that contains concave, convex and flat, which contributes to protein to interact with the external environment. The clefts, pockets, or cavities are generally considered as the active sites on protein surfaces, thus the research on them are meaningful of understanding the protein functions. Now that it has been reported that a large cleft can provide an increased opportunity for the protein to form interactions with other molecules [12,13], and the particularly large and deep clefts have been used to characterize the binding sites of the proteins [11], we consider that for DNA binding proteins, the large clefts on the surface may also play important roles on the dsDNA/ssDNA binding. In other words, the large clefts on SSB would be narrow enough to prevent it from binding with dsDNA. Some tools have been developed to recognize the clefts based on the protein structures, such as HOLE [20], MOLE [21,22], MolAxis [23] and Caver [24,25]. In this work, we applied CAVER 3.0 package to detect the clefts and the corresponding indexes of the largest clefts (also called as tunnels in this work) on the protein surfaces, to investigate whether they are possible to be used for distinguishing the potential interfaces between SSBs and DSBs. Concretely, we mainly got three indexes of the detected tunnels: length, curvature and bottleneck radius. Length: indicating the length of the path from the start point to the end point along the tunnel axis. Curvature: indicating the curvature of the tunnel. The curvature of the tunnel is calculated by Curvature = Length/Distance, where the distance is the length of the straight line from the start point to the end point of the tunnel. The greater the curvature, the curved is the tunnel. Bottleneck radius: indicating the radius of the narrowest part of the tunnel, also representing the radius of the largest possible ball that can be centered at a given point of the tunnel axis without colliding with the input structure. Since the protein surface contains many tunnels of varying shapes and sizes. The CAVER package return as many tunnels as possible. For the reason mentioned above, we just check the largest one in terms of maximizing (Length*Bottleneck Radius). For example, for protein 1A73, CAVER detects out 27 tunnels shown in Figure 1, and 1their indexes are listed in Table 1. According to the choosing criteria, tunnel number 25 (Figure 2) will be considered as the largest tunnel.
Figure 1

All detected tunnels of protein 1A73. The graph shows the CAVER package detects out 27 tunnels in 1A73 protein, and show 3D structure for all tunnels with different colours in protein surface.

Table 1

Index values for all tunnels of 1A73

TunnelBottleneck-radiusLengthCurvature
13.522.471.05
22.793.481.26
32.547.641.26
41.855.851.77
51.8612.082.02
61.3314.781.29
71.2512.681.43
80.9613.081.39
91.0915.631.50
101.1316.261.71
111.0329.471.57
120.9825.021.62
131.0335.711.61
141.0733.062.00
150.7719.991.43
160.7735.071.47
170.7925.532.09
180.7124.741.39
190.7728.351.32
200.7238.971.78
210.8851.541.62
220.7046.821.47
230.7736.591.40
240.7341.061.47
250.7462.011.64
260.7245.183.11
270.7247.092.18

This table shows the values of bottleneck radius, length and curvature for the all tunnels. Note that the boldface (25#) presents the values of the largest tunnel.

Figure 2

The largest tunnel (25#) of protein 1A73. The graph shows the red tunnel is the largest tunnel in terms of maximizing (Length*Bottleneck Radius).

All detected tunnels of protein 1A73. The graph shows the CAVER package detects out 27 tunnels in 1A73 protein, and show 3D structure for all tunnels with different colours in protein surface. Index values for all tunnels of 1A73 This table shows the values of bottleneck radius, length and curvature for the all tunnels. Note that the boldface (25#) presents the values of the largest tunnel. The largest tunnel (25#) of protein 1A73. The graph shows the red tunnel is the largest tunnel in terms of maximizing (Length*Bottleneck Radius).

Feature on OB-fold domain

OB-fold is a small structural motif that was first characterized in 1992 in four proteins that bind either oligonucleotides or oligosaccharides [26]. Typically, the OB fold comprises a five-stranded β-sheet coiled to form a closed β barrel and capped by an α-helix located between the third and fourth β strands [27-30]. Although OB-fold has since been observed at protein/protein interfaces as well, but the nucleic acid-binding superfamily is the largest within the OB-folds, and proteins containing OB-folds involve almost any time that single-stranded DNAs or RNAs are present or require manipulation [8]. Now that OB-folds are conserved and play important roles in SSB-ssDNA binding, we extract the feature indicating whether OB-fold is contained in a protein, with the hope that the feature is able to distinguish SSBs with DSBs. Considering that OB-folds evolve into several variants though they are very conserved, we choose the chain A of six typical proteins (PDB:1QUQ [31], 1V1Q [32], 4GS3 [33], 3ULL [34], 1O7I [35], 1JMC [36]) shown in Figure 3 as OB-fold templates. From Figure 3, we can see that these proteins contain nothing except for OB-fold domains. Moreover, each chain of the former five proteins contains one and only one OB-fold domain. Since 1JMC_A contains two OB-fold domains, we only use one of them as the template.
Figure 3

Six templates of the OB-fold domain. They show structural similarity but different topologies, and the similarity of sequences are with <30%.

Six templates of the OB-fold domain. They show structural similarity but different topologies, and the similarity of sequences are with <30%. For an unknown protein, we use the protein structure alignment package TM-align [37] to compare its structure with each of the templates and use the maximal alignment score TM-score as the OB-fold feature of the protein.

Classification model and evaluation

In this work, we used support vector machine (SVM) to build the classification model. The SVM classifiers were implemented using Matlab 2012a SVM package with the Gaussian Radial Basis Function (RBF) as a kernel. In order to evaluate the performance of the prediction results, we used several measures, including Accuracy, Sensitivity, Specificity, and F-measured and area under the receiver operating characteristic curve (AUC). Let TP (true positive) is the number of proteins correctly predicted as SSBs, FP (false positive) is the number of proteins incorrectly predicted as SSBs, TN (true negative) be the number of proteins correctly predicted as DSBs and FN (false negative) be the number of proteins incorrectly predicted as DSBs. The accuracy (ACC), sensitivity (SN), specificity (SP), F-measured (F1) and Matthews Correlation Coefficient (MCC) are defined as the following: We use 10-fold cross validation test to evaluate the classification performance. Because of the unbalance of different kinds of proteins, in each fold we iterate 15 times to randomly select the equal numbers of SSBs and DSBs into the train set by using down-sampling method, and use the voting strategy to assign the class label of the test protein. To the best of our knowledge, there is no computational method to distinguish SSBs from DSBs, therefore we also train the random classifier as the baseline in each test.

Results and discussion

Investigation of the distinguishing ability of the features

By using CAVER3.0, we have detected 990 tunnels from HOLO set (865 for DSBs, 125 for SSBs), and 1168 tunnels from APO set (757 for DSBs, 411 for SSBs). According to the maximizing criterion described above, we selected one maximal tunnel for each protein. As a result, we finally got 37 tunnels for bound (DNA-bound) SSBs, 38 tunnels for unbound (DNA-free) SSBs, 154 tunnels for bound DSBs and 51 tunnels for unbound DSBs. Accordingly, we also got three feature values for each tunnel. By using TM-align, we aligned every protein with each of the six OB-fold templates shown in Figure 3, and got the maximal alignment score as the TM-score of the protein. In order to investigate the distinguishing ability of the features, we had statistically analysed the distribution for each feature, shown in Figure 4. It is obvious that, bottleneck radius shows little difference between DSBs and SSBs in either bound or unbound forms; and the DNA binding protein in bound form tends to have larger bottleneck radius than that in unbound form, which may be due to the fact that the protein usually need to widen the tunnel for binding with the DNA. SSBs tend to have the smaller tunnel length and curvature than DSBs, and tunnel length seems to be more distinguishable than tunnel curvature between DSBs and SSBs; moreover, it seems easier to differentiate DSBs and SSBs in bound forms than in unbound forms by using either of the features. As expected, SSBs obtain much higher TM-scores than DSBs by comparing to the OB-fold templates, illustrating that most SSBs have OB-fold like domains. In conclusion, TM-score, tunnel length and tunnel curvature are usable features to construct distinguish model for SSBs and DSBs, while bottleneck radius is lack of the distinguishing ability. Since the statistical results of tunnel length and tunnel curvature are very similar, we further investigate the correlation between these two features, listed in Table 2 showing that they are actually positive correlated with each other.
Figure 4

Feature distributions of different kinds of DNA-binding proteins. These graphs show the box plot of the four features for the HOLO and APO datasets. Those are (a) tunnel bottleneck radius, (b) tunnel length, (c) tunnel curvature and (d) TM-score.

Table 2

Correlation of tunnel length and tunnel curvature

DatasetProtein typesPearson-coefficientP-value
HOLO setDSBs0.69292.3752e-23
SSBs0.44840.0054
APO setDSBs0.92937.7890e-23
SSBs0.55992.5705e-04
Feature distributions of different kinds of DNA-binding proteins. These graphs show the box plot of the four features for the HOLO and APO datasets. Those are (a) tunnel bottleneck radius, (b) tunnel length, (c) tunnel curvature and (d) TM-score. Correlation of tunnel length and tunnel curvature This table shows the values of Pearson coefficient and P-value between tunnel length and curvature. The columns of Pearson coefficient and P-value correspond to the pairs of DSBs/SSBs in HOLO set and APO set, respectively.

Validation of the differentiating features

We have done the validation experiments on HOLO set and APO set by using one, two or three features to construct the classification models. The validation performances are shown in Table 3, 4 respectively. From the tables we can see that, feature TM-Score can recognize out SSBs with high accuracy, while the feature tunnel length/curvature can recognize out DSBs with high accuracy, meaning that the distinguishing abilities of TM-Score and length/curvature are complementary. The performance of the classification model constructed with length feature is better than that constructed with curvature, also better than or nearly equal to that constructed with length and curvature features, further confirming that curvature feature is redundant with length feature and adding redundant features into the classification model does not necessarily get the positive response. Compared to the model with single feature, the significant enhancement of performance when using TM-Score together with one or more other features showing that constructing classification models with complementary features is preferable to the discrimination of DSBs and SSBs.
Table 3

Performance on HOLO set

FeatureACCSNSPAUCMCCF1
Length0.74700.77250.72580.75390.52070.7681
Curvature0.68080.70580.65250.68180.37600.6949
TM-Score0.70541.00000.40500.66290.50180.7823
Length+Curvature0.77250.79250.75080.77380.56330.7889
Length+TM-Score0.84340.83080.85830.84760.70120.8472
Curvature+TM-Score0.78240.77500.79250.78660.58480.7903
Length+Curvature+TM-Score0.86860.87250.86080.87100.74970.8782
BaselineRandom 1 feature0.50400.50820.48590.4967-0.00560.5553
Random 2 features0.49510.49070.51370.50280.00400.5697
Total 3 features0.49380.49140.50360.4968-0.00430.5789
Table 4

Performance on APO set

FeatureACCSNSPAUCMCCF1
Length0.65330.47830.82670.65480.32050.5732
Curvature0.54870.30330.80000.56760.11190.4005
TM-Score0.80150.95250.64830.81170.64240.8425
Length+Curvature0.64010.46580.81920.64910.31010.5689
Length+TM-Score0.85430.98000.72420.85110.73780.8848
Curvature+TM-Score0.85180.98500.71670.85920.73650.8836
Length+Curvature+TM-Score0.83100.95330.70580.82860.69340.8620
BaselineRandom 1 feature0.49900.49910.49880.4990-0.00250.4712
Random 2 features0.49910.49730.50160.4998-0.00140.4875
Total 3 features0.50190.50540.49730.50190.00280.5035
Performance on HOLO set Performance on APO set

Independent test on APO set

In many cases, it is easier to collect information on DNA binding proteins in the bound form than in unbound form, whereas we need to know whether an unknown unbound protein be SSB or DSB. Thus, we train the classifier on HOLO set and test it on APO set. The results are listed in Table 5 from which we can see that the structural information on tunnel and OB-fold can actually reflect that differences between SSBs and DSBs thus can be used as discriminant features to build the classification model.
Table 5

Performance of the independent test

FeatureACCSNSPAUCMCCF1
Length+TM-Score0.71910.82350.57890.70120.70970.4179
Curvature+TM-Score0.68540.74510.60530.67520.63890.3531
Length+Curvature+TM-Score0.73030.76470.68420.72450.68420.4489
BaselineRandom 1 feature0.50090.50590.49430.49970.00040.4927
Random 2 features0.50110.50140.50070.50100.00220.5312
Total 3 features0.50060.50410.49580.5007-0.00020.5323
Performance of the independent test

Prediction on mixed set

In practice, we often found the available dataset include not only the bound form proteins, but also the unbound form proteins, whereas we need to know whether an unknown DNA binding protein be SSB or DSB. Thus, we have done the validation experiments on the mixed set by using one, two or three features construct the classification models. The results are listed in Table 6 from the tables we can see that, feature TM-Score can still recognize out SSBs with high accuracy in each single feature. Compared to the models with single feature, the best performance using more features with an accuracy of 0.8251, MCC of 0.6632, SN of 0.8605 and SP of 0.7904 is much better. Thus, we further train the classifier on mixed set and predicted the unknown proteins (727 unknowns). The classified results are listed in additional file 2.
Table 6

Performance on mixed set

FeatureACCSNSPAUCMCCF1
Length0.61610.71320.51700.62370.24460.6548
Curvature0.59840.68870.50630.60970.20180.6374
TM-Score0.75580.98460.52300.75710.57560.8104
Length+Curvature0.61140.70040.52110.61690.23000.6457
Length+TM-Score0.79840.88800.70890.80760.62220.8259
Curvature+TM-Score0.77010.94070.59800.78020.58440.8133
Length+Curvature+TM-Score0.82510.86050.79040.82650.66320.8372
BaselineRandom 1 feature0.50300.50700.49180.4992-0.00110.5201
Random 2 features0.49960.49860.50240.50010.00130.5369
Total 3 features0.50140.50210.49920.50070.00150.5525
Performance on mixed set

Conclusion

Despite many similar properties, dsDNA and ssDNA possess distinctive entities that are recognized differently by specialized dsDNA and ssDNA binding proteins, respectively. SSBs and DSBs binding interfaces are thus expected to differ in their geometrical features consistent with the different nature of dsDNA and ssDNA [29,38,39]. While the sequence and structural properties of DSBs and SSBs binding interfaces has been studied during the last decade [28,40], computationally distinguishing between the DSBs and SSBs binding interfaces is still a lack of research. In this study, we investigated surface tunnels features of SSBs and DSBs and found that they have different ranges of tunnel lengths and tunnel curvatures; moreover, the alignment results with OB-fold templates have also found to be the discriminative feature of SSBs and DSBs. Therefore, we made the first try to present a method to computationally distinguish SSBs with DSBs based on the discriminant features and got the satisfactory results. The protein surface features should also be useful for the analysis of other types of molecular interactions, such as protein-ligand, protein-RNA, and protein-protein complexes, and for the study of a variety of proteins, multiple binding sites or a specific family of proteins. These problems would require modelling interface surfaces of different characteristics such as compatibility, different sizes, and cooperatives between these surfaces, thus new surface features in addition to the solid angle may be needed.

Abbreviations

DSBs: double-stranded DNA binding proteins; SSBs: single-stranded DNA binding proteins; ssDNA: single-stranded DNA; dsDNA: double-stranded DNA; OB-fold: OB (oligonucleotide/oligosaccharide binding) -fold; ACC: accuracy; SN: sensitivity; SP: specificity; F1: F-measured; MCC: Matthews Correlation Coefficient. AUC: area under the receiver operating characteristic curve; TP: true positive; FP: false positive; TN: true negative; FN: false negative.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

W.W., J.L. contributed to the software design and testing. W.W. and X.Z. implemented the software. W.W. and J.L. wrote this paper. All authors read and approved the final manuscript.

Additional file 1

This file contains the complete list of PDB codes for DNA-binding proteins set. Click here for file

Additional file 2

This file describes the classified results of the unknown proteins by the mixed set classifier. Click here for file
  39 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

Review 2.  Nucleic acid recognition by OB-fold proteins.

Authors:  Douglas L Theobald; Rachel M Mitton-Fry; Deborah S Wuttke
Journal:  Annu Rev Biophys Biomol Struct       Date:  2003-02-18

3.  Structure of PolC reveals unique DNA binding and fidelity determinants.

Authors:  Ronald J Evans; Douglas R Davies; James M Bullard; Jeffrey Christensen; Louis S Green; Joseph W Guiles; Janice D Pata; Wendy K Ribble; Nebojsa Janjic; Thale C Jarvis
Journal:  Proc Natl Acad Sci U S A       Date:  2008-12-23       Impact factor: 11.205

4.  HOLE: a program for the analysis of the pore dimensions of ion channel structural models.

Authors:  O S Smart; J G Neduvelil; X Wang; B A Wallace; M S Sansom
Journal:  J Mol Graph       Date:  1996-12

5.  Protein clefts in molecular recognition and function.

Authors:  R A Laskowski; N M Luscombe; M B Swindells; J M Thornton
Journal:  Protein Sci       Date:  1996-12       Impact factor: 6.725

6.  Structure of the single-stranded-DNA-binding domain of replication protein A bound to DNA.

Authors:  A Bochkarev; R A Pfuetzner; A M Edwards; L Frappier
Journal:  Nature       Date:  1997-01-09       Impact factor: 49.962

7.  The crystal structure of the complex of replication protein A subunits RPA32 and RPA14 reveals a mechanism for single-stranded DNA binding.

Authors:  A Bochkarev; E Bochkareva; L Frappier; A M Edwards
Journal:  EMBO J       Date:  1999-08-16       Impact factor: 11.598

8.  Structural dynamics and single-stranded DNA binding activity of the three N-terminal domains of the large subunit of replication protein A from small angle X-ray scattering.

Authors:  Dalyir I Pretto; Susan Tsutakawa; Chris A Brosey; Amalchi Castillo; Marie-Eve Chagot; Jarrod A Smith; John A Tainer; Walter J Chazin
Journal:  Biochemistry       Date:  2010-04-06       Impact factor: 3.162

9.  CAVER: a new tool to explore routes from protein clefts, pockets and cavities.

Authors:  Martin Petrek; Michal Otyepka; Pavel Banás; Pavlína Kosinová; Jaroslav Koca; Jirí Damborský
Journal:  BMC Bioinformatics       Date:  2006-06-22       Impact factor: 3.169

10.  TM-align: a protein structure alignment algorithm based on the TM-score.

Authors:  Yang Zhang; Jeffrey Skolnick
Journal:  Nucleic Acids Res       Date:  2005-04-22       Impact factor: 16.971

View more
  2 in total

1.  Analysis and prediction of single-stranded and double-stranded DNA binding proteins based on protein sequences.

Authors:  Wei Wang; Lin Sun; Shiguang Zhang; Hongjun Zhang; Jinling Shi; Tianhe Xu; Keliang Li
Journal:  BMC Bioinformatics       Date:  2017-06-12       Impact factor: 3.169

2.  PredPSD: A Gradient Tree Boosting Approach for Single-Stranded and Double-Stranded DNA Binding Protein Prediction.

Authors:  Changgeng Tan; Tong Wang; Wenyi Yang; Lei Deng
Journal:  Molecules       Date:  2019-12-26       Impact factor: 4.411

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.