| Literature DB >> 30237677 |
Chowdhury Nilkanta1, Angshuman Bagchi1.
Abstract
The DNA-protein interactions play vital roles in the central dogma of molecular biology. Proper interactions between DNA and protein would lead to the onset of various biological phenomena like transcription, translation, and replication. However, the mechanisms of these well-known processes vary between prokaryotic and eukaryotic organisms. The exact molecular mechanisms of these processes are unknown. Therefore, it is of interest to report the comparative estimate of the different properties of the DNA binding proteins from prokaryotic and eukaryotic organisms. We analyzed the different sequence-based features such as the frequency of amino acids and amino acid groups in the proteins of prokaryotes and eukaryotes by statistical measures. The general pattern of differences between the various DNA binding proteins for the development of a prediction system to discriminate between these proteins between prokaryotes and eukaryotes is documented.Entities:
Keywords: DNA binding proteins; Distribution of amino acid residues; Prokaryotic and Eukaryotic Organisms; Transcription factors
Year: 2018 PMID: 30237677 PMCID: PMC6137564 DOI: 10.6026/97320630014315
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Figure 1Flowchart diagram of the in-house python tool.
The distribution of the dataset.
| DNA Binding Protein (DBP) dataset | Transcription Factor (TF) Dataset | ||
| Prokaryote 1 - 270 | Eukaryote Set-1 1 - 270 | Prokaryote 1-92 | Eukaryote Set-1 1-92 |
| Eukaryote Set-2 78 - 347 | Eukaryote Set-2 91 - 182 | ||
Figure 2The bar-graph representation of amino acids and amino acid group frequency in prokaryotes and eukaryotes (Blue: Prokaryote; Red: Eukaryote Set-1; Green: Eukaryote Set-2).
Figure 3Amino acids and amino acid group frequency from TF dataset.
Figure 4Amino acids and amino acid group frequency from DBP dataset.
Results obtained from WEKA analysis.
| (Transcription Factor Set-1) | |||||||||
| Total Number of Instances | 184 | ||||||||
| Correctly Classified Instances | 94.02% | ||||||||
| Incorrectly Classified Instances | 5.98% | ||||||||
| === Detailed Accuracy By Class === | |||||||||
| TP Rate | FP Rate | Precision | Recall | F-Measure | MCC | ROC Area | PRC Area | Class | |
| 0.924 | 0.043 | 0.955 | 0.924 | 0.939 | 0.881 | 0.94 | 0.92 | Prokaryot | |
| 0.957 | 0.076 | 0.926 | 0.957 | 0.941 | 0.881 | 0.94 | 0.908 | Eukaryot | |
| Weighted Avg. | 0.94 | 0.06 | 0.941 | 0.94 | 0.94 | 0.881 | 0.94 | 0.914 | |
| (Transcription Factor Set-2) | |||||||||
| Total Number of Instances | 184 | ||||||||
| Correctly Classified Instances | 93.48% | ||||||||
| Incorrectly Classified Instances | 6.52% | ||||||||
| === Detailed Accuracy By Class === | |||||||||
| TP Rate | FP Rate | Precision | Recall | F-Measure | MCC | ROC Area | PRC Area | Class | |
| 0.924 | 0.054 | 0.944 | 0.924 | 0.934 | 0.87 | 0.935 | 0.911 | Prokaryot | |
| 0.946 | 0.076 | 0.926 | 0.946 | 0.935 | 0.87 | 0.935 | 0.902 | Eukaryot | |
| Weighted Avg. | 0.935 | 0.065 | 0.935 | 0.935 | 0.935 | 0.87 | 0.935 | 0.907 | |
| (DNA Binding Protein Set-1) | |||||||||
| Total Number of Instances | 540 | ||||||||
| Correctly Classified Instances | 88.33% | ||||||||
| Incorrectly Classified Instances | 11.67% | ||||||||
| === Detailed Accuracy By Class === | |||||||||
| TP Rate | FP Rate | Precision | Recall | F-Measure | MCC | ROC Area | PRC Area | Class | |
| 0.863 | 0.096 | 0.9 | 0.863 | 0.881 | 0.767 | 0.883 | 0.845 | Prokaryot | |
| 0.904 | 0.137 | 0.868 | 0.904 | 0.886 | 0.767 | 0.883 | 0.833 | Eukaryot | |
| Weighted Avg. | 0.883 | 0.117 | 0.884 | 0.883 | 0.883 | 0.767 | 0.883 | 0.839 | |
| (DNA Binding Protein Set-2) | |||||||||
| Total Number of Instances | 540 | ||||||||
| Correctly Classified Instances | 90% | ||||||||
| Incorrectly Classified Instances | 10% | ||||||||
| === Detailed Accuracy By Class === | |||||||||
| TP Rate | FP Rate | Precision | Recall | F-Measure | MCC | ROC Area | PRC Area | Class | |
| 0.904 | 0.104 | 0.897 | 0.904 | 0.9 | 0.8 | 0.9 | 0.859 | Prokaryot | |
| 0.896 | 0.096 | 0.903 | 0.896 | 0.9 | 0.8 | 0.9 | 0.861 | Eukaryot | |
| Weighted Avg. | 0.9 | 0.1 | 0.9 | 0.9 | 0.9 | 0.8 | 0.9 | 0.86 | |