| Literature DB >> 21573033 |
E Ashrafi1, A Alemzadeh, M Ebrahimi, E Ebrahimie, N Dadkhodaei, M Ebrahimi.
Abstract
Phytoremediation refers to the use of plants for extraction and detoxification of pollutants, providing a new and powerful weapon against a polluted envEntities:
Keywords: ATPase pumps; bioinformatics; environment; heavy metals; modeling; transporter
Year: 2011 PMID: 21573033 PMCID: PMC3091408 DOI: 10.4137/BBI.S6206
Source DB: PubMed Journal: Bioinform Biol Insights ISSN: 1177-9322
Identifying the most important protein features in discrimination of hyperaccumulator pumps from nonhyperaccumulators by different weighting algorithms (value nearer to 1 shows higher effectiveness of attribute in generating hyperaccumulator pump).
| Gly-Glu count | 1.00 | |
| Ser-Tyr count | 0.91 | |
| Lys-Ser count | 0.88 | |
| Lys count | 0.84 | |
| Frequency of Cys-Glu, Cys count | 0.83 | |
| Frequency of Lys-Ser | 0.80 | |
| Lie-Cys count, frequency of Asp-Cys | 0.79 | |
| Frequency of Asn-Lys | 0.7 | |
| Asp-Cys count, hydrophilic residues, Gly-Asn | 0.75 | |
| Frequency of Phe-Cys, Asn-Lys count | 0.74 | |
| Ser count | 0.73 | |
| Asp-Ser count | 0.71 | |
| Frequency of Cys | 0.70 | |
| Glu-Asn count | 1.00 | |
| Frequency of Glu-Asn | 0.92 | |
| Ser-Ala count | 0.88 | |
| Gly-Asp count | 0.86 | |
| Gly-Pro count | 0.85 | |
| Frequency of Ser-Ala | 0.83 | |
| Gln-Val count | 0.79 | |
| Leu-Gln count | 0.78 | |
| Frequency of Gly-Pro | 0.74 | |
| Frequency of Ser-Ser; Ser-Asn Gln-Ile and Ser-Asn counts | 0.73 | |
| Glu-Lys count; frequency of Gln-Val | 0.72 | |
| Frequency of Ser-Cys; Val-Phe, Arg-Leu, Asp-Pro counts | 0.71 | |
| Frequency of Val-Ser, Asp-Pro | 0.7 | |
| Frequency of Phe-His | 1.00 | |
| Phe-His count | 0.83 | |
| Cys-Met count, frequency of Cys-Met | 0.73 | |
| Gly count | 1.00 | |
| Gly count | 1.00 | |
| Val-Phe count | 0.7 | |
| Gly count | 1.00 | |
| Trp-Asn count | 0.88 | |
| Trp-Tyr count | 0.78 | |
| Reduced extinction coefficient at 280 nm | 1.00 | |
| Gly count | 1.00 | |
| Val-Phe count | 0.92 | |
| Frequency of Val-Phe | 0.88 | |
| Frequency of Gln-Val | 0.57 | |
| Val-Phe count | 1.00 | |
| Frequency of Val-Phe | 0.98 | |
| Frequency of Lys-Ser | 0.96 | |
| Lys-Ser count | 0.96 | |
| Met-Lys count | 0.92 | |
| Frequency of Leu-Gln; Gly-Asn count | 0.88 | |
| Thr-Ser, Gly, Val-Glu, Asp-Pro, Gln-Ile, Gly-Pro, Phe-His, Asp-Phe, Arg-Gly, Arg-Leu, Pro-Thr counts; frequency of Gly-Arg, Asp-Phe, Arg-Leu, Val-Glu | 0.83 | |
| Ser-Cys, Ala-Leu, Gly-Trp, Lys-Pro, Phe-Ala, Tyr-Pro, Ala-His, Pro-Arg counts; frequency of Glu-Asp, Tyr-Pro | 0.77 | |
| Pro-Ile count | 0.73 | |
| Gly-Leu, Sulfur, Cys-Cys, Glu, Phe-Glu, Met-Thr, Tyr-His, Cys-Gly, Asp-Thr, Pro-Ser, Arg-Pro, Gln-Cys counts; negatively charged residues, Leu-His, Cys-Pro, Ser-Thr; frequency of His-Glu, Asp, Thr-Ala; negatively charged residues, Pro, Cys-Cys, Trp-Lys, Asp-Thr, Gln-Cys, Trp, Leu-Trp, Pro-Thr | 0.71 |
Figure 1.Tree induced by decision tree algorithm on discretized data with gain ratio criterion.
Abbreviations: H, hyperaccumulator; T, tolerant.
Mean ± standard error of the mean for performances of rule induction and tree induction models. Horizontal continuation of this table is placed on page 17.
| Discretized | 66.62 | 10.24 | 69.24 | 9.35 | 82.78 | 13.65 | 84.0 | 12.4 | 62.9 | 12.1 | 49.2 | 21.9 | |||
| Numerical | 70.67 | 8.27 | 77.18 | 9.24 | 74.94 | 7.12 | 83.1 | 0.08 | 68.1 | 11.7 | 59.3 | 11.6 | |||
| Decision tree | Gain ratio | 72.57 | 6.25 | 86.34 | 12.23 | 69.11 | 16.48 | 0.847 | 0.044 | 0.745 | 0.058 | 0.685 | 0.069 | ||
| Information gain | 75.24 | 10.66 | 78.55 | 11.21 | 81.47 | 13.61 | 0.870 | 0.105 | 0.679 | 0.165 | 0.649 | 0.180 | |||
| Gini index | 69.57 | 10.32 | 72.93 | 13.36 | 87.22 | 14.24 | 0.787 | 0.216 | 0.628 | 0.168 | 0.479 | 0.166 | |||
| Accuracy | 56.86 | 12.11 | 70.95 | 17.63 | 51.00 | 16.76 | 0.774 | 0.110 | 0.677 | 0.129 | 0.617 | 0.137 | |||
| ID3 | Gain ratio | 74.00 | 9.73 | 84.62 | 10.67 | 71.44 | 10.59 | 0.889 | 0.109 | 0.654 | 0.181 | 0.569 | 0.212 | ||
| Information gain | 80.86 | 8.94 | 89.74 | 10.23 | 78.28 | 12.83 | 0.936 | 0.092 | 0.716 | 0.179 | 0.718 | 0.139 | |||
| Gini index | 74.00 | 17.05 | 84.94 | 15.07 | 70.14 | 19.04 | 0.769 | 0.119 | 0.633 | 0.168 | 0.519 | 0.222 | |||
| Accuracy | 65.10 | 8.90 | 84.57 | 11.24 | 53.50 | 14.22 | 0.808 | 0.082 | 0.612 | 0.153 | 0.526 | 0.146 | |||
| Decision tree | Gain ratio | 80.10 | 10.34 | 91.89 | 10.81 | 75.53 | 17.07 | 0.960 | 0.067 | 0.524 | 0.065 | 0.658 | 0.166 | ||
| Information gain | 81.38 | 8.93 | 86.84 | 9.56 | 82.47 | 12.39 | 0.917 | 0.073 | 0.750 | 0.144 | 0.693 | 0.152 | |||
| Gini index | 73.05 | 8.60 | 78.58 | 11.65 | 78.28 | 6.79 | 0.861 | 0.079 | 0.705 | 0.140 | 0.641 | 0.124 | |||
| Accuracy | 74.62 | 9.21 | 81.31 | 5.66 | 74.97 | 15.92 | 0.863 | 0.042 | 0.771 | 0.075 | 0.711 | 0.088 | |||
| ID3 | Gain ratio | 80.14 | 10.08 | 86.17 | 10.08 | 80.67 | 14.96 | 0.950 | 0.066 | 0.507 | 0.022 | 0.638 | 0.161 | ||
| Information gain | 80.10 | 8.80 | 88.38 | 8.80 | 77.17 | 13.30 | 0.930 | 0.103 | 0.706 | 0.183 | 0.705 | 0.151 | |||
| Gini index | 82.24 | 7.52 | 90.32 | 7.49 | 79.56 | 12.93 | 0.917 | 0.116 | 0.684 | 0.159 | 0.703 | 0.122 | |||
| Accuracy | 80.29 | 9.94 | 90.97 | 8.02 | 75.22 | 16.14 | 0.931 | 0.075 | 0.843 | 0.085 | 0.761 | 0.113 | |||
Abbreviations: SE, standard error of the mean; AUC, area under curve.
Rule sets (with supports >50%) induced by FP-growth itemset mining on discretized data (Pt was cation transport ATPase (P-type) family; group was animal was cu transporter (low 0–0.35, mid 0.35–0.5, high >0.5)).
| 0.856 | Pro-Cys count was mid | ||
| 0.842 | Protein family was Pt | ||
| 0.801 | Frequency of Pro-Cys was mid | ||
| 0.801 | Pro-Cys count was mid | Frequency of Pro-Cys was mid | |
| 0.705 | Pro-Cys count was mid | Protein family was Pt | |
| 0.685 | Frequency of Gly-Ile was mid | ||
| 0.664 | Group was animal | ||
| 0.664 | Protein family was Pt | Frequency of Pro-Cys was mid | |
| 0.664 | Pro-Cys count was mid | Protein family was Pt | Frequency of Pro-Cys was mid |
| 0.630 | Leu-Val count was mid | ||
| 0.623 | Val-Leu count was high | ||
| 0.616 | Frequency of Gly-Thr was mid | ||
| 0.589 | Pro-Cys count was mid | Frequency of Gly-Ile was mid | |
| 0.589 | Protein family was Pt | Group was animal | |
| 0.582 | Frequency of Thr-Gly was hiigh | ||
| 0.582 | Frequency of Leu-Lie was mid | ||
| 0.582 | Gly-Lie count was high | ||
| 0.575 | Frequency of Lie-Val was high | ||
| 0.575 | Frequency of Lys-Arg was mid | ||
| 0.575 | Frequency of His- Pro was mid | ||
| 0.575 | Frequency of Pro-Cys was mid | Frequency of Gly-Ile was mid | |
| 0.575 | Pro-Cys count was mid | Frequency of Pro-Cys was mid | Frequency of Gly-Ile was mid |
| 0.568 | Frequency of Leu-Val was mid | ||
| 0.568 | Leu-Lie count was mid | ||
| 0.568 | Pro-Cys count was mid | Group was animal | |
| 0.568 | Protein family was Pt | Frequency of Gly-Ile was mid | |
| 0.562 | Thr-Leu count was mid | ||
| 0.562 | Pro-Cys count was mid | Leu-Val was mid group was A | |
| 0.562 | Frequency of Pro-Cys was mid | ||
| 0.562 | Pro-Cys count was mid | Frequency of Pro-Cys was mid | Group was animal |
| 0.555 | Thr-Val count was high | ||
| 0.555 | Thr-Gly count was high | ||
| 0.555 | Lys-Arg count was mid | ||
| 0.548 | Frequency of Val-Leu was high | ||
| 0.548 | Frequency of Thr-Arg was mid | ||
| 0.548 | Pro-Cys count was mid | Frequency of Gly-Thr was mid | |
| 0.541 | Frequency of Lie-Gly was mid | ||
| 0.541 | Lie-Pro count was mid | ||
| 0.541 | Leu-Met count was mid | ||
| 0.541 | Protein family was Pt | Val-Leu count was high | |
| 0.534 | Frequency of Val-Val was mid | ||
| 0.534 | Frequency of Gly-Leu was mid | ||
| 0.534 | Lie-Val count was high | ||
| 0.527 | Frequency of Thr-Val was mid | ||
| 0.527 | Frequency of Leu-Ala was mid | ||
| 0.527 | His-Pro count was mid | ||
| 0.527 | Pro-Cys count was mid | Val-Leu count was high | |
| 0.527 | Pro-Cys count was mid | Frequency of Lie-Val was high | |
| 0.521 | Lie-Gly count was mid | ||
| 0.521 | Phe-Gly count was mid | ||
| 0.521 | Pro-Cys count was mid | Frequency of His-Pro was mid | |
| 0.521 | Protein family was Pt | Frequency of His-Pro was mid | |
| 0.521 | Frequency of Pro-Cys was mid | Leu-Val count was mid | |
| 0.521 | Pro-Cys count was mid | Frequency of Pro-Cys was mid | Leu-Val count was mid |
| 0.514 | Frequency of Pro-Val was mid | ||
| 0.514 | Ala-Gln count was mid | ||
| 0.514 | Pro-Cys count was mid | Frequency of Thr-Gly was high | |
| 0.514 | Pro-Cys count was mid | Gly-Ile count was high | |
| 0.514 | Frequency of Pro-Cys was mid | Val-Leu count was high | |
| 0.514 | Pro-Cys count was mid | Frequency of Pro-Cys was mid | Val-Leu count was high |
| 0.507 | Frequency of Val-Ser was mid | ||
| 0.507 | Frequency of Thr-Cys was mid | ||
| 0.507 | Frequency of Phe-Gly was mid | ||
| 0.507 | Val-Gly count was mid | ||
| 0.507 | Val-Glu count was mid | ||
| 0.507 | Thr-Ala count was high | ||
| 0.507 | Leu-Gly count was mid | ||
| 0.507 | Leu-Ala count was high | ||
| 0.507 | Ala-Thr count was mid | ||
| 0.507 | Pro-Cys count was mid | Frequency of Leu-lie was mid | |
| 0.507 | Protein family was Pt | Leu-Val count was mid | |
| 0.507 | Protein family was Pt | Frequency of Lie-Val was high | |
| 0.507 | Frequency of Pro-Cys was mid | Frequency of Gly-Thr was mid | |
| 0.507 | Frequency of Pro-Cys was mid | Frequency of lie-Val was high | |
| 0.507 | Frequency of Lie-Val was high | Lie-Val count was high | |
| 0.507 | Frequency of Lys-Arg was mid | Lys-Arg count was mid |
Accession, metals, type of pump, and organism of each amino acid sequence of P1-ATPase.
| Q70Q04 | Zn/Cd | H | |
| Q9UVL6 | Cu | H | |
| Q9P983 | Cd | H | |
| Q9P458 | Cu | H | |
| Q96WX2 | Cu | H | |
| Q941L1 | Cu | H | |
| Q92T56 | Zn/Cd/Pb | H | |
| Q8ZS90 | Cu/Ag | H | |
| Q8H028 | Cu | H | |
| Q88CP1 | Cd | H | |
| Q7XU05 | Cu | H | |
| Q70LF4 | Zn/Cd | H | |
| Q6ZDR8 | Cu | H | |
| Q6JAg2 | Cu | H | |
| Q6H7M3 | Cu | H | |
| Q6H6Z1 | Cu | H | |
| Q69AX6 | Zn/Cd/co | H | |
| Q655X4 | Cu | H | |
| Q5AQ24 | Cu | H | |
| Q5API0 | Cu | H | |
| Q59465 | Zn/Cd/co | H | |
| Q59385 | Cu | H | |
| Q4WQF3 | Cu | H | |
| Q3ZDL9 | Zn/Cd | H | |
| Q2I7E8 | Cd | H | |
| Q10QZ3 | Cu | H | |
| Q10QZ2 | Cu | H | |
| Q0JB51 | Cu | H | |
| Q0E3J1 | Cu | H | |
| Q0DAA4 | Cu | H | |
| P38360 | Cd | H | |
| B8BBV4 | Cu/Ag | H | |
| B8B185 | Cu/Ag | H | |
| B8APM8 | Cu/Ag | H | |
| B8AIJ3 | Cu/Ag | H | |
| B8ADR7 | Cu/Ag | H | |
| B6HT11 | Cu | H | |
| B6HC49 | Cu | H | |
| B6H689 | Cu | H | |
| B6H165 | Cu | H | |
| B6GWG5 | Cu | H | |
| B5VEN9 | Cd | H | |
| B3LML9 | Cd | H | |
| B2Y4P1 | Zn/Cd | H | |
| B2Y4N2 | Zn/Cd | H | |
| B2Y4N1 | Zn/Cd | H | |
| B2APT4 | Cu | H | |
| B2AAH3 | Cu | H | |
| B0Y4L9 | Cu | H | |
| B0XWU3 | Cu | H | |
| A6ZLN2 | Cd | H | |
| A5DRE2 | Cu | H | |
| A5DHC6 | Cu | H | |
| A3BU99 | Cu | H | |
| A3BEE3 | Cu | H | |
| A3AWA4 | Cu | H | |
| A1CL19 | Cu | H | |
| A1CII4 | Cu | H | |
| Q60048 | Cd | S | |
| Q31HQ5 | Cu/Ag | S | |
| Q31H35 | Cu2+/Cu/mg | S | |
| Q31E73 | Cu/Ag | S | |
| Q31DS4 | Cu/Ag | S | |
| B5AXL4 | Cu | S | |
| Q9ZHC7 | Cu | T | |
| Q9SZW4 | Zn/Cd | T | |
| Q9SH30 | Cu | T | |
| Q9S7J8 | Cu | T | |
| Q9JZI0 | Cu | T | |
| Q9I147 | Zn/Cd/Pb | T | |
| Q9C594 | Cu | T | |
| Q94KD6 | Cu | T | |
| Q8ZRG7 | Cu/Ag | T | |
| Q8VPE6 | Cu2+/Cu/Ag | T | |
| Q8RVG7 | Cd | T | |
| Q8LPW1 | Zn/Cd | T | |
| Q8L158 | Zn/Cd | T | |
| Q8H384 | Zn/Cd | T | |
| Q88RT8 | co | T | |
| Q830Z1 | co | T | |
| Q7Y051 | Cu | T | |
| Q7SGS2 | Cu | T | |
| Q7S316 | Zn/Cd/pb | T | |
| Q7RZE4 | Cu | T | |
| Q7A3E6 | Cu | T | |
| Q75C31 | Cu | T | |
| Q750J2 | Cu | T | |
| Q72N56 | Cu/Ag | T | |
| Q6MK07 | Cu/Ag | T | |
| Q6JAH7 | Cu | T | |
| Q6JAg3 | Cu | T | |
| Q6CS43 | Cu | T | |
| Q6CKX1 | Cu | T | |
| Q6BVG6 | Cu | T | |
| Q6BIS6 | Cu | T | |
| Q654Y9 | co | T | |
| Q5K722 | Cu | T | |
| Q58AE3 | Cu/Ag/Zn/Cd/pb | T | |
| Q4WYE4 | Cu | T | |
| Q4PI36 | Cu | T | |
| Q4PFU4 | Cu | T | |
| Q3MNJ6 | Cu/Ag | T | |
| Q3E9R8 | Cu | T | |
| Q12685 | Cu | T | |
| Q0WUP4 | Cd | T | |
| Q0WPL5 | Cu | T | |
| Q0D7L9 | Zn/Cd/pb | T | |
| P37617 | Zn/Cd/pb/Au | T | |
| P32113 | Cu | T | |
| P20021 | Cd | T | |
| P0A503 | Zn/Cd/pb | T | |
| P05425 | Cu2+/Cu/Ag | T | |
| O67432 | Cu/Ag | T | |
| O67203 | Cu2+ | T | |
| O64474 | Zn/Cd | T | |
| O32220 | Cu | T | |
| O32219 | Zn/Cd/co | T | |
| O31688 | Co | T | |
| B9WHL7 | Cu | T | |
| B9W8U7 | Cu | T | |
| B8PIS7 | Cu | T | |
| B8PD13 | Cu | T | |
| B8B248 | Zn/Cd/pb | T | |
| B8B1T9 | co | T | |
| B6TVS8 | Cu | T | |
| B6K2D1 | Cu | T | |
| B5AXM3 | Cu | T | |
| B5AXJ3 | Cu | T | |
| B5AXJ0 | Cu | T | |
| B5AXI8 | Cu | T | |
| B5AXI7 | Cu | T | |
| B5AXI6 | Cu | T | |
| B4FW89 | co | T | |
| B3LG21 | Cu | T | |
| A9NIX0 | Zn/Cd/pb | T | |
| A8FHF8 | Cu/Ag | T | |
| A8FHE7 | Zn/Cd/co | T | |
| A8FCJ1 | co | T | |
| A7ISW5 | Cu | T | |
| A6ZYM2 | Cu | T | |
| A5E2U1 | Cu | T | |
| A5E1L1 | Cu | T | |
| A3LVL5 | Cu | T | |
| A3LRS8 | Cu | T | |
| A3GG72 | Cu | T | |
| A3BI12 | Zn/Cd/pb | T | |
| A3BF39 | Zn/Cd/pb | T | |
| A2YJN9 | Zn/Cd/pb | T | |
| A2YED2 | Zn/Cd/pb | T | |
| A1D6E8 | Cu | T | |
| A1CW79 | Cu | T | |
| Q8J286 | Cu | ||
| Q0WXV8 | Cu | ||
| Q0SAU6 | Cu/Ag | ||
| B8PCW0 | Zn/Cd/pb | ||
| B2WP89 | Cu | ||
| B2WCY5 | Cu | ||
| B2W577 | Cu | ||
| B0STR2 | Cu | ||
| A7TLU7 | Cu | ||
| A7JVC8 | Cu | ||
| A6SEF3 | Cu | ||
| A6SAI2 | Cu | ||
| A6RXG0 | Cu | ||
| A6RAT8 | Cu | ||
| A6R8J5 | Cu | ||
| A4RDM4 | Cu | ||
| A4QR04 | Cu |
Abbreviations: H, hyperaccumulator; T, tolerant; S, Sensitive
Standard amino acid abbreviations.
| Alanine | Ala | A |
| Arginine | Arg | R |
| Asparagine | Asn | N |
| Aspartic acid | Asp | D |
| Cysteine | Cys | C |
| Glutamic acid | Glu | E |
| Glutamine | Gln | Q |
| Glycine | Gly | G |
| Histidine | His | H |
| Isoleucine | Ile | I |
| Leucine | Leu | L |
| Lysine | Lys | K |
| Methionine | Met | M |
| Phenylalanine | Phe | F |
| Proline | Pro | P |
| Serine | Ser | S |
| Threonine | Thr | T |
| Tryptophan | Trp | W |
| Tyrosine | Tyr | Y |
| Valine | Val | V |
Abbreviations: H, hyperaccumulator; T, tolerant; S, sensitive.