| Literature DB >> 15629057 |
Zhengxi Li1, Zuorui Shen, Jingjiang Zhou, Lin Field.
Abstract
Chemosensory proteins (CSPs) are identifiable by four spatially conserved Cysteine residues in their primary structure or by two disulfide bridges in their tertiary structure according to the previously identified olfactory specific-D related proteins. A genomics- and bioinformatics-based approach is taken in the present study to identify the putative CSPs in the malaria-carrying mosquito, Anopheles gambiae. The results show that five out of the nine annotated candidates are the most possible Anopheles CSPs of A. gambiae. This study lays the foundation for further functional identification of Anopheles CSPs, though all of these candidates need additional experimental verification.Entities:
Mesh:
Substances:
Year: 2003 PMID: 15629057 PMCID: PMC5172414 DOI: 10.1016/s1672-0229(03)01034-9
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Previously Identified Insect Chemosensory Proteins*
| Order | Species | Protein name | Length(a.a) | Accession No. | References |
|---|---|---|---|---|---|
| Hymenoptera | ASP3c | 130 | AF481963 | ||
| Lepidoptera | CLP-1 | 130 | U95046 | ||
| SAP1 | 105 | AF117574 | |||
| SAP3 | 126 | AF117585 | |||
| SAP2 | 127 | AF117592 | |||
| SAP5 | 231 | AF117594 | |||
| SAP4 | 127 | AF117599 | |||
| BmorCSP2 | 120 | AF509238 | |||
| BmorCSP1 | 127 | AF509239 | |||
| CSP-MbraA1 | 112 | AF211177 | |||
| CSP-MbraA2 | 112 | AF211178 | |||
| CSP-MbraA3 | 112 | AF211179 | |||
| CSP-MbraA4 | 112 | AF211180 | |||
| CSP-MbraA5 | 112 | AF211181 | |||
| CSP-MbraB1 | 108 | AF211182 | |||
| CSP-MbraB2 | 108 | AF211183 | |||
| CSP-MbraA6 | 128 | AF255918 | |||
| CSP-MbraB3 | 108 | AF255919 | |||
| CSP-MbraB4 | 108 | AF255920 | |||
| HvirCSP2 | 126 | AY101511 | |||
| HvirCSP1 | 114 | AY101512 | |||
| HvirCSP3 | 106 | AY101513 | |||
| SAP | 111 | AY026760 | unpublished | ||
| CSP-Harm | 127 | AF368375 | unpublished | ||
| CSP-Hzea | 128 | AF448448 | unpublished | ||
| Diptera | A10 | 155 | U05244 | ||
| RH70879p | 124 | BT001865 | unpublished | ||
| PEBmeIII | 158 | U08281 | |||
| SAP-1 | 127 | AF437891 | |||
| Orthoptera | CSP-sg1 | 109 | AF070961 | ||
| CSP-sg2 | 109 | AF070962 | |||
| CSP-sg3 | 103 | AF070963 | |||
| CSP-sg4 | 109 | AF070964 | |||
| CSP-sg5 | 109 | AF070965 | |||
| OS-D1 | 103 | AJ251075 | |||
| OS-D2 | 120 | AJ251076 | |||
| OS-D3 | 125 | AJ251077 | |||
| OS-D4 | 125 | AJ251078 | |||
| OS-D5 | 125 | AJ251079 | |||
| Phasmatodea | CSP-ec1 | 107 | AF139196 | ||
| CSP-ec2 | 102 | AF139197 | |||
| CSP-ec3 | 107 | AF139198 | |||
| Dictyoptera | p10 | 130 | AF030340 |
GenBank (04/2003);
Registered names in GenBank;
GenBank accession numbers.
Fig. 1Multiple alignment showing the absolutely conserved domain in insect CSPs: Cx(6,8)Cx(18)Cx(2)Cx(3). All acronyms in the figure refer to the protein names listed in Table 1.
Anopheles CSP Candidates Found by CSPMOT and BLAST (E*-values<0.0001)
| Peptide ID | Previously identified insect CSPs | ||||||
|---|---|---|---|---|---|---|---|
| agCP10968 | SAP-1 | ASP3c | SAP2 | CSP-Harm | HvirCSP2 | OS-D3 | CSP-sg1 |
| (6e-18) | (7e-18) | (6e-17) | (1e-16) | (2e-16) | (2e-16) | (2e-16) | |
| OS-D1 | CSP-sg4 | CSP-sg2 | CSP-MbraA6 | OS-D4 | CSP-Hzea | OS-D5 | |
| (2e-16) | (3e-16) | (3e-16) | (3e-16) | (4e-16) | (5e-16) | (3e-16) | |
| SAP4 | PEBmeIII | CSP-sg5 | OS-D2 | A10 | CSP-ec3 | CSP-MbraA3 | |
| (7e-16) | (9e-16) | (9e-16) | (2e-15) | (2e-15) | (2e-15) | (2e-15) | |
| CSP-MbraA1 | CSP-MbraA2 | CSP-sg3 | p10 | CSP-MbraA5 | SAP3 | ||
| (2e-15) | (3e-15) | (3e-15) | (3e-15) | (7e-15) | (8e-15) | ||
| agCP11079 | SAP-1 | PEBmeIII | ASP3c | p10 | HvirCSP2 | SAP4 | A10 |
| (1e-45) | (1e-34) | (2e-34) | (4e-28) | (3e-27) | (1e-26) | (2e-26) | |
| OS-D2 | CSP-MbraA6 | OS-D3 | SAP3 | BmorCSP1 | SAP5 | CSP-sg1 | |
| (6e-26) | (2e-23) | (4e-23) | (7e-23) | (8e-23) | (9e-23) | (1e-22) | |
| CLP-1 | CSP-MbraA3 | CSP-sg4 | HvirCSP1 | CSP-sg2 | OS-D4 | OS-D5 | |
| (2e-22) | (3e-22) | (4e-22) | (5e-22) | (6e-22) | (7e-22) | (7e-22) | |
| CSP-sg4 | OS-D1 | CSP-MbraA2 | CSP-sg3 | SAP2 | CSP-MbraA3 | agCP11435 | |
| (1e-21) | (1e-21) | (2e-21) | (3e-21) | (4e-21) | (4e-21) | (5e-08) | |
| agCP11481 | ASP3c | PEBmeIII | A10 | SAP4 | HvirCSP2 | SAP5 | SAP3 |
| (4e-28) | (3e-27) | (3e-26) | (3e-25) | (4e-25) | (4e-24) | (2e-23) | |
| CSP-sg2 | SAP-1 | CSP-sg1 | OS-D2 | CSP-MbraA6 | CLP-1 | CSP-sg5 | |
| (1e-22) | (1e-22) | (2e-22) | (3e-22) | (7e-22) | (1e-21) | (2e-21) | |
| CSP-sg4 | CSP-MbraA3 | CSP-sg3 | CSP-MbraA2 | OS-D3 | HvirCSP1 | CSP-MbraA1 | |
| (2e-21) | (9e-21) | (1e-20) | (2e-20) | (3e-20) | (4e-20) | (6e-20) | |
| CSP-MbraA3 | CSP-MbraB1 | SAP2 | CSP-MbraA4 | CSP-MbraA5 | BmorCSP1 | agCP11435 | |
| (7e-20) | (8e-20) | (1e-19) | (2e-19) | (2e-19) | (3e-19) | (1e-06) | |
| agCP11484 | SAP-1 | ASP3c | PEBmeIII | HvirCSP2 | p10 | CSP-MbraA6 | OS-D2 |
| (3e-55) | (1e-31) | (5e-32) | (6e-29) | (4e-28) | (3e-27) | (9e-27) | |
| SAP4 | BmorCSP1 | CLP-1 | SAP2 | A10 | SAP3 | SAP5 | |
| (1e-26) | (2e-26) | (2e-26) | (1e-25) | (6e-25) | (4e-24) | (6e-24) | |
| CSP-Hzea | CSP-MbraA2 | HvirCSP1 | CSP-MbraA1 | CSP-MbraA3 | CSP-Harm | CSP-MbraA4 | |
| (3e-23) | (5e-23) | (5e-23) | (7e-23) | (7e-23) | (9e-23) | (2e-22) | |
| CSP-MbraA5 | CSP-sg1 | CSP-sg4 | CSP-sg2 | OS-D3 | OS-D1 | CSP-ec1 | |
| (2e-22) | (4e-22) | (8e-22) | (1e-21) | (2e-21) | (3e-21) | (3e-21) | |
| CSP-sg5 | agCP11435 | ||||||
| (3e-21) | (2e-08) | ||||||
| agCP11532 | RH70879 | ASP3c | SAP5 | CSP-sg4 | CSP-MbraA6 | CSP-sg5 | CSP-sg2 |
| (5e-30) | (4e-10) | (6e-08) | (6e-08) | (1e-07) | (1e-07) | (2e-07) | |
| HvirCSP2 | CSP-sg1 | SAP2 | CSP-ec3 | BmorCSP2 | CSP-sg3 | OS-D3 | |
| (2e-07) | (3e-07) | (4e-07) | (7e-07) | (7e-07) | (9e-07) | (1e-06) | |
| CLP-1 | CSP-MbraA3 | CSP-MbraA1 | CSP-MbraA2 | CSP-MbraA4 | SAP4 | CSP-MbraA5 | |
| (1e-06) | (1e-06) | (1e-06) | (1e-06) | (1e-06) | (2e-06) | (2e-06) | |
| OS-D2 | OS-D1 | A10 | CSP-Hzea | HvirCSP3 | CSP-ec1 | ||
| (3e-06) | (5e-06) | (1e-05) | (1e-05) | (1e-05) | (3e-05) | ||
| agCP11545 | SAP-1 | PEBmeIII | ASP3c | p10 | HvirCSP2 | A10 | OS-D2 |
| (3e-40) | (2e-33) | (2e-32) | (1e-30) | (6e-30) | (1e-28) | (1e-27) | |
| SAP4 | CSP-MbraA6 | CSP-sg1 | BmorCSP1 | SAP3 | CSP-sg4 | CSP-sg2 | |
| (3e-26) | (1e-25) | (3e-24) | (3e-24) | (3e-24) | (6e-24) | (6e-24) | |
| OS-D3 | SAP5 | CSP-sg5 | CSP-sg3 | HvirCSP1 | OS-D4 | OS-D1 | |
| (1e-23) | (2e-23) | (3e-23) | (3e-23) | (4e-23) | (5e-23) | (7e-23) | |
| CSP-MbraA3 | CSP-MbraA5 | CLP-1 | CSP-ec1 | CSP-MbraA2 | CSP-Hzea | agCP11435 | |
| (2e-22) | (2e-22) | (2e-22) | (5e-22) | (7e-22) | (1e-21) | (7e-09) | |
| agCP6514 | no matches | ||||||
| agCP12965 | no matches | ||||||
| agCP11435 | OS-D1 | OS-D3 | OS-D5 | OS-D4 | SAP-1 | SAP4 | |
| (2e-12) | (5e-10) | (7e-10) | (1e-09) | (2e-09) | (2e-08) | (3e-08) | |
| PEBmeIII | OS-D2 | HvirCSP2 | ASP3c | CSP-sg5 | CSP-ec2 | CSP-sg4 | |
| (3e-08) | (4e-08) | (8e-08) | (1e-07) | (3e-07) | (5e-07) | (5e-07) | |
| CSP-sg3 | CSP-sg2 | CSP-sg1 | CSP-ec1 | CSP-ec3 | p10 | CSP-MbraA6 | |
| (6e-07) | (9e-07) | (9e-07) | (2e-06) | (2e-06) | (5e-05) | (5e-05) | |
| CSP-Hzea | CSP-Harm | CSP-MbraA3 | CSP-MbraA1 | ||||
| (1e-04) | (1e-04) | (1e-04) | (1e-04) | ||||
No significant hits obtained by BLAST;
BLAST identified;
The E value is a parameter that describes the number of hits one can “expect” to see just by chance when searching a database of a particular size. All acronyms of insect CSPs refer to protein names listed in Table 1.
Chromosomal Location, New ORFs, Signal Peptides, and Biochemical Properties of Anopheles CSP Candidates
| Celera_ID | GB_ID | Chrom | Scaffold No. | Original length (a.a.) | New ORF length (a.a.) | Signal peptide | MW (kDa) | pI | Hydrophobic a.a. (%) |
|---|---|---|---|---|---|---|---|---|---|
| agCP10968 | EAA12703 | 3R | AAAB01008964 | 127 | 109 | none | 12.3 | 9.5 | 25.7 |
| agCP11079 | EAA12353 | 3R | AAAB01008964 | 143 | 127 | 1–17 | 14.8 | 5.4 | 33.1 |
| agCP11481 | EAA12591 | 3R | AAAB01008964 | 137 | 123 | 1–19 | 14.3 | 9.4 | 29.3 |
| agCP11484 | EAA12322 | 3R | AAAB01008964 | 149 | 127 | 1–17 | 14.7 | 8.6 | 33.1 |
| agCP11532 | EAA12601 | 3R | AAAB01008964 | 150 | 117 | 1–33 | 12.9 | 9.8 | 41.0 |
| agCP11545 | EAA12338 | 3R | AAAB01008964 | 141 | 126 | 1–17 | 14.6 | 8.6 | 31.0 |
| agCP11435 | EAA12702 | 3R | AAAB01008964 | 102 | 137 | 1–16 | 15.7 | 5.0 | 35.0 |
| agCP12965 | EAA05664 | 3L | AAAB01008834 | 173 | 137 | none | 14.1 | 3.5 | 23.4 |
| agCP6514 | EAA10937 | 2L | AAAB01008960 | 132 | 117 | 1–19 | 13.0 | 8.6 | 25.6 |
Secondary Structure of Anopheles CSP Candidates*
| Peptide ID | Predicted secondory structure (%) | Globularity | |||
|---|---|---|---|---|---|
| Helix | Sheet | Loop | Class | ||
| agCP10968 | 25.70 | 15.50 | 58.80 | mixed | appears not to be globular |
| agCP11079 | 72.40 | 0.00 | 27.60 | all-alpha | may be globular, but it is not as compact as a domain |
| agCP11481 | 71.50 | 0.00 | 28.50 | all-alpha | may be globular, but it is not as compact as a domain |
| agCP11484 | 71.70 | 1.60 | 26.80 | all-alpha | appears as compact, as a globular domain |
| agCP11532 | 72.70 | 0.00 | 27.40 | all-alpha | appears as compact, as a globular domain |
| agCP11545 | 68.20 | 2.40 | 29.40 | all-alpha | may be globular, but it is not as compact as a domain |
| agCP11435 | 75.90 | 0.00 | 24.10 | all-alpha | may be globular, but it is not as compact as a domain |
| agCP12965 | 13.10 | 16.80 | 70.10 | mixed | appears as compact, as a globular domain |
| agCP6514 | 10.30 | 0.00 | 89.70 | mixed | appears not to be globular |
Prediction server—http://www.sbg.bio.ic.ac.uk/3dpssm.
Fig. 2Homology modelling of Anopheles CSP candidates with 1K19_A_MbraCSP (PDB ID: 1K19) as the model. A. 3D-structure of 1K19_A_MbraCSP showing the position of two disulfide bridges; B-J: predicted 3D-structure of Anopheles CSPs, including agCP11079, agCP10968, agCP11435, agCP11481, agCP11484, agCP11532, agCP 11545, agCP12695, and agCP6514. The figures are generated in Swiss-PdbViewer.
Fig. 3Partial alignment of candidate Anopheles CSPs with the previously identified CSPs shows the surplus sequence of agCP11435 that causes its 3D-structural abnormality. All acronyms of the gene names refer to the protein names listed in Table 1.