Literature DB >> 23292137

Comparative genomics of odorant binding proteins in Anopheles gambiae, Aedes aegypti, and Culex quinquefasciatus.

Malini Manoharan1, Matthieu Ng Fuk Chong, Aurore Vaïtinadapoulé, Etienne Frumence, Ramanathan Sowdhamini, Bernard Offmann.   

Abstract

About 1 million people in the world die each year from diseases spread by mosquitoes, and understanding the mechanism of host identification by the mosquitoes through olfaction is at stake. The role of odorant binding proteins (OBPs) in the primary molecular events of olfaction in mosquitoes is becoming an important focus of biological research in this area. Here, we present a comprehensive comparative genomics study of OBPs in the three disease-transmitting mosquito species Anopheles gambiae, Aedes aegypti, and Culex quinquefasciatus starting with the identification of 110 new OBPs in these three genomes. We have characterized their genomic distribution and orthologous and phylogenetic relationships. The diversity and expansion observed with respect to the Aedes and Culex genomes suggests that the OBP gene family acquired functional diversity concurrently with functional constraints posed on these two species. Sequences with unique features have been characterized such as the "two-domain OBPs" (previously known as Atypical OBPs) and "MinusC OBPs" in mosquito genomes. The extensive comparative genomics featured in this work hence provides useful primary insights into the role of OBPs in the molecular adaptations of mosquito olfactory system and could provide more clues for the identification of potential targets for insect repellants and attractants.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23292137      PMCID: PMC3595023          DOI: 10.1093/gbe/evs131

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

The spread of infectious diseases among humans is mediated primarily by the world’s most dangerous animal, the mosquitoes among which the anthrophilic mosquitoes such as Anopheles gambiae, Anopheles funestus, Aedes albopictus, Aedes aegypti, and Culex quinquefasciatus are the most effective transmitters of viruses and parasites. They are responsible for the spread of a number of life-threatening diseases such as malaria, dengue, and West Nile encephalitis and recently Chikungunya with a lower mortality rate compared with the other diseases. According to the World Health Organization, global climate change is expanding mosquitoes range, heightening the risk of disease for millions of additional people. Primary prevention is one of the most important aspects to subside the spread of diseases either by controlling the population of these vectors or by preventing the interaction between the vector and the host. Understanding the molecular mechanism for human host recognition mediated by olfaction would help in identifying new strategies for the prevention of the primary contact. Volatile products secreted by the human host in the process of metabolism are responsible for the attraction of these vectors to the host. The ability of recognizing and discriminating thousands of odorant molecules in insects as in mammals relies on specialized chemosensitive neural cells expressing olfactory receptor proteins (ORs) which reside within segregated compartments called sensilla. Each sensillum is a hair-like structure bathed in the sensillum lymph which contains a number of secreted proteins (McKenna et al. 1994; Pikielny et al. 1994; Wang et al. 1999). The odorant binding proteins (OBPs) are found to be important water-soluble components of this sensillum lymph. It was first identified in the moth as pheromone binding proteins (PBPs) (Vogt and Riddiford 1981). These globular proteins are believed to bind different odorant molecules (Plettner et al. 2000), owing to their high divergence within the family, and transport them to their respective olfactory receptors triggering the mechanism of olfaction (Pelosi and Maida 1995). The arthropod OBPs form a large specific multi-gene family. They are 10–30 kDa globular and water-soluble proteins that are characterized by a specific six α-helical domain comprising of six highly conserved cysteines that have distinct disulphide connectivities. These structural features are now considered the hallmark of this protein family (Calvo et al. 2002; Valenzuela et al. 2002; Calvo et al. 2006). OBPs have been identified in a number of insect species, including four dipterian species Drosophila melanogaster (Galindo and Smith 2001; Graham and Davies 2002; Hekmat-Scafe et al. 2002; Valenzuela et al. 2002; Zhou et al. 2004; Vieira et al. 2007; Vieira and Rozas 2011), A. gambiae (Vogt 2002; Xu et al. 2003; Zhou et al. 2004; Li et al. 2005; Vieira and Rozas 2011), Aed. aegypti (Zhou et al. 2008), and C. quinquefasciatus (Pelletier and Leal 2009, 2011). These proteins are very divergent in terms of the sequences within the family, and sequence identities between the family members from the different species could drop as low as 8% (Vieira and Rozas 2011). In Drosophila, a subgroup of (i) OBPs lacking two of the six conserved cysteines, called MinusC OBPs and (ii) OBPs carrying additional conserved cysteines called PlusC OBPs have been identified (Hekmat-Scafe et al. 2002). The MinusC OBPs typically lack the second and fifth Cys residues. However, this definition appears to be somewhat ambiguous, since there are three Drosophila OBPs among this cluster which contain all the six hallmark cysteines (Pelosi and Maida 1995). MinusC OBPs have never been described to date in mosquito genomes. In mosquitoes, three subfamilies of OBP genes have been characterized so far: (i) the Classic OBPs that carry the six conserved cysteines characteristic motif of the OBP family; (ii) the PlusC OBPs that have the same conserved cysteines and disulphide connectivity but which contain six additional cysteines with novel disulphide connectivities; (iii) the Atypical OBPs that are among the longest known OBPs and that have initially been described as containing a single Classic OBP domain in its N-terminal extended by a less characterized C-terminal extension. Very recently, it was shown that Atypical OBPs comprises two domains that are in fact homologous to the Classic OBP domain and were hence considered as “dimer OBPs” (Vieira and Rozas 2011). In A. gambiae and Aed. aegypti, OBPs from the three different subfamilies have been reported to date while in C. quinquefasciatus, only the Classic and PlusC members of this family have been reported so far (Pelletier and Leal 2009, 2011). Atypical OBPs have not yet been reported in this genome. An additional multi-gene family, known as D7 salivary proteins, is known to be distantly related to the arthropod OBP superfamily (Calvo et al. 2002, 2006, 2009). There are two types of D7 salivary proteins in the mosquito genome, the short and the long forms which contain one and two OBP-like domains, respectively (Valenzuela et al. 2002; Kalume et al. 2005; Choumet et al. 2007). The available structures of the D7 proteins indicate that the domains adopt a similar fold to the OBP domains but decorated with additional structural features and a seventh helix. In the two-domain D7 protein, the C-terminal OBP-like domain has been shown to bind to biogenic amines in A. gambiae and Aed. aegypti (Mans et al. 2008; Calvo et al. 2009), while the N-terminal domain in Aed. aegypti was shown to have a specific bioactive lipid-binding activity (Calvo et al. 2009). These members serve as important representatives for the construction of phylogenetic trees serving as outgroups for the OBP gene family in the current analysis. This work describes the identification and extension of OBPs in the mosquito genomes of A. gambiae, Aed. aegypti, and C. quinquefasciatus. We provide a significant extension of the OBP gene family to a total of 110 new members in these three genomes and report the presence of all three classes of OBPs in the three mosquito genomes. In particular, we identified Atypical class of OBPs in C. quinquefasciatus. We further confirm that “Atypical OBPs” are composed of two domains that are homologous to Classic OBPs and provide in-depth characterization of their origin and structural features. This work also provides for a comprehensive and robust subclassification of the different OBP classes through structure-based alignments and phylogenetic analysis which could possibly reflect on the functional divergence of these proteins. We also provide a detailed primary structural and phylogenetic characterization of all these novel OBP subtypes. An extensive set of supplementary materials that detail our analyses and results are provided.

Results

Extension of OBPs Family in All Three Mosquito Genomes

In the already published works, 65 OBPs from A. gambiae (Vogt 2002; Xu et al. 2003; Zhou et al. 2004, 2008), 64 from Aedes aegyti (Zhou et al. 2008), and 53 OBPs from C. quinquefasciatus (Pelletier and Leal 2009) were previously identified. These OBPs have been characterized by these groups into three main subfamilies Classic, PlusC, and Atypical based on sequence features (fig. 1). Only very recently, Vieira and Rozas (2011) added four new putative genes to the A. gambiae OBP gene repertoire and 13 PlusC OBPs to the C. quinquefasciatus genome. These new genes were also identified by our sequence searches and bioinformatics analysis (see Materials and Methods) (table 1 and supplementary table S1, Supplementary Material online). The fasta sequences of the identified genes are available for download as supplementary material.
F

Cysteine conservation patterns across the different subfamilies and subgroups of OBPs from Anopheles gambiae, Aedes aegypti, and Culex quinquefasciatus genomes. The six conserved cysteines in GOBP domain are denoted C1–C6. The six additional cysteines in the C-term of the Atypical OBPs are denoted C1′–C6′. The lines connecting cysteines represent the disulphide bonds and dotted lines represent the additional disulphide bonds in the PlusC OBPs.

Table 1

Identification of OBPs in Anopheles gambiae, Aedes aegypti, and Culex quinquefasciatus Genomes

Subfamily
ClassicPlusCAtypicalNot DeterminedNew Total
A. gambiae
    Previously reporteda29161669
    Newly identified44
Aed. aegypti
    Previously reportedb331714111
    Newly identified61031
C. quinquefasciatus
    Previously reportedc48109
    Newly identified2112262

Note.—The table shows statistics of previously and newly identified OBP members (AgamOBP65 to AgamOBP68, AaegOBP67 to AaegOBP114, CquiOBP54 to CquiOBP112) in all three mosquito genomes. Detailed results are provided in accompanying supplementary tables S1, Supplementary Material online.

aVogt (2002); Xu et al. (2003); Zhou et al. (2004); and Vieira and Rozas (2011).

bZhou et al. (2008) and Pelletier et al. (2009).

cPelletier et al. (2009).

Cysteine conservation patterns across the different subfamilies and subgroups of OBPs from Anopheles gambiae, Aedes aegypti, and Culex quinquefasciatus genomes. The six conserved cysteines in GOBP domain are denoted C1–C6. The six additional cysteines in the C-term of the Atypical OBPs are denoted C1′–C6′. The lines connecting cysteines represent the disulphide bonds and dotted lines represent the additional disulphide bonds in the PlusC OBPs. Identification of OBPs in Anopheles gambiae, Aedes aegypti, and Culex quinquefasciatus Genomes Note.—The table shows statistics of previously and newly identified OBP members (AgamOBP65 to AgamOBP68, AaegOBP67 to AaegOBP114, CquiOBP54 to CquiOBP112) in all three mosquito genomes. Detailed results are provided in accompanying supplementary tables S1, Supplementary Material online. aVogt (2002); Xu et al. (2003); Zhou et al. (2004); and Vieira and Rozas (2011). bZhou et al. (2008) and Pelletier et al. (2009). cPelletier et al. (2009). In this study, a major expansion is provided in the Atypical OBP subfamily of the mosquitoes where 31 new members (AaegOBP84 to AaegOBP114) are identified in Aed. aegypti which interestingly show high sequence similarities with the 26 (CquiOBP75–CquiOBP100) new Atypical members from the C. quinquefasciatus genome that are reported in this work (supplementary table S1 and , Supplementary Material online). In the Classic OBP subfamily, we have annotated six new members in the Aed. aegypti genome and 21 members in the C. quinquefasciatus genome. In addition to this, 10 new members have been added to the PlusC subfamily of the Aed. aegypti genome which sums up to the addition of 110 members to the OBP gene family of mosquitoes [which includes sequences identified by Vieira and Rozas (2011) and Pelletier and Leal (2011)].

Two-Domain OBPs and MinusC OBPs

Owing to the low sequence identity and length variations observed between the members of the OBP family, a structure-based alignment was used to align them (see Materials and Methods). It highly improved the quality of alignment compared with regular multiple sequence alignments namely for (i) the precise classification of the new OBPs into the three different subfamilies and (ii) the identification of residues in structurally conserved positions that would have been missed otherwise (supplementary fig. S3, Supplementary Material online). The conservation pattern of cysteines across the different classes were clearly highlighted in these structure-based alignments but could not be obtained otherwise with the ordinary sequence alignment methods. We further refer to the cysteine positions in this article by numbering them C1 to C6 with respect to the order of their positions in the Classic OBP proteins. A detailed schematic representation featuring the cysteine spacing and conservation together with their predicted disulphide patterns are given in figure 1. Overall, the six cysteine residues involved in disulphide bond formation, which are considered as the hallmark of this protein family (Calvo et al. 2002; Valenzuela et al. 2002; Calvo et al. 2006), are well conserved across the Classic, PlusC, and Atypical subclasses. Interestingly, sequences that lack C2 and C5 cysteines were observed in the alignments. OBPs which lack these two particular cysteines, called the MinusC OBPs, have been characterized and expressed in other insect genomes such as Drosophila, Bombyx mori, Tribolium castaneum, and Apis mellifera (Vieira and Rosas 2011), but their presence in the mosquito genome has not been shown previously. AaegOBP78 from Aed. aegypti and 15 proteins from C. quinquefasciatus (CquiOBP59–CquiOBP62, CquiOBP64–CquiOBP74) were found to lack these two cysteines. As all these sequences retained the N-terminal signal peptide or the presence of the PBP/GOBP domain, they were retained in our analysis as MinusC OBPs (supplementary tables S3 and S4, Supplementary Material online). We also observed interesting cysteine conservation patterns among the Atypical OBPs. The Atypical OBPs were previously described as proteins that hold a Classic OBP domain in the N-terminal end with an uncharacterized C-terminal domain. However, the close analysis of the extended C-terminal end of Atypical members highlighted the presence of six additional cysteines conserved within this subfamily, with a cysteine spacing pattern very similar to the conserved cysteines (C1–C6) at their N-terminal end. The observed cysteine conservation pattern in the case of the Atypcial OBPs is purely the reflection of the annotation of new members in this subfamily and has never been described before to our knowledge. We hence propose to annotate these cysteines as C1′–C6′. This remarkable conservation of cysteines is believed to hold important evolutionary information (Thangudu et al. 2005, 2008). Following this, we characterized the homologues of each of the two domains and identified their closest classic OBP homologue in their corresponding genomes and also the Drosophila genome which confirms that the Atypical OBPs are indeed “two-domain OBPs.” It is noteworthy that within the Atypical (two-domain) subfamily, a distinctive subtype called matype2 (see below and fig. 1) showed the presence of only six cysteines (C1, C3, C4, C6, C4′, and C6′), when compared with the other subtypes which carry the 12 cysteines. The Cys conservation pattern at the N-terminal domain of the OBP is similar to the MinusC OBPs; however, the C-terminal domain is found to have lost more cysteines comparatively.

Analysis of OBP Genes: Orthology across the Three Genomes and Their Corresponding Distribution

We investigated the orthology and gene distribution of OBPs in three genomes. Assembled genome is only available for A. gambiae at the date of this work in Ensembl Genomes and VectorBase 3.4 version. The chromosomal mapping for each of the OBP genes in Anopheles is hence known with precision (fig. 2). Their chromosomal distribution in the Anopheles genome is centrally featured in supplementary fig. S1 and further referenced in supplementary table S1, Supplementary Material online. Though the syntenic relationship between the chromosome arms in A. gambiae and their corresponding orthologous chromosome arms in Culex and Aedes was established by Arensburger et al. (2010) with the help of genetic markers (supplementary table S2, Supplementary Material online), the genomic data of these two Culcinae species are only available in the form of supercontigs fragments (Nene et al. 2007; Arensburger et al. 2010) and are yet to be assembled. In these two genomes, a few supercontigs (about 10%) harbor markers that allow their chromosomal localization (Arensburger et al. 2010). Very few of these anchor supercontigs hosted OBP genes. Most supercontigs containing OBP genes did not harbor any genomic markers, hence cannot be assigned to a chromosome in Aedes and Culex. However, in many cases, direct orthologs in the Anopheles genome could be identified (fig. 2, supplementary fig. S1 and supplementary table S1, , and , Supplementary Material online). OBP orthologs have been identified using the reciprocal BLAST hit approach (Moreno-Hagelsieb and Latimer 2008) which is widely used in the detection of orthologs. As illustrated in figures 2 and 3 and in supplementary figure S1, Supplementary Material online, three-way orthology (1:1:1) between OBP genes in the three genomes were identified in 31 cases while two-way orthology (1:1) between OBP genes from only two genomes were identified in 5 cases between Anopheles and Culex, 6 between Anopheles and Aedes, and 19 between Aedes and Culex (fig. 3), thus confirming the genetic proximity between the Aed. aegypti and C. quinquefasciatus species. Our proposed analysis was found to be in complete agreement with the microsynteny analysis described very recently in Pelletier and Leal (2011), thus indicating that the orthology detected may serve as the basis of further syntenic analysis.
F

Chromosomal localization of odorant binding proteins from Anopheles gambiae. Details about the gene names used here are shown in supplementary table S1, Supplementary Material online. Shown to the right of the gene names are their direct three-way orthologs detected in the Aedes aegypti and Culex quinquefasciatus genomes by the reverse BLAST hit methodology. Classic OBPs are featured in blue, PlusC OBPs in red, and two-domain OBPs (or Atypical) in green. The chromosomes were drawn to scale in MapDraw software. The positions of and the distances between gene loci are indicated in megabases. When three-way orthology (1:1:1) was not detected but only two-way orthology (1:1), the two CquiOBP and AaegOBP orthologs are separately connected to the corresponding AgamOBP gene.

F

Analysis of orthologous OBP genes shared across three mosquito species, Anopheles gambiae, Aedes aegypti, and Culex quinquefasciatus. The Venn diagrams indicate the number of inferred orthologous genes shared among the mosquitoe species: (a) number of A. gambiae OBP genes orthologous to Aed. aegypti and C. quinquefasciatus; (b) number of Aed. aegypti OBP genes orthologous to A. gambiae and C. quinquefasciatus; (c) number of Culex OBP genes orthologous to A. gambiae and Aed. aegypti; (d) overall number of orthologous groups across the three mosquito species. The orthologs were identified using the reciprocal BLAST hit approach. The number of genes that share a three-way (1:1:1) orthology between the three species is 31. The number of genes in a species that have two-way orthology (1:1) with the two other species but not a three-way orthology is indicated between parenthesis and for a given species should be counted only once. For example, in (a), the total number of OBP genes in A. gambiae is 30 + 3 + 2 + 31 + (3) = 69, since three genes in A. gambiae have two-way orthology (1:1) with genes in both C. quinquefasciatus and Aed. aegypti but not a three-way orthology. Detailed listings of the orthology analysis are provided in supplementary table S1, , and , Supplementary Material online.

Chromosomal localization of odorant binding proteins from Anopheles gambiae. Details about the gene names used here are shown in supplementary table S1, Supplementary Material online. Shown to the right of the gene names are their direct three-way orthologs detected in the Aedes aegypti and Culex quinquefasciatus genomes by the reverse BLAST hit methodology. Classic OBPs are featured in blue, PlusC OBPs in red, and two-domain OBPs (or Atypical) in green. The chromosomes were drawn to scale in MapDraw software. The positions of and the distances between gene loci are indicated in megabases. When three-way orthology (1:1:1) was not detected but only two-way orthology (1:1), the two CquiOBP and AaegOBP orthologs are separately connected to the corresponding AgamOBP gene. Analysis of orthologous OBP genes shared across three mosquito species, Anopheles gambiae, Aedes aegypti, and Culex quinquefasciatus. The Venn diagrams indicate the number of inferred orthologous genes shared among the mosquitoe species: (a) number of A. gambiae OBP genes orthologous to Aed. aegypti and C. quinquefasciatus; (b) number of Aed. aegypti OBP genes orthologous to A. gambiae and C. quinquefasciatus; (c) number of Culex OBP genes orthologous to A. gambiae and Aed. aegypti; (d) overall number of orthologous groups across the three mosquito species. The orthologs were identified using the reciprocal BLAST hit approach. The number of genes that share a three-way (1:1:1) orthology between the three species is 31. The number of genes in a species that have two-way orthology (1:1) with the two other species but not a three-way orthology is indicated between parenthesis and for a given species should be counted only once. For example, in (a), the total number of OBP genes in A. gambiae is 30 + 3 + 2 + 31 + (3) = 69, since three genes in A. gambiae have two-way orthology (1:1) with genes in both C. quinquefasciatus and Aed. aegypti but not a three-way orthology. Detailed listings of the orthology analysis are provided in supplementary table S1, , and , Supplementary Material online. Interestingly, the overwhelming majority of the OBP genes are organized in gene clusters in the three genomes (supplementary fig. S1, Supplementary Material online). The clusters are mainly composed of gene duplicates. The genes in these genomic clusters hence share high sequence identity (data not shown) and are thereby phylogenetically very close (see below) as it is confirmed by inparalogy data from the inParanoid database (O’Brien et al. 2005). The extension of OBP gene repertoire in Aed. aegypti and C. quinquefasciatus with respect to A. gambiae was mainly driven by these gene duplication events which are more numerous in these two Culicinae species. There are a total of 12 OBP gene clusters in Aed. aegypti and 13 clusters in C. quinquefasciatus genomes when compared with 6 clusters in A. gambiae. The largest gene clusters are found in Aedes and Culex, and a few clusters contain as many as 12 genes. It is observed that 21 out of the 26 newly identified Atypical (two-domain) OBPs genes from C. quinquefasciatus are in fact distributed into three main gene clusters (fig. 2 and supplementary fig. S1, Supplementary Material online). Similarly, 10 out of the 12 newly identified PlusC proteins are distributed into three gene clusters.

Phylogeny-Based OBP Clusters

As expected and as already reported previously, OBP family members showed high divergence. The average sequence identity between OBP genes in A. gambiae, Aed. aegypti, and C. quinquefasciatus are 12.5%, 12.8%, and 13.1%, respectively, and their phylogenetic tree (see Material and Methods) also indicated a high sequence divergence (supplementary fig. S2, Supplementary Material online). However, the comparative analysis of the different subfamilies of the OBPs in the mosquito genome provided more meaningful clustering patterns within each subfamily of the OBP members. The analysis was done based on the sequence alignment and phylogenetic trees constructed using sequences from individual subfamilies from all the three mosquito genomes used in this analysis and the Drosophila OBPs (Hekmat-Scafe et al. 2002) in the case of the Classic members. A bootstrap consensus tree was constructed using the neighbor joining method (Saitou and Nei 1987) with all the Classic OBPs from the three mosquito genomes and the D. melanogaster with 1000 bootstrap replicates (fig. 4). The clustering of the various Classic OBPs into clusters based on significant bootstrap values (50% cutoff) revealed the possibility of 18 different subtypes. These clusters carried orthologous and paralogous sequences from the three genomes. Few members of the mosquito genomes clustered with Drosophila OBPs (Hekmat-Scafe et al. 2002), and these clusters were named after their closest Drosophila OBPs. Among these OS-E/OS-F, Pbprp1, LUSH, OBP19a, and Pbprp4 have already been described previously (Xu et al. 2003; Zhou et al. 2008; Pelletier and Leal 2009). However, one member from C. quinquefasciatus in each of the two subtypes OS-E/OS-F (CquiOBP58) and OBP19a (CquiOBP57) have been annotated. The huge expansion of sequences (CquiOBP25–CquiOBP42) observed by Pelletier and Leal (2009) were found to be homologous to AaegOBP57 and AgamOBP13 and were indeed closely related to the Pbprp2/Pbprp5 of Drosophila. CquiOBP55 and AaegOBP83 identified in this analysis are orthologs of AgamOBP29 and homologous to OBP59a of Drosophila and have an unusually long sequence as recently mentioned by Vieira and Rozas (2011). Clustering of three orthologous OBP sequences AgamOBP9, AaegOBP22, and CquiOBP43 with the Drosophila MinusC members OBP99a, OBP44a, and OBP99b was observed with a considerable bootstrap support, among which OBP99a alone retains all the six cysteines, while the two others lack the C2 and C5 cysteines (see below). Among the Drosophila MinusC OBPs, three members of the MinusC subfamily (Obp83f, Obp99a, and Obp99d) retain all six conserved cysteines, whereas four members of the subfamily (Obp8a, Obp44a, Obp99b, and Obp99c) have C2 and C5 cysteines lacking. Therefore, the mosquito OBPs, which cluster with these Drosophila OBPs, do not represent true MinusC OBPs. The other clusters which do not have a close Drosophila homologue are named as mclassic1–9 (fig. 4 and supplementary fig. S3, Supplementary Material online). In addition to these subtypes, one group, displaying outstanding sequence features (supplementary fig. S3, Supplementary Material online) with 16 members lacking C2 and C5 cysteines, has been named as “Bombyx mori MinusC” due to their homology with the B. mori MinusC sequences though its branch holds a bootstrap value of only 35%. This homology was determined using BLAST analysis and confirmed with the inParanoid eukaryotic ortholog database (O’Brien et al. 2005). Other subtype classifications of the Classic members were also similar to the clustering seen in the inParanoid database.
F

Unrooted phylogenetic tree of Classic OBPs in the three mosquito genomes and in Drosophila melanogaster. The Anopheles gambiae, Aedes Aegypti, and Culex quinquefasciatus members are colored in mustard, pink, and turquoise, respectively. The bootstrap values of the branches are indicated on the nodes in percentage values. The names of identified clusters inside the Classic OBPs subfamily are indicated on the branches. Detailed alignments of the members inside each cluster are provided in supplementary figure S3, Supplementary Material online.

Unrooted phylogenetic tree of Classic OBPs in the three mosquito genomes and in Drosophila melanogaster. The Anopheles gambiae, Aedes Aegypti, and Culex quinquefasciatus members are colored in mustard, pink, and turquoise, respectively. The bootstrap values of the branches are indicated on the nodes in percentage values. The names of identified clusters inside the Classic OBPs subfamily are indicated on the branches. Detailed alignments of the members inside each cluster are provided in supplementary figure S3, Supplementary Material online. As shown in figure 5, the PlusC OBPs clustered as seven major phylogenetic clusters based on bootstrap cutoff value of 50%, but we further subdivided them into 11 subtypes (mplus1–mplus11). Indeed, though the interior node of mplus7–11 cluster hold a bootstrap value of 57%, we separated them as different subtypes because they clearly hold distinct sequence features (supplementary fig. S3, Supplementary Material online). Furthermore, analysis of chromosomal localization of PlusC members from A. gambiae shows that mplus11 subtype members are specific to chromosome 3L while all other PlusC OBPs were specifically distributed on chromosome 2L. At this stage, it is difficult to interpret the molecular background behind this clustering.
F

Unrooted phylogenetic tree of PlusC OBPs in the three mosquito genomes. The Anopheles gambiae, Aedes Aegypti, and Culex quinquefasciatus members are colored in mustard, pink, and turquoise, respectively. The bootstrap values of the branches are indicated on the nodes in percentage values. The names of identified clusters inside the PlusC OBPs subfamily are indicated on the branches. Detailed alignments of the members inside each cluster are provided in supplementary figure S3, Supplementary Material online.

Unrooted phylogenetic tree of PlusC OBPs in the three mosquito genomes. The Anopheles gambiae, Aedes Aegypti, and Culex quinquefasciatus members are colored in mustard, pink, and turquoise, respectively. The bootstrap values of the branches are indicated on the nodes in percentage values. The names of identified clusters inside the PlusC OBPs subfamily are indicated on the branches. Detailed alignments of the members inside each cluster are provided in supplementary figure S3, Supplementary Material online. The Classic subfamily members from the three genomes share an average sequence identity of 15.5%, while the PlusC OBPs share 17.3% average sequence identity. No distinct sequence features could be observed at the subfamily level (Classic, PlusC, and Atypical) because of high sequence divergence. Nevertheless, a close examination of the alignments for the different clusters which contain orthologous sequences from the three genomes within each subfamily indicates that the phylogenetic clusters established in this study tend to have specific sequence patterns (supplementary fig. S3 and , Supplementary Material online). Some subgroups are characterized by a very low average sequence identity like the B. mori MinusC subgroup within the Classic OBPs (21.5%), the mclassic9 (23.3%), or the mplus9 (24.3%), while other subgroups share significantly higher sequence identities like OS-E/OS-F (55.2%), Pbprp4 (60.2%), or mclassic4 (77.3%).

Sequence Specific Clustering of Two-Domain OBPs

The Atypical OBPs, unlike the Classic members, formed four major clusters based on bootstrap values which are named in this study matype1–matype4 (fig. 6) and showed distinct sequence features (supplementary fig. S3, Supplementary Material online). The matype1 forms the smallest cluster among the four subtypes with two members from each genome, and this cluster is separated from the other three subtypes with high bootstrap values. The matype2 forms a distinctive type of Atypical members holding only a total of six cysteines (C1, C3, C4, C5, C1′, and C6′) out of the 12 conserved cysteines characteristic of the other subtypes of this subfamily (fig. 1 and supplementary fig. S3, Supplementary Material online). The matype2 still features to stand as a distinctive type with the presence of cysteines in the N-terminal domain lacking C2–C5 as previously described. The matype4 members unanimously hold a deletion of about 15 resides between the C1 and C2 which stands as the distinguishing feature of this subtype. The matype1 members are orthologous to AgamOBP39 that is located on chromosome 2R which is otherwise populated with Classic members supporting their close relation to the Classic members as observed in the phylogeny of the individual genomes. The matype2 members, intriguingly, share orthology with corresponding OBPs from A. gambiae that were mapped to chromosome X, whereas matype3 and matype4 members were sharing orthology with AgamOBPs distributed over chromosomes 3R and 3L.
F

Unrooted phylogenetic tree of Atypical odorant binding proteins in the three mosquito genomes. The Anopheles gambiae, Aedes Aegypti, and Culex quinquefasciatus members are colored in mustard, pink, and turquoise, respectively. The bootstrap values of the branches are indicated on the nodes in percentage values. The names of identified clusters inside the Atypical OBPs subfamily are indicated on the branches. Detailed alignments of the members inside each cluster are provided in supplementary figure S3, Supplementary Material online.

Unrooted phylogenetic tree of Atypical odorant binding proteins in the three mosquito genomes. The Anopheles gambiae, Aedes Aegypti, and Culex quinquefasciatus members are colored in mustard, pink, and turquoise, respectively. The bootstrap values of the branches are indicated on the nodes in percentage values. The names of identified clusters inside the Atypical OBPs subfamily are indicated on the branches. Detailed alignments of the members inside each cluster are provided in supplementary figure S3, Supplementary Material online.

Discussion

Evolutionary Aspects of OBP Gene Family in Mosquitoes

C. quinquefaciatus (Arensburger et al. 2010) and Aed. aegypti genomes (Nene et al. 2007) which code for 109 and 111 OBPs, respectively, have a significantly larger OBP genes repertoire than the related A. gambiae genome which harbors only 67 OBPs. A. gambiae belongs to the Anophelinae and Aed. aegypti and C. quinquefasciatus belong to the Culicinae subfamilies. These two subfamilies of mosquitoes are estimated to have diverged around ∼120 My (Reidenbach et al. 2009). The increase in the number of genes indicates lineage-specific expansions of the OBP gene family among the mosquito species. The odorant binding gene family has been previously shown to adopt a birth and death model of evolution based on a number of factors which includes several gene gain and loss events in lineages, decrease in the number of orthology groups with increasing divergence times, and an uneven phylogenetic subfamily distribution across species (Vieira and Rozas 2011). Similar observations are made with respect to the OBP genes in Anopheles, Culex, and Aedes species, where a number of gene gain is observed in the Aedes and Culex species, uneven distribution of subfamilies is observed (where the MinusC Subfamily is absent in Anopheles), and the number of orthologous sequences are higher in the Aedes and Culex species and becomes lesser with respect to the Anopheles which is distantly related to these two species comparatively. This provides further support to the already existing fact that the OBP gene family undergoes a birth and death model of evolution. Furthermore, the appearance of new genes and subfamilies in the Aedes and Culex could relate to the requirement of these genes for environmental adaptations by these species.

Evolution of MinusC Proteins in the Mosquito Genomes

The MinusC subfamily of OBPs was first identified in the Drosophila genome with some of its members lacking the second and fifth cysteine residues (Hekmat-Scafe et al. 2002) and later identified in other species which includes the Apis millefera (Forêt and Maleszka 2006) and B. mori (Gong et al. 2009 and Yoshizawa et al. 2011). In the case of mosquitoes, interestingly, MinusC OBPs are not present in the A. gambiae. But the MinusC OBPs appeared in Culicinae lineage of the mosquito OBPs. The close homology of these MinusC members with the B. mori MinusC OBPs suggests that they could have a common ancestor. Thus, it can be said that the MinusC OBPS appeared in the Endopterygota lineage of insects which is dated back to ∼300 My and suggest that these OBPs appeared earlier in the evolution and not only in the Drosophilidae, Bombyx/Tribolium, and Apis lineages as believed earlier (Vieira and Rozas 2011). However, the absence of these OBPs in the A. gambiae is intriguing and suggests that they could have species-specific expansions. This in fact supports the birth and death model of evolution observed in the OBP family of proteins. Separately, the matype2 members belonging to the two-domain OBP subfamily, which retain only six cysteines, interestingly lack the C2 and C5 cysteines in the N-terminal domain. The absence of C2 and C5 cysteines being the characteristic feature of the MinusC proteins lays an important question on the evolutionary link between these members.

Atypical OBPs are Indeed Two-Domain OBPs

The increase in the number of Atypical OBPs in the three mosquito genome revealed important facets in this subfamily of proteins. We have shown that the Atypical OBPs so far identified only in the mosquitoes are indeed two-domain OBPs. This was also reported by Vieira and Rozas (2011), based on their phylogenetic analysis that they belong to the dimer OBP clade. We further provide evidence to this by characterizing each of the Atypical OBP domains by their closest homologue in the Classic OBP subfamily in their corresponding genomes. Very interestingly, the Classic OBP members, obtained as hits by each of these domains, were mainly found among the mclassic9, mclassic8, and Obp99a members (table 2). Atypical OBPs indeed are found to share closer phylogenetic proximity to OBPs from the mclassic9, mclassic8, and OBP99a phylogenetic clusters however not with significant bootstrap value because of sequence divergence (supplementary fig. S2, Supplementary Material online). Moreover, Atypical gene clusters in A. gambiae are localized in close proximity to gene clusters that contains Classic OBPs from one of these three groups at the chromosomal level. On chromosome 2R, matype1 members AgamOBP39 and AgamOBP40 are localized at the level of the same gene cluster as the mclassic9 members AgamOBP11, AgamOBP12, and AgamOBP14. Likewise, on chromosome X, the matype2 OBPs AgamOBP34 to AgamOBP37 are localized proximal to AgamOBP8 and AgamOBP9 that belong to the Obp99a phylogenetic cluster. Similarly, on chromosome 3L, the matype3 AgamOBP31, AgamOBP44, and AgamOBP45 that form a gene cluster are in close proximity to AgamOBP22 which belongs to the mclassic8 group. Another interesting observation is that OBP members from these three phylogenetic clusters (mclassic9, mclassic8, and Obp99a) are closely related to the MinusC group of proteins in the Drosophila genome (fig. 4), and it has been established that the Drosophila Dimer OBPs 83cd and 83ef (Zhou et al. 2004), which are proteins that hold two OBP domains, are related to these Drosophilidae MinusC proteins.
Table 2

Analysis of the Two Putative OBP Domains (N-term and C-term) of Atypical OBPs from Anopheles gambiae, Aedes Aegypti, and Culex quinquefasciatus

Mosquito Atypical OBP
Mosquito Classic OBP Closest Homologues
Drosophila OBP Closest Homologues
IDPhylogenetic SubgroupN-termPhylogenetic SubgroupE-valueC-termPhylogenetic SubgroupE-valueN-termE-valueC-termE-value
AAEL009597matype1AAEL005772Obp99a1e-10AAEL004342mclassic9a2e-14Obp99b5e-11Obp99a5e-09
AaegOBP40AaegOBP22AaegOBP18
AAEL009599matype1AAEL005772Obp99a1e-10AAEL007014No group2e-04Obp99a9e-08Obp99a8e-04
AaegOBP41AaegOBP22AaegOBP79
AGAP002190matype1AGAP000278OBP99a1e-10AGAP002189mclassic9b2e-15Obp99b6e-09Obp99a5e-09
AgamOBP39AgamOBP9AgamOBP14
AGAP002191matype1AGAP002188No group2e-08Obp44a7e-07Obp99a5e-08
AgamOBP40AgamOBP12
AGAP011647matype1AGAP010409mclassic82e-11AGAP002025mclassic9b5e-09Obp99a7e-09Obp99a4e-03
AgamOBP30AgamOBP22AgamOBP11
CPIJ015732matype1CPIJ010787mclassic9a2e-10CPIJ016343mclassic9b2e-17Obp99b4e-07Obp99a9e-10
CquiOBP85CquiOBP51CquiOBP63
CPIJ015733matype1CPIJ010787mclassic9a1e-10CPIJ010782maclassic9b4e-04Obp44a3e-07Obp99a3e-07
CquiOBP86CquiOBP51CquiOBP46
AAEL000318matype2AAEL007003No group5e-07AAEL004342mclassic9a1e-03Obp44a2e-04Obp99c1e-04
AaegOBP92AaegOBP80AaegOBP18
AAEL000319matype2AAEL002617mclassic3a2e-03AAEL007014No group3e-04Obp44a4e-03
AaegOBP93AaegOBP12AaegOBP79
AAEL000344matype2AAEL007003No group1e-04AAEL007014No group4e-02Obp44a4e-06Obp99b3e-03
AaegOBP94AaegOBP80AaegOBP79
AAEL000350matype2AAEL011730mclassic83e-05AAEL002587mclassic3b1e-04Obp56d2e-03Obp99b5e-07
AaegOBP95AaegOBP81AaegOBP11
AAEL000377matype2AAEL004343mclassic9a4e-06AAEL007014No group1e-03Obp44a3e-09Obp99b7e-04
AaegOBP89AaegOBP19AaegOBP79
AAEL001153matype2AAEL007003No group7e-05AAEL013018OS-E/OS-F3e-02Obp99c1e-05
AaegOBP106AaegOBP80AaegOBP3
AAEL001174matype2AAEL004343mclassic9a4e-07AAEL004343mclassic9a9e-07Obp44a7e-05Obp44a3e-05
AaegOBP98AaegOBP19AaegOBP19
AAEL001179matype2AAEL004343mclassic9a8e-08AAEL007014No group1e-02Obp99b5e-07Obp44a8e-05
AaegOBP99AaegOBP19AaegOBP79
AAEL001189matype2AAEL004343mclassic9a8e-07AAEL007003No group8e-04Obp44a1e-04Obp44a2e-03
AaegOBP105AaegOBP19AaegOBP80
AAEL004516matype2AAEL004343mclassic9a2e-04Obp44a4e-04
AaegOBP104AaegOBP19
AAEL009433matype2AAEL004343mclassic9a2e-06Obp99c0.002
AaegOBP109AaegOBP19
AAEL013719matype2AAEL004343mclassic9a1e-03
AegOBP90AaegOBP19
AAEL013720matype2AAEL007003No group2e-06AAEL004343mclassic9a1e-04Obp44a1e-03
AaegOBP91AaegOBP80AaegOBP19
AAEL014874matype2AAEL004343mclassic9a2e-06OBP99c2e-03
AaegOBP108AaegOBP19
AAEL014876matype2AAEL011730mclassic83e-09Obp99c1e-05
AaegOBP107AaegOBP81
AGAP000641/644matype2AGAP013182ND3e-09AGAP002025mclassic9b4e-10Pbprp21e-04Pbprp15e-04
AgamOBP34/37AgamOBP59AgamOBP11
AGAP000642matype2AGAP013182ND2e-08AGAP002025mclassic9b2e-06Obp56d1e-05
AgamOBP35AgamOBP59AgamOBP11
AGAP000643matype2AGAP013182ND8e-09AGAP002025mclassic9b2e-06Obp56d1e-05
AgamOBP36AgamOBP59AgamOBP11
CPIJ001690matype2CPIJ009937mclassic82e-07Obp99a4e-07
CquiOBP92CquiOBP44
CPIJ003863matype2CPIJ009937mclassic82e-05CPIJ010782maclassic9b4e-02Obp44a4e-07
CquiOBP89CquiOBP44CquiOBP46
CPIJ003865matype2CPIJ009937mclassic84e-06CPIJ010782maclassic9b2e-02Obp44a7e-07
CquiOBP88CquiOBP44CquiOBP46
CPIJ003866matype2CPIJ009937mclassic83e-07CPIJ010782maclassic9b1e-04Obp44a1e-09Obp99b1e-03
CquiOBP90CquiOBP44CquiOBP46
CPIJ003867matype2CPIJ009937mclassic81e-09CPIJ010782maclassic9b4e-02OBP99a9e-07
CquiOBP91CquiOBP44CquiOBP46
CPIJ017163matype2CPIJ009937mclassic82e-06CPIJ017326Obp99a6e-02Obp44a4e-04Obp56g4e-03
CquiOBP99CquiOBP44CquiOBP43
CPIJ017164matype2CPIJ009937mclassic86e-06CPIJ010789mclassic72e-02Obp44a2e-04
CquiOBP97CquiOBP44CquiOBP53
CPIJ017165matype2CPIJ009937mclassic87e-10CPIJ010782maclassic9b1e-02Obp99c2e-04
CquiOBP98CquiOBP44CquiOBP46
CPIJ017166matype2CPIJ009937mclassic81e-03CPIJ016343mclassic9b3e-02Obp44a1e-04
CquiOBP95CquiOBP44CquiOBP63
CPIJ017167matype2CPIJ009937mclassic81e-03CPIJ001365Pbprp11e-02Obp44a4e-04Obp56d7e-05
CquiOBP96CquiOBP44CquiOBP7
CPIJ017168matype2CPIJ016951Obp19a6e-04CPIJ016343mclassic9b3e-02
CquiOBP101CquiOBP57CquiOBP63
CPIJ017169matype2CPIJ009937mclassic88e-03CPIJ010789mclassic72e-02Obp44a3e-03Obp99b2e-03
CquiOBP100CquiOBP44CquiOBP53
CPIJ017170matype2CPIJ009937mclassic83e-02CPIJ016343mclassic9b6e-03Obp44a6e-03
CquiOBP94CquiOBP44CquiOBP63
AAEL006385matype3AAEL002596mclassic3a8e-04AAEL004343mclassic9a2e-06Obp56d8e-04Obp99a9e-08
AaegOBP33AAEgOBP9AaegOBP19
AAEL006387matype3AAEL002617mclassic3a1e-05AAEL004343mclassic9a3e-06Obp56d3e-05Obp99a5e-07
AaegOBP29AaegOBP12AaegOBP19
AAEL006393matype3AAEL002617mclassic3a1e-05AAEL004343mclassic9a3e-06Obp56d3e-05Obp99a5e-07
AaegOBP28AaegOBP12AaegOBP19
AAEL006396matype3AAEL002617mclassic3a1e-05AAEL004342mclassic9a2e-03Obp56d1e-05Obp99a4e-05
AaegOBP31AaegOBP12AaegOBP18
AAEL006398matype3AAEL011730mclassic82e-02AAEL011730mclassic81e-06Obp56d9e-03Obp99a1e-06
AaegOBP32AaegOBP81AaegOBP81
AGAP010648matype3AGAP002025mclassic9b1e-10AGAP002025mclassic9b1e-08Obp99a2e-05Obp99b5e-04
AgamOBP44AgamOBP11AgamOBP11
AGAP010649matype3AGAP013182ND5e-09AGAP002025mclassic9b1e-12Obp99a8e-07Obp99b6e-08
AgamOBP31AgamOBP59AgamOBP11
AGAP010650matype3AGAP013182ND1e-11AGAP002189mclassic9b3e-11Obp99b9e-05Obp99a3e-10
AgamOBP45AgamOBP59AgamOBP14
CPIJ009038matype3CPIJ009937mclassic89e-08CPIJ009937mclassic87e-08Obp56d2e-03Obp99a1e-10
CquiOBP87CquiOBP44CquiOBP44
CPIJ017342matype3CPIJ009937mclassic82e-08CPIJ006551Obp19a2e-05Obp56c7e-03Obp99b1e-08
CquiOBP93CquiOBP44CquiOBP11
AAEL000796matype4AAEL011730mclassic88e-04AAEL004343mclassic9a6e-07Obp56i1e-05
AaegOBP96AaegOBP81AaegOBP19
AAEL000821matype4AAEL011730mclassic81e-05AAEL005770Obp99a1e-06
AaegOBP6AaegOBP81AaegOBP21
AAEL000827matype4AAEL007014No group4e-03AAEL004342mclassic9a4e-05Obp99a5e-05
AaegOBP84AaegOBP79AaegOBP18
AAEL000831matype4AAEL011730mclassic81e-02AAEL002596mclassic3a3e-05Obp56g1e-04
AaegOBP85AaegOBP81AaegOBP9
AAEL000833matype4AAEL004339mclassic74e-03AAEL004343mclassic9a8e-05Obp99d5e-05
AaegOBP7AaegOBP17AaegOBP19
AAEL000835matype4AAEL011730mclassic88e-03AAEL004343mclassic9a6e-07Obp56i1e-05
AaegOBP97AaegOBP81AaegOBP19
AAEL000837matype4AAEL011730mclassic86e-08Pbprp27e-04
AaegOBP112AaegOBP81
AAEL001487matype4AAEL011730mclassic84e-03Obp51a1e-02
AaegOBP114AaegOBP81
AAEL003311matype4AAEL011730mclassic87e-05AAEL002596mclassic3a2e-04Obp99b1e-03
AaegOBP111AaegOBP81AaegOBP9
AAEL003315matype4AAEL011730mclassic82e-08AAEL005770Obp99a2e-04Obp99c4e-03
AaegOBP16AaegOBP81AaegOBP21
AAEL003511matype4AAEL011730mclassic82e-04AAEL004343mclassic9a1e-06Obp99a2e-05
AaegOBP87AaegOBP81AaegOBP19
AAEL003513matype4AAEL011730mclassic85e-07AAEL005770Obp99a1e-07Obp99a2e-05
AaegOBP100AaegOBP81AaegOBP21
AAEL003525matype4AAEL011730mclassic82e-03AAEL005770Obp99a4e-07Obp99a2e-04
AaegOBP101AaegOBP81AaegOBP21
AAEL003538matype4AAEL011730mclassic85e-07AAEL005770Obp99a1e-07OBP99a2e-05
AaegOBP102AaegOBP81AaegOBP21
AAEL004856matype4AAEL011730mclassic83e-05AAEL007014No group6e-06Obp99a1e-04
AaegOBP86AaegOBP81AaegOBP79
AAEL010714matype4AAEL011730mclassic83e-05AAEL005770Obp99a2e-07Obp99a8e-06
AaegOBP45AaegOBP81AaegOBP21
AAEL010718matype4AAEL011730mclassic85e-07AAEL005770Obp99a2e-06Obp56g6e-03Obp99a1e-04
AaegOBP44AaegOBP81AaegOBP21
AAEL010872matype4AAEL011730mclassic89e-05AAEL004342mclassic9a5e-05Pbprp58e-04
AaegOBP46AaegOBP81AaegOBP18
AAEL010874matype4AAEL011730mclassic83e-06AAEL005770Obp99a3e-06Obp99b3e-05
AaegOBP88AaegOBP81AaegOBP21
AAEL010875matype4AAEL011730mclassic83e-06AAEL005770Obp99a2e-05Obp99d7e-05
AaegOBP103AaegOBP81AaegOBP21
CPIJ000653matype4CPIJ016343maclassic9b2e-06Obp99b3e-04
CquiOBP84CquiOBP63
CPIJ008154matype4CPIJ014525maclassic61e-02CPIJ016343mclassic9b7e-05
CquiOBP83CquiOBP24CquiOBP63
CPIJ008155matype4CPIJ010789mclassic72e-03CPIJ016343mclassic9b1e-06Obp99a2e-04
CquiOBP79CquiOBP53CquiOBP63
CPIJ008156matype4CPIJ010789mclassic74e-02CPIJ010782maclassic9b2e-11Obp99a2e-08
CquiOBP80CquiOBP53CquiOBP46
CPIJ008157matype4CPIJ010787mclassic9a1e-02CPIJ010782maclassic9b1e-09Obp99a6e-08
CquiOBP76CquiOBP51CquiOBP46
CPIJ008158matype4CPIJ016343mclassic9b1e-08CPIJ016343mclassic9b8e-07Obp99a1e-05Obp99a3e-08
CquiOBP77CquiOBP63CquiOBP63
CPIJ008159matype4CPIJ009937mclassic86e-03CPIJ016343mclassic9b4e-10Obp99a6e-09
CquiOBP78CquiOBP44CquiOBP63
CPIJ008160matype4CPIJ009937mclassic81e-02CPIJ016343mclassic9b1e-09Obp44a3e-03
CquiOBP81CquiOBP44CquiOBP63
CPIJ008161matype4CPIJ010789mclassic72e-02CPIJ016343mclassic9b2e-09Obp99b3e-06
CquiOBP82CquiOBP53CquiOBP63
AAEL008640AAEL011730mclassic82e-08AAEL011730mclassic81e-05
AaegOBP113AaegOBP81AaegOBP81
AAEL014430AAEL007003No group2e-08AAEL011730mclassic82e-05Obp99c3e-07Obp44a1e-06
AaegOBP58AaegOBP80AaegOBP81
AAEL014431AAEL011730mclassic84e-12AAEL004342mclassic9a9e-09Obp99b1e-07Obp99b3e-03
AaegOBP110AaegOBP81AaegOBP18
AGAP000580AGAP002189mclassic9b2e-06AGAP002025mclassic9b4e-05Obp99b6e-05Obp99c2e-06
AgamOBP38AgamOBP14AgamOBP11
AGAP000638AGAP010409mclassic82e-11AGAP002025mclassic9b8e-07Obp99a1e-06Obp99c1e-03
AgamOBP32AgamOBP22AgamOBP11
AGAP000640AGAP010409mclassic82e-11AGAP002025mclassic9b8e-07Obp99a1e-06Obp99c1e-03
AgamOBP33AgamOBP22AgamOBP11
AGAP005182AGAP013182ND8e-07AGAP002025mclassic9b7e-11Obp44a2e-04
AgamOBP41AgamOBP59AgamOBP11
AGAP009065AGAP013182ND6e-10AGAP002025mclassic9b2e-08Obp99a5e-05Obp99c7e-05
AgamOBP42AgamOBP59AgamOBP11
AGAP009402AGAP010409mclassic85e-11AGAP002189mclassic9b6e-16Obp99a4e-06Obp99a2e-09
AgamOBP43AgamOBP22AgamOBP14

Note.—The table shows top hits results of the BLAST search among all mosquito Classic OBPs and Drosophila OBPs after splitting the Atypical proteins into their two respective putative domains.

Analysis of the Two Putative OBP Domains (N-term and C-term) of Atypical OBPs from Anopheles gambiae, Aedes Aegypti, and Culex quinquefasciatus Note.—The table shows top hits results of the BLAST search among all mosquito Classic OBPs and Drosophila OBPs after splitting the Atypical proteins into their two respective putative domains. The recent publication of a functional dimer in the C. quinquefasciatus genome (Mao et al. 2010) supports the current important speculations on Atypical members, indicating the importance of the presence of two-domain proteins in the binding of relatively large ligands. Thus, it is confirmed that the Atypical OBP members are indeed two-domain OBPs which were previously observed in Drosophila as Dimer OBPs and that they no more stand specific to the mosquito genomes as reported earlier (Xu et al. 2003). Furthermore, the matype2 members which carry a presence of only 6 cysteines in the place of 12 cysteines as in the other two-domain OBPs is suggestive of a possible adaptation in the fold with 3 disulphide bonds in place of 6 disulphide bonds in the other types. The astound distribution of these matype2 OBP genes from A. gambiae on the X chromosome further increases the speculative importance of these proteins in the blood feeding mechanism by female mosquitoes. Interestingly, most of the members of the two-domain OBP subfamily are reported as differentially expressed with respect to blood time series which adds to the importance of these proteins in host recognition (Dissanayake et al. 2010). Ecological adaptations might have driven the need for the observed expansion in two-domain OBP gene repertoire in the Aed. aegyptii and C. quinquefasciatus genome when compared with A. gambiae. Our observations indicate that this expansion most probably occurred through gene duplication events in localized genome regions which lead to the observed gene clusters. We hence hypothesize that two distinct mechanisms could underlie the emergence of Atypical genes in mosquitoes. The observations made in A. gambiae genome sustain the first hypothesis that two-domain OBPs might have originated from gene duplicates of mclassic9, mclassic8, or Obp99a related members and their subsequent gene fusion leading to Atypical genes coclustered with their Classic counterparts. The observations made in Aed. aegyptii and C. quinquefasciatus support the second complementary hypothesis whereby the Atypical genes have undergone further gene duplications probably in response to ecological constraints in these mosquito lineages. Our analysis hence sustains the proposition that the Atypical OBP genes to be renamed two-domain OBP proteins. Their future structural characterization and ligand binding profiling would be of significant importance in deciphering their contribution in olfaction in mosquitoes.

Materials and Methods

Sequence Searches

The database of the predicted protein sequences of the three mosquito genomes A. gambiae (A. gambiae annotation, AgamP3.4), Aed. aegypti (Aed. aegypti annotation, AaegL1.1), and C. quinquefasciatus (C. quinquefasciatus annotation, CpipJ1.2) were downloaded from the VectorBase (Lawson et al. 2009) version 3.4 (http://www.vectorbase.org, last accessed January 9, 2013) and Ensembl Genomes (Hubbard et al. 2009). The putative OBPs in the three mosquito species were identified using 10 Drosophila query sequences which belong to three different subfamilies Classic/General OBPs, PlusC, and MinusC OBPs using a PSI-BLAST (Altschul et al. 1997) run of 10 query sequences with an E-value cutoff of 3e−10 (Vieira et al. 2007) and an alignment length cutoff of 75% with respect to the query sequence. At this level, all of the previously identified members in the three genomes were identified with identification of a few additional members. A second run of PSI-BLAST was initiated with the hits from the previous runs. Using this protocol it was possible to not only pick up all the members of OBPs reported so far (Vogt 2002; Xu et al. 2003; Zhou et al. 2004, 2008; Pelletier and Leal 2009, 2011; Vieira and Rozas 2011) but also a remarkable number of additional members. The additional sequences were checked for the presence of a signal peptide using the SignalP server (Petersen et al. 2011), PBP/GOBP domain using CD-Search (Marchler-Bauer and Bryant 2004) in the case of classic OBPs, and alignment of the new sequences with their subfamily members in case of Atypical and PlusC proteins. The D7 proteins which were identified using this method but which are considered as a distinct family of proteins related to the OBPs were also retained for further analysis and used as an outgroup in the construction of phylogenetic trees. The orthologous sequences were identified based on the reciprocal best hit approach using BLAST (Moreno-Hagelsieb and Latimer 2008). The newly added sequences were named according to the naming conventions used in the earlier reports (Vogt 2002; Xu et al. 2003; Zhou et al. 2004, 2008; Pelletier and Leal 2009).

Multiple Sequence Alignment

The multiple sequence alignment forms the basis for any analysis of a family of proteins and it is highly necessary to obtain an accurate alignment. The error rate in the alignment increases with the increase in divergence of the proteins. Structure-based alignments in turn are considered to be the most accurate forms of alignments and hence, in this study, the structure alignment was used in constructing the alignments. The structure alignment was constructed using 10 OPBs in the OBP gene family using COMPARER (Sali and Blundell 1990). However, the use of the structure alignment as profiles was restricted to seven members in the case of OBPs and two members for the D7 family due to the limited number of structural data (data not shown). The OBPs and the D7 sequences were aligned to their respective structure alignments as profiles, and a combined alignment of the two family of proteins was constructed using the profile–profile alignment option using ClustalX (Thompson et al. 1994, 1997; Jeanmougin et al. 1998). The alignments were truncated based on the structure alignment on the N-terminal end which corresponds to the signal peptide region that has a high substitution rate; however, the C-terminal ends were retained due to the presence of an extended C-terminal in the case of Atypical subfamily members of the OBP family. This method was applied for aligning the sequences in all the three different genomes. Alignments for the different subclasses were constructed with sequences from all the three mosquito genomes and in the case of Classic subfamily, along with Drosophila sequences. The alignment of the Atypical and PlusC subclasses of OBPs were however not based on the structure alignment.

Phylogenetic Analysis

The phylogenetic trees were inferred using the Neighbor-Joining method (Saitou and Nei 1987) in MEGA 4.0 (Tamura et al. 2007). The percentage of replicate trees in which the associated sequences cluster together in the bootstrap test (1000 replicates) are shown next to the branches of the bootstrap consensus trees (Felsenstein 1985) and branches with <50% bootstrap cutoff were collapsed. The evolutionary distances were computed using the Poisson correction method (Zuckerkandl and Pauling 1965) and are in the units of number of amino acid substitutions per site. All positions containing alignment gaps and missing data were eliminated only in pairwise sequence comparisons (pairwise deletion option). The trees were rooted at the branches of the D7 family of proteins which was considered as an outgroup (supplementary fig. S2, Supplementary Material online). The trees of the different subclasses (figs. 4–6) used for the comparative analysis of the different genomes were analyzed as unrooted trees. The phylogenetic trees were inferred using the Neighbor-Joining method (Saitou and Nei 1987) in MEGA 4.0 (Tamura et al. 2007). The percentage of replicate trees in which the associated sequences cluster together in the bootstrap test (1000 replicates) are shown next to the branches of the bootstrap consensus trees (Felsenstein 1985) and branches with <50% bootstrap cutoff were collapsed for the PlusC and Atypical OBP trees. The branches were not collapsed for the Classic OBP tree, however the subtype definition was still based on 50% bootstrap cutoff.

Orthology, Paralogy, Chromosomal Mapping, and Tentative Syntenic Analysis

OBP orthologs have been identified using the reciprocal BLAST hit approach (Moreno-Hagelsieb and Latimer 2008) which is widely used in the detection of orthologs. The inParanoid database (O’Brien et al. 2005) was used to examine the inparalogous relationship between OBPs. Assembled genome data was only available for A. gambiae at the date of this work in the Ensembl Genome (Hubbard et al. 2009) and VectorBase (Lawson et al. 2009). The chromosomal locations of OBPs from A. gambiae were identified using this data. The genome data of Aed. aegypti and C. quinquefasciatus as featured todate in Ensembl Genomes and VectorBase are not yet assembled and were used to map the OBP genes in these genomes at the supercontigs level. The exact chromosomal locations are known for only about 10% of their supercontigs among which very few harbor OBP genes. Orthologous OBP genes identified as described above were used to establish putative synteny between chromosomal segments from A. gambiae and supercontigs from the other two Culicinae species. The genes were mapped to their respective location on the chromosome or supercontigs (supplementary fig. S1, Supplementary Material online). The chromosomes of A. gambiae was used as reference and were represented as a yellow bar and the contigs of Aedes and Culex are represented in purple and green, respectively. The direct three-way (1:1:1) orthology relationships among the three genomes are represented as green lines. The two-way (1:1) orthology relationships between two species are represented as black lines, and the inparalogy relationships are represented as red lines. The figures of the chromosomal mapping were drawn to scale using Adobe illustrator CS5.

Atypical Domain Analysis

The two constitutive PBP/GOBP OBP domains of Atypical OBPs were further characterized for their relationship with Classic or PlusC OBPs. For each Atypical OBP, the boundary between the N-term and C-term PGP/GOBP domains was manually delimited. This was performed by subjecting the full-length sequence to Pfam (Finn et al. 2010) and Conserved Domain Database (Marchler-Bauer et al. 2011) and was further validated by analyzing their cysteine profiles. Each N-term and C-term domain hence delimited was then subjected to a PSI-BLAST search (E-value cutoff value of 10−2) against a database that contains all OBPs from the same mosquito species in an attempt to find their putative distantly related single-domain OBPs. A similar search was performed against a database of Drosophila OBPs.

Supplementary Material

Supplementary figures S1–S3 and tables S1–S4 are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).
  51 in total

1.  Identification and expression of odorant-binding proteins of the malaria-carrying mosquitoes Anopheles gambiae and Anopheles arabiensis.

Authors:  Zheng-Xi Li; John A Pickett; Linda M Field; Jing-Jiang Zhou
Journal:  Arch Insect Biochem Physiol       Date:  2005-03       Impact factor: 1.698

2.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming.

Authors:  A Sali; T L Blundell
Journal:  J Mol Biol       Date:  1990-03-20       Impact factor: 5.469

3.  The neighbor-joining method: a new method for reconstructing phylogenetic trees.

Authors:  N Saitou; M Nei
Journal:  Mol Biol Evol       Date:  1987-07       Impact factor: 16.240

4.  Members of a family of Drosophila putative odorant-binding proteins are expressed in different subsets of olfactory hairs.

Authors:  C W Pikielny; G Hasan; F Rouyer; M Rosbash
Journal:  Neuron       Date:  1994-01       Impact factor: 17.173

5.  The major acid soluble proteins of adult female Anopheles darlingi salivary glands include a member of the D7-related family of proteins.

Authors:  E Calvo; A G deBianchi; A A James; O Marinotti
Journal:  Insect Biochem Mol Biol       Date:  2002-11       Impact factor: 4.714

6.  Intriguing olfactory proteins from the yellow fever mosquito, Aedes aegypti.

Authors:  Yuko Ishida; Angela M Chen; Jennifer M Tsuruda; Anthon J Cornel; Mustapha Debboun; Walter S Leal
Journal:  Naturwissenschaften       Date:  2004-08-24

7.  Pheromone binding and inactivation by moth antennae.

Authors:  R G Vogt; L M Riddiford
Journal:  Nature       Date:  1981 Sep 10-16       Impact factor: 49.962

8.  Genome-wide analysis of the odorant-binding protein gene family in Drosophila melanogaster.

Authors:  Daria S Hekmat-Scafe; Charles R Scafe; Aimee J McKinney; Mark A Tanouye
Journal:  Genome Res       Date:  2002-09       Impact factor: 9.043

9.  The salivary glands and saliva of Anopheles gambiae as an essential step in the Plasmodium life cycle: a global proteomic study.

Authors:  Valérie Choumet; Annick Carmi-Leroy; Christine Laurent; Pascal Lenormand; Jean-Claude Rousselle; Abdelkader Namane; Charles Roth; Paul T Brey
Journal:  Proteomics       Date:  2007-09       Impact factor: 3.984

10.  Identification of a distinct family of genes encoding atypical odorant-binding proteins in the malaria vector mosquito, Anopheles gambiae.

Authors:  P X Xu; L J Zwiebel; D P Smith
Journal:  Insect Mol Biol       Date:  2003-12       Impact factor: 3.585

View more
  34 in total

1.  Peripheral olfactory signaling in insects.

Authors:  Eunho Suh; Jonathan Bohbot; Laurence J Zwiebel
Journal:  Curr Opin Insect Sci       Date:  2014-12-01       Impact factor: 5.186

2.  Profiles of soluble proteins in chemosensory organs of three members of the afro-tropical Anopheles gambiae complex.

Authors:  Immacolata Iovinella; Beniamino Caputo; Maria Calzetta; Laurence J Zwiebel; Francesca Romana Dani; Alessandra Della Torre
Journal:  Comp Biochem Physiol Part D Genomics Proteomics       Date:  2017-08-02       Impact factor: 2.674

Review 3.  Access to the odor world: olfactory receptors and their role for signal transduction in insects.

Authors:  Joerg Fleischer; Pablo Pregitzer; Heinz Breer; Jürgen Krieger
Journal:  Cell Mol Life Sci       Date:  2017-08-21       Impact factor: 9.261

4.  Silencing the Odorant Binding Protein RferOBP1768 Reduces the Strong Preference of Palm Weevil for the Major Aggregation Pheromone Compound Ferrugineol.

Authors:  Binu Antony; Jibin Johny; Saleh A Aldosari
Journal:  Front Physiol       Date:  2018-03-21       Impact factor: 4.566

5.  Association of putative members to family of mosquito odorant binding proteins: scoring scheme using fuzzy functional templates and cys residue positions.

Authors:  Malini Manoharan; Kannan Sankar; Bernard Offmann; Sowdhamini Ramanathan
Journal:  Bioinform Biol Insights       Date:  2013-07-22

6.  Comparative genomics and transcriptomics in ants provide new insights into the evolution and function of odorant binding and chemosensory proteins.

Authors:  Sean K McKenzie; Peter R Oxley; Daniel J C Kronauer
Journal:  BMC Genomics       Date:  2014-08-26       Impact factor: 3.969

7.  A draft genome sequence of an invasive mosquito: an Italian Aedes albopictus.

Authors:  Vicky Dritsou; Pantelis Topalis; Nikolai Windbichler; Alekos Simoni; Ann Hall; Daniel Lawson; Malcolm Hinsley; Daniel Hughes; Valerio Napolioni; Francesca Crucianelli; Elena Deligianni; Giuliano Gasperi; Ludvik M Gomulski; Grazia Savini; Mosè Manni; Francesca Scolari; Anna R Malacrida; Bruno Arcà; José M Ribeiro; Fabrizio Lombardo; Giuseppe Saccone; Marco Salvemini; Riccardo Moretti; Giuseppe Aprea; Maurizio Calvitti; Matteo Picciolini; Philippos Aris Papathanos; Roberta Spaccapelo; Guido Favia; Andrea Crisanti; Christos Louis
Journal:  Pathog Glob Health       Date:  2015-09-14       Impact factor: 2.894

8.  Comparative Genomics Provide Insights Into Function and Evolution of Odorant Binding Proteins in Cydia pomonella.

Authors:  Cong Huang; Xue Zhang; Dongfeng He; Qiang Wu; Rui Tang; Longsheng Xing; Wanxue Liu; Wenkai Wang; Bo Liu; Yu Xi; Nianwan Yang; Fanghao Wan; Wanqiang Qian
Journal:  Front Physiol       Date:  2021-07-07       Impact factor: 4.566

9.  The co-expression pattern of odorant binding proteins and olfactory receptors identify distinct trichoid sensilla on the antenna of the malaria mosquito Anopheles gambiae.

Authors:  Anna Schultze; Pablo Pregitzer; Marika F Walter; Daniel F Woods; Osvaldo Marinotti; Heinz Breer; Jürgen Krieger
Journal:  PLoS One       Date:  2013-07-05       Impact factor: 3.240

10.  Expression and accumulation of the two-domain odorant-binding protein AaegOBP45 in the ovaries of blood-fed Aedes aegypti.

Authors:  André Luis Costa-da-Silva; Bianca B Kojin; Osvaldo Marinotti; Anthony A James; Margareth Lara Capurro
Journal:  Parasit Vectors       Date:  2013-12-24       Impact factor: 3.876

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.