| Literature DB >> 18837995 |
James H Thomas1, Hugh M Robertson.
Abstract
BACKGROUND: Chemoreceptor proteins mediate the first step in the transduction of environmental chemical stimuli, defining the breadth of detection and conferring stimulus specificity. Animal genomes contain families of genes encoding chemoreceptors that mediate taste, olfaction, and pheromone responses. The size and diversity of these families reflect the biology of chemoperception in specific species. r> RESULTS: Based on manual curation and sequence comparisons among putative G-protein-coupled chemoreceptor genes in the nematode Caenorhabditis elegans, we identified approximately 1300 genes and 400 pseudogenes in the 19 largest gene families, most of which fall into larger superfamilies. In the related species C. briggsae and C. remanei, we identified most or all genes in each of the 19 families. For most families, C. elegans has the largest number of genes and C. briggsae the smallest number, suggesting changes in the importance of chemoperception among the species. Protein trees reveal family-specific and species-specific patterns of gene duplication and gene loss. The frequency of strict orthologs varies among the families, from just over 50% in two families to less than 5% in three families. Several families include large species-specific expansions, mostly in C. elegans and C. remanei. r> CONCLUSION: Chemoreceptor gene families in Caenorhabditis species are large and evolutionarily dynamic as a result of gene duplication and gene loss. These dynamics shape the chemoreceptor gene complements in Caenorhabditis species and define the receptor space available for chemosensory responses. To explain these patterns, we propose the gray pawn hypothesis: individual genes are of little significance, but the aggregate of a large number of diverse genes is required to cover a large phenotype space.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18837995 PMCID: PMC2576165 DOI: 10.1186/1741-7007-6-42
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
Figure 1Sample protein tree. This tree is a section from the complete SRG protein tree (Additional file 20). Protein names are colored by species (green is Caenorhabditis elegans, blue is C. briggsae, and red is C. remanei). In addition to identifiers, each name includes the genome position of the corresponding gene. Open circles on branches indicate a branch support value of 0.9 or higher, as computed by phyml-alrt. The scale bar indicates number of amino acid changes per site. Probable strict ortholog trios are marked with filled black squares. A representative gene expansion in C. elegans is marked and a view of the gene arrangement is expanded to the right (adapted from the WormBase genome browser, WS170). An alignment of four of the C. elegans proteins from this gene expansion is shown in the lower right. Blue coloring is proportional to amino acid conservation.
Figure 2Negative correlation between orthology and nonfunctional gene frequencies in . Each of the 19 gene families is plotted once. The X-axis is the fraction of C. elegans genes in the family with single orthologs in both C. briggsae and C. remanei. The Y-axis is the fraction of C. elegans genes in the family with probable defective alleles in the N2 reference genome sequence. The correlation shown is much stronger than that between family size and fraction of defective genes (see Additional file 9). R is the Pearson correlation coefficient.
E-box matches for all promoters
| Superfamily | Family | Promoters | E-box Matches | Percent with E-box | |
| Sra | sra | 39 | 0 | (0) | NS |
| Sra | srab | 27 | 1 | (3.7) | NS |
| Sra | srb | 19 | 0 | (0) | NS |
| solo | srbc | 64 | 0 | (0) | NS |
| Str | srd | 71 | 0 | (0) | NS |
| Sra | sre | 55 | 14 | 25.5 | <0.0001 |
| Srg | srg | 69 | 4 | 5.8 | NS |
| Str | srh | 304 | 104 | 34.2 | <0.0001 |
| Str | sri | 78 | 39 | 50.0 | <0.0001 |
| solo | srsx | 35 | 1 | (2.9) | NS |
| Srg | srt | 73 | 1 | (1.4) | NS |
| Srg | sru | 45 | 4 | 8.9 | 0.02 |
| Srg | srv | 36 | 2 | 5.6 | NS |
| solo | srw | 148 | 8 | 5.4 | 0.04 |
| Srg | srx | 137 | 3 | 2.2 | NS |
| Srg | srxa | 16 | 2 | 12.5 | NS |
| solo | srz | 104 | 51 | 49.0 | <0.0001 |
| Str | str/srj | 323 | 9 | 2.8 | NS |
| All SR | 1675 | 246 | 14.7 | ||
| All genes | 20569 | 500 | 2.4 | NA |
E-box hits are the 500 highest scoring motif matches in promoters from -200 to -20 from the translation start codon. Percent values based on zero or one hit are in parentheses to indicate high uncertainty. The uncorrected P-value shown was determined by comparison with the number of hits in all promoters (Fisher's exact test for small families and the χ-square approximation for large families). After Bonferoni correction for multiple testing the sru and srw P-values are not significant. The str and srj families are closely related and were analyzed together. A few small chemoreceptor families are not shown individually, so numbers shown do not add up to the SR total. SR, all members of putative chemoreceptor families; NA, not applicable; NS, not significant.