| Literature DB >> 21554730 |
Kelly E O'Quin1, Daniel Smith, Zan Naseer, Jane Schulte, Samuel D Engel, Yong-Hwee E Loh, J Todd Streelman, Jeffrey L Boore, Karen L Carleton.
Abstract
BACKGROUND: Divergence within cis-regulatory sequences may contribute to the adaptive evolution of gene expression, but functional alleles in these regions are difficult to identify without abundant genomic resources. Among African cichlid fishes, the differential expression of seven opsin genes has produced adaptive differences in visual sensitivity. Quantitative genetic analysis suggests that cis-regulatory alleles near the SWS2-LWS opsins may contribute to this variation. Here, we sequence BACs containing the opsin genes of two cichlids, Oreochromis niloticus and Metriaclima zebra. We use phylogenetic footprinting and shadowing to examine divergence in conserved non-coding elements, promoter sequences, and 3'-UTRs surrounding each opsin in search of candidate cis-regulatory sequences that influence cichlid opsin expression.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21554730 PMCID: PMC3116502 DOI: 10.1186/1471-2148-11-120
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 1Conservation between . A) SWS1 opsin-containing region. B) SWS2-LWS opsin-containing region. C) RH2 opsin-containing region. Top line represents O. niloticus BAC sequence. Conserved non-coding elements (CNEs) are numbered and highlighted in red; repetitive sequences are highlighted in green; promoter sequences later examined for interspecific polymorphism are highlighted in blue.
List of candidate transcription factors surveyed in this study
| Transcription Factor | Symbol | OMIM1 # | TESS2 # (mice) | Opsin(s) affected | Ref(s) |
|---|---|---|---|---|---|
| Activator Protein 1 | AP-1 | 165160 | T00032 | [ | |
| Cone-rod homeobox-protein | CRX/OTX | 602225 | T03461 | [ | |
| Nuclear Factor kappa B | NFκB | 164011 | T00588 | [ | |
| Photoreceptor-specific nuclear receptor | PNR | 604485 | T03723* | SWS | [ |
| Retinoic Acid Receptor α | RARα | 180240 | T01327 | [ | |
| Retinoic Acid Receptor β | RARβ | 180220 | T01328 | [ | |
| Retinoic Acid Receptor γ | RARγ | 180190 | T01329 | [ | |
| Retinoid X Receptor α | RXRα | 180245 | T01331 | - | - |
| Retinoid X Receptor β | RXRβ | 180246 | T01332 | - | - |
| Retinoid X Receptor γ | RXRγ | 180247 | T01333 | SWS | [ |
| Thyroid Hormone Receptor α | THRα | 190120 | T01173 | [ | |
| Thyroid Hormone Receptor β | THRβ | 190160 | T00851* | [ |
1 Online Mendelian Inheritance in Man (http://www.ncbi.nlm.nih.gov/omim)
2 Transcription Element Search System (http://www.cbil.upenn.edu/cgi-bin/tess/tess)
* TESS # for human sequences
Assembly statistics for the O. niloticus and M. zebra opsin-containing BACs
| Species | Opsin array | Clone ID | Estimated clone size (bp) | Sequencing method | Contig size (bp) | Reads assembled (%) | GenBank accession nos. |
|---|---|---|---|---|---|---|---|
| T4057DH09 | 210,000 | ABI, 454 | 171,838 | 77 K + 3 K | |||
| T4075AE05 | 184,000 | ABI | 171,742 | 3072 | |||
| T4024BG04 | 200,000 | ABI | 177,366 | 3072 | |||
| Mz042C6 | 87,000 | 454 | 77,652 | 79,892 | |||
| Mz045P9 | 96,000 | 454 | 107,624 | 43,135 | |||
| Mz088M22 | 133,000 | 454 | 83,463 | 21,758 |
1 Estimated clone size based on Pulsed Gel Electrophoresis.
Figure 2Alignment of two putative . A) Alignment of CNE 7a from five fish genomes to dre-miR-726. This region is highly similar among all fish species examined. Black box indicates the mature microRNA sequence. B) Alignment of CNE 7b from five fish genomes to the human LWS-LCR. These sequences show regions of high similarity between humans and fishes. Asterisks (*) indicate positions that are identical among all taxa; colons (:) indicate positions that are identical among four out of five taxa. Boxes highlight conserved transcription factor binding sites.
Comparison of sequence similarity and TFBS/miRNA target site divergence for putative cis-regulatory regions surrounding the opsin arrays of O. niloticus and M. zebra
| Region | Identity | Dxy1 | Length | Length | TFBS | TFBS | Est. | p-value3 | |
|---|---|---|---|---|---|---|---|---|---|
| (%) | (%) | Divt. | Shrd | Pdiv2 (%) | |||||
| CNE4 | 1 | 96.84 | 3.23 | 158 | 158 | 0 | 2 | 0.0 | 1.000 |
| 2 | 96.22 | 3.88 | 240 | 239 | 2 | 6 | 25.0 | 0.130 | |
| 3 | 94.74 | 4.53 | 349 | 359 | 7 | 1 | 87.5 | < 0.001* | |
| 4 | 98.31 | 1.70 | 240 | 241 | 2 | 0 | 100.0 | 0.006 | |
| 5 | 96.14 | 3.97 | 207 | 207 | 1 | 0 | 100.0 | 0.080 | |
| 6 | - | - | 300 | - | - | - | - | - | |
| 7 | 97.16 | 2.89 | 882 | 885 | 1 | 8 | 11.1 | 0.528 | |
| 8 | 88.46 | 4.86 | 779 | 799 | 3 | 9 | 25.0 | 0.065 | |
| 9 | 93.93 | 6.33 | 313 | 313 | 1 | 3 | 25.0 | 0.283 | |
| 10 | 97.64 | 2.40 | 127 | 127 | 0 | 0 | - | - | |
| 11 | 95.97 | 4.14 | 124 | 124 | 1 | 1 | 50.0 | 0.154 | |
| 12 | 95.53 | 4.61 | 246 | 249 | 1 | 3 | 25.0 | 0.284 | |
| 13 | 97.66 | 2.37 | 214 | 214 | 1 | 9 | 10.0 | 0.566 | |
| 14 | 88.97 | 4.71 | 999 | 1404 | 1 | 9 | 10.0 | 0.566 | |
| 15 | 95.32 | 4.84 | 428 | 428 | 3 | 6 | 33.3 | 0.030 | |
| 16 | 91.21 | 9.35 | 182 | 191 | 0 | 2 | 0.0 | 1.000 | |
| 17 | 96.14 | 3.96 | 311 | 313 | 2 | 3 | 40.0 | 0.054 | |
| 18 | 93.25 | 7.07 | 1087 | 976 | 5 | 13 | 27.8 | 0.012 | |
| 19 | - | - | 69 | - | - | - | - | - | |
| 20 | 98.88 | 1.13 | 358 | 38 | 1 | 13 | 7.1 | 1.000 | |
| Proximal | 97.56 | 2.48 | 1000 | 1000 | 1 | 16 | 5.9 | 1.000 | |
| 94.80 | 5.38 | 1000 | 1000 | 10 | 11 | 47.6 | < 0.001* | ||
| 91.77 | 8.60 | 1000 | 1000 | 14 | 19 | 42.4 | < 0.001* | ||
| 61.35 | 9.40 | 1000 | 1000 | 15 | 7 | 68.1 | < 0.001* | ||
| 71.49 | 26.37 | 1000 | 1000 | 18 | 10 | 64.3 | < 0.001* | ||
| 97.19 | 2.87 | 1000 | 1000 | 11 | 12 | 47.8 | < 0.001* | ||
| 81.96 | 16.31 | 1000 | 1000 | 4 | 10 | 28.6 | 0.021 | ||
| 3'-UTR6 | 93.39 | 6.92 | 189 | 189 | 1 | 4 | 20.0 | 0.341 | |
| 94.04 | 6.21 | 438 | 442 | 4 | 9 | 30.8 | 0.016 | ||
| 93.26 | 7.06 | 465 | 460 | 4 | 11 | 26.7 | 0.027 | ||
| 93.15 | 7.18 | 310 | 319 | 4 | 4 | 50.0 | 0.002* | ||
| 96.74 | 3.33 | 217 | 242 | 1 | 3 | 25.0 | 0.284 | ||
| 98.37 | 1.64 | 123 | 123 | 0 | 1 | 0.0 | 1.000 | ||
| 95.90 | 4.21 | 124 | 137 | 4 | 1 | 80.0 | < 0.001* | ||
1 Pairwise sequence divergence between O. niloticus and M. zebra, corrected for multiple hits.
2 Actual proportion of divergent TFBSs observed for O. niloticus and M. zebra.
3 P-values for the Exact binomial test at a null proportion divergence = 8%. Tests marked with an asterisk (*) are significant after Bonferroni correction for multiple comparisons.
4 See Additional file 6 for individual counts of each TFBS identified for the CNEs.
5 See Figure 3 for individual counts of each TFBS identified for the proximal promoters.
6 See Additional file 7 for individual counts of each microRNA target site identified for the 3'-UTRs.
Figure 3Transcription factor binding site diversity within opsin proximal promoters. A - G) Distribution of ten transcription factor binding sites (TFBS) in the proximal promoters of each opsin in O. niloticus and M. zebra. TFBS labelled in blue are present in O. niloticus only, those in red are present in M. zebra only, and those in black are found in both species. Sites labelled simply RAR correspond to all three retinoic acid paralogs. The orientation of factors above or below the central reference line has no special meaning, although O. niloticus-only sites are generally above the line, and M. zebra-only sites are below it. H) Comparison of the average number of binding sites for each transcription factor in the proximal promoters of the opsins and seven randomly-selected, non-opsin genes in O. niloticus. On average, the opsins contain significantly greater numbers of binding sites for these transcription factors compared to the non-opsin genes.
Conserved microRNA target sites within the 3'-UTRs of each opsin in O. niloticus and M. zebra
| Opsin | miRNA | Target | Conserved1 | Function and expression | Ref(s) |
|---|---|---|---|---|---|
| miR-725 | TGACTGAG | GA | Expressed in fins | [ | |
| miR-217 | ATGCAGTA | GA | Alters | [ | |
| miR-181a | AGAATGTA | DR | T-cell regulation; found in eye | [ | |
| miR-23b | TATGTGAA | TR | Ganglion apoptosis; found in eye | [ | |
| miR-96 | TTGCCAAA | OL | Sensory organ specific; found in eye | [ | |
| miR-182a | TTGCCAAA | OL | Sensory organ specific; found in eye | [ | |
| miR-728 | TTTAGTAA | GA,TN,TR | Unknown; found in eye | [ | |
| miR-722* | GCAAAAAA | TR | Unknown; found in eye | [ |
1 Other fish species in which this target site is also found: GA = stickleback (G. aculeatus), DR = zebrafish (D. rerio), TR = fugu (T. rubripes), TN = pufferfish (T. nigroviridis); OL = medaka (O. latipes)
* This site present in O. niloticus only
Polymorphism statistics for 8 candidate cis-regulatory regions in 18 Lake Malawi cichlid species
| Opsin | Length (bp) | s2 | H3 | π4 | C5 | CRX7 | ||
|---|---|---|---|---|---|---|---|---|
| 1000 | 16 | 5 | 17 | 0.0020 | 0.983 | -1.4424 | 1 | |
| 694 | 2 | 1 | 3 | 0.0008 | 0.997 | 0.2951 | 0 | |
| 1000 | 7 | 1 | 6 | 0.0010 | 0.992 | -1.1518 | 2 | |
| 950 | 17 | 3 | 15 | 0.0022 | 0.982 | -1.1050 | 2 | |
| 956 | 12 | 2 | 11 | 0.0012 | 0.987 | -1.2394 | 0 | |
| CNE 10 | 882 | 12 | 1 | 10 | 0.0021 | 0.986 | -0.2311 | 0 |
| 442 | 2 | 0 | 4 | 0.0013 | 0.995 | 0.4486 | NA | |
| 436 | 1 | 0 | 2 | 0.0006 | 0.998 | 0.0298 | NA |
1 Total number of segregating sites
2 Total number of segregating sites that are singletons
3 Total number of haplotypes
4 Nucleotide diversity
5 Sequence conservation
6 Tajima's D
7 Total number of segregating sites that interrupt predicted CRX binding sites
* Statistics presented for in/del polymorphism
Figure 4Interspecific polymorphism in eight putative . A - H) Minor allele frequency (MAF; in red) and nucleotide diversity (π; in black) calculated in a sliding window across the proximal promoter regions of five opsins (A - E), CNE 7 (LWS-LCR) (F), and two opsin 3'-UTRs (G - H) using 18 Lake Malawi cichlid species. Numbers above peaks of MAF and π denote the position of SNPs analyzed for allelic-association with opsin expression (see Table 6); asterisks (*) denote polymorphisms that interrupt CRX binding sites.
Results of allelic association between SNPs underlying peaks of nucleotide diversity and opsin expression in 18 Lake Malawi cichlid species
| Polymorphism distance from TSS | Type | MAF1 | t-value | P-value | |
|---|---|---|---|---|---|
| SWS1 -54 | C*T | 0.222 | -0.279 | -0.911 | > 0.05 |
| SWS2B -208 | C*T | 0.417 | < 0.001 | 0.003 | > 0.05 |
| SWS2B -55 | 1 bp indel | 0.444 | 0.240 | 0.789 | > 0.05 |
| SWS2A -224* | C*T | 0.222 | 0.127 | 1.037 | > 0.05 |
| SWS2A -217* | 8 bp indel | 0.194 | 0.392 | 1.841 | 0.087 |
| RH2B -308 | C*G | 0.167 | -0.245 | -0.893 | > 0.05 |
| RH2B -161 | C*T | 0.111 | 0.263 | 3.447 | 0.004 |
| LWS -208 | C*T | 0.167 | 0.355 | 1.002 | > 0.05 |
| CNE-7 183 | A*T | 0.222 | 0.055 | -0.673 | > 0.05 |
| CNE-7 570 | C*T | 0.417 | 0.608 | 2.237 | 0.041 |
| SWS2B-UTR 197 | A*C | 0.306 | 0.349 | 1.264 | > 0.05 |
* These polymorphisms interrupt CRX transcription factor binding sites
Figure 5Divergence among coding and non-coding regions in . A) Pairwise sequence divergence (Dxy) between O.niloticus and M. zebra for different coding and non-coding regions of the opsin-containing BACs. Average Jukes-Cantor-corrected Dxy is higher among 5' proximal promoter regions for each opsin. B) Venn diagram of proportion of shared and divergent TFBS and microRNA target sites among non-coding regions examined in this study. Opsin promoter regions exhibit slightly elevated proportions of divergent sites compared to either CNEs or 3'-UTRs. C) Comparison of proportion divergent TFBS/miRNA target sites (Pdiv) and pairwise sequence divergence (Dxy). Non-coding sequences with elevated Pdiv do not necessarily exhibit increased Dxy, even among proximal promoters regions. Filled points are those sequences with Pdiv values that differ significantly from 8% (see also Table 3).