| Literature DB >> 27698666 |
Kristopher J L Irizarry1, Randall L Bryden2.
Abstract
Color variation provides the opportunity to investigate the genetic basis of evolution and selection. Reptiles are less studied than mammals. Comparative genomics approaches allow for knowledge gained in one species to be leveraged for use in another species. We describe a comparative vertebrate analysis of conserved regulatory modules in pythons aimed at assessing bioinformatics evidence that transcription factors important in mammalian pigmentation phenotypes may also be important in python pigmentation phenotypes. We identified 23 python orthologs of mammalian genes associated with variation in coat color phenotypes for which we assessed the extent of pairwise protein sequence identity between pythons and mouse, dog, horse, cow, chicken, anole lizard, and garter snake. We next identified a set of melanocyte/pigment associated transcription factors (CREB, FOXD3, LEF-1, MITF, POU3F2, and USF-1) that exhibit relatively conserved sequence similarity within their DNA binding regions across species based on orthologous alignments across multiple species. Finally, we identified 27 evolutionarily conserved clusters of transcription factor binding sites within ~200-nucleotide intervals of the 1500-nucleotide upstream regions of AIM1, DCT, MC1R, MITF, MLANA, OA1, PMEL, RAB27A, and TYR from Python bivittatus. Our results provide insight into pigment phenotypes in pythons.Entities:
Year: 2016 PMID: 27698666 PMCID: PMC5028829 DOI: 10.1155/2016/1286510
Source DB: PubMed Journal: Adv Bioinformatics ISSN: 1687-8027
Figure 1Color and pattern phenotypes in ball pythons (Part 1). Observed variation in pigmentation and color morphological phenotypes provides evidence for genetic modulation of these phenotypes. (a) Homozygous cinnamon (codominant), heterozygous banana (codominant). (b) Homozygous piebald (recessive), heterozygous pastel (codominant). (c) Heterozygous black pastel (codominant), heterozygous pinstripe (dominant). (d) Wild type. (e) Piebald (recessive). (f) Heterozygous cinnamon (codominant), heterozygous pastel (codominant). (g) Heterozygous pastel (codominant). (h) Heterozygous Lesser (codominant), heterozygous pinstripe (dominant). (i) Homozygous piebald (recessive). (j) Pinstripe (dominant).
Python orthologs of mammalian coat color genes and associated pairwise percent identity.
| Gene | pyth2hum | pyth2mouse | pyth2dog | pyth2horse | pyth2cow | pyth2chick | pyth2anole | pyth2Tham_sirt |
|---|---|---|---|---|---|---|---|---|
| ASIP | 46% | 39 | 45 | 39 | 48 | 52 | 79 | 86 |
| KIT | 64 | 64 | 64 | 64 | 65 | 71 | 83 | 89 |
| MGRN1 | 74 | 73 | 77 | 73 | 74 | 78 | 80 | 92 |
| PMEL | 40 | 48 | 39 | 41 | 39 | 50 | 66 | 77 |
| OCA2 | 86 | 83 | 86 | 86 | 73 | 89 | 93 | 82 |
| MYO5A | 89 | 88 | 88 | 86 | 89 | 89 | 92 | 82 |
| TRPM1 | 76 | 74 | 76 | 76 | 76 | 84 | 86 | 91 |
| RAB27A | 90 | 92 | 90 | 92 | 91 | 93 | 94 | 95 |
| SLC24A5 | 76 | 71 | 75 | 76 | 75 | 77 | 85 | 0 |
| EDN3 | 42 | 41 | 43 | 48 | 42 | 64 | 60 | 78 |
| MITF | 80 | 77 | 80 | 79 | 72 | 88 | 94 | 94 |
| SLC2A9 | 68 | 67 | 68 | 71 | 68 | 73 | 83 | 88 |
| RAB38 | 74 | 76 | 76 | 78 | 78 | 82 | 83 | 0 |
| PAX3 | 98 | 96 | 97 | 97 | 99 | 97 | 99 | 99 |
| ATRN | 79 | 80 | 80 | 95 | 77 | 83 | 87 | 91 |
| OSTM1 | 62 | 57 | 59 | 62 | 57 | 60 | 66 | 86 |
| MCOLN3 | 82 | 80 | 82 | 78 | 81 | 85 | 90 | 95 |
| KITLG | 40 | 49 | 46 | 46 | 46 | 62 | 73 | 85 |
| TYR | 71 | 72 | 73 | 73 | 74 | 76 | 81 | 87 |
| LYST | 64 | 62 | 63 | 63 | 63 | 67 | 79 | 92 |
| SOX10 | 82 | 83 | 83 | 79 | 82 | 95 | 97 | 99 |
| STX17 | 74 | 71 | 74 | 74 | 74 | 76 | 81 | 87 |
| EDNRB | 64 | 61 | 62 | 61 | 60 | 80 | 79 | 64 |
pyth2hum: python and human; pyth2mouse: python and mouse; pyth2dog: python and dog; pyth2horse: python and horse; pyth2cow: python and cow; pyth2chick: python and chicken; pyth2anole: python and anole lizard; pyth2Tham_sirt: python and garter snake. Identity: best blast hit between proteins.
Figure 3Heat map of orthologous coat color genes illustrating pairwise percent identity. Twenty-three orthologous genes modulating pigmentation and coat color in mammals are shown as a heat map illustrating pairwise protein sequence percent identity based on BLASP best hit between the listed species. The highest percent identity is shown in red, moderate percent identity is shown in magenta, and the lowest percent identity is shown in blue. The pleiotropic effects associated with each gene are included within the parentheses immediately following each gene name. Note: gray boxes (RAB38 and SLC24A5) under pyth2Tham_sirt indicate missing sequences; in this case, the garter snake orthologs of these two genes were not available in the NCBI database. pyth2hum: python and human; pyth2mouse: python and mouse; pyth2dog: python and dog; pyth2horse: python and horse; pyth2cow: python and cow; pyth2chick: python and chicken; pyth2anole: python and anole lizard; pyth2Tham_sirt: python and garter snake.
Pigmentation gene expression network components organized by reference and component type.
| Reference | Transcription factors | Promoter/target gene | Regulatory elements | Transcription factor binding sequences | Evolutionary conservation | Year |
|---|---|---|---|---|---|---|
| Bentley et al., 1994 [ | USF, MITF, SP1 | Tyrosinase, 115 bp fragment | M-box, CR1, CR2 | CATGTG, GGTGGA, GTGATAAT | Turtle, quail, human | 1994 |
|
| ||||||
| Besch and Berking, 2014 [ | POU domain TFs, POU3F2 (Brn2) | MITF promoter | ATGCAAAT | 2014 | ||
|
| ||||||
| Fuse et al., 1996 [ | MITF | MITF promoter, melanocyte | GATA, CRE, TATA-like | CATGTG | 1996 | |
|
| ||||||
| Gorkin et al., 2012 [ | SOX10, MITF, TEAD1, JUND, FOS, JUN | 6-mers and 7-mers are more informative in these analyses than k-mers of other lengths | 2489 putative melanocyte enhancers | ACA[AGC]AG SOX10 | 2012 | |
| CAC[AG][TG]G MITF | ||||||
| GGAAT[GT][TC] TEAD1 | ||||||
| [AG]TGA[CG]TCA JUND,FOS,JUN | ||||||
|
| ||||||
| Loftus et al., 2009 [ | MITF, SOX10 | Gpnmb | E-box motif | CACGTG, | Mouse, rat, dog, cat, horse, platypus, chicken | 2009 |
|
| ||||||
| Moro et al., 1999 [ | MITF, AP-1, AP-2, SP-1 | MC1R | E-box, TATA box, AP-1, AP-2, SP-1 | GCCTGCGG AP-2 | 1999 | |
| GCCCGGGG AP-2 | ||||||
| TGACTCAG AP-1 | ||||||
| CAAGTG (E-box) | ||||||
| CAGAGT (E-box) | ||||||
| CACCTG (E-box) | ||||||
| CAGGTG (E-box) | ||||||
| CAGGTG (E-box) | ||||||
| CAGCTG (E-box) | ||||||
|
| ||||||
| Murisier et al., 2006 [ | SOX10, SP-1 | Tyrp1 melanocytes | E-box, SP-1 sites | AACAAA, | 2006 | |
| CANNTG E-box consensus | ||||||
| [AT][AT]CAA[TA] SP-1 | ||||||
|
| ||||||
| Murisier et al., 2007 [ | Tyrosinase, retinal pigmented epithelium | DRE located at −15 kb acts as a strong transcriptional enhancer in melanocytes | Conserved noncoding sequences (CNS) that might represent putative novel regulatory elements of the Tyr gene | 2007 | ||
|
| ||||||
| Murisier et al., [ | MITF, SOX10, Brn2, Mitf, Otx2, Pax2, Pax3, Pax6, Sox10, Tbx2, USF-1 | Tyrosinase, core enhancer requires the E-boxes and Sox10 motifs | DRE: distal reg. Elem, E-box, M-box | CANNTG E-box | 2007 | |
| AGTCANNTGCT M-box | ||||||
| [AT] [AT]CAA[TA] potential SOX binding | ||||||
|
| ||||||
| Schwahn et al., 2005 [ | LEF-1 | MITF, tyrosinase, DCT, TRP-1, PMEL17, Moel for DCT expression in proliferating and senescent normal human melanocytes | M-box that includes the MITF CATGTG, which overlaps with (ER-a), USF-1, TFE-3, Isl-1 and AP-1 binding elements | CTTTGGGTCATGTG LEF-1 & M-box | 2005 | |
| GGTCATGTGCT estrogen RE | ||||||
|
| ||||||
| Vachtenheim and Borovanský, 2010 [ | MITF, PAX3, LEF-1, SOX10, CREB, POU3F2, USF-1, p53, Tbx2 | Mc1R, EDNRB, RAB27A, OA1, PMEL17, MLANA, Gpnmb, melanostatin I, Aim1TYR, TRP-1, TRP-2 (DCT), | M-box, E-box, MITF binding motifs | MITF binding sites: AGTCATGTGCT, ACATGTGA, AATCATGTGCT, | 2010 | |
|
| ||||||
| Vance and Goding, 2004 [ | MITF, SOX10, LEF1, CREB, PAX3 | MITF promoter | M-box (T) CATGTG (A) | –268 CATTGTC –262 (SOX10), | 2004 | |
|
| ||||||
| Wan et al., 2011 [ | MITF, SOX10, PAX3, STAT3, CREB, LEF-1, ITF2, FOXD3, BRN2 (POU3F2) | MITF promoter, tyrosinase promoter | 2011 | |||
|
| ||||||
| Watanabe et al., 2002 [ | MITF, SOX10, PAX3 | MITF M promoter | Distal enhancer | [AT][AT]CAA[AT]G SOX10 | 2002 | |
| CATTGAA SOX10-s1 | ||||||
| AACAAA SOX10-s2 | ||||||
| TTTTGTT SOX10-s3 | ||||||
| AACAAAA SOX10-s4 | ||||||
Relevant information extracted from references is presented in the table. References are ordered alphabetically by the first author.
InterPro domain annotation for pigmentation associated python transcription factors.
| Gene | Protein name | StartPos | EndPos | DomainId | Domain name |
|---|---|---|---|---|---|
| CREB | cAMP response element-binding protein | 87 | 146 | IPR003102 | Coactivator CBP, pKID |
| CREB | cAMP response element-binding protein | 284 | 342 | IPR004827 | Basic-leucine zipper domain |
| FOXD3 | Forkhead box D3 | 71 | 173 | IPR011991 | Winged helix-turn-helix DNA binding domain |
| LEF-1 | Lymphoid enhancer-binding factor 1 | 1 | 65 | IPR027397 | Catenin binding domain |
| LEF-1 | Lymphoid enhancer-binding factor 1 | 1 | 209 | IPR013558 | CTNNB1 binding, N-terminal |
| LEF-1 | Lymphoid enhancer-binding factor 1 | 293 | 375 | IPR009071 | High mobility group box domain |
| MITF | Microphthalmia-associated transcription factor | 303 | 370 | IPR011598 | Myc-type, basic helix-loop-helix (bHLH) domain |
| MITF | Microphthalmia-associated transcription factor | 393 | 519 | IPR021802 | MiT/TFE transcription factors, C-terminal |
| MITF | Microphthalmia-associated transcription factor | 4 | 142 | IPR031867 | MiT/TFE transcription factors, N-terminal |
| POU3F2 | POU Class 3 Homeobox 2 | 51 | 125 | IPR000327 | POU-specific domain |
| POU3F2 | POU Class 3 Homeobox 2 | 125 | 205 | IPR009057 | Homeodomain-like |
| USF-1 | Upstream transcription factor 1 | 193 | 286 | IPR011598 | Myc-type, basic helix-loop-helix (bHLH) domain |
Start and end positions are provided for each domain detected in the transcription factor protein sequences.
Figure 2Color and pattern phenotypes in ball pythons (Part 2). Observed variation in pigmentation and color morphological phenotypes provides evidence for genetic modulation of these phenotypes. (a) Heterozygous spider (dominant). (b) Homozygous Clown (recessive). (c) Heterozygous Mojave (codominant), heterozygous Enchi (codominant). (d) Homozygous piebald (recessive), heterozygous spider (dominant). The examples of ball python (Python regius) morphological phenotypes are used to illustrate phenotypic diversity in pythons and are not meant to suggest any direct causative relationship between the color and pigmentation phenotypes illustrated and the resulting analysis of Burmese python genomic sequence.
Figure 4Orthologous multiple sequence alignment of MITF across species. Strong patterns of conservation are observed from alignment position 130 until alignment position 293. A second long block of conservation is observed at position 347 and continues until the end of the alignment at position 575. The DNA binding region of MITF is contained with the Myc-type, basic helix-loop-helix (bHLH) domain at spanning from position 303 to position 370 (this region is highlighted in yellow in the figure). This region of the alignment is extremely well conserved with only a single amino acid differing between the species at alignment position 418 where the mammals (mouse, dog, and horse) have a threonine (T) while all of the nonmammals have an alanine (A). The highlighted green region of the mouse protein indicates the region of the mouse DNA binding region for which a protein structure cocrystal exists.
Figure 5Orthologous multiple sequence alignment of FOXD3 across species. Highly conserved regions are colored in blue. The winged helix-turn-helix DNA binding region spans a region beginning at position 71 and ending at position 173. This region is highly conserved. The single nonconserved residue within the DNA binding domain occurs at position 148 in the alignment where human, dog, and painted turtle have asparagine (N) while chicken, garter snake, anole lizard, and python have serine (S).
Figure 6Orthologous multiple sequence alignment of LEF-1 across species. Note regions of divergence as well as some areas of strong conservation. Specifically, the N-terminal region exhibits some variability in amino acids exemplified by variable length repeats of the amino acid glycine near the beginning of the sequences. Stronger patterns of conservation are observable beginning at alignment position 146 and continuing at alignment position 220 which partly overlaps with the CTNNB1 binding, N-terminal domain (position 1 to position 209 in the python sequence). The DNA binding region of LEF-1 exhibits identical amino acid composition among all the species except for the king cobra (for which the sequence is annotated as partial).
Figure 7Orthologous multiple sequence alignment of CREB across species. CREB sequences exhibit considerable conservation throughout much of the alignment with a few notable point differences with the conserved regions. The basic-leucine zipper DNA binding domain occurs in the C-terminal region (284 to 342 in the python sequence). Within the DNA recognition strong conservation is observed among the species and only king cobra exhibits considerable divergent sequence identity in this region.
Figure 8Orthologous multiple sequence alignment of POU3F2 across species. The alignment shows marked divergence within the N-terminal portions of the protein and strong conservation within the C-terminal region. The pattern of sequence variation observed within the N-terminal region is consistent with properties associated with transcriptional activator domains where lengths of consecutive repeats of basic amino acids vary among species. Between alignment position 151 and alignment position 160, a group of glutamines (Q) are observed with noticeable differences among the species. However, the DNA binding region of POU3F2, which is composed of the POU-specific domain (python position 51 to position 125) and the homeodomain (python position 125 to position 205), exhibits perfect sequence identity among all of the species.
Figure 9Orthologous multiple sequence alignment of USF-1 across species. The alignment shows patterns of conservation as well as divergence between the species. The N-terminal region exhibits some distinct differences between the species interspersed with groups of strong conservation. The Myc-type, basic helix-loop-helix (bHLH) DNA binding domain spans the region of the python protein beginning at position 193 and ending at position 286. Within this portion of the alignment, eight specific amino acids exhibit divergence between the species.
Figure 10TYR, DCT, and AIM1 upstream regions and associated candidate enhancer modules. (a) The upstream region from the python tyrosinase gene (TYR) contains a cluster of pigmentation associated transcription factor binding sites located between position 1 and position 200. This cluster includes binding sites for POU3F2, USF-1, LEF-1, and FOXD3. A second cluster occurs between position 410 and position 640 that contains sites for FOXD3, LEF-1, USF-1, and POU3F2. The third cluster is located from position 1210 to position 1410 and contains binding sites for CREB, MITF, POU3F2, and USF-1. (b) The upstream region of python dopachrome tautomerase (DCT) contains a cluster of binding sites from position 675 to position 800 that include LEF-1, POU3F2, MITF, and FOXD3. A second cluster occurs from position 800 to position 1000 which includes binding sites for transcription factors FOXD3, POU3F2, USF-1, LEF-1, and MITF. The third cluster contains binding sites for LEF-1, MITF, USF-1, POU3F2, and FOXD3. (c) The upstream region from Absent In Melanoma 1 (AIM1) (Figure 10(c)) contains a cluster of transcription factor binding sites between position 725 and position 925 for POU3F2, USF-1, MITF, FOXD3, and LEF-1. A second cluster of sites beginning at position 950 and ending at position 1150 contains binding sites for POU3F2 and FOXD3. A third cluster of binding sites beginning at position 1200 and ending at position 1500 has binding sites for USF-1, POU3F2, and LEF-1.
Figure 11MLANA, RAB27A, and MC1R upstream regions and associated candidate enhancer modules. (a) The upstream region from melan-A (MLANA) contains a cluster of binding sites beginning at position 275 and ending at position 475 which include recognition sites for LEF-1, POU3F2, USF-1, and FOX3D. A second cluster of binding sites begins at position 650 and extends to position 850 which contains binding sites for POU3F2, LEF-1, and FOX3D. The third cluster of sites occurs between position 1100 and position 1300 which contains binding sites for FOXD3, POU3F2, and USF-1. (b) The upstream region for RAS oncogene family member 27A (RAB27A) (Figure 11(b)) has a cluster of transcription factor binding sites beginning at position 250 and ending at position 450 which include sites for POU3F2, USF-1, and FOXD3. The second cluster of binding sites begins at position 500 and ends at position 700 which contains sites for POU3F2, USF-1, FOX3D, and LEF-1. The third cluster begins at position 1050 and ends at position 1250 and binds LEF-1, USF-1, CREB, FOX3D, and POU3F2. (c) The upstream region for melanocortin 1 receptor (MC1R) contains a cluster of transcription factor binding sites for USF-1 and POU3F2 between position 150 and position 350. A second cluster begins at position 725 and ends at position 925 with binding sites for USF-1, LEF-1, FOX3D, POU3F2, and MITF. A third cluster of binding sites is located between position 1225 and position 1425 with sites for LEF-1, USF-1, POU3F2, and FOXD3.
Figure 12OA1, MITF, and PMEL upstream regions and associated candidate enhancer modules. (a) The upstream region for ocular albinism type 1 (OC1) contains a cluster of transcription factor binding sites beginning at position 150 and spanning to position 400 which include recognition sites for POU3F2, USF-1, FOX3D, and LEF-1. A second cluster of binding sites occurs between position 550 and position 750 which binds USF-1, FOX3D, and POU3F2. The third cluster of transcription factor binding sites begins at position 1350 and ends at position 1500 and recognizes transcription factors USF-1, MITF, and LEF-1. (b) The upstream region for microphthalmia-associated transcription factor (MITF) has a cluster of transcription factor binding sites beginning at position 375 and ending at position 575. The sites recognized within this first cluster include POU3F2, FOXD3, MITF, CREB, and LEF-1. A second cluster of binding sites starts at position 1050 and ends at position 1250 with recognition sites for POU3F2, FOX3D, and LEF-1. The third cluster of binding sites begins at position 1300 and extends to position 1500 which recognizes transcription factors FOXD3, POU3F2, USF-1, and LEF-1. (c) The upstream region for premelanosome protein (PMEL) contains a cluster of transcription factor binding sites between position 125 and position 325 that include sites for USF-1, FOX3D, and LEF-1. A second cluster of binding sites begins at position 890 and extends to position 1090 with recognition sites for LEF-1, POU3F2, and USF-1. The third cluster of transcription factor binding sites begins at position 1225 and ends at position 1425 and contains binding sites for FOXD3, USF-1, and MITF.