Literature DB >> 28903537

Evolutionary Origins of Pax6 Control of Crystallin Genes.

Ales Cvekl1,2, Yilin Zhao2, Rebecca McGreal1,2, Qing Xie1,2, Xun Gu3, Deyou Zheng2,4,5.   

Abstract

The birth of novel genes, including their cell-specific transcriptional control, is a major source of evolutionary innovation. The lens-preferred proteins, crystallins (vertebrates: α- and β/γ-crystallins), provide a gateway to study eye evolution. Diversity of crystallins was thought to originate from convergent evolution through multiple, independent formation of Pax6/PaxB-binding sites within the promoters of genes able to act as crystallins. Here, we propose that αB-crystallin arose from a duplication of small heat shock protein (Hspb1-like) gene accompanied by Pax6-site and heat shock element (HSE) formation, followed by another duplication to generate the αA-crystallin gene in which HSE was converted into another Pax6-binding site. The founding β/γ-crystallin gene arose from the ancestral Hspb1-like gene promoter inserted into a Ca2+-binding protein coding region, early in the cephalochordate/tunicate lineage. Likewise, an ancestral aldehyde dehydrogenase (Aldh) gene, through multiple gene duplications, expanded into a multigene family, with specific genes expressed in invertebrate lenses (Ω-crystallin/Aldh1a9) and both vertebrate lenses (η-crystallin/Aldh1a7 and Aldh3a1) and corneas (Aldh3a1). Collectively, the present data reconstruct the evolution of diverse crystallin gene families.
© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  Pax6; aldehyde dehydrogenase; crystallin; eye evolution; heat shock responsive element; lens; small heat shock protein

Mesh:

Substances:

Year:  2017        PMID: 28903537      PMCID: PMC5737492          DOI: 10.1093/gbe/evx153

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

The birth of novel genes with potentially new functions is thought to be a major contributor to evolution (Kaessmann 2010). Evolution from ancestral proteins is a process by which mutation and selection processes generate new materials for either improved functions and/or formation of novel biological structures, that is, morphological evolution (Carroll 2008). It has been shown that the total number of protein-coding genes in genomes of diverse model metazoan organisms is similar. However, besides differences in protein content and sequences, the underlying functional differences of the encoded information are determined by differences in the wiring of the gene regulatory networks (GRNs) that control both formation and function of individual tissues and organs (Davidson and Erwin 2006; Wagner 2007; Carroll 2008; Davidson, 2010). Although a number of important insights into the molecular evolution have been achieved since the dawn of DNA sequencing, particularly regarding coding sequences, evolution of regulatory sequences is not well understood (Wray etal. 2003). The key principles of metazoan evolution include gene duplication to generate novel genes (Conant and Wolfe 2008), changes in gene control mechanisms (Piatigorsky and Wistow 1991; Wray etal. 2003; Wagner 2007; Carroll 2008), and activity of mobile genetic elements (Kazazian 2004). Simulation of processes required to establish novel cis-acting sites through local point mutations (Stone and Wray 2001; Wagner etal. 2007; Behrens and Vingron 2010), identification of prospective cis-sites in DNA-repetitive sequences (Zhou etal. 2002; Zemojtel etal. 2009), and comparisons of homologous regulatory sequences in related species (Lowe etal. 2011) indicate that new cis-sites can be generated on microevolutionary timescales (Stone and Wray 2001; Stern and Orgogozo 2008; Behrens and Vingron 2010). Eye evolution exemplifies a multitude of challenges to test evolutionary theories; at the same level, the eye provides a considerable number of molecular insights to directly link tissue morphogenesis with conserved and/or novel gene functions (Lamb etal. 2007; Nilsson 2009, 2013; Oakley and Speiser 2015). Reconstruction of key processes in eye evolution has now become possible (Gehring and Ikeo 1999; Gehring 2004; Carroll 2008; Vopalensky and Kozmik 2009; Gehring 2012; Vopalensky etal. 2012). To simplify analysis of rapidly evolving eye complexity, four discrete stages have been proposed (Nilsson 2009, 2013). Nondirectional photoreception (stage I) requires only a few photosensitive cells. Directional photoreception (stage II) requires a combination of pigmented cells that shield one or multiple photoreceptors forming the eyespot. Low-resolution vision (stage III), achieved by pit/cup eyes, does not employ any focusing optics. Finally, high-resolution vision (stage IV) also added an ocular lens as a focusing apparatus to generate the camera-like eye (Jonasova and Kozmik 2008). The ocular lens is a cellular structure mostly composed of highly elongated lens fiber cells, the crystallin-producing cellular factories (Piatigorsky 1981; Bassnett etal. 2011). At present, camera-like eyes (supplementary fig. S1, Supplementary Material online) are found in some jellyfish (Cnidarians), a range of invertebrates including scallops, squid and octopus (Mollusks), and in vertebrates starting during the Cambrian revolution (525 Ma, e.g., lampreys, supplementary fig. S1, Supplementary Material online). The fossil record shows that primitive eyes existed prior to this period and were further improved to provide high-resolution vision during the Cambrian period (Land and Nilsoon 2002). Thus, understanding lens evolution is intimately linked with studies of its principal structural proteins, the crystallins as well their encoding genes (Piatigorsky and Wistow 1991; Piatigorsky 2007). Crystallins are water-soluble proteins that are highly expressed in lens fiber cells (Wistow and Piatigorsky 1988); however, there is no apparent similarity between crystallins of vertebrates and invertebrates. In vertebrates, crystallins are represented by over a dozen proteins that are classified by their elution profiles in gel filtration as α-, β- and γ-crystallins (Piatigorsky 1981). It has been proposed earlier that either αB- or βB2-crystallin could be the “original” vertebrate crystallin as both of them are expressed outside of the eye (Wistow 1993). The pair of αA- and αB-crystallins are structurally and functionally related to small heat shock proteins, called Hspbs (Ingolia and Craig 1982). Accordingly, the mammalian αB-crystallin transcriptional regulatory regions contain multiple heat shock responsive elements (HSEs) (Klemenz etal. 1991; Somasundaram and Bhat 2000, 2004). Hspbs form a family of ten vertebrate genes (Kappe etal. 2003; Franck etal. 2004; Wistow 2012). Vertebrate lenses also contain a variable number of proteins from the β/γ-crystallin superfamily that is composed of at least 14 members (Kappe etal. 2010). This β/γ-crystallin superfamily also includes Ca2+-binding proteins encoded by microbial genomes (Kappe etal. 2010; Srivastava etal. 2014). In contrast, different abundant proteins were found in invertebrate lenses, including S-crystallins (squid and octopus), Ω-crystallins (scallop and squid), and J-crystallins (jellyfish) (supplementary fig. S1, Supplementary Material online). These crystallins are identical to basic metabolic enzymes, such as aldehyde dehydrogenases (Aldhs, Ω-crystallins) and glutathione-S-transferases (S-crystallins) (Piatigorsky 2007; Marchitti etal. 2008; Chen etal. 2013; Vasiliou etal. 2013). In particular, both scallop (Placopecten magellanicus) and squid (Omnastrephes sloani pacificus) express Aldh1a9 as their Ω-crystallin (Tomarev and Piatigorsky 1996). In addition, other crystallins were identified in jellyfish (Tripedalia cystophora) with more distant similarities to known proteins, including ADP-ribohydroxylase (J1-crystallins) and saponins (J3-crystallin) (Piatigorsky and Kozmik 2004). In the absence of any common denominator between vertebrate and invertebrate crystallins, the remarkable diversity of crystallins was thought to originate from their ability to accumulate to high concentrations while maintaining their solubility through the processes of convergent evolution. To explain the origin of crystallin genes, a concept of “gene recruitment” was proposed (Piatigorsky and Wistow 1991). The hallmark of this idea is that gene-duplication is not essential to establish any novel function; only the transcriptional regulation of a gene needs to change to gain neofunction. Thus, an acquisition of lens-preferred (“lens-specific”) mechanism(s) of gene expression allows quantitatively significant increases in expression of these proteins in lens, while maintaining appropriate levels and timing of expression in other tissues/organs (Piatigorsky and Wistow 1991). Despite the differences among vertebrate and invertebrate crystallins in the origins of their protein-coding sequences, a series of studies of their transcriptional control revealed a remarkable similarity in the use of an evolutionarily conserved DNA-binding transcription factor Pax6 (Callaerts etal. 1997; Chow and Lang 2001; Piatigorsky 2007). However, none of the earlier studies have examined contribution of divergent and convergent evolutionary processes to explain how the lens-specific GRNs have evolved. Transcriptional studies of vertebrate α-, β-, and γ-crystallins have shown Pax6, c-Maf, Hsf4, Prox1, and Sox1 as their key DNA-binding factors (supplementary fig. S2, Supplementary Material online) (Cvekl and Duncan 2007; Cvekl etal. 2015). Both jellyfish J2- and J3-crystallins are regulated by PaxB (Kozmik etal. 2003), and scallop Ω-crystallin/Aldh1a9 is directly regulated by Pax6 (Carosa etal. 2002). PaxB is a Pax gene that lost a major part of its homeodomain following gene duplications of the ancestral Pax gene formed by proto-Pax transposase inserted into a homeobox-containing genomic region about 1 billion years ago (Breitling and Gerber 2000). Taken together, several lines of evidence support the model of a conversion of a preexisting gene into a crystallin-coding gene via acquisition of Pax6/PaxB-binding sites (Cvekl and Piatigorsky 1996; Kozmik etal. 2003; Kozmik etal. 2008). Key questions for understanding the origins of individual crystallin genes are 1) how lens regulatory mechanisms evolved in specific gene(s) and 2) why proteins encoded by these genes evolved successfully as lens crystallins. Here we propose that Pax6-binding sites could evolve from preexisting cis-sites governing stress-response. Multiple sequence alignments and phylogenetic tree analyses of Hspb/Cryab/Cryaa genes demonstrate that an ancestral Hspb1-like gene evolved into a pair of αΑ- and αΒ-crystallin paralogs. Our data also show that the ancestral β/γ-crystallin could have been formed in the cephalochordate/tunicate lineage as a gene fusion in which the Hspb1-like gene provided a promoter region inserted 5′ to the coding region of a Ca2+-binding protein. Finally, notable similarities between multiple invertebrate crystallin promoters raise the possibility that they evolved from an ancestral Aldh gene promoter.

Materials and Methods

Sequence Data Retrieval

The amino acid and nucleotide coding sequences were obtained from sequenced genomes available from the UCSC Genome Browser and from the Department of Energy (DOE) Joint Genome Institute.

Annotation of Crystallin Promoters

The following “consensus” binding site sequences were used: HSE, 5′-GAANNTTCNNGAA-3′ (Fujimoto etal. 2008), P6CON (Epstein etal. 1994), and Pax6 ChIP-seq (Sun etal. 2015).

Multiple Sequence Alignments

Multiple sequence alignments with DNA and protein sequences were conducted using T-coffee (Notredame etal. 2000), MUSCLE (http://www.ebi.ac.uk/Tools/msa/muscle/) (Edgar 2004), and MAFFT (version 7) (Katoh and Standley 2013). The phylogenetic tree analysis was conducted with maximum likelihood method and bootstrap with 1,000 calculations using MEGA7 (Kumar etal. 2016). All the alignments and phylogenetic analysis used default setting unless otherwise stated.

Expression of Cryaa, Cryab, Hspb1, Hspb2, and Hspb6 in Mouse Heart and Lens

Mouse embryonic E16.5 lenses (n = 8) and newborn and hearts (n = 2) were obtained from CD-1 mice (Charles River Laboratories). Dissected tissues were incubated overnight in in 35 mm dishes at 37 °C in 5% CO2 in 2 ml of DMEM (10% FBS). To treat the tissues with heat shock, the media was replaced with DMEM (10%FBS) at 42 °C and the dishes incubated for 1 h at 42 °C in 5% CO2. The tissues were then returned to 37 °C media and allowed to recover for 0.5, 1, and 4 h at 37 °C in 5% CO2. Total RNAs were prepared using the RNeasy Mini Kit (QIAGEN, Valencia, CA), followed by quantitative RT-PCR analysis as described elsewhere (Xie and Cvekl 2011). C values were normalized using GAPDH and fold changes of gene expression were calculated relative to untreated tissues. Primer sequences used for quantitative RT-PCR were: Cryaa forward 5′-GAGATTCACGGCAAACACAA-3′; Cryaa reverse 5′-ACATTGGAAGGCAGACGGTA-3′; Cryab forward 5′-TCTCTCCGGAGGAACTCAAA-3′; Cryab reverse 5′-TCCGGTACTTCCTGTGGAAC-3′; Hspb1 forward 5′-CCGGAAATACACGCTCCCTC-3′; Hspb1 reverse 5′-ATGGTGATCTCCGCTGACTG-3′; Hspb2 forward 5′-CCGAGTACGAATTTGCCAACC-3′; Hspb2 reverse 5′-CCCGAGGCCGAACATAGTAG-3′; Hspb6 forward 5′-TGTCCACGGACTCTGGGTAT-3′; Hspb6 reverse 5′-TGAATCCGTGTTCATCCGGG-3′; GAPDH forward 5′-CCAATGTGTCCGTCGTGGATCT-3′; GAPDH reverse 5′-GTTGAAGTCGCAGGAGACAACC-3′.

Transcriptional Regulation by Putative Pax6-Binding Sites Identified in B. floridae

Six copies of candidate Pax6-binding sites (WT1 and WT2), their mutants (M1 and M2), and “optimized” sites (O1 and O2), were synthesized by GeneScript (Piscataway, NJ) and cloned into a E4 TATA-luc minimal promoter followed by transient cotransfection experiments as we described earlier (Chauhan etal. 2004a, 2004b). The cells used included mouse lens epithelial cell line αTN4-1 and embryonic carcinoma cells P19 as used previously to evaluate transactivation by Pax6 (Xie and Cvekl 2011; Chauhan etal. 2004a).

Results

A Pax6 Consensus-Binding Site Resembles Recognition Motifs Recognized by Stress-Induced TFs

Pax6 binds to DNA through a combinatorial use of paired domain (PD) and homeodomain (HD) (fig. 1) that by footprinting occupies 16–20 bp long sites (Epstein etal. 1994; Xu etal. 1999; Yang and Cvekl 2005) and contains variants of a 14 bp consensus sequence, P6CON (Epstein etal. 1994), which is highly similar to the recently established invivo 15 bp Pax6 consensus sequence (Sun etal. 2015). Evolution of such sequences for Pax6-binding could originate from random mutations and/or through a smaller number of nucleotide changes using preexisting cis-sites.
. 1.

—Comparison of Pax6 consensus binding site with three stress-response sites. (A) Alignments of invivo Pax6 consensus (Sun etal. 2015) and P6CON (Epstein etal. 1994) with stress-response elements, including canonical trimeric HSE, ARE, and half of p53 (1/2 TP53) sites. Five highly interactive DNA-binding subdomains are found in Pax6: β−turn, PAI, RED, linker (L), and HD (Xu etal. 1999). (B) Identification of optimal HSEs (underlined, conserved nucleotide, and purple) in a set of four validated Pax6-binding sites, and in multiple “optimal” Pax6-binding sites generated by Selex (Xie and Cvekl 2011). (C) A model for the immediate utilization and simple conversion of a HSE-ARE and HSE-TATA box into a Pax6-binding site without any nucleotide change (left panel) or via a single nucleotide substitutions (arrow) coupled with a single nucleotide deletion (Δ) (right panel), respectively.

—Comparison of Pax6 consensus binding site with three stress-response sites. (A) Alignments of invivo Pax6 consensus (Sun etal. 2015) and P6CON (Epstein etal. 1994) with stress-response elements, including canonical trimeric HSE, ARE, and half of p53 (1/2 TP53) sites. Five highly interactive DNA-binding subdomains are found in Pax6: β−turn, PAI, RED, linker (L), and HD (Xu etal. 1999). (B) Identification of optimal HSEs (underlined, conserved nucleotide, and purple) in a set of four validated Pax6-binding sites, and in multiple “optimal” Pax6-binding sites generated by Selex (Xie and Cvekl 2011). (C) A model for the immediate utilization and simple conversion of a HSE-ARE and HSE-TATA box into a Pax6-binding site without any nucleotide change (left panel) or via a single nucleotide substitutions (arrow) coupled with a single nucleotide deletion (Δ) (right panel), respectively. Here, we hypothesize that a Pax6-binding site could have rapidly emerge from related cis-regulatory regions in various promoters and enhancers of genes during the course of eye evolution. Given the link between vertebrate crystallins and stress-response, we evaluated a mutual relationship between HSEs, antioxidant response elements (AREs), p53-binding sites, and both the invivo Pax6 consensus sequence and Pax6 paired domain consensus, P6CON (fig. 1). This “optimal” Pax6 site could be indeed aligned with these stress-response elements (fig. 1). Among a group of ∼60 validated Pax6-binding sites (Xie and Cvekl 2011), an appreciable number of them indeed preserve high sequence similarity to HSE motifs (fig. 1). To illustrate minimal nucleotide changes required to generate a Pax6-binding site, we show two plausible model “seed” sequences. First, a HSE-ARE tandem is already a Pax6-binding site with two nucleotide mismatches (fig. 1). Second, a region comprised of a single HSE and a canonical TATA-box would require only one nucleotide substitution (A to G) and a single nucleotide deletion (ΔT) to convert this array of cis-elements into a Pax6-binding site (fig. 1). This scenario of Pax6 site emergence is conceivably more parsimonious than the alternative that a random 14-bp sequence was converted to Pax6 site. In addition, a consensus ARE, identified in promoters of squid crystallins (Tomarev etal. 1994) partially overlaps with the “optimal” Pax6-binding site (fig. 1). Considering stress regulation at the level of transcription by p53 (Lohrum and Vousden 1999; Danilova etal. 2008), coupled with recent identification of p53-binding sites in distal regions of mouse Cryaa and Cryba1 genes (Ji etal. 2013), we also noticed similarities between a Pax6-binding site and a half of the p53-decameric sequence (fig. 1). These sequence comparisons suggest that an individual gene can be recruited into a Pax6-dependent gene regulation through a small number of nucleotide changes within the preexisting cis-elements, in agreement with a recent systematic analysis reporting that binding sites of a number of transcription factors differ indeed by a few nucleotides (Payne and Wagner 2014).

Gene Duplication and Transcriptional Regulation of Hspb1, Hspb2, Hspb6, and Cryab/Hspb5

To reconstruct molecular evolution underlying the birth of crystallin genes in vertebrate genomes, it is necessary to identify an ancestral gene to track changes in transcriptional mechanisms that proceeded towards the evolution of high lens-preferred gene expression. A group of three mammalian proteins, including Hspb1, Hspb6, and Hspb2, is highly related to both α-crystallins (Kappe etal. 2003; Franck etal. 2004; Basha etal. 2012). However, a precise evolutionary history of this group of genes is still not clear from the protein analysis, even if exon/intron boundaries and other features are considered (Franck etal. 2004). To resolve this issue, we hypothesized that the analysis of transcriptional regulatory sequences of Hspb1, Hspb2, Hspb6, and Cryab could provide valuable clues to explain the process that led to the evolution of lens-preferred expression of an ancestral Cryab gene. Previous studies identified multiple HSEs in human and mouse HSPB1/Hspb1 promoters (Gaestel etal. 1993; Oesterreich etal. 1996). In contrast, gene regulation of Hspb6 and Hspb2 genes is poorly understood (Hough etal. 2002; Li etal. 2007; Swamynathan and Piatigorsky 2007). Initial studies of αB-crystallin gene regulation identified distal HSE in the muscle-lung-heart (MLH) enhancer (−427/−254) and proximal HSE adjacent to the TATA-box (fig. 2Klemenz etal. 1991; Somasundaram and Bhat 2004; Jing etal. 2013). To align regulatory sites of Hspb1, Hspb2, Hspb6, and Cryab, we first identified HSEs (fig. 2). These HSEs were classified as “optimal” (at least two perfect consensus trinucleotides; fig. 2, solid circles) and “imperfect” (a sequence with only one consensus trinucleotide and/or with at least three scattered “core” nucleotides (Fujimoto etal. 2008; fig. 2, dashed circles). All four genes examined showed comparable arrangements of multiple HSEs in their 5′-flanking regions (fig. 2). At the nucleotide level, these promoter regions show notable similarities (fig. 2), including the 5′-distal 81 bp region (fig. 2), a central 78 bp region with two perfect HSE trinucleotides in Hspb1, Hspb6, and Cryab genes (fig. 2), and a 194 bp long promoter region (fig. 2). The central region (fig. 2) harbors a characteristic HSE motif, 5′-GAAGTTTC-3′. Phylogenetic analysis of Hspb1, Hspb2, Hspb6, and αB-crystallin proteins suggests that Hspb6 and Cryab could be derived from a putative Hspb1-like ancestor gene, which could also give rise to the Hspb2 gene as supported by reasonably good bootstrapping values (fig. 2). Since both HSE and Pax6-site evolved from the proximal promoter region (fig. 2), we call this region the “HSE/Pax6 precursor binding region.” Additional comparison of their regulatory DNA sequences provided further support to the putative evolutionary events. A duplication of an ancestral Hspb produced two pairs, Hspb2/Cryab (two genes in head-to-tail orientation) and Hspb1/Hspb6 (currently on two different chromosomes; figs. 2). Note that similar a conclusion can be drawn if additional protein and DNA sequences, including human, mouse, and naked mole rat Hspbs, are analyzed (supplementary fig. S3, Supplementary Material online). Although some degree of uncertainty remains in the sequential evolutionary events, taken together, these analyses uncover both evolutionary conserved and seed sequences for novel transcriptional control elements in the 5′-flanking promoter regions among this family of vertebrate Hspb genes.
. 2.

—Sequence similarities in noncoding regions of mouse Hspb1, Hspb2, Hspb6, and Cryab genes. (A) Diagrammatic representation of promoter regions of mouse Hspb1, Hspb2, Hspb6, and Cryab genes. Optimal” (“imperfect”) HSEs are indicated by full (dashed) circles, respectively (see text for details). The regions are partitioned into the 5′-distal, central, and promoter. (B) Sequence alignment of “distal” (81 bp) and “central” (78 bp) regions containing multiple HSEs using T-coffee. Conserved nucleotides in the 5′-GAANNTTC-3′ motifs (horizontal bars) are shown in bold. 100%-conserved nucleotides are shown in red. “Imperfect” HSEs are underlined. (C) Sequence alignment of the “proximal” promoter region (194 bps), including the TATA-box (horizontal bracket) and start site of transcription (+1, hooked arrows) using T-coffee. A “weak” trimeric HSE is in both Hspb1 and Hspb2 promoters, though at different locations. (D) Phylogenetic analysis of the protein sequences (MAFFT for alignment with E-INS-i for multiple conserved regions and long gaps and leave gappy regions, others as default) and regulatory regions (T-coffee for alignment) of the mouse Hspb1, Hspb2, Hspb6, and Cryab. Drosophila melanogaster small heat shock protein 22 (dmHsp22) was also included in the protein analysis as the outgroup. The clustering was carried out in the software MEGA7 using Maximum likelihood method and bootstrap with 1,000 calculations (Kumar etal. 2016). Additional alignments including human, mouse, and naked mole rat sequences are shown in supplementary figure S3, Supplementary Material online.

—Sequence similarities in noncoding regions of mouse Hspb1, Hspb2, Hspb6, and Cryab genes. (A) Diagrammatic representation of promoter regions of mouse Hspb1, Hspb2, Hspb6, and Cryab genes. Optimal” (“imperfect”) HSEs are indicated by full (dashed) circles, respectively (see text for details). The regions are partitioned into the 5′-distal, central, and promoter. (B) Sequence alignment of “distal” (81 bp) and “central” (78 bp) regions containing multiple HSEs using T-coffee. Conserved nucleotides in the 5′-GAANNTTC-3′ motifs (horizontal bars) are shown in bold. 100%-conserved nucleotides are shown in red. “Imperfect” HSEs are underlined. (C) Sequence alignment of the “proximal” promoter region (194 bps), including the TATA-box (horizontal bracket) and start site of transcription (+1, hooked arrows) using T-coffee. A “weak” trimeric HSE is in both Hspb1 and Hspb2 promoters, though at different locations. (D) Phylogenetic analysis of the protein sequences (MAFFT for alignment with E-INS-i for multiple conserved regions and long gaps and leave gappy regions, others as default) and regulatory regions (T-coffee for alignment) of the mouse Hspb1, Hspb2, Hspb6, and Cryab. Drosophila melanogaster small heat shock protein 22 (dmHsp22) was also included in the protein analysis as the outgroup. The clustering was carried out in the software MEGA7 using Maximum likelihood method and bootstrap with 1,000 calculations (Kumar etal. 2016). Additional alignments including human, mouse, and naked mole rat sequences are shown in supplementary figure S3, Supplementary Material online. —Evolution of αA-crystallin via gene duplication of the inverted tandem Hspb2/αB-crystallin and nucleotide changes in its 5′-proximal promoter region. (A) Schematic representation of mouse Hspb2/Cryab and Cryaa loci including their promoter regions. The HSE and Pax6-binding sites are shown as circles and ovals, respectively. (B) Pair-wise alignment of mouse lens-specific Cryab and Cryaa promoters using T-coffee. Both αB-crystallin promoter fragments (−162/+45) and (−115/+45) shown here are lens-specific (Gopal-Srivastava etal. 1996). The αA-crystallin promoter (−88/+46) is also lens-specific (Wawrousek etal. 1990). Conserved nucleotides in the 5′-GAANNTTC-3′ HSE motifs are shown in bold. Pax6-binding sites, blue; first ATG codon, green; “imperfect” HSE in the Cryaa promoter, HSE-like. C) Five alignments (Alt I–V) of the 5′-promoter proximal regions of mouse Cryab × Cryaa. Alt I, see alignment in supplementary figure S5, Supplementary Material online; Alt II, see alignment in panel B; Alt III and IV, manually adjusted/gaps changed; Alt V, a result of 9 kb pair-wise alignment of mouse, elephant shark, and zebrafish Cryab × Cryaa loci (supplementary fig. S4A–C, Supplementary Material online, respectively).

Comparative Analysis of Mouse αA- and αB-Crystallin Promoters

Duplication of the ancestral αB-crystallin into the αA-crystallin-like gene (supplementary fig. S3, Supplementary Material online) most likely occurred in the early vertebrates as Cryaa homologues are not found in the sea urchin, amphioxus, sea squirt, and lamprey genomes (data not shown); however, it is not known how long of the Cryab-containing sequence was duplicated to generate the earliest precursor of Cryaa gene. Using the VISTA alignment, we found sequence similarities over 9 kb of 5′-sequences of mouse Cryab/Cryaa loci that include the Hspb2-Cryab loci (fig. 3supplementary fig. S4A, Supplementary Material online) as well as the distal lens-specific enhancer (DCR1) (Yang etal. 2006a) in the Cryaa locus (fig. 3 and supplementary fig. S4A, Supplementary Material online). Importantly, similar findings are obtained from analysis of evolutionary lower elephant shark and zebrafish genomes (supplementary fig. S4B and C, Supplementary Material online). In summary, these studies show that the Cryaa gene originated from a duplication of over 9 kb of genomic DNA (5′-sequences and coding regions) including the Hspb2-Cryab head-to-tail paired genes that occurred in the early fish ancestor.
. 3.

—Evolution of αA-crystallin via gene duplication of the inverted tandem Hspb2/αB-crystallin and nucleotide changes in its 5′-proximal promoter region. (A) Schematic representation of mouse Hspb2/Cryab and Cryaa loci including their promoter regions. The HSE and Pax6-binding sites are shown as circles and ovals, respectively. (B) Pair-wise alignment of mouse lens-specific Cryab and Cryaa promoters using T-coffee. Both αB-crystallin promoter fragments (−162/+45) and (−115/+45) shown here are lens-specific (Gopal-Srivastava etal. 1996). The αA-crystallin promoter (−88/+46) is also lens-specific (Wawrousek etal. 1990). Conserved nucleotides in the 5′-GAANNTTC-3′ HSE motifs are shown in bold. Pax6-binding sites, blue; first ATG codon, green; “imperfect” HSE in the Cryaa promoter, HSE-like. C) Five alignments (Alt I–V) of the 5′-promoter proximal regions of mouse Cryab × Cryaa. Alt I, see alignment in supplementary figure S5, Supplementary Material online; Alt II, see alignment in panel B; Alt III and IV, manually adjusted/gaps changed; Alt V, a result of 9 kb pair-wise alignment of mouse, elephant shark, and zebrafish Cryab × Cryaa loci (supplementary fig. S4A–C, Supplementary Material online, respectively).

Defined by transgenic studies, the shortest lens-specific αA-crystallin promoter (fig. 3) resides within the region of −88/+46 (Wawrousek etal. 1990). Similarly, the minimal αB-crystallin promoter resides within an −115/+44 fragment (fig. 3) (Gopal-Srivastava etal. 1996). Lens-specific expression of both αA- and αB-crystallin promoters is driven by Pax6 and c-Maf (fig. 3 and supplementary fig. S2, Supplementary Material online) (Yang etal. 2004; Yang and Cvekl 2005) and Hsf4 binding to the Cryab promoter (Somasundaram and Bhat 2004). A comparison of the two promoters indicates HSE to Pax6 replacement in the αA-crystallin promoter supported by a pair-wise alignment of mouse Cryab and Cryaa promoters (fig. 3). The alignment between these regulatory elements remains nearly intact even if corresponding Hspb1, Hspb2, and Hspb6 sequences are added (compare fig. 3 and supplementary fig. S5, Supplementary Material online). However, subtle nuances in the Cryab × Cryaa pair-wise alignment exist, depending on method used and gap penalty values (fig. 3supplementary fig. S5, Supplementary Material online), as alignments of short DNA regulatory regions remain challenging (Wray etal. 2003; Meireles-Filho and Stark 2009). Nevertheless, all these alignments support the idea that the evolution of the TATA-proximal Pax6-binding site in the αA-crystallin promoter was aided by the “preexisting” HSE region. The differences such as the number, structure and location of HSEs among the Hspb/crystallin promoters could elicit different regulation of these genes under heat shock conditions (Jing etal. 2013). For example, the cell-specific environment of the lens provides Pax6 and Hsf4; whereas these proteins are not significantly expressed in the heart (Walther and Gruss 1991; Fujimoto etal. 2004), and heat inducible Hsf1 and Hsf2 might show different levels of expression in these tissues especially in early development (Somasundaram and Bhat 2004). We evaluated response of Cryab, Cryaa, Hspb1, Hspb2, and Hspb6 following the 42 °C/1 h heat shock treatments in isolated mouse embryonic lens and heart. In E16.5 lens, expression of Hspb1 and Hspb6 was induced during the entire time window examined (0.5–4 h). In contrast, no induction of expression of Cryaa, Cryab, or Hspb2 was found (fig. 4). In newborn heart, Hspb6 and Hspb2 transcripts were most abundant 1 h following the stress treatment and expression of Cryab was also increased following the heat shock (fig. 4). Expression of Cryaa was so low in heart that it was not possible to accurately observe the effect of heat shock on expression. Thus, although the HSE-like sequences exist in the Cryaa promoter (fig. 3), they are not functional. While no heat induction of Cryab in lens may look surprising in light of Cryab heat-inducibility in the heart, expression of Hsf1 and Hsf2 may be too low at this stage of lens morphogenesis (Somasundaram and Bhat 2000, 2004). We conclude that unique tissue-specific heat shock responses of individual Cryab, Cryaa, Hspb1, Hspb2, and Hspb6 genes reflect pressure on DNA to retain the HSEs while lens environment evolved mechanisms utilizing lens-specific factors, such as Hsf4 and Pax6 (Cvekl and Ashery-Padan 2014).
. 4.

—Regulation of Cryaa, Cryab, Hspb1, Hspb2, and Hspb6 by heat shock is tissue-specific. (A) E16.5 lenses and (B) P1 hearts were subjected to heat shock treatment at 42 °C for 1 h. Dissected tissues were allowed to recover at 37 °C for 0.5, 1, and 4 h. RT-PCR analysis of the genes was performed using total RNAs isolated from the tissues and fold changes of expression were calculated relative to untreated controls (grey bars). Expression of Cryaa in heart tissues was at too low levels to measure accurately and was therefore omitted. Biological repeats were performed and similar trends observed. P values were calculated using the Student’s t test (* ≈ P < 0.05, ** ≈ P < 0.001).

—Regulation of Cryaa, Cryab, Hspb1, Hspb2, and Hspb6 by heat shock is tissue-specific. (A) E16.5 lenses and (B) P1 hearts were subjected to heat shock treatment at 42 °C for 1 h. Dissected tissues were allowed to recover at 37 °C for 0.5, 1, and 4 h. RT-PCR analysis of the genes was performed using total RNAs isolated from the tissues and fold changes of expression were calculated relative to untreated controls (grey bars). Expression of Cryaa in heart tissues was at too low levels to measure accurately and was therefore omitted. Biological repeats were performed and similar trends observed. P values were calculated using the Student’s t test (* ≈ P < 0.05, ** ≈ P < 0.001).

A Possible Origin of the Ciona intestinalis β/γ-Crystallin Gene from an Hspb1 Fragment Inserted 5′-of the β/γ-Coding Sequences

Multiple microbial genomes encode proteins that can serve as the founding members of the β/γ-crystallin superfamily (Mishra etal. 2014). Two precursors of vertebrate β/γ-crystallin genes have been identified previously in lower invertebrates: geodin and β/γ-crystallin-like gene. In sea sponge Geodia cydonium, geodin is an intron-less gene encoding a 163 amino acid protein (supplementary fig. S6, Supplementary Material online), the most ancient member of the β/γ-crystallin superfamily in metazoans (Di Maro etal. 2002; Giancola etal. 2005; Srivastava etal. 2014). The tunicate Ciona intestinalis genome harbors an ancient β/γ-crystallin-like gene with multiple exons (Shimeld etal. 2005). Its current open reading frame (ORF) lacks as much as 108 amino acid residues from the putative N-terminal region compared with the mouse βB2- and γA-crystallins (supplementary fig. S6, Supplementary Material online). Analysis of sea urchin (Strongylocentrotus purpuratus) and amphioxus (Branchyostoma floridae) genomes revealed a novel β/γ-crystallin-like ORF in the amphioxus (Bf Crybg) (supplementary fig. S6, Supplementary Material online). These analyses led us to hypothesize that the amphioxus Bf Crybg gene (see fig. 6) is an ancestor form of the sea squirt β/γ-crystallin-like gene and that the vertebrate β/γ-crystallins evolved from this precursor via gene duplication including their 5′-regulatory regions.
. 6.

—Origin of β/γ-crystallin gene family. (A) Genomic organization of Ciona β/γ-crystallin-like gene and its 1.2 kb promoter (the numbering of nucleotides is from Shimeld etal. 2005). The TATA-box (blue rectangle), start sites of transcription (+1, hooked arrows), putative HSE (beige circles) and Pax6-binding (blue ovals) sites, downstream HSE and Pax6 sites, horizontal lines (beige, and blue, respectively), six short repetitive sequences (horizontal arrows, consensus 5′-GGGGTT-3′) are shown. (B) Genomic organization of B. floridae β/γ-crystallin-like gene and its putative promoter. The 16 bp and 4 bp repetitive sequences are shown by purple and blue arrows. (C) Genomic organization of B. floridae Hspb1-like gene and its putative promoter. Four repetitive sequences (each ∼ 200 bp) are shown as green arrows. (D) The pair-wise alignment centered around the conserved HSEs by T-coffee. (E) The alignment between a Ciona β/γ-crystallin promoter and an upstream region of the B. floridae Hspb1-like gene by LAGAN (Berezikov etal. 2004).

To test this hypothesis, we aligned 16 proteins, including all 13 mouse β/γ-crystallins, plus amphioxus Crybg (BfCrybg), sea squirt Crybg (Cionabg), and geodin as an obvious “outlier.” The alignments were conducted by T-coffee. As shown in figure 5 separation of the β- and γ-crystallin subfamily is clear, whereas both invertebrate Crybg and geodin were placed into a separate clade. Although these alignments support common evolutionary origin of coding sequences among the β/γ-crystallins as proposed earlier (Kappe etal. 2010), it remains possible that their promoters and regulatory elements formed separately. We thus examined their 5′-regions prior and including the first ATG codon (Cryba1, Crybb1, Crybb2, Crygd, Crygf, Crygs, and Crygn lengths between 414 to 504 bp; Ciona Crybg, 537 bp). Notably, the phylogenetic tree (fig. 5) generated with DNA regulatory regions—ATG, differed from the tree based on protein sequences since Ciona Crybg was placed in a clade with Crygs. Nevertheless; at the nucleotide level, multiple sequence alignment of eight promoter regions, including seven β/γ-crystallins and Ciona Crybg, revealed a number of conserved blocks as well as gaps (supplementary fig. S7, Supplementary Material online). Taken together, the current data support the idea that the ancestor of the sea squirt Crybg likely represents the founding member of the β/γ-crystallin family, including their promoters.
. 5.

—Phylograms of crystallin proteins and their 5′-promoter regions. (A) Tree constructed from protein coding sequences without the distance correction. (B) Tree constructed from promoter sequences up to the starting ATG codon. Both phylogenetic trees are constructed using T-coffee for alignment and maximum likelihood method with 1,000 bootstrap calculations with MEGA7.

—Phylograms of crystallin proteins and their 5′-promoter regions. (A) Tree constructed from protein coding sequences without the distance correction. (B) Tree constructed from promoter sequences up to the starting ATG codon. Both phylogenetic trees are constructed using T-coffee for alignment and maximum likelihood method with 1,000 bootstrap calculations with MEGA7. A series of Ciona Crybg promoter fragments (−1, 122/+110 to −275/+110) have been studied in sea squirt larvaes and transgenic frogs (Shimeld etal. 2005; Chen etal. 2014); however, the origin of this promoter and its possible regulation by Pax6 and HSEs identified here (fig. 6) was not discussed. The transition from amphioxus into sea squirt β/γ-crystallin-like gene could happen by various mechanisms such as trans-insertion of a novel genomic fragment 5′-of the β/γ-like ORF, or cis-deletion of an internal genomic fragment, a combination of these processes, and by cumulative nucleotide modifications of the promoter region. We first considered an “insertional” model from “scratch” as proposed elsewhere (Kaessmann 2010) and searched for the possible “donor” sequence. As the putative inserted sequence has a conserved 8 bp HSE (5′GT-3′) similar to what is found in mouse Hspb1 and Hspb6 promoters (fig. 2), an intriguing possibility arose that evolution of both α- and β/γ-crystallin promoters could be linked together. The amphioxus β/γ-crystallin and Hspb1-like genes are shown in figure 6, respectively. We found that amphioxus Hspb1-like gene promoter also harbors a conserved 8 bp HSE, (5′GT-3′) (fig. 6). A conservation of this HSE across evolutionary diverse genomes is evident from multiple sequence alignments of five Hspb1-related genes, including D. melanogaster Hsp23, mouse Hspb1, mouse Hspb6, C. intestinalis Crybg, and B. floridae Hspb1-like gene, demonstrating conservation of this sequence over a region of ∼150 bps (supplementary fig. S8, Supplementary Material online). Importantly, the “conserved” 5′-GT-3′ HSE in the sea squirt Crybg and amphioxus Hspb1 genomic fragments were precisely aligned if five Hspb1 genomic fragments were used (see supplementary fig. S8, Supplementary Material online). Likewise, their pair-wise alignment, including 146 and 149 bp with centrally positioned HSEs, matched these HSEs (fig. 6). Finally, a pair-wise alignment between functionally defined C. intestinalis β/γ-crystallin promoter (Shimeld etal. 2005) and the 5′-flanking region of the B. floridae Hspb1-like gene confirmed similarities between these sequences over a span of 2 kb (fig. 6). These findings indicate that a common cephalochordate/tunicate ancestor Hspb1-like gene provided a genomic fragment that was inserted 5′- of the β/γ-crystallin-like ORF to produce the ancestral β/γ-crystallin gene. —Origin of β/γ-crystallin gene family. (A) Genomic organization of Ciona β/γ-crystallin-like gene and its 1.2 kb promoter (the numbering of nucleotides is from Shimeld etal. 2005). The TATA-box (blue rectangle), start sites of transcription (+1, hooked arrows), putative HSE (beige circles) and Pax6-binding (blue ovals) sites, downstream HSE and Pax6 sites, horizontal lines (beige, and blue, respectively), six short repetitive sequences (horizontal arrows, consensus 5′-GGGGTT-3′) are shown. (B) Genomic organization of B. floridae β/γ-crystallin-like gene and its putative promoter. The 16 bp and 4 bp repetitive sequences are shown by purple and blue arrows. (C) Genomic organization of B. floridae Hspb1-like gene and its putative promoter. Four repetitive sequences (each ∼ 200 bp) are shown as green arrows. (D) The pair-wise alignment centered around the conserved HSEs by T-coffee. (E) The alignment between a Ciona β/γ-crystallin promoter and an upstream region of the B. floridae Hspb1-like gene by LAGAN (Berezikov etal. 2004). The possible origin of Pax6-binding sites from HSE-rich sequences described above was experimentally tested using a specific region of the B. floridae Hspb1-like gene. Using a 14 bp Pax6 consensus binding site (P6CON), we identified two putative Pax6-like sequences, Bf-WT1 and Bf WT2, separated by the 5′-GT-3′ HSE (fig. 7). Six copies of wild type (Bf-WT1 and Bf WT2), their point mutants (Bf-M1 and Bf-M2), and optimized Pax6-binding sites (Bf-O1 and Bf-O2), were inserted 5′-of the E4 TATA-box in a luciferase reporter gene pGL3 as we described elsewhere (Chauhan etal. 2004a). Transient cotransfections with Pax6 cDNA were conducted in both lens (expressing Pax6) and nonlens (no Pax6 expression) cultured cells (fig. 7). The results showed that reporter genes driven by putative and mutated Pax6-binding sites were not activated by Pax6. In contrast, four individual nucleotide changes towards the “optimal′ Pax6-binding sites, Bf-O1 and Bf-O2, produced a pair of reporters activated by Pax6 in both cell culture systems. Taken together, these experiments showed that a small number of nucleotide changes is sufficient to produce functional Pax6-binding sites from regions rich in HSE-like motifs.
. 7.

—Functional evaluation of the HSE to Pax6-binding site conversion model within the B. floridae Hspb1-like gene. (A) Identification of HSEs and two putative Pax6-binding sites. The binding sites were found on the reverse complement (RC) strand. (B) Relative luciferase expression driven by six copies of the wild type (Bf-WT1 and Bf WT2), their mutants (Bf-M1 and Bf-M2), and optimized Pax6-binding sites (Bf-O1 and Bf-O2) in lens (αTN4) and embryonic carcinoma (P19) cells.

—Functional evaluation of the HSE to Pax6-binding site conversion model within the B. floridae Hspb1-like gene. (A) Identification of HSEs and two putative Pax6-binding sites. The binding sites were found on the reverse complement (RC) strand. (B) Relative luciferase expression driven by six copies of the wild type (Bf-WT1 and Bf WT2), their mutants (Bf-M1 and Bf-M2), and optimized Pax6-binding sites (Bf-O1 and Bf-O2) in lens (αTN4) and embryonic carcinoma (P19) cells.

Invertebrate Crystallin Promoters: Connections to Aldhs?

Camera-like eyes are also found in multiple evolutionarily distant invertebrates and Ω-, S-, and J-crystallins have been identified in their lenses (Tomarev and Piatigorsky 1996; Piatigorsky and Kozmik 2004; Jonasova and Kozmik 2008). The overarching hypothesis is that invertebrate Ω-crystallin genes evolved by gene duplication from an ancestral Aldh gene. Aldh genes encode enzymes critical for retinoic acid biosynthesis, a key ligand employed by retinoic acid activated nuclear receptors (Cvekl and Wang 2009; Kumar etal. 2012). Importantly, these proteins protect cells from UV light damage and oxidative stress (Chen etal. 2013; Lassen etal. 2008). In scallop, the lens-cornea form continuous tissues and both highly express Ω-crystallin/Aldh1a9 (Piatigorsky etal. 2000). In addition to the major S-crystallins, the Ω-crystallin is also expressed in squid lenses (Tomarev and Piatigorsky 1996). Aldh1a1/Raldh1, Aldh1a2/Raldh2, and Aldh1a3/Raldh3 are expressed in the mammalian lens, neuroretina, and retinal pigmented epithelium (RPE) (Cvekl and Wang 2009; Kumar etal. 2012) and Aldh1a3 is a Pax6-regulated gene (Suzuki etal. 2000). Furthermore, Aldh3a1 is a major component of the mouse cornea with a low lens expression (Lassen etal. 2007; Chen etal. 2013, 2016), and is also directly regulated by Pax6 (Davis etal. 2008). In elephant shrew (a small mammal), η-crystallin is encoded by an Aldh1a7 gene (Graham etal. 1996). Multiple sequence alignments of Ω-crystallin/Aldh1a9, η-crystallin/Aldh1a7, Aldh3a1, Aldh1a1, Aldh1a2, and Aldh1a3 proteins (supplementary fig. S9A, Supplementary Material online) reveal extensive amino acid similarities supporting their common evolutionary origin. We next aligned their promoters, experimentally defined for mouse Aldh3a1 (Davis etal. 2008) and scallop Ω-crystallin/Aldh1a9 (Carosa etal. 2002), and deduced from genome annotation for mouse Aldh1a1, Aldh1a2, Aldh1a3, and Aldh1a7. Many blocks of conservation were identified in the alignment of two evolutionarily distant genomes (mouse and scallop) whose common ancestor is at least 650 million years old (supplementary fig. S9B, Supplementary Material online). We therefore wondered whether it is possible to reconstruct various aspects of invertebrate crystallin gene recruitment through the analysis of Ω-crystallin/Aldh1a9 promoter formation by considering a single origin of the invertebrate crystallin promoter and convergent evolution assisted by the preexisting cis-regulatory sites. To test the first model, we aligned five known scallop Ω-, jellyfish J1A- and J2-, and squid SL11 and SL20 S-crystallin promoters (fig. 8). This alignment shows that both jellyfish J1A- and J2-crystallin promoters form a group with Ω-crystallin/Aldh1a9, whereas SL11 and SL20 form their own clade (fig. 8). Multiple blocks of conservation combined with gap patterns therefore support the intriguing possibility of the common ancestral origin of these promoters. To probe the model of preexisting sites, we focused on the idea that the invertebrate Pax6 sites also evolved from HSEs and/or AREs. In sea scallop (Placopecten magellanicus) Ω-crystallin/Aldh1a9 promoter, three Pax6-binding sites were identified and characterized previously (fig. 9) (Carosa etal. 2002). The distal Pax6-site overlaps with an ARE (fig. 9). The cubozoan jellyfish (Tripedalia cystophora) J2-crystallin promoter fragment −148/+10 is comprised of three functional PaxB-binding sites (fig. 9, Kozmik etal. 2008). Site C overlaps ARE, while two putative HSE motifs are located in the adjacent 5′-flanking region. The J3-crystallin promoter (Kozmik etal. 2003) contains a pair of PaxB/Pax6-binding sites though its complete promoter sequence remains to be reported. Finally, two squid (Ommastrephes sloani) crystallin promoters have been characterized using vertebrate lens nuclear extracts (Tomarev etal. 1994). The SL11 and SL20 crystallin promoters are similar (46% identity across 204 bp). In the SL11-crystallin promoter, two putative Pax6-binding sites overlap with AREs (fig. 9). In the SL20-crystallin promoter, a putative Pax6-binding site also partially overlaps with an ARE. The Pax6-binding sites (n = 16, eight experimentally validated) identified here were aligned with a Pax6 consensus site (P6CON, Epstein etal. 1994; Xie and Cvekl 2011) and Pax6 “optimal” binding site (Sun etal. 2015) to demonstrate that many of these sites contain five or more mismatches (supplementary fig. S10, Supplementary Material online). Thus, it is possible that Pax6/PaxB-binding sites in invertebrate crystallin promoters could also originate from AREs in the ancestral crystallin-precursor genes by the process of convergent evolution.
. 8.

—Multiple sequence alignments of five invertebrate crystallin promoters. (A) Phylogram of the promoter sequences: Scallop Ω-crystallin/Aldh1a9, squid SL11 and SL20 S-crystallin promoters, jellyfish J1A and J2-crystallin promoters. The phylogram employed sequences from the 5′-end throughout the TATA-box using alignments in panel B with maximum likelihood method and 1,000 bootstrap calculation. (B) Multiple sequence alignments of the invertebrate crystallin promoters. The alignment shown was generated from two blocks based on the use of TATA-box and experimentally defined transcriptional start sites (Tomarev etal. 1994; Carosa etal. 2002; Kozmik etal. 2008) using T-coffee. The 5′-box included all sequences up to the TATA. The downstream box started from the TATA-box and included the first ATG sequences in J1A, J2, and SL11 sequences. 100% nucleotide conservation, light blue; at least 60% conservation, dark blue. The clustering was carried out by T-coffee.

. 9.

—Analysis of invertebrate Ω-, J2-, SL11, and SL20-crystallin promoters. (A) Scallop Ω-crystallin promoter. (B) Jellyfish J2-crystallin promoter. (C) Squid S-crystallin SL11 and SL20 promoters. Pax6-binding sites (blue font), ARE and TATA-box (horizontal brackets), HSEs (bold and underlined), start sites of transcription (horizontal arrows), coding sequences (green font).

—Multiple sequence alignments of five invertebrate crystallin promoters. (A) Phylogram of the promoter sequences: Scallop Ω-crystallin/Aldh1a9, squid SL11 and SL20 S-crystallin promoters, jellyfish J1A and J2-crystallin promoters. The phylogram employed sequences from the 5′-end throughout the TATA-box using alignments in panel B with maximum likelihood method and 1,000 bootstrap calculation. (B) Multiple sequence alignments of the invertebrate crystallin promoters. The alignment shown was generated from two blocks based on the use of TATA-box and experimentally defined transcriptional start sites (Tomarev etal. 1994; Carosa etal. 2002; Kozmik etal. 2008) using T-coffee. The 5′-box included all sequences up to the TATA. The downstream box started from the TATA-box and included the first ATG sequences in J1A, J2, and SL11 sequences. 100% nucleotide conservation, light blue; at least 60% conservation, dark blue. The clustering was carried out by T-coffee. —Analysis of invertebrate Ω-, J2-, SL11, and SL20-crystallin promoters. (A) Scallop Ω-crystallin promoter. (B) Jellyfish J2-crystallin promoter. (C) Squid S-crystallin SL11 and SL20 promoters. Pax6-binding sites (blue font), ARE and TATA-box (horizontal brackets), HSEs (bold and underlined), start sites of transcription (horizontal arrows), coding sequences (green font).

Discussion

Reconstruction of an evolutionary history of any family of genes encoding proteins that determine a unique cell type-specific property, such as lens transparency, is aided by a) following general principles of evolution applied to formation of novel cell types/tissues/organs (Arendt etal. 2016), b) analysis of functionally annotated genomic DNA regions from related and distant organisms (Park etal. 2016), and c) detailed knowledge on the molecular mechanisms that govern tissue-specific gene expression (Davidson and Erwin 2006; Sun and Kim 2011; Ettensohn 2013). The present study was aimed to examine origin of lens-specific Pax6-binding sites, identify ancestral genes from which diverse families of crystallins evolved, and search for one or more unifying principles of crystallin gene recruitment throughout metazoan evolution. The current studies were conducted with the aim to shed new light into the genome evolutionary changes and their consequences for pathways underlying eye evolution in both vertebrate and invertebrate organisms (Gehring and Ikeo 1999; Lamb 2013; Nilsson 2013). Thus, we aimed to address the potential use of parallel and convergent evolutionary steps and/or their combinations. The present data shed new light onto ocular lens evolution through the identification of complementary evolutionary mechanisms, including gene duplication, formation of Pax6-binding sites from preexisting cis-elements, and insertion-mediated de novo gene formation. Through convergent evolution, any Pax6-binding site could evolve from nonfunctional and/or preexisting cis-sites near active transcriptional control regions. The complementary hypothesis states that an initial formation of Pax6-dependent regulatory region occurred through the processes described above but only in one or very few genes. If expression of the regulated gene provides a selective advantage for the cell expressing it, the gene could be “recruited” with an opportunity to maintain the original function and/or produce novel function(s) (Piatigorsky 2006). In case of α-crystallins, the original function could be similar to what other small heat shock proteins do, that is, to protect proteins from missfolding and inhibit programmed cell death (Arrigo and Simon 2010). The present data show that Pax6-binding sites in α-crystallin genes could indeed emerge from proximal promoter sequences, called here the “HSE/Pax6 precursor binding region”, found in the ancestral Hspb1/6 genes. The conservation of distal HSEs makes this region available for evolutionary changes towards formation of either additional HSEs like in αB-crystallin promoter, and subsequently, by destroying the nonessential HSE in the αA-crystallin promoter. It is important to state that the current Hspb1, Hspb6, Cryab, and Cryaa promoter sequences do not provide any definitive insight into the putative nucleotide sequence of the ancestral gene Hspb1-like gene. Similarly, multiple HSE-like motifs found in both amphioxus and sea squirt β/γ-crystallin-like genes could provide specific “seed” sequences for the emergence of new Pax6-binding sites. The β/γ-crystallin family of genes expanded by gene duplication while retaining and evolving their 5′-flanking sequences. From a purely evolutionarily perspective, it is easier to envision a common origin from a single “improbable” event compared with many parallel processes by convergent evolution within a family of duplicated genes (Gehring 2004, 2012). An interesting question is how consensus Pax6-binding sites are related to the optimal binding sites of all known transcription factors as any of these sites has a potential to give rise to a new site recognized by different factors and change gene regulatory mechanisms. Comparisons of optimal binding sites have an internal limitation as it has been proposed that promoter/enhancer mediated transcriptional regulation offers better opportunities if the individual regulatory sites are “suboptimal” (Crocker etal. 2015; Jolma etal. 2015). An unbiased analysis using the TOMTOM motif comparison software (http://meme-suite.org/tools/tomtom) to compare the Pax6 motif with all vertebrate TF motifs in the Jaspar database (http://jaspar.genereg.net/) identified binding motifs for glial cells missing homolog 1 (GCM1) and T-box transcription factors (e.g., Tbr1) that use DNA-binding domains distinct from the helix-turn-helix domains forming the Pax6 paired domain, that is, metal chelating and Tbox DNA-binding domain, respectively. This analysis illustrates that there are many routes how preexisting cis-sites can be modified towards Pax6-binding sites and the actual mechanisms ultimately selected by evolution depends on the transcription factors available and gene product function(s). Our studies suggest that multiple invertebrate crystallins originated from an ancestral Aldh gene, a pair of vertebrate α-crystallins originated from an Hspb1 gene, and 14 vertebrate β/γ-crystallins emerged from a single β/γ-crystallin-like gene (see fig. 10). The common denominator between Aldh and Hspb genes is their stress-inducibility and inhibition of apoptosis by the encoded proteins. Small heat shock proteins are multifunctional proteins used to protect cells from heat shock and other types of stress (Arrigo and Simon 2010). Mouse Aldh3a1 possesses both antiapoptotic function and stress inducibility (Choudhary etal. 2005; Lassen etal. 2007) likely mediated by at least two HSEs and an ARE predicted here in the Ω-crystallin/Aldh1a9 promoter.
. 10.

—Summary of evolution model of multigene α- and β/γ-crystallin families. (A) The process of gene duplication originating from an ancestral Hspb1 gene to produce a pair of Cryab and Cryaa genes. (B) The process of gene fusion to create the ancestral β/γ-crystallin-like gene. Tentative acquisition of HSE and Pax6 sites is shown in brackets.

—Summary of evolution model of multigene α- and β/γ-crystallin families. (A) The process of gene duplication originating from an ancestral Hspb1 gene to produce a pair of Cryab and Cryaa genes. (B) The process of gene fusion to create the ancestral β/γ-crystallin-like gene. Tentative acquisition of HSE and Pax6 sites is shown in brackets. The recruitment of small heat shock proteins, other stress-related proteins, and ubiquitous enzymes to function as lens crystallins has been extensively discussed earlier mostly through their ability to support lens transparency and its refractive power (Piatigorsky 2006). It is obvious that any crystallin protein needs to be water soluble at a very high concentration, and selection pressures had to choose for the right proteins. Since evolution works by improvement of something that already works, it is unlikely that the issue of water-solubility and concentration would have any impact at the beginning of this process. It has been proposed that elongation of lens fiber cells requires elaborate changes in the cellular cytoskeleton as well as changes in the osmoregulation that would preferentially select small heat shock proteins (Wistow 1993). Indeed, both αB-crystallin and small heat shock proteins interact with actin and have roles in cytoskeleton assembly (Xi etal. 2006; Ghosh etal. 2007; Singh etal. 2007) and play several noncanonical roles in lens and nonlens cells (Gangalum and Bhat 2009; Gangalum etal. 2011). It is also possible that β/γ-crystallins originally served as UV light filters (Hibbert etal. 2015). We propose here that the reason to initially select small heat shock proteins is their strong antiapoptotic function (Arrigo and Simon 2010; Basha etal. 2012) coupled with their compatibility with the cytoskeleton remodeling during cellular elongation (Wistow 1993). The hallmark of lens differentiation is an organized degradation of organelles inside of the lens fiber cells to prevent light scattering. Degradation of organelles combined with preservation of cells is a unique property of the lens, and thus, a reasonable initial force to improve lens function. However, there is an internal conflict between the degradation of organelles that leads to apoptosis in other cell types, and the necessity to make organelle-free fiber cells as a structural foundation of the bulk of the lens. The elevation of expression of an ancestral crystallin/small heat shock protein would provide the antiapoptotic cellular environment to preserve the cells while undergoing demise of the mitochondria, endoplasmatic reticulum, Golgi apparatus, and finally nuclei. The αA-crystallin indeed protects lens epithelial cells from apoptosis during mitosis (Xi etal. 2003b). Loss of both αA- and αB-crystallins in double-knockout mouse lens resulted in a caspase 3/6-dependent lens fiber cell disintegration (Morozov and Wawrousek 2006). These studies demonstrate that a specific antiapoptotic role of α-crystallins is essential for the lens viability. Interestingly, specific mutations in both α- and γ-crystallins inhibit the denucleation process (Sandilands etal. 2002; Graw etal. 2004; Gupta etal. 2011) further supporting the link between shsps and denucleation of lens fiber cells. Finally, it is possible that the Pax6/α-crystallin regulatory mechanism evolved in ancestral neuronal cells prior to its subsequent deployment in the ancestral lens fibers. This inference is supported by recent studies demonstrating that Pax6 regulates expression of αA-, βB2-, and other crystallins in a subset of olfactory bulb neurons, and that the key role of αA-crystallin in these cells is inhibition of apoptosis (Ninkovic etal. 2010). Likewise, what are the selective advantages to recruit Aldhs as lens crystallins? In the scallop eye, Ω-crystallin/Aldh1a9 is expressed both in the cornea/lens joint structure (Piatigorsky etal. 2000). The Ω-crystallin/Aldh1a9 is both a major crystallin in scallop and a minor crystallin in squid lens (Tomarev and Piatigorsky 1996). In the present mammalian eye, Aldh3a1 is the major corneal crystallin (Lassen etal. 2007) and protects cells from UV light damage (Chen etal. 2013). Thus, Aldh-based proteins were successfully employed by lens and cornea in evolutionarily very distant organisms (Jonasova and Kozmik 2008). The rodent Aldh1a3 and Aldh3a1 (Suzuki etal. 2000; Davis etal. 2008) and scallop Ω-crystallin/Aldh1a9 (Carosa etal. 2002) are all Pax6-regulated genes. Although we do not know if the ancestral “prototype” Aldh gene was also regulated by Pax6/PaxB, the direct link between Pax6/PaxB and Aldh fits to the “intercalary evolution” hypothesis to explain both initiating and universal roles of Pax6/PaxB in eye formation and evolution (Gehring and Ikeo 1999) as Aldh1a1, Aldh1a2, and Aldh1a3 are important enzymes in the metabolism of retinoic acid, a key morphogen that controls eye development (Cvekl and Wang 2009; Kumar etal. 2012). Our data also support an intriguing possibility that Aldh gene recruitment could be accompanied by duplication of their promoter sequences and their insertion in other parts of the ancestral genomes to give rise to the J- and S-crystallin promoters. Taken together, prior to lens evolution, Aldh gene(s) were already important for the formation of the most primitive eyes, and were available to expand their roles as lens and corneal structural proteins.

Conclusions

In summary, the foundation for the present study employs detailed knowledge of lens-specific regulatory mechanisms of both vertebrate and invertebrate crystallin genes through their transcriptional control by Pax6 and more ancestral PaxB in jellyfish. Lens morphogenesis can be unified by the use of small heat shock proteins for the formation of vertebrate lens and suggest that the nonvertebrate crystallins could follow the same strategy via antiapoptotic and UV-filtering activities of Aldhs. Although our data on the origin of Pax6-regulation of vertebrate crystallins support the model of HSE-like to Pax6 conversion followed by gene duplication, the origin of invertebrate crystallin promoters is less clear and awaits for complete DNA sequences of the invertebrate genomes of interest. Finally, it is possible to envision experimental testing of the current models as it is now feasible to screen large promoter libraries using massively parallel reporter assays (Inoue and Ahituv 2015).

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online.

Author Contributions

Designed research: A.C. and D.Z. Performed experiments: A.C., Y.Z., R.S.M., Q.X., and D.Z. Data analysis and interpretation: A.C., Y.Z., R.S.M., Q.X., X.G., and D.Z. Wrote the manuscript: A.C. and D.Z. Click here for additional data file. Click here for additional data file.
  125 in total

1.  T-Coffee: A novel method for fast and accurate multiple sequence alignment.

Authors:  C Notredame; D G Higgins; J Heringa
Journal:  J Mol Biol       Date:  2000-09-08       Impact factor: 5.469

2.  Caspase-dependent secondary lens fiber cell disintegration in alphaA-/alphaB-crystallin double-knockout mice.

Authors:  Viktor Morozov; Eric F Wawrousek
Journal:  Development       Date:  2006-01-26       Impact factor: 6.868

3.  Methylation and deamination of CpGs generate p53-binding sites on a genomic scale.

Authors:  Tomasz Zemojtel; Szymon M Kielbasa; Peter F Arndt; Ho-Ryun Chung; Martin Vingron
Journal:  Trends Genet       Date:  2008-12-26       Impact factor: 11.639

4.  The animal body plan, the prototypic body segment, and eye evolution.

Authors:  Walter J Gehring
Journal:  Evol Dev       Date:  2012 Jan-Feb       Impact factor: 1.930

Review 5.  Lens crystallins: the evolution and expression of proteins for a highly specialized tissue.

Authors:  G J Wistow; J Piatigorsky
Journal:  Annu Rev Biochem       Date:  1988       Impact factor: 23.643

6.  Novel PAX6 binding sites in the human genome and the role of repetitive elements in the evolution of gene regulation.

Authors:  Yi-Hong Zhou; Jessica B Zheng; Xun Gu; Grady F Saunders; W-K Alfred Yung
Journal:  Genome Res       Date:  2002-11       Impact factor: 9.043

7.  Cubozoan jellyfish: an Evo/Devo model for eyes and other sensory systems.

Authors:  Joram Piatigorsky; Zbynek Kozmik
Journal:  Int J Dev Biol       Date:  2004       Impact factor: 2.203

8.  Functional properties of natural human PAX6 and PAX6(5a) mutants.

Authors:  Bharesh K Chauhan; Ying Yang; Kveta Cveklová; Ales Cvekl
Journal:  Invest Ophthalmol Vis Sci       Date:  2004-02       Impact factor: 4.799

9.  Pax-6, a murine paired box gene, is expressed in the developing CNS.

Authors:  C Walther; P Gruss
Journal:  Development       Date:  1991-12       Impact factor: 6.868

Review 10.  Evolution of biological interaction networks: from models to real data.

Authors:  Mark G F Sun; Philip M Kim
Journal:  Genome Biol       Date:  2011-12-28       Impact factor: 13.583

View more
  9 in total

1.  Proteome-transcriptome analysis and proteome remodeling in mouse lens epithelium and fibers.

Authors:  Yilin Zhao; Phillip A Wilmarth; Catherine Cheng; Saima Limi; Velia M Fowler; Deyou Zheng; Larry L David; Ales Cvekl
Journal:  Exp Eye Res       Date:  2018-10-22       Impact factor: 3.467

Review 2.  Crystallin gene expression: Insights from studies of transcriptional bursting.

Authors:  Ales Cvekl; Carolina Eliscovich
Journal:  Exp Eye Res       Date:  2021-04-21       Impact factor: 3.770

3.  The zebrafish as a model system for analyzing mammalian and native α-crystallin promoter function.

Authors:  Mason Posner; Kelly L Murray; Matthew S McDonald; Hayden Eighinger; Brandon Andrew; Amy Drossman; Zachary Haley; Justin Nussbaum; Larry L David; Kirsten J Lampi
Journal:  PeerJ       Date:  2017-11-27       Impact factor: 2.984

4.  Bidirectional Analysis of Cryba4-Crybb1 Nascent Transcription and Nuclear Accumulation of Crybb3 mRNAs in Lens Fibers.

Authors:  Saima Limi; Yilin Zhao; Peng Guo; Melissa Lopez-Jones; Deyou Zheng; Robert H Singer; Arthur I Skoultchi; Ales Cvekl
Journal:  Invest Ophthalmol Vis Sci       Date:  2019-01-02       Impact factor: 4.925

5.  Promoter-enhancer looping and shadow enhancers of the mouse αA-crystallin locus.

Authors:  Rebecca S McGreal-Estrada; Louise V Wolf; Ales Cvekl
Journal:  Biol Open       Date:  2018-12-07       Impact factor: 2.643

6.  A Novel cis Element Achieves the Same Solution as an Ancestral cis Element During Thiamine Starvation in Candida glabrata.

Authors:  Christine L Iosue; Anthony P Gulotta; Kathleen B Selhorst; Alison C Mody; Kristin M Barbour; Meredith J Marcotte; Lilian N Bui; Sarah G Leone; Emma C Lang; Genevieve H Hughes; Dennis D Wykoff
Journal:  G3 (Bethesda)       Date:  2020-01-07       Impact factor: 3.154

Review 7.  Chemical Properties Determine Solubility and Stability in βγ-Crystallins of the Eye Lens.

Authors:  Megan A Rocha; Marc A Sprague-Piercy; Ashley O Kwok; Kyle W Roskamp; Rachel W Martin
Journal:  Chembiochem       Date:  2021-02-10       Impact factor: 3.164

8.  RNA-mediated gene regulation is less evolvable than transcriptional regulation.

Authors:  Joshua L Payne; Fahad Khalid; Andreas Wagner
Journal:  Proc Natl Acad Sci U S A       Date:  2018-03-26       Impact factor: 11.205

9.  Profiling of chromatin accessibility and identification of general cis-regulatory mechanisms that control two ocular lens differentiation pathways.

Authors:  Yilin Zhao; Deyou Zheng; Ales Cvekl
Journal:  Epigenetics Chromatin       Date:  2019-05-03       Impact factor: 4.954

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.