| Literature DB >> 23360665 |
Daniel J Macqueen1, Daniel Garcia de la Serrana, Ian A Johnston.
Abstract
Whole-genome duplication (WGD) was experienced twice by the vertebrate ancestor (2 rounds; 2R), again by the teleost fish ancestor (3R) and most recently in certain teleost lineages (4R). Consequently, vertebrate gene families are often expanded in 3R and 4R genomes. Arguably, many types of "functional divergence" present across 2R gene families will exceed that between 3R/4R paralogs of genes comprising 2R families. Accordingly, 4R offers a form of replication of 2R. Examining whether this concept has implications for molecular evolutionary research, we studied insulin-like growth factor (IGF) binding proteins (IGFBPs), whose six 2R family members carry IGF hormones and regulate interactions between IGFs and IGF1-receptors (IGF1Rs). Using phylogenomic approaches, we resolved the complete IGFBP repertoire of 4R-derived salmonid fishes (19 genes; 13 more than human) and established evolutionary relationships/nomenclature with respect to WGDs. Traits central to IGFBP action were determined for all genes, including atomic interactions in IGFBP-IGF1/IGF2 complexes regulating IGF-IGF1R binding. Using statistical methods, we demonstrate that attributes of these protein interfaces are overwhelming a product of 2R IGFBP family membership, explain 49-68% of variation in IGFBP mRNA concentration in several different tissues, and strongly predict the strength and direction of IGFBP transcriptional regulation under differing nutritional states. The results support a model where vertebrate IGFBP family members evolved divergent structural attributes to provide distinct competition for IGFs with IGF1Rs, predisposing different functions in the regulation of IGF signaling. Evolution of gene expression then acted to ensure the appropriate physiological production of IGFBPs according to their structural specializations, leading to optimal IGF-signaling according to nutritional-status and the endocrine/local mode of action. This study demonstrates that relatively recent gene family expansion can facilitate inference of functional evolution within ancient genetic systems.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23360665 PMCID: PMC3670735 DOI: 10.1093/molbev/mst017
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Details of the Complete Salmonid IGFBP Gene System, Including Nomenclature, Gene Structures, Primary Protein Sequences, and Identity between Paralogs of 2R Family Members.
| Proposed Salmonid Name | Previous Salmonid Name/s | NCBI Accession (Atlantic Salmon) | Gene Length, Intron–Exon Structure | Amino Acid Length/Molecular Weight (kDa) | Amino Acid % Identity with 3R/4R Relatives |
|---|---|---|---|---|---|
| cDNA: JX565543 | 2,030 bp. E1–379, I1–291, E2–173, I2–502, E3–129, I3–426, E4–134 | 268/28.7 | 79 vs. 1A2 | ||
| gDNA: AGKD01032414 | 57 vs. 1B1 | ||||
| 57 vs. 1B2 | |||||
| Novel | cDNA: JX565544 | >1,835 bp. E1–379, I1–254, E2–173, I2–399, E3–129, I3–>501, E4–126 | 268/28.7 | 79 vs. 1A1 | |
| gDNA: AGKD01161668 | 54 vs. 1B1 | ||||
| 53 vs. 1B2 | |||||
| cDNA: JX565545 | 6,149 bp. E1–397, I1–172, E2–140, I2–5021, E3–129, I3–170, E4–120 | 245/26.5 | 85 vs. 1B2 | ||
| gDNA: AGKD01000719 | 57 vs. 1A1 | ||||
| 54 vs. 1A2 | |||||
| Novel | cDNA: JX565546 | 1,866 bp. E1–366, I1–343, E2–140, I2–552, E3–129, I3–227, E4–126 | 247/26.5 | 85 vs. 1B1 | |
| gDNA: AGKD01088518 | 57 vs. 1A1 | ||||
| 53 vs. 1A2 | |||||
| IGFBP-2 | cDNA: JX565547 | >10,353 bp. E1–365, I1–>6,555, E2–183, I2–2,248, E3–197, I3–721, E4–159 | 280/31.1 | 61 vs. 2B1 | |
| gDNA: AGKD01016051 | 64 vs. 2B2 | ||||
| AGKD01019007 | |||||
| IGFBP-2B | cDNA: JX565548 | >11,278 bp. E1–413, I1–>9,037, E2–147, I2–1,384, E3–141, I3–>156, E4–180 | 283/31.6 | 77 vs. 2B2 | |
| gDNA: AGKD01042581 | 61 vs. 2A | ||||
| AGKD01190121 | |||||
| Novel | cDNA:JX565549 | >29,024 bp. E1–365, I1–>3,258, E2–183, I2–8,670, E3–177, I3–1,430, E4–141 | 286/32.1 | 77 vs. 2B1 | |
| gDNA: AGKD01106875 | 64 vs. 2A | ||||
| AGKD01074801 | |||||
| AGKD01075328 | |||||
| IGFBP-3 | cDNA: JX565550 | >23,610 bp. E1–364, I1–>15,854, E2–277, I2–718, E3–120, I3–>6,151, E4–126 | 296/31.8 | 81 vs. 3A2 | |
| gDNA: AGKD01052954 | 59 vs. 3B1 | ||||
| AGKD01064800 | 62 vs. 3B2 | ||||
| AGKD01079623 | |||||
| Novel | cDNA: JX565551 | >13,204 bp. E1–364, I1–>6,934, E2–284, I2–>2,196, E3–120, I3–2,468, E4–126 | 294/32 | 81 vs. 3A1 | |
| gDNA: AGKD01110782 | 57 vs. 3B1 | ||||
| AGKD01363123 | 58 vs. 3B2 | ||||
| AGKD01394173 | |||||
| AGKD01187220 | |||||
| Novel | cDNA: JX565552 | 8,480 bp. E1–370, I1–1,654, E2–266, I2–3,337, E3–120, I3–2,606, E4–127 | 93/32.4 | 83 vs. 3B2 | |
| gDNA: AGKD01000719 | 59 vs. 3A1 | ||||
| 57 vs. 3A2 | |||||
| Novel | cDNA: JX565553 | 11,644 bp. E1–364, I1–1,468, E2–257, I2–1,364, E3–120, I3–7,945, E4–126 | 288/32.0 | 83 vs. 3B1 | |
| gDNA: AGKD01022138 | 62 vs. 3A1 | ||||
| 58 vs. 3A2 | |||||
| IGFBP-4 | cDNA: JX565554 | 12,679 bp. E1–361, I1–9,459, E2–164, I2–574, E3–126, I3–1,857, E4–138 | 262/28.7 | N/A | |
| gDNA: AGKD01005530 | |||||
| IGFBP-5 | cDNA: JX565555 | 13,537 bp. E1–334, I1–12,362, E2–224, I2–181, E3–120, I3–178, E4–138 | 271/30.0 | 96 vs. 5B2 | |
| gDNA: AGKD01005278 | 74 vs. 5A | ||||
| IGFBP-5.1 | cDNA: JX565556 | >10,612 bp. E1–334, I1–>9475, E2–227, I2–182, E3–120, I3–142, E4–132 | 270/29.9 | 96 vs. 5B1 | |
| gDNA: AGKD01022725 | 72 vs. 5A | ||||
| AGKD01053266 | |||||
| IGFBP-5.2 | cDNA: JX565557 | >5,800 bp. E1–334, I1–>208, E2–203, I2–152, E3–120, I3 | 269/29.6 | 74 vs. 5B1 | |
| gDNA: AGKD01113827 | 72 vs. 5B2 | ||||
| AGKD01019007 | |||||
| Novel | cDNA: JX565558 | >4,784, E4–1,720 bp. E1–343, I1–723, E2–36, I2–143, E3–121, I3–250, E4–104 | 204/21.9 | 89 vs. 6A2 | |
| gDNA: AGKD01008228 | 52 vs. 6B1 | ||||
| 56 vs. 6B2 | |||||
| Novel | cDNA: JX565559 | 1,723 bp. E1–343, I1–712, E2–34, I2–165, E3–123, I3–232, E4–114 | 204/22.0 | 89 vs. 6A1 | |
| gDNA: AGKD01099665 | 51 vs. 6B1 | ||||
| 55 vs. 6B2 | |||||
| IGFBP-6 | cDNA: JX565560 | 2,953 bp. E1–340, I1–1076, E2–35, I2–787, E3–120, I3–424, E4–170 | 199/21.5 | 89 vs. 6B2 | |
| gDNA: AGKD01054596 | 52 vs. 6A1 | ||||
| 51 vs. 6A2 | |||||
| Novel | cDNA: JX565561 | 3,174 bp. E1–331, I1–1053, E2–35, I2–953, E3–120, I3–568, E4–114 | 202/21.9 | 89 vs. 6B1 | |
| gDNA: AGDK01135063 | 56 vs. 6A1 | ||||
| 55 vs. 6A2 | |||||
aSpans the whole coding sequence only. “E” and “I” respectively denote exon and introns, such that E1 and I1 would describe exon 1 and intron 1, respectively, on the sense-strand, with the number after the en dash (e.g., E1–379) reflecting the nucleotide length.
bShimizu, Kishimoto, et al. (2011).
cKamangar et al. (2006).
dBower et al. (2008).
eShimizu, Suzuki, et al. (2011).
F(A) ML family member tree for IGFBP-1; ML/NJ trees for other family members are provided in the supplementary material S1 and figs. S1–S5, Supplementary Material online. The positions of 3R and 4R were inferred according to criteria set out in the main text. In this tree, the topology is consistent with duplication of IGFBP-1 during 3R, producing IGFBP-1A and -1B (after Kamei et al. 2008) before these genes duplicated again during 4R producing IGFBP-1A1, -1A2, -1B1, and -1B2 (our 4R nomenclature, table 1). Node bootstrap support values exceeding 50% are shown. Accession numbers and Ensembl gene identifiers are provided. Novel genes are boxed in green and novel sequences highlighted bold. The scale represents the number of inferred substitutions per site. (B) ML tree of the complete Atlantic salmon IGFBP gene system, which recaptures 3R and 4R inferred from family member reconstructions (A; supplementary material S1 and figs. S1–S5, Supplementary Material online). The positions of 1R and 2R are based on comparative genomics (after Ocampo Daza et al. 2011). Branching patterns within two evident IGFBP “metaclades” are sensitive to the tree-building method and statistically poorly supported (compare B and supplementary material S1 and fig. S7, Supplementary Material online). Green branches highlight novel salmonid IGFBP genes.
F(A) Chimera renderings of a modeled ternary complex containing Atlantic salmon IGF1, NBP-4, and CBP-4. CBP and NBP surfaces are shown (transparent in the upper and lower images, respectively). IGF1 is shown as a ribbon with residues contacting NBP or CBP having sidechains. Inferred atomic-level interactions are shaded red, between NBP–IGF1 and NBP–CBP in the upper image and CBP–IGF1 and CBP-NBP in the lower image. The NBP thumb is highlighted by an arrow and the main IGF-binding region is evident as a large patch of red shading on the NBP surface. (B) Bar charts comparing core IGFBP family members in terms of the number of atomic-level contacts comprising three interfaces (identified) in IGFBP–IGF1 and IGFBP–IGF2 complexes. For each IGFBP family member, the left and right hand bars show contacts made in IGFBP–IGF1 and IGFBP–IGF2 complexes. All data are means + SD with n equal to the 4R gene number. Different letters indicate significant differences (P < 0.01) between IGFBP family members compared separately for models containing IGF1 and IGF2.
FExamples of the NBP thumb region of Atlantic salmon IGFBP–IGF1 complexes representing three IGFBP family members. Residues comprising the NBP thumb and CBP interface are colored gray and white, respectively, with the corresponding surfaces portrayed as meshes. IGF1 residues predicted to interact with the IGF1R are colored green (or blue otherwise). All residues are labeled with details provided of interactions with other surfaces in the complex.
FmRNA expression of 19 IGFBP genes in 11 juvenile Atlantic salmon tissues. (A) qPCR-derived expression levels portrayed in the style of a northern-dot blot, scaled to be relative across genes. The area of black circles represents the mean expression level and the distance between the circumference of black and dotted circles the SD (n = 4). Gene-by-gene bar graphs showing the same data are provided in the supplementary material S1 and figs. S9 and S10, Supplementary Material online. Unsupervised hierarchical clustering of IGFBP gene expression. Numbers at branch nodes are AUP values (Suzuki and Shimodaira 2006) based on 5,000 bootstrap iterations.
Regression-Associating IGFBP Gene Expression Levels with the Number of Atomic-Level Contacts at Key Interfaces in IGFBP-IGF1/IGF2 Complexes.
| mRNA Level | Best Regression Model | Mallow’s Cp, S | ||
|---|---|---|---|---|
| Liver | = −4.03 + 0.216α + 0.399β | 67.7 | 3.0, 2.74 | |
| Gill | =17.0 − 0.371α | 53.9 | 2.3, 3.15 | |
| Skin | =16.9 − 0.368α | 52.8 | 2.9, 4.32 | |
| Eye | =16.6 − 0.355α | 49.2 | 3.9, 3.31 | |
| Fast muscle | =12.7 − 0.373α + 0.259β | 60.8 | 3.0, 3.01 | |
| Head Kidney | =7.34 − 0.255α + 0.404β | 55.4 | 0.008 | 3.0, 3.21 |
| Heart | =16.0 − 0.327α | 42.0 | 0.009 | 1.4, 3.53 |
| Lower intestine | =15.9 − 0.323α | 41.4 | 0.010 | 3.9, 3.52 |
| Brain | =15.8 − 0.319α | 40.1 | 0.011 | 1.3, 4.40 |
| Spleen | =14.8 − 0.280α | 31.1 | 0.031 | 1.4, 3.82 |
| Liver | = −4.43 + 0.249α + 0.355β | 63.2 | 3.0, 2.93 | |
| Gill | =16.7 − 0.341α | 49.8 | 1.4, 3.29 | |
| Skin | =16.6 − 0.339α | 49.1 | 1.6, 3.31 | |
| Eye | =15.9 − 0.310α | 41.0 | 0.010 | 1.9, 3.57 |
| Fast muscle | =15.8 − 0.30α | 40.8 | 0.010 | 3.1, 3.56 |
| Heart | =15.2 − 0.285α | 34.7 | 0.021 | 1.0, 3.74 |
| Lower intestine | =15.0 − 0.275α | 32.8 | 0.026 | 2.2, 3.77 |
| Brain | =14.9 − 0.272α | 31.8 | 0.028 | 1.0, 3.82 |
| Spleen | =14.5 − 0.254α | 28.1 | 0.042 | 2.2, 3.90 |
Note.—< > = Number of predicted atomic level contacts at protein interface. α = IGF1 < > NBP thumb. β = NBP thumb < > CBP. Underlined probability values remain significant after Bonferroni correction.
FExample scatterplots showing the association between atomic-level contacts within modeled IGFBP–IGF1 complexes and IGFBP mRNA expression in tissues. Family members are colored as in figure 2. (A) Illustrates how IGFBP liver mRNA level is positively correlated with the combined number of contacts made between the NBP thumb and IGF1 and between the NBP thumb and CBP. (B) Illustrates how IGFBP skin mRNA level is negatively associated with the number of predicted contacts between the NBP thumb and IGF1. Associated results are given in table 2.
F(A) Scatterplot showing the association between the number of α and β contacts in modeled IGFBP–IGF1 ternary complexes and IGFBP mRNA regulation in liver at different nutritional states. The highest ranked IGFBPs on the y axis showed the greatest downregulation during 18 h ad libitum feeding that followed a 72 h period of feed restriction. IGFBPs above and below the dotted line were downregulated and upregulated, respectively. (B) Expression data for the two boxed IGFBP genes in (A). On the x axis, C, F, and R indicate control, fasted and refeeding states. On the y axis, mRNA expression level are scaled such that the C-state mean equals one; the two charts are not on equivalent scales and should only be compared to indicate the strength and direction of transcriptional regulation. Error bars represent standard deviation. IGFBP family members are colored as in figure 2.