| Literature DB >> 28469376 |
Yuh-Chyang Charng1, Lung-Hsin Hsu1, Li-Yu Daisy Liu1.
Abstract
In exonization events, Ds1 may provide donor and/or acceptor sites for splicing after inserting into genes and be incorporated into new transcripts with new exon(s). In this study, the protein variants of Ds1 exonization yielding additional functional profile(s) were studied. Unlike Ds exonization, which creates new profiles mostly by incorporating flanking intron sequences with the Ds message, Ds1 exonization additionally creates new profiles through the presence or absence of Ds1 messages. The number of unique functional profiles harboring Ds1 messages is 1.3-fold more than that of functional profiles without Ds1 messages. The highly similar 11 protein isoforms at a single insertion site also contribute to proteome complexity enrichment by exclusively creating new profiles. Particularly, Ds1 exonization produces 459 unique profiles, of which 129 cannot be built by Ds. We thus conclude that Ds and Ds1 are independent but synergistic in their capacity to enrich proteome complexity through exonization.Entities:
Keywords: Ds1 transposon; alternative splicing; exonization; nonsense-mediated decay pathway
Year: 2017 PMID: 28469376 PMCID: PMC5395267 DOI: 10.1177/1176934317690410
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Figure 1.(A) Classification of the profiles built by Ds1 and its flanking sequences. (B) Ds1 and Ds sequences yielding splice acceptor (A) or donor (D) junctions (arrows) as well as premature termination codons in exonized transcripts (bold). The translated products of Ds1 are also shown, of which gains/losses of 7 amino acids (boxes) yielded using D3/D1 were important to compose functional profiles. The termini repeat sequences of each transposable element are underlined. Note that Ds could provide 5 donors, R1, R2, R3, R4, and F1, but no acceptor. The donor, F1, is used for exonization by the opposite insertion pattern.
Number of variants, profiles, and profiles per variant yielded by each donor, acceptor, and donor-acceptor combination.
| Number of variants | Number of profiles | Number of profiles per variant | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
|
| Interior | C-terminal | Interior + C-terminal |
| Interior | C-terminal | Interior + C-terminal |
| Interior | C-terminal | Avg. |
| D1 | 834 649 | 459 547 | 1 294 196 | D1 | 1 649 507 | 1 604 133 | 3 253 640 | D1 | 1.9763 | 3.4907 | 2.5140 |
| D2 | 538 473 | 578 306 | 1 116 779 | D2 | 1 259 036 | 2 234 207 | 3 493 243 | D2 | 2.3382 | 3.8634 | 3.1280 |
| D3 | 976 104 | 506 912 | 1 483 016 | D3 | 2 066 945 | 1 821 524 | 3 888 469 | D3 | 2.1175 | 3.5934 | 2.6220 |
| Sub-total | 2 349 226 | 1 544 765 | 3 893 991 | Sub-total | 4 975 488 | 5 659 864 | 10 635 352 | Avg. | 2.1440 | 3.6492 | 2.7547 |
| D1A1 | 140 406 | 500 362 | 640 768 | D1A1 | 386 510 | 1 652 797 | 2 039 307 | D1A1 | 2.7528 | 3.3032 | 3.1826 |
| D1A2 | 88 020 | 429 108 | 517 128 | D1A2 | 213 532 | 1 315 830 | 1 529 362 | D1A2 | 2.4259 | 3.0664 | 2.9574 |
| D2A1 | 95 605 | 485 683 | 581 288 | D2A1 | 288 212 | 1 729 509 | 2 017 721 | D2A1 | 3.0146 | 3.5610 | 3.4711 |
| D2A2 | 139 554 | 443 916 | 583 470 | D2A2 | 793 056 | 1 613 625 | 2 406 681 | D2A2 | 5.6828 | 3.6350 | 4.1248 |
| D3A1 | 156 369 | 554 037 | 710 406 | D3A1 | 460 560 | 1 870 458 | 2 331 018 | D3A1 | 2.9453 | 3.3761 | 3.2812 |
| D3A2 | 96 485 | 486 622 | 583 107 | D3A2 | 250 906 | 1 503 768 | 1 754 674 | D3A2 | 2.6005 | 3.0902 | 3.0092 |
| Sub-total | 716 439 | 2 899 728 | 3 616 167 | Sub-total | 2 392 776 | 9 685 987 | 12 078 763 | Avg. | 3.2370 | 3.3387 | 3.3377 |
| A1 | 653 001 | 2 911 994 | 3 564 995 | A1 | 92 | 6 951 881 | 6 951 973 | A1 | 0.0001 | 2.3873 | 1.9501 |
| A2 | 584 570 | 2 599 057 | 3 183 627 | A2 | 49 | 6 280 165 | 6 280 214 | A2 | 0.0001 | 2.4163 | 1.9727 |
| Sub-total | 1 237 571 | 5 511 051 | 6 748 622 | Sub-total | 141 | 13 232 046 | 13 232 187 | Avg. | 0.0001 | 2.4018 | 1.9614 |
| Total | 4 303 236 | 9 955 544 | 14 258 780 | Total | 7 368 405 | 28 577 897 | 35 946 302 | Avg. | 1.7123 | 2.8706 | 2.5210 |
Figure 2.The proportions of functional variants: (A) all variants, (B) all variants, (C) interior variants, and (D) C-terminal variants. A indicates acceptor; D, donor; DA, donor and acceptor.
Figure 3.The proportions and numbers of functional profiles: (A) interior profiles, (B) C-terminal profile, (C) numbers of interior profile (in thousands), and (D) numbers of C-terminal profiles (in thousands). A indicates acceptor; D, donor; DA, donor and acceptor.
Figure 4.Distinct donor/acceptor combinations (ie, D1A2, D2A1, and D3A2) resulting in proteins that differ from other proteins by only a few amino acids.
Unique functional profiles yielded by gaining or losing 7 amino acids (either as “VGNGIYS” or “GRKRYLF”), which would be exonized using D1 and D3 donors because the splice junction of D3 is located downstream from D1 by 21 bp.
| Class | Gain or loss of | Profile ID |
|---|---|---|
| I02-D3 | GainVGNGIYS | PS00098; PS00186; PS00371; PS00447; PS01067 |
| I02-D3 | GainGRKRYLF | PS00636; PS00743; PS00761 |
| I12-D3 | GainVGNGIYS | PS00098; PS00186; PS00371; PS00420; PS00551; PS00595; PS00878; PS01067; PS01143 |
| I12-D3 | GainGRKRYLF | PS00027; PS00041; PS00636; PS01143 |
| I22-D3 | GainGRKRYLF | PS00009 |
| I04-D1 | LossVGNGIYS | PS00012; PS00027; PS00028; PS00029; PS00041; PS00059; PS00079; PS00086; PS00098; PS00189; PS00211; PS00212; PS00216; PS00251; PS00285; PS00310; PS00356; PS00358; PS00371; PS00389; PS00445; PS00527; PS00583; PS00589; PS00592; PS00615; PS00636; PS00652; PS00666; PS00678; PS00770; PS00778; PS00878; PS00909; PS00957; PS01008; PS01067;PS01145; PS01249; PS01353 |
| I04-D1 | LossGRKRYLF | PS00007; PS00029; PS00041; PS00079; PS00136; PS00189; PS00299; PS00451; PS00464; PS00595; PS00652; PS00698; PS00743; PS01202 |
| I04-D3 | GainVGNGIYS | PS00052; PS00292; PS00634; PS01094; PS01109; PS01171 |
| I04-D3 | GainGRKRYLF | PS01117 |
| I14-D1 | LossVGNGIYS | PS00012; PS00027; PS00028; PS00029; PS00053; PS00062; PS00079; PS00086; PS00095; PS00098; PS00128; PS00133; PS00146; PS00163; PS00186; PS00189; PS00194; PS00211; PS00212; PS00251; PS00262; PS00285; PS00299; PS00316; PS00338; PS00358; PS00371; PS00389; PS00392; PS00445; PS00551; PS00589; PS00592; PS00615; PS00636; PS00657; PS00672; PS00678; PS00818; PS00889; PS00914; PS01067; PS01103; PS01186; PS01275 |
| I14-D1 | LossGRKRYLF | PS00007; PS00022; PS00024; PS00028; PS00029; PS00063; PS00079; PS00107; PS00133; PS00194; PS00216; PS00232; PS00236; PS00280; PS00296; PS00362; PS00410; PS00422; PS00451; PS00464; PS00595; PS00678; PS01159; PS01186; PS60014 |
| I14-D3 | GainVGNGIYS | PS00218; PS00559; PS00605; PS01094; PS01109 |
| I14-D3 | GainGRKRYLF | PS00605; PS00634; PS01117 |
| I24-D1 | LossVGNGIYS | PS00165; PS00187; PS00237; PS00304; PS00671; PS00778; PS00915; PS01319 |
| I24-D1 | LossGRKRYLF | PS00018; PS00213 |
| I24-D3 | GainVGNGIYS | PS00062; PS00079; PS00107; PS00170; PS00211; PS00259; PS00290; PS00380; PS00588; PS00589; PS00598; PS00606; PS00636; PS01032 |
| I24-D3 | GainGRKRYLF | PS00070; PS00214; PS00674; PS01238 |
| C02-D3 | GainVGNGIYS | PS00098; PS00186; PS00189; PS00551 |
| C02-D3 | GainGRKRYLF | PS00583; PS00636; PS00761 |
| C12-D3 | GainVGNGIYS | PS00073; PS00186; PS00189; PS00371; PS00447 |
| C12-D3 | GainGRKRYLF | PS00041; PS00098; PS00636; PS00761 |
| C22-D3 | GainGRKRYLF | PS00009 |
| C04-D1 | LossVGNGIYS | PS00028; PS00029; PS00041; PS00061; PS00079; PS00211; PS00389; PS00551 |
| C04-D1 | LossGRKRYLF | PS00007; PS00028; PS00029; PS00041; PS00159; PS00189; PS00464; PS00583; PS01176; PS01249 |
| C04-D3 | GainVGNGIYS | PS00107; PS01047; PS01094 |
| C04-D3 | GainGRKRYLF | PS00527; PS01117; PS01143 |
| C14-D1 | LossVGNGIYS | PS00012; PS00022; PS00028; PS00029; PS00061; PS00073; PS00079; PS00086; PS00132; PS00133; PS00163; PS00189; PS00211; PS00285; PS00371; PS00392; PS00447; PS00527; PS00605; PS00678; PS00878; PS00889 |
| C14-D1 | LossGRKRYLF | PS00007; PS00021; PS00022; PS00028; PS00029; PS00063; PS00079; PS00223; PS00270; PS00296; PS00527; PS00652; PS00678; PS00761; PS01186 |
| C14-D3 | GainVGNGIYS | PS00214; PS00217; PS00622; PS01047 |
| C14-D3 | GainGRKRYLF | PS00217 |
| C24-D1 | LossVGNGIYS | PS00217; PS00615 |
| C24-D1 | LossGRKRYLF | PS00018; PS00027; PS00213 |
| C24-D3 | GainVGNGIYS | PS00092; PS00107; PS00205 |
| C24-D3 | GainGRKRYLF | PS00012; PS00216; PS00276 |
| C44-D1 | LossVGNGIYS | PS00435 |
Note that the total number of profiles composing a class may not be equal to the number of unique profiles of that class shown in Figures 2 and 3 because messages other than these 7 amino acids can also build functional profiles.
Unique translated Ds1 messages for the functional profiles of exonized protein isoforms yielded by joining specific donor and acceptor sites of D1A2, D2A1, and D3A2.
| Unique translated | Interior | C-terminal |
|---|---|---|
|
| ||
| RDENDY |
| PS00007; PS00189 |
| RDENDYH | — |
|
| RDENDYHFHP | PS00028 | |
| GMKTI | — | PS00371 |
| GMKTII | PS00079; | PS00079; PS00636 |
| GMKTIIT | — | PS01067 |
| GMKTIITFI |
| PS00098; PS00356 |
| GMKTIITFIP | PS00041; PS00107; PS00189; | PS00041; PS00189; PS00223; PS00622; PS00634; PS00716; PS00838; PS01047 |
| MKTIITFIP | — |
|
|
| ||
| RDENG | — | PS00761 |
| RDENGRKR | PS00041 | — |
| NGRK | PS00009 | PS00009 |
| RKRS |
| PS00004 |
| SDYHFHP |
| PS00214 |
| GMKTV | — | PS00371 |
| GMKTVG | PS00186; PS00589; PS00878 | PS00186; PS00420; PS00589; PS01067 |
| GMKTVGNA |
| PS00012 |
| GMKTVGNAQII | — |
|
| GMKTVGNAQIITF | — | |
| GMKTVGNAQIITFIP | PS00041; PS00716; PS01047 | PS00716; PS01047 |
| TVGNAQIITFIP | PS00223 | — |
| NAQIITFIP | PS00189 | — |
|
| ||
| RDENG | — | PS00761 |
| RDENGRKR | PS00041 | — |
| RDENGRKRY | PS00636 | PS00636 |
| RDENGRKRYL | — |
|
| RDENGRKRYLFD | — |
|
| NGRK | PS00009 | PS00009 |
| GMKTV | — | PS00371 |
| GMKTVG | PS00186; PS00589; PS00878 | PS00186; PS00420; PS00589; PS01067 |
| GMKTVGNGI | — | PS00098 |
| GMKTVGNGIY | — | PS00189 |
| VGNGIYSIITFIP | PS00107 | PS00107 |
| GNGIYSIITFIP | PS00189 | — |
| NGIYSIITFIP |
| |
| GIYSIITFIP | PS00079 | PS00079 |
| YSIITFIP |
| PS00027 |
| SIITFIP |
| PS00392 |
The resulting variants differ from others by only a few amino acids, which build unique profiles by 1 (bold) or 2 patterns only. Although PS00189, PS00371, PS01067, and PS00041 present in all three patterns, each profile was built by independent translated Ds1 message (see text). Profiles yielded by gaining or losing 7 amino acids (either as “VGNGIYS” or “GRKRYLF”), which are exonized using D1A2 and D3A2, were not shown.
Figure 5.The distributions of (A and B) numbers and (C and D) unique numbers of profiles per intron in rice yielded by Ds1 and Ds exonization, respectively.