| Literature DB >> 29079681 |
Abstract
The noncanonical 5' intron donor splice sites GA and GG are exceedingly rare in described eukaryotic genomes; however, they are present in ∼12% of introns in the genome of the copepod Eurytemora affinis Failure to recognize the high frequency of these donor sites compromised the modeling of genes in this newly sequenced genome, including 10 conserved ionotropic glutamate receptor (GluR) family genes curated herein. These introns appear to have been acquired recently, along with many additional idiosyncratic introns. Their high frequency implies the evolution of modified intron donor splice site recognition in this copepod.Entities:
Keywords: Eurytemora; GA donors; copepod genome; intron evolution; noncanonical intron donor splice sites
Mesh:
Substances:
Year: 2017 PMID: 29079681 PMCID: PMC5714493 DOI: 10.1534/g3.117.300189
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Features of 10 conserved IR and related ionotropic GluR genes and proteins in E. affinis
| Gene | Scaffold# | Length (bp) | Exons | Models | Amino Acids | GA/GG Donors |
|---|---|---|---|---|---|---|
| IR8aF | 68 | 33,279 (26,505) | 33 (23) | 3 | 874 (569) | 5/0 |
| IR21a | 3 | 33,150 (16,252) | 23 (13) | 2 | 763 (493) | 4/1 |
| IR25aF | 269 | 90,005 (90,005) | 34 (34) | 5 | 907 (907) | 4/2 |
| IR76bF | 532 | 18,715 (27,613) | 19 (15) | 2 | 480 (417) | 1/0 |
| IR93a | 35 | 18,850 (15,580) | 20 (19) | 1 | 942 (935) | 0/1 |
| GluR1 | 77 | 69,907 (30,397) | 25 (13) | 2 | 960 (449) | 3/0 |
| GluR2 | 5 | 24,096 (14,472) | 23 (14) | 2 | 923 (615) | 2/0 |
| NMDAR1F | 43 | 63,037 (31,051) | 36 (20) | 4 | 1095 (736) | 2/0 |
| NMDAR2-1 | 101 | 123,322 (102,094) | 37 (28) | 8 | 1055 (847) | 6/1 |
| NMDAR2-2F | 141 | 86,450 (76,362) | 47 (38) | 4 | 1030 (823) | 4/0 |
Lengths are from start to stop codon in large scaffolds, excluding exons present on short separate scaffolds or those that were built de novo, both of which presumably belong in sequence gaps in the large scaffolds, the lengths of which are included in these counts. Only coding exons are included (IR76b and NMDAR2-2 have single noncoding 5′ exons). Models are the number of models in the automated gene set available at the i5k Workspace@NAL genome browser (EAFF_v0.5.3), and usually not all exons are modeled. Numbers in parentheses are for the proteins reported in Eyun . Suffix “F” after gene name indicates that the genome assembly had to be repaired for a complete gene model to be built (details of each gene model are provided in File S1).
Figure 1Sequence logos showing information content for the 36 noncanonical GA and GG 5′ intron donor splice sites and the 3′ acceptor sites for these introns, compared with sites for 261 introns with canonical donors in 10 conserved large ionotropic glutamate receptor family genes in the copepod E. affinis. (A) Ten bases of exon and 13 bases of intron sequence are shown for the donors. (B) Sixteen bases of intron and seven bases of exon sequence are shown for the acceptors. Sequence logos for frequencies of nucleotides are shown in Figure S1.