| Literature DB >> 15575968 |
Bing-Bing Wang1, Volker Brendel.
Abstract
A total of 74 small nuclear RNA (snRNA) genes and 395 genes encoding splicing-related proteins were identified in the Arabidopsis genome by sequence comparison and motif searches, including the previously elusive U4atac snRNA gene. Most of the genes have not been studied experimentally. Classification of these genes and detailed information on gene structure, alternative splicing, gene duplications and phylogenetic relationships are made accessible as a comprehensive database of Arabidopsis Splicing Related Genes (ASRG) on our website.Entities:
Mesh:
Substances:
Year: 2004 PMID: 15575968 PMCID: PMC545797 DOI: 10.1186/gb-2004-5-12-r102
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Arabidopsis snRNA genes
| Gene | GeneID | Chromosome | Strand | From | To | Length (nucleotides) | e-value | Similarity | GenBank ID |
| At5g49054 | 5 | - | 19903323 | 19903158 | 166 | 1E-89 | 1-166, 100% | gi17660 | |
| At4g23415 | 4 | + | 12225621 | 12225786 | 166 | 1E-58 | 1-166, 92% | gi22293582 | |
| At5g51675 | 5 | + | 21013986 | 21014149 | 164 | 4E-55 | 3-166, 91% | ||
| At5g25774 | 5 | - | 8972971 | 8972807 | 165 | 2E-51 | 1-166, 90% | gi22293583 | |
| At1g08115 | 1 | - | 2538238 | 2538073 | 166 | 1E-46 | 1-166, 89% | gi22293581 | |
| At3g05695 | 3 | + | 1681815 | 1681977 | 163 | 4E-40 | 4-166, 87% | ||
| At3g05672 | 3 | + | 1657766 | 1657928 | 163 | 4E-40 | 4-166, 87% | gi22293580 | |
| At5g27764 | 5 | + | 9832576 | 9832740 | 165 | 1E-39 | 1-166, 87% | ||
| At5g26694 | 5 | - | 9494594 | 9494430 | 165 | 1E-27 | 1-166, 84% | ||
| At1g11884 | 1 | - | 4007396 | 4007236 | 161 | 1E-18 | 4-61, 93%; 80-166, 88% | ||
| At4g16645 | 4 | + | 9370786 | 9370841 | 56 | 7E-17 | 4-59, 94% | ||
| At4g23565 | 4 | - | 12298871 | 12298802 | 70 | 1E-15 | 94-163, 90% | ||
| At5g49524 | 5 | - | 20112431 | 20112275 | 157 | 2E-14 | 4-50, 91%; 91-166, 88% | ||
| At1g35354 | 1 | + | 12986822 | 12986908 | 87 | 1E-06 | 10-60, 88%; 84-118, 88% | ||
| At1g16825 | 1 | + | 5758381 | 5758575 | 195 | 2E-88 | 1-196, 96% | ||
| At3g57645 | 3 | + | 21357718 | 21357913 | 196 | 1E-107 | 1-196, 100% | gi17661 | |
| At3g57765 | 3 | - | 21408595 | 21408400 | 196 | 1E-95 | 1-196, 97% | gi17662 | |
| At3g56825 | 3 | - | 21052994 | 21052800 | 195 | 5E-86 | 1-196, 95% | gi17663 | |
| At5g09585 | 5 | + | 2975013 | 2975208 | 196 | 7E-79 | 1-196, 93% | gi17664 | |
| At3g56705 | 3 | + | 21015472 | 21015667 | 196 | 1E-83 | 1-196, 94% | gi17665 | |
| At5g61455 | 5 | - | 24730829 | 24730634 | 196 | 5E-86 | 1-196, 95% | gi17666 | |
| At5g67555 | 5 | + | 26966884 | 26967079 | 196 | 5E-86 | 1-196, 95% | ||
| At4g01885 | 4 | + | 815273 | 815466 | 194 | 2E-82 | 1-194, 94% | gi17667 | |
| At2g02938 | 2 | + | 849777 | 849972 | 196 | 3E-93 | 1-196, 96% | gi22293586 | |
| At2g02940 | 2 | + | 852859 | 853054 | 196 | 3E-93 | 1-196, 96% | ||
| At1g09805/09895 | 1 | - | 3180736 | 3180547 | 190 | 8E-85 | 1-190, 95% | ||
| At2g20405 | 2 | + | 8809169 | 8809364 | 196 | 3E-81 | 1-196, 94% | gi22293584 | |
| At1g14165 | 1 | + | 4842274 | 4842469 | 196 | 3E-81 | 1-196, 94% | gi22293585 | |
| At5g62415 | 5 | + | 25083790 | 25083985 | 196 | 4E-74 | 1-196, 92% | ||
| At5g57835 | 5 | - | 23448717 | 23448522 | 196 | 2E-67 | 1-196, 92% | ||
| At5g14545 | 5 | - | 4690105 | 4690008 | 98 | 3E-44 | 1-98, 97% | ||
| At3g26815 | 3 | + | 9881236 | 9881303 | 68 | 2E-14 | 1-68, 89% | ||
| At5g49056 | 5 | - | 19902970 | 19902817 | 154 | 4E-80 | 1-154, 99% | gi17673 | |
| At3g06900 | 3 | - | 2178343 | 2178190 | 154 | 2E-75 | 1-154, 98% | gi17674 | |
| At5g49526 | 5 | - | 20112072 | 20112030 | 43 | 2E-11 | 15-57, 95% | gi17675 | |
| At1g49242/49235 | 1 | - | 18222354 | 18222201 | 154 | 2E-75 | 1-154, 98% | gi22293588 | |
| At5g25776 | 5 | - | 8972618 | 8972465 | 154 | 1E-70 | 1-154, 96% | ||
| At1g11886 | 1 | - | 4007020 | 4006867 | 154 | 1E-70 | 1-154, 96% | gi22293587 | |
| At5g27766 | 5 | + | 9832934 | 9833083 | 150 | 7E-66 | 1-150, 96% | ||
| At5g26996 | 5 | - | 9494230 | 9494081 | 150 | 7E-66 | 1-150, 96% | ||
| At1g79965 | 1 | + | 30086031 | 30086168 | 138 | 9E-47 | 18-154, 92% | ||
| At1g35356 | 1 | + | 12987189 | 12987313 | 125 | 3E-34 | 1-124, 90% | ||
| At1g68395 | 1 | + | 25647322 | 25647396 | 75 | 9E-07 | 18-37, 100%; 60-102, 90% | ||
| At3g55865 | 3 | - | 20740607 | 20740503 | 105 | 6E-35 | 1-105, 94% | gi17676 | |
| At3g55855 | 3 | - | 20736881 | 20736780 | 102 | 7E-38 | 1-102, 96% | gi22293592 | |
| At1g65115 | 1 | + | 24194482 | 24194586 | 105 | 1E-39 | 1-105, 96% | ||
| At1g70185 | 1 | + | 26433396 | 26433497 | 102 | 7E-38 | 1-102, 96% | gi22293590 | |
| At3g55645 | 3 | + | 20653843 | 20653947 | 105 | 3E-37 | 1-105, 95% | ||
| At1g24105/24095 | 1 | - | 8525204 | 8525103 | 102 | 2E-35 | 1-102, 95% | gi22293591 | |
| At1g04475 | 1 | - | 1215831 | 1215730 | 102 | 2E-35 | 1-102, 95% | gi22293589 | |
| At4g02535 | 4 | - | 1114629 | 1114528 | 102 | 1E-30 | 2-103, 93% | ||
| At3g25445 | 3 | - | 9227212 | 9227116 | 97 | 1E-20 | 5-101, 89% | ||
| At1g79545 | 1 | - | 29928543 | 29928447 | 97 | 1E-20 | 5-101, 89% | ||
| At5g14547 | 5 | - | 4690412 | 4690370 | 43 | 3E-12 | 24-67, 97% | ||
| At5g54065 | 5 | - | 21957066 | 21957023 | 44 | 2E-10 | 20-64, 95% | ||
| At1g71355 | 1 | + | 26895255 | 26895298 | 44 | 2E-10 | 20-64, 95% | ||
| At5g53745 | 5 | - | 21829988 | 21829943 | 46 | 3E-09 | 24-70, 93% | ||
| At3g14735 | 3 | + | 4951596 | 4951697 | 102 | 1E-51 | 1-102, 100% | gi16516 | |
| At3g13855 | 3 | + | 4561111 | 4561212 | 102 | 2E-49 | 1-102, 99% | gi16517 | |
| At5g46315 | 5 | + | 18804616 | 18804717 | 102 | 2E-49 | 1-102, 99% | gi16518 | |
| At5g62995 | 5 | + | 25296825 | 25296926 | 102 | 1E-51 | 1-102, 100% | ||
| At4g27595 | 4 | + | 13782215 | 13782316 | 102 | 1E-51 | 1-102, 100% | ||
| At4g03375 | 4 | - | 1483121 | 1483020 | 102 | 1E-51 | 1-102, 100% | ||
| At4g33085 | 4 | - | 15965258 | 15965158 | 101 | 8E-37 | 1-101, 94% | ||
| At4g35225 | 4 | + | 16754836 | 16754931 | 96 | 1E-32 | 1-102, 93% | ||
| At2g15532 | 2 | + | 6784793 | 6784869 | 77 | 7E-25 | 4-80, 93% | ||
| At1g52605 | 1 | + | 19596398 | 19596476 | 96 | 2E-19 | 4-99, 87% | ||
| At1g53465 | 1 | - | 19960538 | 19960485 | 54 | 9E-09 | 21-74, 88% | ||
| At3g45705 | 3 | + | 16792802 | 16792888 | 87 | 2E-06 | 1-46, 89%; 62-100, 89% | ||
| At5g11085 | 5 | - | 3522167 | 3522143 | 25 | 9E-06 | 1-25, 100% | ||
| At1g61275 | 1 | + | 22606785 | 22606960 | 176 | 1E-95 | 1-176, 100% | †gi22293600 | |
| At5g40395 | 5 | - | 16183534 | 16183413 | 122 | 1E-63 | 1-122, 100% | † | |
| At1g21395 | 1 | - | 7491489 | 7491378 | 112 | 5E-20 | 1-65, 95%; 81-110, 93% | ||
| At4g16065 | 4 | + | 9096374 | 9096532 | 159 | N/A | N/A |
Chromosomal locations were determined by conducting BLAST searches against the Arabidopsis genome (Release 5.0). *The gene used for query in the BLAST search; †atU12 and atU6atac sequences, which were experimentally identified [28]. Their sequences were compiled manually from the cited paper. The GenBank gi numbers for the chromosome sequences used are as follows: chromosome 1, 42592260; chromosome 2, 30698031; chromosome 3, 30698537; chromosome 4, 30698542; chromosome 5, 30698605.
Figure 1Sequence alignments of U4atac and U6atac snRNAs. The tentative Arabidopsis U4atac snRNA was aligned against the human U4atac snRNA (U62822) using CLUSTAL W [22]. Possible sequence domains are indicated by different background colors, with cyan indicating transcription signals (USE, upstream sequence element; TATA, TATA box), green indicating the region involved in the stem-loop-stem structure, and pink indicating the domain that binds Sm proteins. The corresponding interaction region in U6atac snRNA is also marked in green. Red background indicates G-T base-pairs in the stem-loop structure. Grey letters indicate the genome sequence upstream and downstream of the putative U4atac gene. Asterisks (upper panel) and black shading (lower panel) show conserved positions in the alignment.
Figure 2Chromosomal locations of Arabidopsis snRNAs. Chromosomes 1 to 5 are represented to scale by the long thick lines in dark green. The small bars above the chromosomes indicate the presence of an snRNA gene in that region. Different colors represent different snRNA types: red, U1 snRNA; magenta, U2 snRNA; blue, U4 snRNA; green, U5 snRNA; yellow, U6 snRNA; black, minor snRNA. The seven U1-U4 snRNA gene clusters (red-blue) and the single U2-U5 snRNA gene cluster (magenta-green) are indicated by red circles.
Arabidopsis splicing-related proteins
| Human homologs | Gene name | GeneID | Chromosome | Tnb | AltS | Chromosomal duplication | Protein domain | Reference | |
| SmB | SmB1 | At5g44500 | 5 | 7 | >4-5a | Sm, 1 | |||
| At4g20440 | 4 | 21 | IntronR (1); | >4-5a | Sm, 1 | ||||
| SmD1 | SmD1 | At3g07590 | 3 | 7 | IntronR (1); | Sm, 1 | |||
| At4g02840 | 4 | 13 | Sm, 1 | ||||||
| SmD2 | SmD2 | At2g47640 | 2 | 7 | AltA (1); AltD (1); | Sm, 1 | |||
| At3g62840 | 3 | 25 | AltA (1); | Sm, 1 | |||||
| SmD3 | SmD3 | At1g76300 | 1 | 9 | >1-1c | Sm, 1 | |||
| At1g20580 | 1 | 7 | >1-1c | Sm, 1 | |||||
| SmE | SmE | At4g30330 | 4 | 2 | >2-4b | Sm, 1 | |||
| At2g18740 | 2 | 10 | AltA (1); | >2-4b | Sm, 1 | ||||
| SmF | SmF | At4g30220 | 4 | 6 | Sm, 1 | ||||
| SmG | SmG | At2g23930 | 2 | 13 | Sm, 1 | ||||
| At3g11500 | 3 | 9 | Sm, 1 | ||||||
| LSM2 | LSm2 | At1g03330 | 1 | 7 | Sm, 1 | ||||
| LSM3 | LSm3 | At1g21190 | 1 | 6 | >1-1c | Sm, 1 | |||
| At1g76860 | 1 | 16 | >1-1c | Sm, 1 | |||||
| LSM4 | LSm4 | At5g27720 | 5 | 13 | Sm, 1 | ||||
| LSM5 | LSm5 | At5g48870 | 5 | 7 | AltA (1); | Sm, 1 | [47] | ||
| LSM6 | LSm6 | At3g59810 | 3 | 7 | >2-3 | Sm, 1 | |||
| At2g43810 | 2 | 5 | >2-3 | Sm, 1 | |||||
| LSM7 | LSm7 | At2g03870 | 2 | 6 | Sm, 1 | ||||
| LSM8 | LSm8 | At1g65700 | 1 | 9 | Sm, 1 | ||||
| LSM1 | LSm1 | At1g19120 | 1 | 8 | Sm, 1 | ||||
| At3g14080 | 3 | 9 | IntronR (1); | Sm, 1 | |||||
| U1A Subunit | Mud1 | At2g47580 | 2 | 14 | ExonS (1); | RRM, 2 | [49] | ||
| U1C Subunit | Yhc1 | At4g03120 | 4 | 5 | C2H2, 1; mrCtermi, 3 | ||||
| U1-70K | Snp1 | At3g50670 | 3 | 32 | IntronR (1); | RRM, 1 | [48] | ||
| - | Prp39 | At1g04080 | 1 | 12 | ExonS (6); | HAT, 7; TPR-like, 1 | |||
| At5g46400 | 5 | 1 | HAT, 4; | ||||||
| FBP11 | Prp40 | At1g44910 | 1 | 10 | IntronR (1); | WW, 2; FF, 5 | |||
| FBP11 | Prp40 | At3g19670 | 3 | 5 | WW, 2; FF, 5 | ||||
| Luc7-like protein | Luc7 | At3g03340 | 3 | 6 | DUF259, 1 | ||||
| At5g17440 | 5 | 8 | DUF259, 1 | ||||||
| Related to Luc7-like protein | Luc7 | At5g51410 | 5 | 7 | IntronR (1); | DUF259, 1 | |||
| U2A' Subunit | Lea1p | At1g09760 | 1 | 21 | LRR 4; | ||||
| U2B" Subunit | Msl1p | At1g06960 | 1 | 6 | AltD (1); | >1-2a | RRM, 2 | ||
| At2g30260 | 2 | 13 | AltA (1); IntronR (1); | >1-2a | RRM, 2; | ||||
| SF3a120/SAP114 Subunit | Prp21p | At1g14650 | 1 | 17 | AltB (1); | SWAP/Surp, 2; Ubiquitin, 1 | |||
| At1g14640 | 1 | SWAP/Surp, 2 | |||||||
| At5g06520 | 5 | SWAP/Surp, 4 | |||||||
| At4g16200 | 4 | 1 | SWAP/Surp, 3 | ||||||
| At4g15580 | 4 | SWAP/Surp, 3; Ubiquitin, 1 | |||||||
| SF3a60/SAP61 Subunit | Prp9p | At5g06160 | 5 | 10 | AltD (1); | C2H2, 1 | |||
| SF3a66/SAP62 Subunit | Prp11p | At2g32600 | 2 | 13 | C2H2, 1; | ||||
| SF3b120/SAP130 Subunit | Rse1p | At3g55200 | 3 | 6 | CPSF_A, 1; WD40-like, 1 | [50] | |||
| At3g55220 | 3 | 7 | CPSF_A, 1; WD40-like, 1 | [50] | |||||
| SF3b150/SAP145 Subunit | Cus1p | At4g21660 | 4 | 16 | PSP, 1; DUF382, 1 | ||||
| At1g11520 | 1 | ||||||||
| SF3b160/SAP155 Subunit | Hsh155 | At5g64270 | 5 | 11 | HEAT, 1; ARM, 2; SAP_155, 1 | ||||
| SF3b53/SAP49 Subunit | Hsh49p | At2g18510 | 2 | 20 | RRM, 2 | ||||
| At2g14550 | 2 | RRM, 2 | |||||||
| p14 | Snu17p | At5g12190 | 5 | 7 | RRM, 1; | ||||
| At2g14870 | 2 | RRM, 1; | |||||||
| SF3b 14b /PHP5A | Rds3p | At1g07170 | 1 | 10 | >1-2a | UPF0123, 1; | |||
| At2g30000 | 2 | 8 | >1-2a | UPF0123, 1; | |||||
| SF3b 10 | At4g14342 | 4 | 11 | SF3b10, 1; | |||||
| At3g23325 | 3 | 6 | SF3b10, 1; | ||||||
| 15 kD Subunit | Dib1p | At5g08290 | 5 | 28 | DIM1, 1; Thioredoxin_2; 1 | ||||
| 40 kD Subunit | At2g43770 | 2 | 21 | WD-40, 7; | |||||
| 100 kD Subunit | Prp28p | At2g33730 | 2 | 13 | DEAD, 1; Helicase_C, 1 | ||||
| 102 KD/Prp6-like | Prp6p | At4g03430 | 4 | 18 | Ubiquitin, 1; TPR, 3; HAT, 15; TPR-like, 2; Prp1_N, 1 | ||||
| 116 kD Subunit /elongation | Snu114p | At1g06220 | 1 | 19 | ExonS (1); | EFG_C, 1; GTP_EFTU, 1; GTP_EFTU_D2; 1; Small_GTP, 1; EFG_IV, 1; | |||
| At5g25230 | 5 | EFG_C, 1; GTP_EFTU, 1; GTP_EFTU_D2; 1; EFG_IV, 1; | |||||||
| At1g56070 | 1 | 214 | EFG_C, 1; GTP_EFTU, 1; GTP_EFTU_D2; 1; EFG_IV, 1; | ||||||
| At3g22980 | 3 | 3 | EFG_C, 1; GTP_EFTU, 1; Small_GTP, 1; | ||||||
| 200 kD Subunit/Helicase | Brr2p | At5g61140 | 5 | 11 | IntronR (1); | DEAD, 2; Helicase_C, 2; Sec63, 2; ARM, 1 | |||
| At1g20960 | 1 | 23 | DEAD, 2; Helicase_C, 2; Sec63, 2 | ||||||
| At2g42270 | 2 | 5 | DEAD, 2; Helicase_C, 2; Sec63, 2 | ||||||
| At3g27730 | 3 | DEAD, 1; Sec63, 1; RuvA domain 2-like, 1 | |||||||
| 220 kD Subunit | Prp8p | At1g80070 | 1 | 33 | Mov34, 1 | ||||
| At4g38780 | 4 | 2 | Mov34, 1 | ||||||
| U4/U6-90K / SAP90 | Prp3p | At1g28060 | 1 | 10 | |||||
| At3g55930 | 3 | ||||||||
| At3g56790 | 3 | ||||||||
| U4/U6-60K / SAP60 | Prp4p | At2g41500 | 2 | 8 | WD-40, 7; SFM, 1; WD40-like, 1 | ||||
| U4/U6-20K / CYP20 | At2g38730 | 2 | 11 | Pro_isomerase, 1 | |||||
| U4/U6-61KD | Prp31 | At1g60170 | 1 | 26 | Nop, 1 | ||||
| At3g60610 | 3 | Nop, 1 | |||||||
| U4/U6-15.5K | Snu13p | At5g20160 | 5 | 18 | IntronR (2); | Ribosomal_L7Ae, 1 | |||
| At4g12600 | 4 | 14 | Ribosomal_L7Ae, 1 | ||||||
| At4g22380 | 4 | 9 | Ribosomal_L7Ae, 1 | ||||||
| Tri-65 KD | Snu66p | At4g22350 | 4 | 7 | UCH; 1; ZnF_UBP, 1 | ||||
| At4g22290 | 4 | 20 | UCH; 1; ZnF_UBP, 1; Pentaxin, 1 | ||||||
| At4g22410 | 4 | UCH; 1; ZnF_UBP, 1 | |||||||
| Tri-110 KD | SAD1 | At5g16780 | 5 | 7 | SART-1, 1 | ||||
| Tri-27 kD/RY1 | At5g57370 | 5 | 14 | ||||||
| hSnu23/FLJ31121 | Snu23p | At3g05760 | 3 | 7 | ZnF_U1, 1; | ||||
| U11/U12-35K | At2g43370 | 2 | 7 | IntronR (1); | RRM, 1 | ||||
| U11/U12-25K (-99 protein) | At3g07860 | 3 | 6 | IntronR (2); | C2H2, 1; | ||||
| U11/U12-65K | At1g09230 | 1 | 15 | AltA (1); | RRM, 2;PHOSPHOPANTETHEINE, 2; | ||||
| U11/U12-31K (MADP1) | At3g10400 | 3 | 5 | RRM, 1;CCHC, 1; | |||||
| U2AF35 | At1g27650 | 1 | 26 | RRM, 1; CCCH, 2; | |||||
| At5g42820 | 5 | 8 | RRM, 1; CCCH, 2; | [58] | |||||
| U2AF65 | Mud2 | At1g60900 | 1 | 10 | RRM, 3; | [58] | |||
| At4g36690 | 4 | 29 | AltA (1); IntronR (2); | RRM, 2; | [58] | ||||
| At2g33440 | 2 | 2 | RRM, 1 | ||||||
| At1g60830 | 1 | ||||||||
| U2AF35 related protein | At1g10320 | 1 | RRM, 1; CCCH, 2; | ||||||
| SF1/BBP | At5g51300 | 5 | 23 | IntronR (1); | RRM, 1; CCHC, 2; KH, 1; | ||||
| CBP20 | Cbc1 | At5g44200 | 5 | 8 | RRM, 1 | [56] | |||
| CBP80 | Cbc2p | At2g13540 | 2 | 21 | MIF4G, 1; ARM, 3 | [56] | |||
| PTB/hnRNP I | At1g43190 | 1 | 26 | RRM, 4; | |||||
| At3g01150 | 3 | 21 | AltD (1); ExonS (1); | RRM, 2 | |||||
| At5g53180 | 5 | 17 | ExonS (1); | RRM, 2 | |||||
| SC35 | At5g64200 | 5 | 32 | AltD (1); | RRM, 1; | [61] | |||
| SRrp40/TASR-2 | At1g55310 | 1 | 12 | IntronR (1); | >1-3b | RRM, 1 | [63] | ||
| At3g13570 | 3 | 32 | ExonS (2); IntronR (4); | >1-3b | RRM, 1 | [61] | |||
| At3g55460 | 3 | 14 | ExonS (1); | RRM, 1 | [61] | ||||
| At5g18810 | 5 | 5 | RRM, 1 | [61] | |||||
| SF2/ASF | At1g02840 | 1 | 37 | AltA (1); IntronR (1); | >1-4 | RRM, 2 | [64,67] | ||
| At4g02430 | 4 | 13 | AltA (1); ExonS (1); IntronR (4); | >1-4 | RRM, 2 | ||||
| At3g49430 | 3 | 3 | ExonS (1); IntronR (1); | RRM, 2 | |||||
| At1g09140 | 1 | 15 | AltA (1); | RRM, 2 | [65] | ||||
| 9G8 | At4g31580 | 4 | 26 | >2-4e | RRM, 1; CCHC, 1 | [63,66] | |||
| At2g24590 | 2 | 7 | >2-4e | RRM, 1; CCHC, 1 | [63,66] | ||||
| At1g23860 | 1 | 18 | RRM, 1; CCHC, 1 | [63,66] | |||||
| At2g37340 | 2 | 30 | IntronR (1); | >2-3 | RRM, 1; CCHC, 2 | [61] | |||
| At3g53500 | 3 | 36 | AltA (1); IntronR (3); | >2-3 | RRM, 1; CCHC, 2 | [61] | |||
| - | At2g46610 | 2 | 23 | AltD (1); IntronR (1); | >2-3 | RRM, 2 | |||
| At3g61860 | 3 | 17 | AltA (1); | >2-3 | RRM, 2 | [59] | |||
| At5g52040 | 5 | 34 | AltA (1); | >4-5b | RRM, 2 | [59] | |||
| At4g25500 | 4 | 15 | ExonS (1); IntronR (1); | >4-5b | RRM, 2 | [59] | |||
| hPrp43 | Prp43p | At5g14900 | 5 | HA2, 1 | |||||
| At3g62310 | 3 | 17 | AltA (1); | >2-3 | DEAD, 1; Helicase_C, 1; HA2, 1 | ||||
| At2g47250 | 2 | 14 | >2-3 | DEAD, 1; Helicase_C, 1; HA2, 1 | |||||
| SR140 | At5g25060 | 5 | 11 | Surp, 1;RRM, 1;, 1;RPR, 1; | |||||
| At5g10800 | 5 | 2 | Surp, 1;RRM, 1;RPR, 1; | ||||||
| SPF45 | At1g30480 | 1 | 9 | D111/G-patch domain, 1; RRM, 1; | |||||
| SPF30 | At2g02570 | 2 | 9 | AltA (1); | Tudor, 1; | ||||
| hPrp19* | Prp19p | At1g04510 | 1 | 18 | >1-2a | WD-40, 7; Ubox, 1; | |||
| At2g33340 | 2 | 27 | IntronR (1); | >1-2a | WD-40, 7; Ubox, 1; | ||||
| CDC5* | Cef1 | At1g09770 | 1 | 12 | SANT, 2; | [104] | |||
| PRL1* | Prp46p | At4g15900 | 4 | 14 | WD-40, 2;WD40like, 1; | ||||
| At3g16650 | 3 | 6 | WD-40, 2;WD40like, 1; | ||||||
| AD-002* | Cwc15p | At3g13200 | 3 | 22 | Cwf_Cwc_15, 1; | ||||
| HSP73/HSPA8* | At3g12580 | 3 | 35 | Hsp70, 1; | |||||
| At5g42020 | 5 | 51 | IntronR (1); | Hsp70, 1; | |||||
| At5g02500 | 5 | 553 | IntronR (1); | Hsp70, 1; | |||||
| SPF27/BCAS2* | At3g18165 | 3 | 15 | BCAS2, 1; | |||||
| beta catenin-like 1* | At3g02710 | 3 | 12 | Armadillo, 1;ARM, 1; | |||||
| hSyf1 | Syf1p | At5g28740 | 5 | 7 | TPR, 1;HAT, 10;TPRlike, 3; | ||||
| hSyf3/CRN | Syf3 | At5g45990 | 5 | TPR, 1; HAT, 14; TPR-like, 2 | |||||
| At3g13210 | 3 | TPR, 1; HAT, 12; TPR-like, 2 | |||||||
| At5g41770 | 5 | 13 | TPR, 1; HAT, 14; TPR-like, 2 | ||||||
| At3g51110 | 3 | 8 | TPR, 1; HAT, 9; TPR-like, 1 | ||||||
| hIsy1 | Isy1p | At3g18790 | 3 | 10 | Isy1, 1; | ||||
| GCIP p29 | Syf2 | At2g16860 | 2 | 12 | |||||
| SKIP | Prp45p | At1g77180 | 1 | 28 | SKIP/SNW, 1; | ||||
| hECM2 | Ecm2p | At1g07360 | 1 | 21 | >1-2a | RRM, 1;CCCH, 1; | |||
| At2g29580 | 2 | 10 | >1-2a | RRM, 1;CCCH, 1; | |||||
| At5g07060 | 5 | CCCH, 1; | |||||||
| KIAA0560 | At2g38770 | 2 | 11 | ||||||
| MGC23918 | At3g05070 | 3 | 7 | ||||||
| G10 | Cwc14p | At4g21110 | 4 | 12 | G10, 1; | ||||
| Cyp E | At2g21130 | 2 | 4 | >2-4c | Pro_isomerase, 1 | ||||
| At4g38740 | 4 | 59 | >2-4c | Pro_isomerase, 1; | |||||
| At2g16600 | 2 | 39 | >2-4a | Pro_isomerase, 1 | |||||
| At4g34870 | 4 | 80 | >2-4a | Pro_isomerase, 1; | |||||
| PPIase-like 1 | At2g36130 | 2 | 10 | Pro_isomerase, 1; | |||||
| NPW38 | At2g41020 | 2 | 16 | AltD (1); IntronR (1); | WW, 2; | ||||
| N-CoR1 | At3g52250 | 3 | 3 | SANT, 2;Homeodomain_like, 2; | |||||
| hPrp4 kinase | At3g25840 | 3 | 13 | ExonS (1); | Pkinase, 1;TyrKc, 1;S_Tkc, 1;, 1;Kinase_like, 1; | ||||
| At1g13350 | 1 | 5 | IntronR (1); | Pkinase, 1;TyrKc, 1;S_Tkc, 1;, 1;Kinase_like, 1; | |||||
| At3g53640 | 3 | Pkinase, 1;TyrKc, 1;S_Tkc, 1;, 1;Kinase_like, 1; | |||||||
| FBP-21 | At1g49590 | 1 | 12 | ExonS (1); IntronR (3); | C2H2, 1; | ||||
| TBL1-rp 1 | At5g67320 | 5 | 14 | WD-40, 5;Peptidase_S9A_N, 1;LisH, 1;WD40like, 1; | |||||
| Smc-1 | At3g54670 | 3 | 12 | ATP_GTP_A_BS, 1;SMC_N, 1;SMC_C, 1;ABC_transporter, 1;SMC_hinge, 1; | |||||
| ALY | Yra1p | At5g02530 | 5 | 19 | IntronR (1); | RRM, 1; | |||
| At5g59950 | 5 | 16 | IntronR (1); | RRM, 1; | |||||
| At5g37720 | 5 | 17 | >1-5b | RRM, 1; | |||||
| At1g66260 | 1 | 38 | ExonS (1); | >1-5b | RRM, 1; | ||||
| Y14 | At1g51510 | 1 | 10 | IntronR (1); | RRM, 1;RBM8, 4; | ||||
| Srm160-like | At2g29210 | 2 | 18 | AltA (1); | PWI, 1 | ||||
| Magoh | At1g02140 | 1 | 19 | Mago_nashi, 1; | |||||
| Nuk-34/eIF4A3/DDX48 | At3g19760 | 3 | 50 | >1-3a | DEAD, 1;Helicase_C, 1; | ||||
| At1g51380 | 1 | 5 | >1-3a | DEAD, 1;Helicase_C, 1; | |||||
| RNPS1 | At1g16610 | 1 | 27 | AltA (1); | RRM, 1 | [63] | |||
| UAP56 | At5g11200 | 5 | 21 | AltA (1); | DEAD, 1; Helicase_C, 1 | ||||
| At5g11170 | 5 | 25 | DEAD, 1; Helicase_C, 1 | ||||||
| pinin | At1g15200 | 1 | 9 | AltA (1); | Pinin/SDK/memA, 1; | ||||
| Prp22 | Prp22 | At3g26560 | 3 | 11 | DEAD, 1; Helicase_C, 1; S1, 1; HA2, 1; | ||||
| At1g26370 | 1 | 5 | DEAD, 1; Helicase_C, 1; HA2, 1 | ||||||
| At1g27900 | 1 | 15 | DEAD, 1; Helicase_C, 1; HA2, 1 | ||||||
| Prp17 | Prp17p | At1g10580 | 1 | 10 | WD-40, 7; | ||||
| At5g54520 | 5 | 5 | AltA (1); | WD-40, 6; | |||||
| Prp18 | Prp18 | At1g03140 | 1 | 16 | Prp18, 1; SFM 1; | ||||
| At1g54590 | 1 | Prp18, 1 | |||||||
| Slu7 | Slu7p | At1g65660 | 1 | 6 | |||||
| At4g37120 | 4 | 11 | |||||||
| At3g45950 | 3 | ||||||||
| Prp16 | Prp16p | At5g13010 | 5 | 22 | DEAD, 1; Helicase_C, 1; HA2, 1 | ||||
| SRm300 | At3g23900 | 3 | 5 | AltD (1); | RRM, 1; Filamin/ABP280 repeat, 1 | ||||
| hTra-2/SFRS10 | At1g07350 | 1 | 25 | ExonS (1); IntronR (3); | RRM, 1 | ||||
| Prp2 | At1g32490 | 1 | 9 | >1-2c | DEAD, 1; Helicase_C, 1; HA2, 1 | ||||
| At2g35340 | 2 | >1-2c | DEAD, 1; Helicase_C, 1; HA2, 1 | ||||||
| At4g16680 | 4 | DEAD, 1; Helicase_C, 1; HA2, 1 | |||||||
| Prp5 | At3g09620 | 3 | DEAD, 1; Helicase_C, 1 | ||||||
| At1g20920 | 1 | 11 | DEAD, 1; Helicase_C, 1 | ||||||
| At2g47330 | 2 | 9 | DEAD, 1; Helicase_C, 1 | ||||||
| hDbr1 | dbr1 | At4g31770 | 4 | 12 | Metallophos, 1; DBR1, 1 | ||||
| Lammer/CLK kinase | At3g53570 | 3 | 11 | AltA (1); IntronR (3); | PKinase, 1; TyrKc, 1; S_Tkc, 1; PKinase-like, 1 | [74] | |||
| At4g24740 | 4 | 9 | ExonS (1); | PKinase, 1; TyrKc, 1; S_Tkc, 1; PKinase-like, 1 | [74] | ||||
| At4g32660 | 4 | 9 | AltD (1); IntronR (1); | PKinase, 1; TyrKc, 1; S_Tkc, 1; PKinase-like, 1 | [74] | ||||
| SRPK1 | At2g17530 | 2 | 7 | >2-4a | PKinase, 1; TyrKc, 1; S_Tkc, 1; PKinase-like, 1 | ||||
| At4g35500 | 4 | 10 | >2-4a | PKinase, 1; TyrKc, 1; S_Tkc, 1; PKinase-like, 1 | |||||
| SRPK2 | At5g22840 | 5 | 2 | PKinase, 1; TyrKc, 1; S_Tkc, 1; PKinase-like, 1 | |||||
| At3g53030 | 3 | 7 | PKinase, 1; TyrKc, 1; S_Tkc, 1; PKinase-like, 1 | ||||||
| At3g44850 | 3 | 1 | PKinase, 1; TyrKc, 1; PKinase-like, 1 | ||||||
| HnRNP A/B | At1g18630 | 1 | 5 | >1-1c | RRM, 1 | ||||
| At1g74230 | 1 | 14 | >1-1c | RRM, 1; Eggshell, 4 | |||||
| At4g13850 | 4 | 17 | >3-4 | RRM, 1 | |||||
| At3g23830 | 3 | 12 | AltA (1); | >3-4 | RRM, 1 | ||||
| At5g61030 | 5 | 8 | RRM, 1; PfkB_Kinase, 1 | ||||||
| At2g16260 | 2 | RRM, 1 | |||||||
| At2g21660 | 2 | 182 | AltD (1); IntronR (3); | >2-4c | RRM, 1 | [77] | |||
| At4g39260 | 4 | 67 | AltB (1); AltD (1); IntronR (5); | >2-4c | RRM, 1 | [77] | |||
| hnRNP A/B | At4g14300 | 4 | 4 | RRM, 2 | [11] | ||||
| At2g33410 | 2 | 13 | RRM, 2 | [11] | |||||
| At5g55550 | 5 | 13 | IntronR (3); | >4-5c | RRM, 2 | [11] | |||
| At4g26650 | 4 | 21 | >4-5c | RRM, 2 | [11] | ||||
| At5g47620 | 5 | 12 | AltD (2); | RRM, 2 | [11] | ||||
| At3g07810 | 3 | 18 | AltA (1); | RRM, 2; FKBP_PPIASE_2, 2 | [11] | ||||
| At1g58470 | 1 | 6 | RRM, 2 | ||||||
| At5g40490 | 5 | 3 | RRM, 2; Eggshell, 4 | ||||||
| At1g17640 | 1 | RRM, 2 | |||||||
| At3g13224 | 3 | 16 | IntronR (1); | RRM, 2;HUDSXLRNA, 2; | |||||
| At3g56860 | 3 | 23 | IntronR (1); | >2-3 | RRM, 2 | [78] | |||
| At2g41060 | 2 | 9 | >2-3 | RRM, 2 | [78] | ||||
| At3g15010 | 3 | 10 | IntronR (1); | RRM, 2 | [78] | ||||
| hnRNP E1/E2 | At3g04610 | 3 | 10 | KH, 3; | |||||
| hnRNP F/ hnRNP H | At5g66010 | 5 | 9 | AltA (1); | RRM, 2 | [11] | |||
| At3g20890 | 3 | RRM, 2 | [11] | ||||||
| hnRNP G | At5g04280 | 5 | 6 | RRM, 1; CCHC, 1 | |||||
| At3g26420 | 3 | 35 | AltA (1); | RRM, 1; CCHC, 1 | |||||
| At1g60650 | 1 | 7 | RRM, 1; CCHC, 1 | ||||||
| hnRNP P2 | At1g50300 | 1 | 9 | RRM, 1; ZnF_RBZ, 2 | |||||
| hnRNP R/Q | At4g00830 | 4 | 19 | RRM, 3; | |||||
| At3g52660 | 3 | 1 | >2-3 | RRM, 3; | |||||
| At2g44710 | 2 | 13 | >2-3 | RRM, 3 | |||||
| CUG-BP | At4g03110 | 4 | 4 | AltA (1); IntronR (1); | RRM, 3; HUDSXLRNA, 4 | [11] | |||
| At1g03457 | 1 | 9 | AltA (1); | RRM, 3; HUDSXLRNA, 4 | [11] | ||||
| (CUG-BP) | At4g16280 | 4 | 13 | AltB (1); IntronR (1); | RRM, 2; WW, 1 | [81] | |||
| At2g47310 | 2 | 6 | RRM, 2; WW, 1 | ||||||
| At1g54080 | 1 | 48 | AltA (1); | >1-3b | RRM, 3 | [84] | |||
| At3g14100 | 3 | 13 | >1-3b | RRM, 3 | [84] | ||||
| At1g17370 | 1 | 17 | RRM, 3 | [84] | |||||
| At2g22090 | 2 | 15 | >2-4c | RRM, 1 | [78] | ||||
| At2g22100 | 2 | 2 | >2-4c | RRM, 1 | [78] | ||||
| At2g19380 | 2 | 1 | RRM, 1; C2H2, 3 | [78] | |||||
| At5g54900 | 5 | 42 | >4-5c | RRM, 3 | [85] | ||||
| At4g27000 | 4 | 52 | >4-5c | RRM, 3 | [85] | ||||
| At1g11650 | 1 | 53 | RRM, 3 | [85] | |||||
| At5g19350 | 5 | 10 | RRM, 3 | ||||||
| At1g49600 | 1 | 10 | >1-3a | RRM, 3 | [85] | ||||
| At3g19130 | 3 | 21 | >1-3a | RRM, 3 | [85] | ||||
| At1g47490 | 1 | 23 | IntronR (1); | RRM, 3 | [85] | ||||
| At1g47500 | 1 | 12 | RRM, 3 | [85] | |||||
| At4g16830 | 4 | 34 | HANP4_PAI-RBP1, 1 | [105] | |||||
| At4g17520 | 4 | 29 | >4-5a | HANP4_PAI-RBP1, 1 | [105] | ||||
| At5g47210 | 5 | 67 | IntronR (1); | >4-5a | HANP4_PAI-RBP1, 1 | ||||
Gene names were kept consistent with names used in previous publications or derived from the names of the respective homologs (yeast names are given in the S.c. column, where available). The Tnb column gives the numbers of cognate cDNAs and ESTs supporting the gene structure. The AltS column indicates evidence for alternative splicing, including alternative donor site (AltD), alternative acceptor site (AltA), alternative position (AltP, both acceptor and donor sites are different), exon skipping (ExonS), and intron retention (IntronR). Chromosomal duplication indicates a known chromosome duplication region. Functional groups of proteins are separated by long lines spanning all columns. Different members in the group are separated by short lines starting at the Arabidopsis gene name. Genes duplicated in Arabidopsis are clustered together with no line between them. Dash line separate the Prp19 complex from other 35S U5 associated proteins and * indicates proteins in that complex. Abbreviations for domains are as follows: ABC_transporter: ABC transporter; Armadillo: Armadillo; ARM: ARM repeat fold; ATP_GTP_A_BS: ATP/GTP-binding site motif A (P-loop); BCAS2: Breast carcinoma amplified sequence 2; C2H2: Zn-finger, C2H2 matrin type; C2H2: Zn-finger, C2H2 type; CCCH: Zn-finger, C-x8-C-x5-C-x3-H type; CCHC: Zn-finger, CCHC type; CPSF_A: CPSF A subunit, C-terminal; Cwf_Cwc_15: Cwf15/Cwc15 cell cycle control protein; DBR1: Lariat debranching enzyme, C-terminal; DEAD: ATP-dependent helicase, DEAD-box; DEAD: DEAD/DEAH box helicase; DIM1: Pre-mRNA splicing protein; DUF259: Protein of unknown function DUF259; DUF382: Protein of unknown function DUF382; EFG_C: Elongation factor G, C-terminal; EFG_IV: Elongation factor G, domain IV; Eggshell: Eggshell protein; FF: FF domain; FKBP_PPIASE_2: Peptidylprolyl isomerase, FKBP-type; G10: G10 protein; GTP_EFTU_D2: Elongation factor Tu, domain 2; GTP_EFTU: Protein synthesis factor, GTP-binding; HA2: Helicase-associated region; HANP4_PAI-RBP1, 1: Hyaluronan/mRNA binding protein; HAT: RNA-processing protein, HAT helix; Helicase_C: Helicase, C-terminal; Homeodomain_like: Homeodomain-like; Hsp70: Heat shock protein Hsp70; HUDSXLRNA: Paraneoplastic encephalomyelitis antigen; Isy1: Isy1-like splicing; Kinase_like: Protein kinase-like; LisH: Lissencephaly type-1-like homology motif; LRR: Leucine-rich repeat,; Mago_nashi: Mago nashi protein; Metalloph: Metallophosphoesterase; MIF4G: Initiation factor eIF-4 gamma, middle; Mov34: Mov34/MPN/PAD-1; mrCtermi: Molluscan rhodopsin C-terminal tail; Nop: Pre-mRNA processing ribonucleoprotein, binding region; Peptidase_S9A_N: Peptidase S9A, prolyl oligopeptidase, N-terminal beta-propeller domain; PfkB_Kinase: Carbohydrate kinase, PfkB; PHOSPHOPANTETHEINE: Phosphopantetheine attachment site; Pinin/SDK/memA: Pinin/SDK/memA protein; Pkinase: Protein kinase; Pro_isomerase: Peptidyl-prolyl cis-trans isomerase, cyclophilin type; Prp18: Prp18 domain; Prp1_N: PRP1 splicing factor, N-terminal; PSP: PSP, proline-rich; PWI: Splicing factor PWI; RBM8: RNA binding motif protein 8; Ribosomal_L7Ae: Ribosomal protein L7Ae/L30e/S12e/Gadd45; RPR: Regulation of nuclear pre-mRNA protein; RRM: RNA-binding region RNP-1 (RNA recognition motif); S1: RNA binding S1; SANT: Myb DNA-binding domain; SAP_155: Splicing factor 3B subunit_1; SART-1: SART-1 protein; Sec63: Sec63 domain; SF3b10: Splicing factor 3B subunit 10; SFM: Splicing factor motif; SKIP/SNW: SKIP/SNW domain; Small_GTP: Small GTP-binding protein domain; SMC_C: Structural maintenance of chromosome protein SMC, C-terminal; SMC_hinge: SMCs flexible hinge; SMC_N: SMC protein, N-terminal; Sm_like_riboprot: Small nuclear-like ribonucleoprotein; Sm: Small nuclear ribonucleoprotein (Sm protein); S_Tkc: Serine/threonine protein kinase; Surp: SWAP/Surp; Thioredoxin_2: Thioredoxin domain 2; TPRlike: TPR-like; TPR: TPR repeat; Tudor: Tudor domain; TyrKc: Tyrosine protein kinase; Ubox: Zn-finger, modified RING; UCH: Peptidase C19, ubiquitin carboxyl-terminal hydrolase family 2; UPF0123: Protein of unknown function UPF0123; UPF0123: Protein of unknown function UPF0123; WD-40: G-protein beta WD-40 repeat; WD40like: WD40-like; WW: WW/Rsp5/WWP domain; ZnF_RBZ: Zn-finger, Ran-binding; ZnF_U1: Zn-finger, U1-like; ZnF_UBP: Zn-finger in ubiquitin thiolesterase.
Duplication source involving Arabidopsis splicing-related proteins
| Genes | Family* | Single/multiple | Duplication ratio | Duplication events | Chromosomal duplications | Chromosomal duplication ratio | |
| snRNP proteins | 91 | 54 | 27/27 | 50.0% | 37 | 7 | 18.9% |
| Splicing factors | 109 | 58 | 33/25 | 43.1% | 51 | 14 | 27.5% |
| Splicing regulator | 60 | 18 | 4/14 | 77.8% | 42 | 11 | 26.2% |
| Total | 260 | 130 | 64/66 | 50.8% | 130 | 32 | 24.6% |
*Family indicates both single copy gene and multiple-copy gene families. The Chromosomal duplication ratio column gives the fraction of all duplication events caused by chromosomal duplications.
Alternative splicing in splicing-related genes
| Genes | AltA | AtlD | AltP | ExonS | IntronR | Overall | Ratio | |
| snRNP proteins | 91 | 6 | 3 | 1 | 3 | 11 | 22 | 23.2% |
| Splicing factors | 109 | 14 | 5 | 0 | 11 | 21 | 38 | 34.9% |
| Splicing regulator | 60 | 8 | 4 | 2 | 1 | 12 | 20 | 33.3% |
| Total | 260 | 28 | 12 | 3 | 15 | 44 | 80 | 30.8% |
The column entries are the numbers of genes in which the respective alternative splicing events can occur. AltA, alternative acceptor site; AltD, alternative donor site; AltP, alternative intron position (both acceptor and donor sites are different); ExonS, exon skipping; IntronR, intron retention. The Overall and Ratio columns give the number and fraction of genes with any type of alternative splicing, respectively.
Figure 3Phylogenetic tree of the SC35 protein family. The phylogenetic tree was constructed on the basis of protein sequence alignments of the SC35 homologs in human, Drosophila, Arabidopsis and rice. The GenBank accession numbers for the sequences are as follows: hsSC35, Q01130; hsSRrp40, AAL57514; hsSRrp35: AAL57515; dmSC35, AAF53192; atSC35, NP_851261; atSR33/SCL33, NP_564685; atSCL30a, NP_187966; atSCL30, NP_567021; atSCL28, NP_197382; osSC35a, BAC79909; osSC35b, BAD09319; osSR33-1, AAP46199; osSCL30a/SR33-2, BAC799901; osSCL30-2, BAD19168. The sequences were aligned using CLUSTALW [22] with default parameters, and the phylogenetic tree was produced according to the neighbor-joining method using PAM substitution model distances as implemented in the PHYLIP package [103].