| Literature DB >> 16093549 |
Alexei Fedorov1, Jesse Stombaugh, Michael W Harr, Saihua Yu, Lorena Nasalean, Valery Shepelev.
Abstract
Based on comparative genomics, we created a bioinformatic package for computer prediction of small nucleolar RNA (snoRNA) genes in mammalian introns. The core of our approach was the use of the Mammalian Orthologous Intron Database (MOID), which contains all known introns within the human, mouse and rat genomes. Introns from orthologous genes from these three species, that have the same position relative to the reading frame, are grouped in a special orthologous intron table. Our program SNO.pl searches for conserved snoRNA motifs within MOID and reports all cases when characteristic snoRNA-like structures are present in all three orthologous introns of human, mouse and rat sequences. Here we report an example of the SNO.pl usage for searching a particular pattern of conserved C/D-box snoRNA motifs (canonical C- and D-boxes and the 6 nt long terminal stem). In this computer analysis, we detected 57 triplets of snoRNA-like structures in three mammals. Among them were 15 triplets that represented known C/D-box snoRNA genes. Six triplets represented snoRNA genes that had only been partially characterized in the mouse genome. One case represented a novel snoRNA gene, and another three cases, putative snoRNAs. Our programs are publicly available and can be easily adapted and/or modified for searching any conserved motifs within mammalian introns.Entities:
Mesh:
Substances:
Year: 2005 PMID: 16093549 PMCID: PMC1184218 DOI: 10.1093/nar/gki754
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Results of an online Blast2 alignment of the 1708 nt long third intron of the human ribosomal protein S3a gene, and its mouse ortholog (1990 nt long third intron of the mouse rps3a gene). (A) Dot plot figure of the intron comparison obtained by bl2seq online program () with default parameters. (B) Alignment of human and mouse introns. The sequences of snoRNA C- and D-boxes inside these introns are boxed.
Description of computed snoRNA and snoRNA-like sequences detected inside each intron of 57 orthologous intron triplets from human, mouse and rat genomes
| Category | Number of orthologous intron triplets with snoRNA-like sequences | Description of computed snoRNA-like structures |
|---|---|---|
| Known snoRNA | 15 | U103 (AY349604); U103 (AY349604); U38a (NR_001456); U20 (Z34290); Z25 (HSAJ10666); U73b (NG_000961); U73 (Z83330); U14 (NR_001452); mgU2-19/30 (BK005567.1); Z32 (HSAJ9638); U59 (X96659); Z17a (HSA224024); U41 (X96640); U32 (NR_000021); U33 (NR_000020); U61 (X96661) |
| Partially characterized snoRNA in mouse | 6 | MBII-316 (AF357335); MBII-295 (AF357354); MBII-166 (AF357343); MBII-82 (AF357319); MBII-115 (AF357349); MBII-55(AF357318) |
| Novel snoRNA | 1 | (37..78) segment has 88% sequence identity to human U53 snoRNA (X96652) and (59..79) segment is identical to mouse MBII-35 (AF357377) |
| Putative snoRNA | 3 | Inside middle-size introns with conserved sequence motifs |
| Extra-long introns with false-positive snoRNA-like sequences | 32 | These snoRNA-like structures were found in introns longer than 100 000 nt in all three species |
Description of partially characterized, novel and putative snoRNAs, detected by the SNO.pl program
| snoRNA name (gene) | Species, chromosome | Intron identifier in MOID | Genomic location (within GenBank contig) | Putative targets for ASEs |
|---|---|---|---|---|
| MBII-316 ( | Hs, chr 2 | INTRON_5 2295_NT_022184 | NT_022184 (7952461..7952549) | AS1: gagtcgggg 28S rRNA (3 G–T) (1340–1348) |
| AS2: cacagccaaggga 28S rRNA (1 G–T) (3843–3855) | ||||
| Mm, chr 17 | INTRON_5 17444_NT_039658 | NT_039658 (5183792..5183880) | AS1? | |
| AS2: cacagccaaggga 28S rRNA (1 G–T) (3520–3532) | ||||
| Rn, chr 6 | INTRON_5 7091_NW_047756 | NW_047756 (comp6117735..6117827) | AS1: gagtcaggg 28S rRNA (2 G–T) (1252–1260) | |
| AS2: cacagccaaggga 28S rRNA (1 G–T) (3589–3601) | ||||
| MBII-295 ( | Hs, chr 9 | INTRON_4 9675_NT_008470 | NT_008470 (comp32963706..32963791) | AS1: gtctgccctat 18S rRNA (1 G–T) (351–361) |
| Mm, chr 2 | INTRON_7 1548_NT_039206 | NT_039206 (comp14828672..14828757) | AS1: gtctgccctat 18S rRNA (1 G–T) (352–362) | |
| Rn, chr 3 | INTRON_5 3929_NW_047653 | NW_047653 (comp3192607..3192692) | AS1: gtctgccctat 18S rRNA (1 G–T) (353–363) | |
| MBII-166 ( | Hs, chr 11 | INTRON_30 11040_NT_009237 | NT_009237 (comp45571179..45571289) | AS1: tggcccttg 28S rRNA (1 G–T) (2721–2729) |
| Mm, chr 2 | INTRON_31 1817_NT_039207 | NT_039207 (32430849..32430959) | AS1: tggcccttg 28S rRNA (1 G–T) (2487–2495) | |
| Rn, chr 3 | INTRON_27 4231_NW_047657 | NW_047657 (17383684..17383793) | AS1: tggcccttg 28S rRNA (1 G–T) (2575–2583) | |
| MBII-82 ( | Hs, chr 16 | INTRON_6 15357_NT_010498 | NT_010498 (24186109..24186198) | AS1: gaagagacatgaga 28S rRNA (1 G–T) (3920–3933) |
| Mm, chr 8 | INTRON_6 9497_NT_078575 | NT_078575 (comp35865888..35865972) | AS1: gaagagacatgag 28S rRNA (1 G–T) (3597–3609) | |
| Rn, chr 19 | INTRON_2 16923_NW_047536 | NW_047536 (comp3195818..3195902) | AS1: gaagagacatgag 28S rRNA (2 G–T) (3666–3678) | |
| MBII-115 ( | Hs, chr 19 | INTRON_10 17981_NT_011109 | NT_011109 (20527299..20527410) | AS2? |
| Mm, chr 7 | INTRON_10 7232_NT_039395 | NT_039395 (comp177705..177815) | AS2: ccccgggcg 28S rRNA (2 G–T) (1022–1030) | |
| Rn, chr 1 | INTRON_10 500_NW_047555 | NW_047555 (comp21675179..21675289) | AS2: ccccgggcg 28S rRNA (2 G–T) (1086–1084) | |
| MBII-55 ( | Hs, chr 20 | INTRON_3 18370_NT_011387 | NT_011387 (2574856..2574934) | AS1: ggattgacagatt 18S rRNA (0 G–T) (1285–1297) |
| Mm, chr 2 | INTRON_2 2112_NT_039207 | NT_039207 (70965250..70965321) | AS1: ggattgacagat 18S rRNA (0 G–T) (1285–1296) | |
| Rn, chr 3 | INTRON_3 4524_NW_047658 | NW_047658 (8066063..8066135) | AS1: ggattgacagatt 18S rRNA (0 G–T) (1289–1301) | |
| Novel 1 ( | Hs, chr 2 | INTRON_11 2295_NT_022184 | NT_022184 (7966780..7966861) | AS2: cagccaagggaa 28S rRNA (1G–T) (3845–3856) |
| Mm, chr 17 | INTRON_11 17444_NT_039658 | NT_039658 (5194095..5194174) | AS2: cagccaagggaa 28S rRNA (1G–T) (3522–3533) | |
| Rn, chr 6 | INTRON_12 7091_NW_047756 | NW_047756 (comp6107760..6107839) | AS2: cagccaagggaa 28S rRNA (1G–T) (3591–3602) | |
| Putative 1 ( | Hs, chr 3 | INTRON_5 4087_NT_005612 | NT_005612 (8672246..8672320) | ? |
| Mm, chr 16 | INTRON_6 16457_NT_096987 | NT_096987 (comp20526051..20526125) | ? | |
| Rn, chr 11 | INTRON_7 12205_NW_047355 | NW_047355 (9207232..9207306) | ? | |
| Putative 2 ( | Hs, chr 9 | INTRON_5 9220A_NT_008413 | NT_008413 (comp32977364..32977439) | AS1: ccatgaacgag 18S rRNA (3 G–T) (1627–1637) |
| Mm, chr 4 | INTRON_4 3776_NT_039260 | NT_039260 (comp15064752..15064844) | AS1: ccatgaacgag 18S rRNA (3 G–T) (1627–1637) | |
| Rn, chr 5 | INTRON_4 6027_NW_047711 | NW_047711 (comp32384161..32384253) | AS1: ccatgaacgag 18S rRNA (3 G–T) (1627–1637) | |
| Putative 3 ( | Hs, chr 15 | INTRON_15 14222_NT_010194 | NT_010194 (22053882..22053953) | ? |
| Mm, chr 2 | INTRON_15 2063_NT_039207 | NT_039207 (67744852..67744930) | ? | |
| Rn, chr 3 | INTRON_4 4479_NW_047658 | NW_047658 (4776101..4776177) | ? |
The first column represents names of partially characterized snoRNAs followed by the GenBank identifier of the human gene inside which snoRNAs were found. The last column represents putative targets for snoRNA ASEs that were detected by the program TARGET.pl. The number of non-WC G–T pairs is shown in parentheses followed by the position of the target within the rRNA sequence. Cases in which no ASE targets were detected are denoted by ‘?’.
Figure 2Sequences and conserved motifs of partially characterized, novel and putative snoRNAs that were detected by the SNO.pl program. C-, C′-, D- and D′-boxes are boxed. ASE-1s are underlined by a single line, ASE-2s are underlined by a double line. Hypothetical ASEs, which do not have strong targets, are underlined by a dotted line. All ASE targets are listed in Table 2.