| Literature DB >> 29189765 |
Deborah J Good1,2, Matthew A Kocher3.
Abstract
The SNORD116 small nucleolar RNA locus (SNORD116@) is contained within the long noncoding RNA host gene SNHG14 on human chromosome 15q11-q13. The SNORD116 locus is a cluster of 28 or more small nucleolar (sno) RNAs; C/D box (SNORDs). Individual RNAs within the cluster are tandem, highly similar sequences, referred to as SNORD116-1, SNORD116-2, etc., with the entire set referred to as SNORD116@. There are also related SNORD116 loci on other chromosomes, and these additional loci are conserved among primates. Inherited chromosomal 15q11-q13 deletions, encompassing the SNORD116@ locus, are causative for the paternally-inherited/maternally-imprinted genetic condition, Prader-Willi syndrome (PWS). Using in silico tools, along with molecular-based and sequenced-based confirmation, phylogenetic analysis of the SNORD116@ locus was performed. The consensus sequence for the SNORD116@ snoRNAs from various species was determined both for all the SNORD116 snoRNAs, as well as those grouped using sequence and location according to a human grouping convention. The implications of these findings are put in perspective for studying SNORD116 in patients with inherited Prader-Willi syndrome, as well as model organisms.Entities:
Keywords: Prader–Willi Syndrome; imprinting; phylogenetic analysis; snoRNA
Year: 2017 PMID: 29189765 PMCID: PMC5748676 DOI: 10.3390/genes8120358
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
SNORD116 snoRNA clusters in different species.
| Common Name | Chromosome | Synteny with Human Chromosome 15 | Cluster Size (bp) | Strand | Number of Transcripts (with Perfectly Homologous C/C’ and D/D’ Boxes) | Number of Annotated Transcripts |
|---|---|---|---|---|---|---|
| Human | 15 | - | 56,781 | Forward | 30 (24) | 30 |
| Chimpanzee | 15 | yes | 66,103 | Forward | 28 (22) | 0 |
| Rhesus macaque | 7 | yes | 61,342 * | Forward | 29 (26) | 29 |
| Rabbit | 17 | yes | 72,915 | Reverse | 29 (22) | 0 |
| Rat | 1 | yes | 163,162 @ | Reverse | 26 (15) | 26 |
| Mouse | 7 | yes | 45,634 (Cluster 1) 133,627 (Cluster 2) ^ | Reverse | 71 (64) | 17 |
* Missing one ~6.5 kb contig within the cluster; @ missing six contigs, totaling ~50 kb within the region; ^ missing one ~50 kb contig between clusters.
Non-cluster paralogs to SNORD116.
| Human Chromosome Number (Accession Number) | Location in Humans | Presence of Homologous C/C’ and D/D’ Boxes to | Chimpanzee Chromosome (synteny)/Location/ Homologous C/D Boxes? | Rhesus Macaque Chromosome (synteny)/Location/Homologous C/D Boxes? |
|---|---|---|---|---|
| 1(ENST00000365628.1) | Intronic | No | 3 (no)/intergenic/yes | 1 (yes)/intronic/no |
| 9(ENST00000517176.1) | Intronic | No | 9 (yes)/intronic/no | 15 (yes)/intronic/no |
| 13(ENST00000391251.1) | Intergenic | No | N/A (scaffold) (yes)/intergenic/no | 17 (yes)/intergenic/no |
Figure 1Comparison of genomic sequences of the SNORD116 locus from model organisms used for most biological research. Sequences from human (Homo sapiens), chimpanzee (Pan troglodytes), rhesus (Macaca mulatta), rabbit (Oryctolagus cuniculus), rat (Rattus norvegicus) and mouse (Mus musculus) were analyzed and the composite sequence shown. Nucleotides AGCT shown are present in 90% or greater of the transcripts at that position, while IUPAC codes are used for positions with one or more variable nucleotides. (A) Alignment of SNORD116@ consensus sequences displaying sites of non-strict homology. A dot (.) indicates non-strict homology with the human sequence for the given nucleotide position. (B) Alignment showing all nucleotides of consensus sequences used in (A). Sequences used for the analysis were obtained from Ensembl builds. The build and the number of SNORD116 sequences analyzed are shown after the common name of the organism: human (n = 30, GRCh38.p10), chimp (n = 28, CHIMP2.1.4), rhesus (n = 29, Mmul_8.0.1), rabbit (n = 29, OryCun_2.0), rat (n = 26, Rnor_6.0) and mouse (n = 17, GRCm38.p5). The C and C’ boxes are highlighted in yellow, while the D and D’ boxes are highlighted in blue. Nucleotides that do not meet the 90% frequency threshold are indicated using IUPAC ambiguity codes. Grey-shading indicates regions of insertion/deletion in some sites of the group. Frequency for qualifying as in/del site is 10% or greater. A dot (.) indicates non-strict homology with the human sequence for the given nucleotide position. A tilde (~) indicates a gap in consensus sequence compared to the aligned consensus sequence. An asterisk (*) indicates perfect homology with the human sequence for all nucleotides in the site above.
Homology comparison of consensus sequences for SNORD116 groupings between human and rat, rabbit and mouse. The number of homologous nucleotide sites is displayed.
| Non-Strict Homology | Strict Homology | |||||
|---|---|---|---|---|---|---|
| Human Group I 96 Nucleotides | Human Group II 92 Nucleotides | Human Group III 96 Nucleotides | Human Group I 96 Nucleotides | Human Group II 92 Nucleotides | Human Group III 96 Nucleotides | |
| Rat 116@ | 64 (66.7%) | 65 (70.7%) | 52 (54.2%) | 59 (61.5%) | 57 (62.0%) | 30 (31.3%) |
| Rat Group I | 76 (79.2%) | 79 (85.9%) | 65 (67.7%) | 69 (71.9%) | 66 (71.7%) | 48 (50.0%) |
| Rat Group II | 64 (66.7%) | 60 (65.2%) | 53 (55.2%) | 59 (61.5%) | 55 (59.8%) | 40 (41.7%) |
| Rabbit 116@ | 73 (76.0%) | 69 (75.0%) | 58 (60.4%) | 68 (70.8%) | 60 (65.2%) | 45 (46.9%) |
| Rabbit Group I | 86 (89.6%) | 78 (84.8%) | 64 (66.7%) | 78 (81.3%) | 64 (69.6%) | 47 (49.0%) |
| Rabbit Group II | 57 (59.4%) | 59 (64.1%) | 48 (50.0%) | 54 (56.3%) | 53 (57.6%) | 41 (42.7%) |
| Mouse 116@ | 81 (84.4%) | 81 (88.0%) | 64 (66.7%) | 75 (78.1%) | 66 (71.7%) | 47 (49.0%) |
Figure 2Consensus sequences for respective SNORD116@ transcript clusters (groups). Threshold frequency for single nucleotide inclusion in the consensus is 90%. Nucleotides that do not meet or exceed the 90% frequency threshold are indicated using IUPAC ambiguity codes. In/del sites are highlighted in light gray. Frequency for qualifying as in/del site is 10% or greater. C/C’ boxes highlighted in yellow. D/D’ boxes highlighted in light blue. A dot (.) indicates non-strict homology with human sequence for the given nucleotide position. A tilde (~) indicates a gap in consensus sequence compared to other consensus sequences. An asterisk (*) indicates perfect homology with the human sequence for all nucleotides in the site above. (A) Group I consensus analysis; (B) Group II consensus analysis; (C) Group 3 consensus analysis; (D) Mouse-human consensus analysis. For this analysis, nucleotides using IAPUC codes are pink.