Literature DB >> 17090314

SAGE detects microRNA precursors.

Xijin Ge1, Qingfa Wu, San Ming Wang.   

Abstract

BACKGROUND: MicroRNAs (miRNAs) have been shown to play important roles in regulating gene expression. Since miRNAs are often evolutionarily conserved and their precursors can be folded into stem-loop hairpins, many miRNAs have been predicted. Yet experimental confirmation is difficult since miRNA expression is often specific to particular tissues and developmental stages.
RESULTS: Analysis of 29 human and 230 mouse longSAGE libraries revealed the expression of 22 known and 10 predicted mammalian miRNAs. Most were detected in embryonic tissues. Four SAGE tags detected in human embryonic stem cells specifically match a cluster of four human miRNAs (mir-302a, b, c&d) known to be expressed in embryonic stem cells. LongSAGE data also suggest the existence of a mouse homolog of human and rat mir-493.
CONCLUSION: The observation that some orphan longSAGE tags uniquely match miRNA precursors provides information about the expression of some known and predicted miRNAs.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 17090314      PMCID: PMC1636050          DOI: 10.1186/1471-2164-7-285

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

MicroRNAs (miRNAs) are endogenous, ~22 nucleotide (nt) noncoding RNAs that play important roles in gene expression regulation by base-pairing with messenger RNAs [1]. A single miRNA can down-regulate a large number of target mRNAs [2]. Since most miRNA precursors can be mapped to ~60–120 nt long conserved genomic regions and can be folded into hairpin structures, miRNAs can be predicted from genomic sequences with high sensitivity [3-9]. Experimental confirmation and functional analysis of these predicted miRNAs, however, remains a challenge. Serial analysis of gene expression (SAGE) collects short 14–21 nt tags from 3' ends of transcripts after certain restriction enzyme cutting sites; the most frequently used site is "CATG" which is recognized by NalIII [10] recently developed variation of this technique known as longSAGE collects 21 bp tags, which are long enough for genomic mapping and specific annotation [11]. Unlike DNA microarray that depends on a pre-defined gene set, SAGE is an exploratory method for transcriptome analysis. Many orphan SAGE tags that cannot be associated with any known transcripts represent potential novel transcripts [12]. Primary miRNAs transcribed by polymerase II are processed by the nuclear Drosha enzyme to give pre-miRNAs, which are then exported into cytoplasm and lead to mature miRNAs. At least some primary miRNAs are known to be capped and polyadenylated in the nucleus [13]. As recent analysis of EST identified 26 known miRNAs [14], SAGE might also be able to detect some primary miRNAs. To investigate whether this is the case, we mined the large number of human and mouse longSAGE tags deposited in public databases and compared these tags with the sequences of pre-miRNAs.

Results and discussion

To identify a set of SAGE tags that could theoretically be contributed by miRNAs, we searched for "CATG" sites in known miRNA precursors. Among the 332 known human miRNAs in the miRBASE [15], 92 (28%) bear such sites. Similarly, 64 (24%) of the 270 known mouse miRNAs could contribute to SAGE tags. To increase coverage, we also included longSAGE tags uniquely mapped to genomic loci that are very close (within 30 bp) to known hairpin sequences. This is because the complex process of miRNA biogenesis is still not well understood and the complete primary transcription units, which can be significantly longer than the ~60–120 bp hairpin sequence, have not been defined for most miRNAs. After extension, the number of human and mouse miRNAs associated with longSAGE tags increased to 130 (39%) and 99 (37%), respectively. Thus, SAGE can theoretically detect about one-third of known miRNAs. Additional File 1 lists all these miRNAs and corresponding longSAGE tags. These virtual tags were then compared with experimentally observed tags in 29 human and 120 mouse longSAGE libraries in the Gene Expression Omnibus database [16] and in 110 mouse longSAGE libraries representing various tissues in multiple developmental stages from the Mouse Atlas of Gene Expression website [17]. We identified nine longSAGE tags matched to human miRNAs and 16 matched to mouse miRNAs. These tags were then mapped to human or mouse genomic sequences and annotated with available mRNAs and ESTs. After removing tags that may have originated from known genes (e.g., mapping to the sense strand of an exon including UTR) and those that mapped to multiple genomic loci, we identified eight human and 14 mouse longSAGE tags that represent known miRNAs (Table 1).
Table 1

LongSAGE tags matched to known and predicted miRNA precursors.

miRNA (1)longSAGE tags (2)Chr.EST#libsTag countsTissue (mouse Theiler Stage)
Human SAGE tags matched to known miRNAs
hsa-mir-302aTTTTGGTGATGGTAAGT4q25No11Embryonic stem cell
hsa-mir-302b (3)GAAGTGCTTTCTGTGAC4q25Yes59Embryonic stem cell
hsa-mir-302cTTTCAGTGGAGGTGTCT4q25Yes12Embryonic stem cell
hsa-mir-302dTTTGAGTGTGGTGGTTC4q25No46Embryonic stem cell
hsa-mir-7-1 (4)CCTCTACAGGACAAATG9q21No33White blood cell, breast tumor, stem cell
hsa-let-7i (4)GCCCTGGCTGAGGTAGT12q14No44Embryonic stem cells and Fetal brain
hsa-mir-21GCTGTACCACCTTGTCG17q23Yes22White blood cell, breast tumor
hsa-mir-125a (4)TTGCCAGTCTCTAGGTC19q13No11breast tumor (myofibroblast)
Human SAGE tags matched to predicted miRNAs
Lim et al. [4]CTACTCTCACTGAGTAC5p21No1Embryonic stem cell
cand525-HSCGGAGCCCCCGGGCTTG11q13No4Embryonic stem cell and breast & lung cancer
Mouse SAGE tags matched to known miRNAs
mmu-mir-29b-2 (3)GTGGCTTAGATTTTTCC1qH6Yes22Heart bulbous cordis (TS14 embryo)
mmu-mir-205GAGCTGCCAGCGGTGGA1qH6Yes717Brain, forelimb & skin (embryo)
mmu-mir-130aCCTTTGCTGCTGGCCGG2qDYes11Branchial Arch embryonic tissue
mmu-mir-133a-2GCCCAGCCAGAGGACAC2qH4Yes44Heart ventricle (TS14-19), Sk. muscle (TS25)
mmu-mir-29aACCTCTTGTGACCCCTT6qA3No11Ovary (21 days post natal)
mmu-mir-425GAAAGTGCTTTGGAATG9qF2Yes38Visual cortex(27days post natal), Pancreas (TS20)
mmu-mir-331(4)CAAGCTGAAAGCACTCC10qC2Yes11Brain – Amygdala (Post natal day 7)
mmu-let-7iGCCCTGGCTGAGGTAGT10qD2Yes44Sm. Intestine (TS24), Lung (TS26), Neural tube(TS13), Pancreas
mmu-mir-21GCTGTACCACCTTGTCG11qCYes22Placenta (TS22), large intestine (TS24)
mmu-mir-196a-1GCAGTTACTGCTTCTTG11qDNo33Endoderm (Definitive), kidney (TS24), testis
mmu-mir-337 (3)CAGGAGTTGATTGCACA12qF1No11Adrenal gland (TS22 embryo)
mmu-mir-485 (4)TGTGATACTTGGAGAGA12qF1No11Skin (TS21)
mmu-mir-351 (3,4)GCACCTCCGTTTCCCTGXqA5Yes22Heart ventricle (TS14 embryo)
mmu-mir-92-2CCCATTCATCCACAGGTXqA5No11Thymus (TS23 embryo)
Mouse SAGE tags matched to predicted miRNAs
cand185-MMTGACTGCCTGTCTGTGC1qH2Yes11Fibroblast cell line (embryonic)
cand847-HSGTGAGCAGATGATGCAT2qH1Yes11Brain – Whole (TS20 embryo)
cand136-MMAACATTATTTCTTGTGT4qD2No11Adult testis
cand407-MMGAGTCTTCCAAGCCAAG4qF4No33Adult bladder & mammary gland
cand913-RNTGACGTCTGAGGAGCGG11qE2Yes11Brain telencephalon dorsal (TS20 embryo)
cand219-MMGCCAATCTCCTTTCGGC12qF1Yes26Visual cortex (27 days post natal)
cand202-MMGTAGGCTTTCATTCATT12qF1No22Branchial arch embryonic tissue (TS15)
cand239-MMCAGCTTTGGAGACGCCA16qC3No11Brain – Preplate (TS21 embryo)
cand525-MMCGGAGCCCCCGGGCTTG19qANo99Multiple normal tissues

(1) Except the one marked as Lim et al. [4], all predicted miRNAs are based on Berezikov et al. [7]. Cand219-MM and Cand202-MM are also predicted by Sewer et al. [9].

(2) "CATG" sites before each SAGE tags had been omitted.

(3) SAGE tags match miRNA hairpin sequences without extension.

(4) These miRNAs are listed as known in miRBASE based on their homolog to entries in other species.

Among the eight human miRNAs whose expression was detected by SAGE tags, four (mir-302a, b, c&d) mapped to a 600 bp region of Chr. 4q25 (Fig. 1). Another member of the cluster, mir-367, was not detected because of the lack of the "CATG" site. This miRNA cluster is known to be specifically expressed in human embryonic stem cells [18], which is in accord with the source of the SAGE libraries in which the tags were observed (see Table 1, detailed information about SAGE libraries is available in Additional File 2).
Figure 1

Four human longSAGE tags specifically mapped to a cluster of four miRNAs on Chromosome 4. These evolutionarily conserved miRNAs are transcribed from the antisense strand of an intron of HDCMA18P gene.

The large amount of mouse longSAGE data provides rich information about the particular tissue and developmental stage of the expression of 14 known miRNAs. In the mouse embryo at Theiler Stage 14, for example, we observed the expression of mir-133a-2 and mir-351 in heart ventricle. At the same stage, SAGE detects the expression of mir-29b-2 in heart bulbous cordis. The expression of mir-29b and mir-133 in the heart has been confirmed by northern blot [19]. LongSAGE data also indicate the expression of "known" but unconfirmed miRNAs, such as the expression of let-7i in human embryonic stem cells and fetal brain tissues. Although listed as known miRNAs in the miRBASE [15] based on the mouse homolog, its expression has not yet been experimentally confirmed in humans. Similarly, longSAGE tags also suggest the expression of two human (mir-7-1 and mir-125a) and three mouse (mir-331, mir-351, and mir-495) miRNAs that have not been experimentally confirmed (Table 1). LongSAGE data thus provide hints about the expression of unconfirmed miRNAs. LongSAGE data also provide evidence for the existence of some predicted miRNAs. Two human and seven mouse miRNAs predicted by Lim et al. [4], Berezikov et al. [7] and Sewer et al. [9] are supported by SAGE tags (Table 1). One mouse miRNA candidate, cand202-MM, predicted by both Berezikov et al. [7] and Sewer et al. [9], is highly homologous to human and rat mir-493. The presence of such a SAGE tag in two mouse SAGE libraries strongly supports the existence of mouse mir-493. Two mouse SAGE tags map to genomic loci that are highly homologous to predicted human (cand847-HS) and rat (cand913-RN) miRNAs. The information about the tissue and stage of expression might facilitate the experimental confirmation of these predicted miRNAs. The use of SAGE tags to detect miRNA precursors is limited, however. For example, longSAGE tags are subject to sequencing errors. Also, 21 bp tags do not provide full sequences of miRNA precursors. Therefore, further studies are needed to confirm our findings.

Conclusion

In summary, the available longSAGE tags indicate the expression of eight human and 14 mouse known miRNA precursors and provide evidence for the existence of two human and seven mouse predicted miRNAs. Although limited in the number of miRNAs, SAGE data provide useful information on the expression of miRNA. Together with recent longSAGE-based studies that identifies many novel antisense transcripts in mouse [21] and human [22], this study again shows that longSAGE is an effective technology for exploratory transcriptome analysis.

Methods

Genomic coordinates of 332 human and 270 mouse hairpin sequences were downloaded from the miRBase (Ref. 15) as our collection of known miRNAs. Because pre-miRNAs could be longer than these hairpin sequences, these sequences were extended by 30 bp in both directions on corresponding genomic sequences. In addition, miRNAs predicted by Lim et al. [4], Berezikov et al. [7] and Sewer et al. [9] were downloaded from the respective journal web sites. These sequences were then searched for the "CATG" site and 17 bp tags after each of these sites was extracted. Such virtual SAGE tags are linked to miRNAs for further analysis. The 29 human and 120 mouse longSAGE libraries were retrieved from the gene expression omnibus database (Ref. 16). Another 110 mouse longSAGE libraries were downloaded from the Mouse Atlas of Gene Expression web site (Ref. 17). Pooling multiple libraries for each species led to a total of 632,813 unique human tags and 1,902,036 unique mouse tags. These experimental tags were then compared to the virtual tags extracted from miRNA sequences. Only virtual tags whose sequence is identical to the sequence of real tags were considered confirmed. For annotation, matched human and mouse tags were mapped to human (Mar. 2006 assembly, hg18) and mouse (Aug. 2005 assembly, mm7) genomic sequences, respectively, using BLAT [20]. All tags mapped to multiple genomic loci or exons of known genes were excluded. Tags mapped to UTR regions were retained only if the tag was transcribed from the opposite strand.

Authors' contributions

XG, QW and SMW conceived the study and participated in study design. XG did computational analyses. XG and SMW wrote the manuscript. All authors read and approved the final manuscript.

Additional File 1

Virtual SAGE tags extracted from known miRNA precursors. This file contains 310 longSAGE tags extracted from known miRNA precursor sequences. Click here for file

Additional File 2

Detailed description of SAGE libraries that includes tags representing miRNAs precursors. This file gives detailed information on the type of tissue and stage of development (if available) that the SAGE tags listed in Table 1 are detected. SAGE library IDs are also given for further enquiry to the original databases. Click here for file
  22 in total

1.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

2.  Vertebrate microRNA genes.

Authors:  Lee P Lim; Margaret E Glasner; Soraya Yekta; Christopher B Burge; David P Bartel
Journal:  Science       Date:  2003-03-07       Impact factor: 47.728

3.  The microRNAs of Caenorhabditis elegans.

Authors:  Lee P Lim; Nelson C Lau; Earl G Weinstein; Aliaa Abdelhakim; Soraya Yekta; Matthew W Rhoades; Christopher B Burge; David P Bartel
Journal:  Genes Dev       Date:  2003-04-02       Impact factor: 11.361

4.  Computational and experimental identification of C. elegans microRNAs.

Authors:  Yonatan Grad; John Aach; Gabriel D Hayes; Brenda J Reinhart; George M Church; Gary Ruvkun; John Kim
Journal:  Mol Cell       Date:  2003-05       Impact factor: 17.970

5.  Identifying novel transcripts and novel genes in the human genome by using novel SAGE tags.

Authors:  Jianjun Chen; Miao Sun; Sanggyu Lee; Guolin Zhou; Janet D Rowley; San Ming Wang
Journal:  Proc Natl Acad Sci U S A       Date:  2002-09-04       Impact factor: 11.205

6.  Identification of tissue-specific microRNAs from mouse.

Authors:  Mariana Lagos-Quintana; Reinhard Rauhut; Abdullah Yalcin; Jutta Meyer; Winfried Lendeckel; Thomas Tuschl
Journal:  Curr Biol       Date:  2002-04-30       Impact factor: 10.834

7.  Using the transcriptome to annotate the genome.

Authors:  Saurabh Saha; Andrew B Sparks; Carlo Rago; Viatcheslav Akmaev; Clarence J Wang; Bert Vogelstein; Kenneth W Kinzler; Victor E Velculescu
Journal:  Nat Biotechnol       Date:  2002-05       Impact factor: 54.908

8.  miRBase: microRNA sequences, targets and gene nomenclature.

Authors:  Sam Griffiths-Jones; Russell J Grocock; Stijn van Dongen; Alex Bateman; Anton J Enright
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

9.  Bioinformatic discovery of microRNA precursors from human ESTs and introns.

Authors:  Sung-Chou Li; Chao-Yu Pan; Wen-chang Lin
Journal:  BMC Genomics       Date:  2006-07-03       Impact factor: 3.969

10.  Computational identification of Drosophila microRNA genes.

Authors:  Eric C Lai; Pavel Tomancak; Robert W Williams; Gerald M Rubin
Journal:  Genome Biol       Date:  2003-06-30       Impact factor: 13.583

View more
  7 in total

1.  LongSAGE analysis of the early response to cold stress in Arabidopsis leaf.

Authors:  Youn-Jung Byun; Hyo-Jin Kim; Dong-Hee Lee
Journal:  Planta       Date:  2009-02-28       Impact factor: 4.116

2.  MicroRNA profiling of human-induced pluripotent stem cells.

Authors:  Kitchener D Wilson; Shivkumar Venkatasubrahmanyam; Fangjun Jia; Ning Sun; Atul J Butte; Joseph C Wu
Journal:  Stem Cells Dev       Date:  2009-06       Impact factor: 3.272

3.  MicroRNAs regulate synthesis of the neurotransmitter substance P in human mesenchymal stem cell-derived neuronal cells.

Authors:  Steven J Greco; Pranela Rameshwar
Journal:  Proc Natl Acad Sci U S A       Date:  2007-09-13       Impact factor: 11.205

4.  Focus on RNA isolation: obtaining RNA for microRNA (miRNA) expression profiling analyses of neural tissue.

Authors:  Wang-Xia Wang; Bernard R Wilfred; Donald A Baldwin; R Benjamin Isett; Na Ren; Arnold Stromberg; Peter T Nelson
Journal:  Biochim Biophys Acta       Date:  2008-02-13

5.  dbDEMC: a database of differentially expressed miRNAs in human cancers.

Authors:  Zhen Yang; Fei Ren; Changning Liu; Shunmin He; Gang Sun; Qian Gao; Lei Yao; Yangde Zhang; Ruoyu Miao; Ying Cao; Yi Zhao; Yang Zhong; Haitao Zhao
Journal:  BMC Genomics       Date:  2010-12-02       Impact factor: 3.969

6.  A dumbbell probe-mediated rolling circle amplification strategy for highly sensitive microRNA detection.

Authors:  Yuntao Zhou; Qing Huang; Jimin Gao; Jianxin Lu; Xizhong Shen; Chunhai Fan
Journal:  Nucleic Acids Res       Date:  2010-06-14       Impact factor: 16.971

7.  Integration of expressed sequence tag data flanking predicted RNA secondary structures facilitates novel non-coding RNA discovery.

Authors:  Paul M Krzyzanowski; Feodor D Price; Enrique M Muro; Michael A Rudnicki; Miguel A Andrade-Navarro
Journal:  PLoS One       Date:  2011-06-15       Impact factor: 3.240

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.