| Literature DB >> 17785534 |
Housheng He1, Jie Wang, Tao Liu, X Shirley Liu, Tiantian Li, Yunfei Wang, Zuwei Qian, Haixia Zheng, Xiaopeng Zhu, Tao Wu, Baochen Shi, Wei Deng, Wei Zhou, Geir Skogerbø, Runsheng Chen.
Abstract
The number of annotated protein coding genes in the genome of Caenorhabditis elegans is similar to that of other animals, but the extent of its non-protein-coding transcriptome remains unknown. Expression profiling on whole-genome tiling microarrays applied to a mixed-stage C. elegans population verified the expression of 71% of all annotated exons. Only a small fraction (11%) of the polyadenylated transcription is non-annotated and appears to consist of approximately 3200 missed or alternative exons and 7800 small transcripts of unknown function (TUFs). Almost half (44%) of the detected transcriptional output is non-polyadenylated and probably not protein coding, and of this, 70% overlaps the boundaries of protein-coding genes in a complex manner. Specific analysis of small non-polyadenylated transcripts verified 97% of all annotated small ncRNAs and suggested that the transcriptome contains approximately 1200 small (<500 nt) unannotated noncoding loci. After combining overlapping transcripts, we estimate that at least 70% of the total C. elegans genome is transcribed.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17785534 PMCID: PMC1987347 DOI: 10.1101/gr.6611807
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043