| Literature DB >> 20428234 |
Amit N Khachane1, Paul M Harrison.
Abstract
BACKGROUND: The role of long non-coding RNAs (lncRNAs) in controlling gene expression has garnered increased interest in recent years. Sequencing projects, such as Fantom3 for mouse and H-InvDB for human, have generated abundant data on transcribed components of mammalian cells, the majority of which appear not to be protein-coding. However, much of the non-protein-coding transcriptome could merely be a consequence of 'transcription noise'. It is therefore essential to use bioinformatic approaches to identify the likely functional candidates in a high throughput manner. PRINCIPALEntities:
Mesh:
Substances:
Year: 2010 PMID: 20428234 PMCID: PMC2859052 DOI: 10.1371/journal.pone.0010316
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1A schematic representation of the discovery pipeline for conserved expressed long non-protein-coding RNAs (lncRNAs).
General statistics for the 78 conserved lncRNAs.
| Category | Number of cases |
| Syntenically conserved | With significant sequence homology: 11 |
| Without significant sequence homology: 67 | |
| Genomic location | Protein-coding region: 57 |
| Non-protein-coding region: 21 | |
| Spliced forms | Spliced: 64 |
| Non spliced: 14 |
*BLAST e-value set was to <1×10−6. The protein-coding region annotations were taken from the ENSEMBL website (www.ensembl.org).
A summary of the analysis results for preservation, sequence conservation and occurrence of secondary structure motifs in mouse lncRNAs that have orthologous human counterparts (with BLAST homology).
| Fantom3 entries | H-Inv db entries (syntenic to mouse lncRNA and having BLAST homology, e-value: <0.01) | Preservation in other mammals | Sequence identities between conserved lncRNAs and between orthologous flanking regions indicated in brackets | Conservation of secondary structure motifs |
| A230108N10 | HIT000394689.1 | HMDC | 80.3% (not conserved) | no |
| D730031O06 | HIT000294155.8 | HMDC | 64.3% (not conserved) | yes |
| A430070C22 | HIT000091723.8 | HMDC | 10.12% (not conserved) | no |
| 1600017P15 | HIT000389575.3 | HM | 72% (not conserved) | no |
| A130061G12 | HIT000294554.8 | HMDC | 25.8% (34.8%) | yes |
| 2600002C05 | HIT000323535.8 | HMDC | 43.4% (not conserved) | yes |
| 9530073M10 | HIT000093538.10 | HMDC | 66.1% (40.15%) | no |
| 5430433I11 | HIT000282711.8 | HMDC | 87.9% (17.2%) | yes |
| 5330421F07 | HIT000248175.9 | HMDC | 28.1% (7.2%) | yes |
| 1110021C24 | HIT000292834.10 | HMDC | 48.1% (11.3%) | no |
| G370125G16 | HIT000430538.1 | HMDC | 39.4% (not conserved) | no |
*‘H’ refers to human, ‘M’ to mouse, ‘D’ to dog and ‘C’ to cow.
Note: For the calculation of sequence conservation, orthologous sequences to mouse lncRNAs were identified in the human genome using synteny maps and BLAST searches (e-value<0.01) and subjected to further evolutionary analysis.
Figure 2A schematic representation of different genomic regions from which lncRNA originate relative to the structure of a protein-coding gene.
List of lncRNAs associated with known genes implicated in cancer pathogenesis.
| H-Inv id | Gene name | Cancer | References |
| HIT000067299.10 |
| Lymphoma | CGMIM |
| HIT000064387.8 |
| Leukemia | CGMIM |
| HIT000257890.10 |
| Breast, Melanoma, Ovarian | CGMIM, |
| HIT000277951.8 |
| Leukemia | CGMIM, |
| HIT000323535.8 |
| Leukemia | CGMIM |
| HIT000389429.2 |
| Colorectal | CGMIM, |
| HIT000327147.7 |
| Colorectal | CGMIM |
| HIT000079026.8 |
| Brain, Breast, Colorectal, Prostrate, Ovarian | CGMIM, |
| HIT000067550.9 |
| Brain, Melanoma | CGMIM |
| HIT000276030.9 |
| Breast, Ovarian | CGMIM |
| HIT000284226.9 |
| Oncogene | OMIM |
| HIT000024195.13 |
| Melanoma |
|
| HIT000383650.1 |
| Basal cell carcinoma | OMIM |
| HIT000243731.8 |
| reduced expression in Pituitary tumors and its overexpression reverts growth hormone hypersecretion |
|
| HIT000248175.9 |
| Colorectal carcinomas |
|
| HIT000075518.7 |
| Breast cancer |
|
| HIT000071420.7 |
| Prostate cancer, Breast cancer |
|
| HIT000389219.2 |
| Pulmonary chondroid hamartoma, Uterine leiomyomas |
|
| HIT000089413.9 |
| Acute promyelocytic leukemia |
|
| HIT000330125.5 |
| Non Hodgkin lymphoma, Urinary bladder cancer susceptibility, Colorectal adenoma susceptibility |
|
Note: ‘CGMIM’ database is accessible at http://www.bccrc.ca/ccr/CGMIM/ and ‘OMIM’ database at www.ncbi.nlm.nih.gov/omim/.
Figure 3A model for antisense regulation of target mRNA transcripts by lncRNAs.
The following lncRNA sequences: HIT000079026.8 and HIT000091723.8, have complementary relationship to UTR of the following protein-coding transcripts: ENST00000393449 and ENST00000383790, respectively.
Figure 4Assessment for protein-coding ability.
Comparison between Ka/Ks values of long ORFs derived from six-frame conceptual translation for human-mouse lncRNAs and orthologous neighboring protein-coding genes.