Literature DB >> 25378313

An update on LNCipedia: a database for annotated human lncRNA sequences.

Pieter-Jan Volders1, Kenneth Verheggen2, Gerben Menschaert3, Klaas Vandepoele4, Lennart Martens2, Jo Vandesompele1, Pieter Mestdagh5.   

Abstract

The human genome is pervasively transcribed, producing thousands of non-coding RNA transcripts. The majority of these transcripts are long non-coding RNAs (lncRNAs) and novel lncRNA genes are being identified at rapid pace. To streamline these efforts, we created LNCipedia, an online repository of lncRNA transcripts and annotation. Here, we present LNCipedia 3.0 (http://www.lncipedia.org), the latest version of the publicly available human lncRNA database. Compared to the previous version of LNCipedia, the database grew over five times in size, gaining over 90,000 new lncRNA transcripts. Assessment of the protein-coding potential of LNCipedia entries is improved with state-of-the art methods that include large-scale reprocessing of publicly available proteomics data. As a result, a high-confidence set of lncRNA transcripts with low coding potential is defined and made available for download. In addition, a tool to assess lncRNA gene conservation between human, mouse and zebrafish has been implemented.
© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25378313      PMCID: PMC4383901          DOI: 10.1093/nar/gku1060

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Over the past decade long non-coding RNAs (lncRNAs) have emerged as a large class of functional non-coding RNAs (ncRNAs) (1). Defined as ncRNA transcripts longer than 200 nucleotides, lncRNAs have been shown to function mainly as transcriptional regulators by interaction with other biomolecules, such as proteins (2–4) and microRNAs (5). They are involved in a wide range of processes including cardiac development (6), dosage compensation (7,8) and cancer (2,9–10). Several specialist databases concerning lncRNA have been developed. Well-known examples are lncRNAdb, which focuses on lncRNAs with described functions (11), and NONCODE (12,13). In addition to these general lncRNA databases, databases that describe specific lncRNA subclasses have been compiled as well. LncRNAdisease contains lncRNAs with published disease associations (14) while lncRNAs targeted by microRNAs can be found in DIANA-LncBase (15). Distinguishing coding from ncRNA sequences is an important step, both in the ncRNA and the protein research field. Classic approaches are based on either open reading frame (ORF) length, ORF conservation or structural protein domains (16). Recent computational methods make use of more complex features or machine learning approaches. Notable examples are the Coding-Potential Calculator (CPC), Coding-Potential Assessment Tool (CPAT) and PhyloCSF. CPC utilizes a support vector machine trained on features that describe long, high-quality ORFs with sequence similarity (BLASTX) to known proteins (17). CPAT is a logistic regression model that only uses sequence-derived features, such as ORF size, codon and hexamer usage bias (18). In contrast to CPC and CPAT, PhyloCSF employs codon substitution frequencies in whole-genome multi-species alignments and maximum likelihood trees to distinguish between coding and non-coding loci (19). ORF length is either directly or indirectly used in all these computational prediction methods yet ORFs yielding short peptides (<100 amino acids) are difficult to predict. The discovery of functional peptides shorter than 100 amino acids, like the Drosophila gene tarsal-less (tal), thus raised the possibility that several lncRNAs are actually misclassified protein-coding genes encoding micropeptides (20,21). As small ORFs can also occur by chance in long transcripts, many well-described lncRNAs harbor non-functional ORFs (22). In addition to small ORFs, the in silico prediction of coding ORFs is further complicated by the existence of non-canonical (non-AUG) start codons (23). Experimental procedures to detect translated ORFs and their products have been developed as well. One such method is referred to as ribosome profiling and is based on deep sequencing of ribosome-protected mRNA fragments. Although many ncRNAs show ribosome occupancy, by using initiation-specific translation inhibitors in combination with ribosome profiling, researchers were able to map translation initiation sites (TIS) with base pair resolution and improve the detection of true ORFs (23,24). Other researchers were able to use the periodicity of ribosome movement on the mRNA to define actively translated ORFs (25). In addition to ribosome profiling, mass spectrometry has been applied in the search for novel peptides arising from lncRNAs (26,27). Several authors report small numbers of (micro) peptides arising from lncRNAs using either ribosome profiling or mass spectrometry. The debate on the putative function and total number of these peptides is still ongoing (26–28). Here, we report on LNCipedia 3.0, the latest version of our publically available lncRNA database. In version 3.0, our major improvement is the evaluation of protein-coding potential with state-of-the-art algorithms and data sets. As such we have generated a high-confidence data set that excludes lncRNAs with possible protein-coding potential. In addition, a new tool to assess the conservation of lncRNA genes has been implemented. The database content has been updated and now contains over five times the number of transcripts compared to the first version.

MATERIALS AND METHODS

Locus conservation

The upstream and downstream protein-coding genes that flank a human lncRNA gene are queried in the public Ensembl (29) MySQL database (version 73). For both genes, the orthologs in mouse and zebrafish are obtained using the Ensembl Compara API (version 73). If any pair of orthologs are neighboring genes, the locus is reported as conserved.

PhyloCSF

Whole-genome alignments of 46 species are obtained from the UCSC website (30) and processed using the PHAST (31) package (version 1.3) to obtain the required input format for PhyloCSF (19). To validate our workflow, we benchmarked PhyloCSF with transcripts annotated in Ensembl (version 75). Transcripts with biotype ‘lincRNA’ or ‘antisense’ (20 320 transcripts) serve as negative set while transcripts with biotype ‘protein_coding’ and an annotated coding sequence (36 959 transcripts) serve as positive set.

TIS

Ribosome profiling sequencing data of HEK-293 cells treated with cycloheximide (CHX) and lactimidomycin (LTM) were processed (24). Two technical replicates of both treatments were pooled (Bioproject http://www.ncbi.nlm.nih.gov/bioproject/PRJNA171327: runs SRR618770 and SRR618771 for CHX and runs SRR618772 and SRR618773 for LTM). The reads were first clipped to remove their 3′ cloning adaptor sequence using the FASTX-Toolkit (fastx_clipper tool). Unclipped and clipped reads shorter than 25 nt were discarded. The remaining reads were mapped using the RNA-seq STAR aligner (32), sequentially using indices based on the following sequences: (i) Phix genome (widely used as a quality control for Illumina sequencing runs), (ii) Homo sapiens rRNA (Refseq IDs NR_003285.2, NR_003286.1, NR_003287.1, NR_023363.1) and (iii) the human reference genome (downloaded from the igenomes repository http://support.illumina.com/sequencing/sequencing_software/igenome.ilmn, using the H. sapiens genome build GRCh37 and Ensembl annotation version 70). The human STAR index was built taking into account the splice site annotation from Ensembl. Only uniquely mapped reads that are between 28 and 35 nt long were retained. Footprint alignments were assigned to a specific P-site nucleotide based on the fragment length (the 5′ offset is set to respectively 12, 13 or 14 for profiles with length ≤ 30 nt, 31–33 nt, or ≥ 34 nt (23)).

PRoteomics IDEntifications (PRIDE) reprocessing

The processing pipeline consists of three major modules. The first module is based on the PRIDE automated spectrum annotation pipeline (pride-asap) (33), and is used to reverse engineer the original search parameters from submitted data. The key parameters extracted by pride-asap in this stage are the allowable mass errors, the post-translational modifications (PTMs) to consider, and the enzyme used. Recent developments in this module have greatly improved the PTM inference by considering the modifications found in the PSI-mod (34) and Unimod (35) databases, as well as the frequency of occurrence of these modifications. Two thresholds are calculated based on this information, with the first one serving as a lower threshold to exclude very low abundance modifications while the second threshold is used to determine whether a sufficiently abundant modification is to be considered as either variable or fixed. A second development has been the impromptu determination of the protease used in the original experiment. Instead of assuming the use of trypsin, the pride-asap module now calculates the most likely enzyme based on all reported peptide sequences reported in PRIDE for that experiment. Overall, these updates to the module allow a reduction in search space to consider, providing faster processing times and leaving less room for false-positive matches. The second module handles the peptide-to-spectrum matching, relying on SearchGUI (36) to automatically run multiple search engines in parallel; in this case OMSSA (37) and X!Tandem (38). SearchGUI is configured to use the target/decoy approach (39), where both the original (target) sequence database is searched, but also a reversed (decoy) version of that database. Matches from the latter can then be used to determine a false discovery rate (FDR) (39). The third and final module uses PeptideShaker (http://peptide-shaker.googlecode.com) and the compomics-utilities library (40) to collect, process and analyze the results generated by SearchGUI.

RESULTS

LNCipedia 3.0 content

LNCipedia 1.0 (41) combined sequences and annotation from three different public resources, namely, Ensembl (29,42), Human body map lincRNAs (43) and the lncRNA database (11). In LNCipedia version 3.0, we have complemented these resources with four additional public data sets (Table 1). Two of these data sets are obtained from databases (44,45), and two from lncRNA research articles describing RNA sequencing workflows and reporting on novel lncRNAs (46,47). As with LNCipedia 1.0, redundant transcripts are merged into the same record. The result of this extension and integration of sources is that LNCipedia 3.0 represents a more than 5-fold increase in transcript content over version 1.0 (Figure 1). The majority of these transcripts (80%) is found in new loci and as such give rise to novel lncRNA genes.
Table 1.

Overview of data sources contributing to lncRNA content in LNCipedia 3.0

SourceVersionNumber of transcripts
Ensembl (42)7523 498
Refseq (44)March 20146917
Nielsen et al. (46)7656
Hangauer et al. (47)5339
NONCODE (45)493 164
LNCipedia (41)1.021 504
Total number of unique transcripts113 513
Figure 1.

LNCipedia has grown substantially since its first release. The first version (41) was based on sequences and annotation from three different sources and was made available to the public in 2012. For the 2013 release of LNCipedia (unpublished), no additional sources were used, but the different sources were updated to the most recent version. For version 3.0 of LNCipedia, both new sources were added and existing sources were updated.

In LNCipedia 1.0 we introduced a universal lncRNA nomenclature to overcome the confusion caused by the use of different identifiers by different authors and databases. As was suggested by others, we named lncRNAs after neighboring protein-coding genes on the same strand (48). In LNCipedia 3.0, we hold true to this strategy. Existing genes are expanded when novel transcripts have overlapping exons and new genes are created when a transcript does not share exonic sequence with any existing gene. The identification of orthologous lncRNAs is an important step for animal modeling and functional research across species. Conservation of gene order is a straightforward metric often used in comparative genomics. We applied the concept of gene order conservation to determine the orthologous locus of a lncRNA in another species. Using the Ensembl Compara API, we have assessed the conservation in the order of the flanking protein-coding genes. Currently, orthologs for non-coding genes are not as well annotated as for protein-coding genes, flanking non-coding genes were therefore not taken into account. When the order is conserved in mouse or zebrafish we report the locus as conserved. In this way, we find locus conservation for 55% of the human lncRNA genes in mouse, and for 27% in zebrafish (Figure 2). The majority of the conserved loci in zebrafish are also conserved in mouse, as one would expect. While locus conservation is no proof for the functional conservation of the lncRNA itself, it may serve a first step in finding the orthologous lncRNA.
Figure 2.

Many lncRNA loci are conserved in mouse or zebrafish. Locus conservation is a novel tool to determine the orthologous locus of a human lncRNA in another species. When the order of the flanking protein-coding genes is conserved in another species, the lncRNA locus is considered conserved. The majority of the conserved loci in zebrafish are also conserved in mouse, this fraction is depicted in gray.

Protein-coding potential

For collection of lncRNA transcript sequences, we rely on public data sets that are often contaminated with small numbers of transcripts harboring coding ORFs (25,26). While we already presented several measures to assess this problem (41), we further expanded these with state-of-the-art tools and included additional lncRNA transcript data sets. One such measure is the PhyloCSF (19) score. We have benchmarked PhyloCSF using Ensembl transcripts and we have determined 41 as an optimal threshold for the PhyloCSF score resulting in a precision of 95% and sensitivity of 91% (Supplemental Material and Figures). From the empirical cumulative distribution (Figure 3a) it is apparent that LNCipedia most likely contains a considerable fraction of protein-coding sequences. When applying our pre-computed cutoff, these transcripts add up to about 26% of the collection. Figure 3c shows the distribution of these putative coding transcripts among the different sources used for LNCipedia. It is clear that some lncRNA data sets suffer more from contamination of coding sequences than others. Strikingly, nearly 50% of Refseq annotated non-coding sequences are predicted to be coding according to the PhyloCSF score cutoff. It is no surprise that the lowest number of coding sequences is observed in Cabili et al. and Hangauer et al. as these studies applied PhyloCSF as a filter in their workflow.
Figure 3.

Different methods suggest contamination of coding sequences in lncRNA data sets. (a) PhyloCSF benchmarking and score distributions. We can observe a considerable difference between the score distributions of coding and non-coding transcripts in the Ensembl data set. In addition, while the great majority of LNCipedia is presumably non-coding, it also contains a fraction of transcripts with a PhyloCSF score in the coding range. (b) Transcripts with a TIS have a significantly higher PhyloCSF score (Mann–Whitney U test) compared to other transcripts. (c) Several public lncRNA resources suffer from considerable contamination with protein-coding sequences. The percentage of transcripts with PhyloCSF score greater than 41 is shown for the different sources in LNCipedia 3.0. Two sources already filtered with PhyloCSF are depicted in gray. In the case of RefSeq, only entries with property “biomol_ncrna_lncrna” were considered.

LNCipedia has grown substantially since its first release. The first version (41) was based on sequences and annotation from three different sources and was made available to the public in 2012. For the 2013 release of LNCipedia (unpublished), no additional sources were used, but the different sources were updated to the most recent version. For version 3.0 of LNCipedia, both new sources were added and existing sources were updated. Many lncRNA loci are conserved in mouse or zebrafish. Locus conservation is a novel tool to determine the orthologous locus of a human lncRNA in another species. When the order of the flanking protein-coding genes is conserved in another species, the lncRNA locus is considered conserved. The majority of the conserved loci in zebrafish are also conserved in mouse, this fraction is depicted in gray. Different methods suggest contamination of coding sequences in lncRNA data sets. (a) PhyloCSF benchmarking and score distributions. We can observe a considerable difference between the score distributions of coding and non-coding transcripts in the Ensembl data set. In addition, while the great majority of LNCipedia is presumably non-coding, it also contains a fraction of transcripts with a PhyloCSF score in the coding range. (b) Transcripts with a TIS have a significantly higher PhyloCSF score (Mann–Whitney U test) compared to other transcripts. (c) Several public lncRNA resources suffer from considerable contamination with protein-coding sequences. The percentage of transcripts with PhyloCSF score greater than 41 is shown for the different sources in LNCipedia 3.0. Two sources already filtered with PhyloCSF are depicted in gray. In the case of RefSeq, only entries with property “biomol_ncrna_lncrna” were considered. Another measure to assess protein-coding potential is the use of ribosome profiling to map TIS. When we map the TIS observed in HEK-293 (24) to LNCipedia entries, we find 4154 trancripts with at least one TIS. Of note, these transcripts have significantly higher PhyloCSF scores (Figure 3b), which is a good validation of both methods.

PRIDE

Similar to the rapid growth of LNCipedia, the submission of mass spectrometry data to the PRIDE repository has flourished as well (49). While these increased collections of lncRNAs and mass spectrometry data provide even more means to detect potentially coding lncRNAs, they also require much more compute power to process. The only way to analyze these data in a timely fashion is to make use of parallelization on a compute cluster or through grid computing (50). We have therefore set up such a grid environment based on dedicated hardware running a collection of Linux virtual machines, allowing us to re-analyze the full human complement of PRIDE in under a week. At the time of writing, the pipeline has been run on 2493 PRIDE experiments, containing 39 463 035 fragmentation mass spectra and covering all 68 annotated human tissues in the public repository. This resulted in a total of 8 064 657 peptide-to-spectrum matches (PSMs), of which 747 305 were matched to lncRNAs in LNCipedia (393 859 matched the target database and 353 446 matched the decoy database). Of these PSMs, 18 929 target sequences (representing 2040 transcripts, from 1770 genes) had an identification confidence higher than 90% (in contrast to only 2001 decoy sequences that had such a high confidence). Of note, the estimation of the FDR remains a complex issue in these very broad searches (51,52), and care should be taken to interpret these results. Indeed, as supplementary Figures S1 and S2 illustrate, while the confidence compares reasonably well with the estimated FDR, especially at higher confidences (higher than 90%), the evolution of the FDR toward the higher confidences is very different between the UniProtKB-SwissProt-derived identifications and the lncRNA matches. No significantly higher PhyloCSF score was found for transcripts containing PSMs with identification confidence higher than 90%. In addition, no significant overlap is observed between the set of transcripts identified in PRIDE and the sets containing TIS and smORFs. This observation illustrates the very unique nature of the PRIDE analysis and strongly suggests its ability to detect coding potential not predicted by other methods.

HIGH-CONFIDENCE SET

Since LNCipedia contains a non-negligible number of putative coding transcripts, we propose a filtering strategy to create a stringent or high-confidence data set. Four groups of putative coding transcripts are removed (Figure 4, Supplementary Figure S3). The first group consists of 253 lncRNAs containing small ORFs (smORFs) (25). Bazzini et al. developed an approach to detect smORFs using ribosome profiling whereby the periodicity of ribosome movement on actively translated ORFs is used to distinguish coding from non-coding sequences. A second approach to apply ribosome profiling in the quest for novel coding RNAs has been described by Lee et al. (24). Using LTM, a ribosome inhibitor specific to initiating ribosomes, TIS were mapped in HEK-293 cells. Note that 4127 lncRNA transcripts containing at least one TIS are thus withdrawn. While these transcripts have a good change to give rise to peptides, it is important to consider that a negative result does not guarantee the opposite. The transcript may not be expressed or translated in the sample. The next filtering step is based on PhyloCSF (19). As discussed earlier, this algorithm can distinguish between coding and non-coding sequences with high accuracy. As such, 27 293 transcripts with a PhyloCSF score higher than 41 are discarded. Finally, the 2040 PSM containing transcripts from the PRIDE reprocessing pipeline are excluded as well. The resulting set of 80 216 transcripts (71% of LNCipedia 3.0) representing 48 028 genes (76%) is referred to as ‘high-confidence set’ and is available for download on the LNCipedia website.
Figure 4.

Transcripts with a likely coding potential are removed in the definition of a high-confidence set. Transcripts containing small ORFs (25), TIS (24), PhyloCSF score greater than 41 or PSMs with an identification confidence higher than 90% are excluded.

Transcripts with a likely coding potential are removed in the definition of a high-confidence set. Transcripts containing small ORFs (25), TIS (24), PhyloCSF score greater than 41 or PSMs with an identification confidence higher than 90% are excluded.

CONCLUSION AND FUTURE DIRECTION

With over 90 000 new transcripts, LNCipedia content increased 5-fold since its first publication in 2012. This makes it to our knowledge the largest publicly available human lncRNA resource. Furthermore, we improved the evaluation of coding potential with state-of-the-art algorithms, published data sets and an improved PRIDE reprocessing pipeline. In addition, we have developed a locus conservation analysis tool, which can aid in the search for lncRNA orthologs or prioritarization of lncRNAs for animal studies. As in the previous years, LNCipedia will be updated when new lncRNA data sets are available. With the arrival of a new human reference genome (GRCh38), an important improvement to the database will be remapping chromosomal positions to this new reference genome. We will also continue to automatically run searches against the ever-growing contents of the PRIDE database on a routine basis. Furthermore, we will improve the specificity of the PRIDE searches by taking possible contamination from viral sequences into account. In conclusion, LNCipedia 3.0 provides significant improvements over the previous version in terms of data content and data annotation.

AVAILABILITY

LNCipedia 3.0 can be accessed trough a web interface at www.lncipedia.org. Exports are available in FASTA, GFF, GTF or BED format for both the entire lncRNA collection and the high-confidence set. In addition, Integrative Genome Viewer (IGV) users have the option of loading an IGV optimized data set directly in the application. As in version 1.0, the database can be queried by chromosomal position or (partial) sequence. We encourage the lncRNA research community to contribute to LNCipedia by submitting newly discovered lncRNAs and by adding PubMed literature records to existing entries using the web interface.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.
  51 in total

1.  The Ensembl genome database project.

Authors:  T Hubbard; D Barker; E Birney; G Cameron; Y Chen; L Clark; T Cox; J Cuff; V Curwen; T Down; R Durbin; E Eyras; J Gilbert; M Hammond; L Huminiecki; A Kasprzyk; H Lehvaslaiho; P Lijnzaad; C Melsopp; E Mongin; R Pettett; M Pocock; S Potter; A Rust; E Schmidt; S Searle; G Slater; J Smith; W Spooner; A Stabenau; J Stalker; E Stupka; A Ureta-Vidal; I Vastrik; M Clamp
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

2.  A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes.

Authors:  David Fenyö; Ronald C Beavis
Journal:  Anal Chem       Date:  2003-02-15       Impact factor: 6.986

3.  The UCSC Table Browser data retrieval tool.

Authors:  Donna Karolchik; Angela S Hinrichs; Terrence S Furey; Krishna M Roskin; Charles W Sugnet; David Haussler; W James Kent
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

4.  Unimod: Protein modifications for mass spectrometry.

Authors:  David M Creasy; John S Cottrell
Journal:  Proteomics       Date:  2004-06       Impact factor: 3.984

5.  Open mass spectrometry search algorithm.

Authors:  Lewis Y Geer; Sanford P Markey; Jeffrey A Kowalak; Lukas Wagner; Ming Xu; Dawn M Maynard; Xiaoyu Yang; Wenyao Shi; Stephen H Bryant
Journal:  J Proteome Res       Date:  2004 Sep-Oct       Impact factor: 4.466

6.  Characterization of HULC, a novel gene with striking up-regulation in hepatocellular carcinoma, as noncoding RNA.

Authors:  Katrin Panzitt; Marisa M O Tschernatsch; Christian Guelly; Tarek Moustafa; Martin Stradner; Heimo M Strohmaier; Charles R Buck; Helmut Denk; Renée Schroeder; Michael Trauner; Kurt Zatloukal
Journal:  Gastroenterology       Date:  2006-08-14       Impact factor: 22.682

7.  Requirement for Xist in X chromosome inactivation.

Authors:  G D Penny; G F Kay; S A Sheardown; S Rastan; N Brockdorff
Journal:  Nature       Date:  1996-01-11       Impact factor: 49.962

8.  Tsix, a gene antisense to Xist at the X-inactivation centre.

Authors:  J T Lee; L S Davidow; D Warshawsky
Journal:  Nat Genet       Date:  1999-04       Impact factor: 38.330

9.  NONCODE: an integrated knowledge database of non-coding RNAs.

Authors:  Changning Liu; Baoyan Bai; Geir Skogerbø; Lun Cai; Wei Deng; Yong Zhang; Dongbo Bu; Yi Zhao; Runsheng Chen
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

10.  The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013.

Authors:  Juan Antonio Vizcaíno; Richard G Côté; Attila Csordas; José A Dianes; Antonio Fabregat; Joseph M Foster; Johannes Griss; Emanuele Alpi; Melih Birim; Javier Contell; Gavin O'Kelly; Andreas Schoenegger; David Ovelleiro; Yasset Pérez-Riverol; Florian Reisinger; Daniel Ríos; Rui Wang; Henning Hermjakob
Journal:  Nucleic Acids Res       Date:  2012-11-29       Impact factor: 16.971

View more
  123 in total

1.  Convergent BCL6 and lncRNA promoters demarcate the major breakpoint region for BCL6 translocations.

Authors:  Zhengfei Lu; Nicholas R Pannunzio; Harvey A Greisman; David Casero; Chintan Parekh; Michael R Lieber
Journal:  Blood       Date:  2015-08-14       Impact factor: 22.113

2.  Deep Illumina sequencing reveals differential expression of long non-coding RNAs in hyperoxia induced bronchopulmonary dysplasia in a rat model.

Authors:  Han-Rong Cheng; Shao-Ru He; Ben-Qing Wu; Dong-Cai Li; Tian-Yong Hu; Li Chen; Zhu-Hui Deng
Journal:  Am J Transl Res       Date:  2017-12-15       Impact factor: 4.060

Review 3.  Towards a complete map of the human long non-coding RNA transcriptome.

Authors:  Barbara Uszczynska-Ratajczak; Julien Lagarde; Adam Frankish; Roderic Guigó; Rory Johnson
Journal:  Nat Rev Genet       Date:  2018-09       Impact factor: 53.242

Review 4.  The short and long of noncoding sequences in the control of vascular cell phenotypes.

Authors:  Joseph M Miano; Xiaochun Long
Journal:  Cell Mol Life Sci       Date:  2015-05-29       Impact factor: 9.261

Review 5.  The role of micropeptides in biology.

Authors:  Rui Vitorino; Sofia Guedes; Francisco Amado; Manuel Santos; Nobuyoshi Akimitsu
Journal:  Cell Mol Life Sci       Date:  2021-01-28       Impact factor: 9.261

6.  Long Noncoding RNA Signatures Induced by Toll-Like Receptor 7 and Type I Interferon Signaling in Activated Human Plasmacytoid Dendritic Cells.

Authors:  Rochelle C Joslyn; Adriana Forero; Richard Green; Stephen E Parker; Ram Savan
Journal:  J Interferon Cytokine Res       Date:  2018-09       Impact factor: 2.607

7.  Epigenetic sampling effects: nephrectomy modifies the clear cell renal cell cancer methylome.

Authors:  Christophe Van Neste; Alexander Laird; Fiach O'Mahony; Wim Van Criekinge; Dieter Deforce; Filip Van Nieuwerburgh; Thomas Powles; David J Harrison; Grant D Stewart; Tim De Meyer
Journal:  Cell Oncol (Dordr)       Date:  2017-01-10       Impact factor: 6.730

Review 8.  Long non-coding RNA regulation of reproduction and development.

Authors:  David H Taylor; Erin Tsi-Jia Chu; Roman Spektor; Paul D Soloway
Journal:  Mol Reprod Dev       Date:  2015-10-30       Impact factor: 2.609

9.  Using Network Distance Analysis to Predict lncRNA-miRNA Interactions.

Authors:  Li Zhang; Pengyu Yang; Huawei Feng; Qi Zhao; Hongsheng Liu
Journal:  Interdiscip Sci       Date:  2021-07-07       Impact factor: 2.233

10.  Progress on the HUPO Draft Human Proteome: 2017 Metrics of the Human Proteome Project.

Authors:  Gilbert S Omenn; Lydie Lane; Emma K Lundberg; Christopher M Overall; Eric W Deutsch
Journal:  J Proteome Res       Date:  2017-10-09       Impact factor: 4.466

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.