Literature DB >> 30229065

Dataset for regulation between lncRNAs and their nearby protein-coding genes in human cancers.

Zhi Liu1, Juncheng Dai1, Hongbing Shen1,2.   

Abstract

This article contains data related to the research article entitled "Systematic analysis reveals long noncoding RNAs regulating neighboring transcription factors in human cancers" (Liu et al., 2018 in press) [1]. Long noncoding RNAs (lncRNAs) are proposed to play essential roles in modulating the expression of the nearby loci. In this study, we systematically investigated the relationship between lncRNAs and their neighboring genes based on the genomic location of genes and the transcriptome expression profiles from TCGA samples across 12 tumor types. Position conservation analysis was applied to find lncRNAs conserved by position across vertebrate species. Gene ontology and enrichment analysis identified TF genes as a specific type of protein-coding genes that adjacent to highly positionally conserved lncRNA. The expression correlation of lncRNAs and their adjacent TFs were assessed across tumors to define significant co-expressed lncRNA-TF pairs, and a causal inference test (CIT) was used to infer the causal regulation of lncRNA on its nearby TF genes. A list of candidate lncRNA/TF regulation pairs in tumors was provided.

Entities:  

Keywords:  Cancer; LncRNA; Transcription factors

Year:  2018        PMID: 30229065      PMCID: PMC6141275          DOI: 10.1016/j.dib.2018.06.048

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data The position conservation analysis of lncRNAs across species provides a reference for inferring the functionality of lncRNAs from the conservation perspective of view. The significant adjacency between positional conserved lncRNA and TF genes provides clues to study the regulation mechanism of lncRNAs on gene expression. The provided list of candidate lncRNA/TF regulation pairs can be used for experimental validation to investigate the function of lncRNA in tumors.

Data

GO enrichment of protein coding genes nearby lncRNA

GO items enriched by protein coding genes located in regions 1 Mb upstream and downstream lncRNA loci were presented in Table S1.

Position conservation of lncRNAs

The existence and absence of syntenic counterparts of human lncRNAs across other vertebrate species were listed in Table S2. LncRNAs that have syntenic lncRNAs in at least four species were classified as highly conserved ones (HC), and used in the following analysis. In total, 769 lncRNA/TF pairs were classified as HC pairs (Table S3). The detailed results were discribed [1].

Co-expression between lncRNA and TF genes

There were 266 of 769 HC lncRNA/TF pairs were significantly correlated in at least one tumor type, involving 159 TF genes and 253 lncRNAs (Table S4). Of those, 206 were consistently co-expressed in at least two tumor types.

Candidate lncRNA/TF regulation pairs

To prioritize the true lncRNA/TF regulatory pairs involved in tumors, we combined the results of co-expression (Table S4) and CIT (Table S5) and take advantage of pan-cancer dataset to define a confident list of pairs as those passed both co-expression test and CIT in more than two tumor types. Finally, we provided a list of 28 lncRNA/TF regulation pairs (Table 1).
Table 1

Candidate lncRNA/TF regulation pairs in TCGA tumors.

lncRNATF genesTumor type
SENCRETS1BLCA,BRCA,HNSC,KIRC,LUAD,LUSC,OV,STAD
RP11-290F20.2CEBPBBLCA,BRCA,HNSC,KIRC,LUSC,STAD
RP11-290F20.1CEBPBBLCA,BRCA,HNSC,KIRC,LUSC,STAD
PVT1MYCBRCA,HNSC,KIRC,LUSC,OV,STAD
KB-1732A1.1KLF10BLCA,BRCA,HNSC,KIRC,LUSC,OV
AF064858.8ETS2HNSC,KIRC,LUAD,LUSC,OV
AF064858.11ETS2HNSC,KIRC,LUAD,LUSC,OV
RP11-796E10.1SP3HNSC,LUAD,LUSC,STAD
RP11-57H14.4TCF7L2BRCA,LUAD,LUSC,OV
RP11-290F20.3CEBPBBRCA,LUAD,LUSC,STAD
LINC00511SOX9BRCA,KIRC,LUAD,STAD
CASC15SOX4BRCA,KIRC,LUSC,STAD
RP6-109B7.4PPARABRCA,KIRC,OV
RP11-57A1.1SOX9KIRC,LUAD,STAD
RP11-567M16.1NFATC1HNSC,LUSC,OV
RP11-51B23.3TEAD1BLCA,BRCA,LUAD
RP11-472N13.3ZEB1BLCA,BRCA,STAD
RP11-439L18.2HIVEP2BRCA,KIRC,STAD
RP11-397A16.2TCF4HNSC,KIRC,LUSC
RP11-330O11.3ZEB1BLCA,LUAD,LUSC
RP11-221N13.3HMGA2BLCA,HNSC,LUSC
PITRM1-AS1KLF6BRCA,KIRC,LUAD
LINC01152SOX9BRCA,LIHC,LUSC
LINC00261FOXA2LUSC,OV,STAD
GATA6-AS1GATA6LUAD,LUSC,STAD
CTD-2532K18.2MSX2HNSC,KIRC,LUSC
CCAT1MYCHNSC,LUSC,STAD
Candidate lncRNA/TF regulation pairs in TCGA tumors.

Experimental design, materials, and methods

Data and preprocessing

We downloaded TCGA lncRNA and coding gene expression data from the TANRIC database [2] (http://ibl.mdanderson.org/tanric/_design/basic/index.html) and Broad Institute GDAC firehose (http://gdac.broadinstitute.org) respectively. Only samples with paired lncRNA and mRNA expression profiles were used in this study. LncRNA with RPKM >0.1 and coding genes with RPKM >1 in at least 5% of the samples in each tumor types were retained for the following analysis (Table 2).
Table 2

The number of tumor samples and expressed lncRNAs across tumors.

TumorNo. of tumor samplesNo. of expressed lncRNA
BLCA2525979
BRCA8375941
COAD1571612
HNSC4265149
KIRC4486183
KIRP1985864
LIHC2004969
LUAD4886288
LUSC2206206
OV4126197
STAD2856070
THCA4975122
The number of tumor samples and expressed lncRNAs across tumors.

Positional conservation of human lncRNAs across species

Annotations of protein-coding gene orthologs were obtained from EnsemblCompara [3], and lncRNA annotation in other ten species was downloaded from the NONCODE database [4]. To identify syntenic human lncRNAs in other species, we used the method proposed by Hezroni et al. [5]. Briefly, when comparing genome human (H) and species A, and when considering orthologous protein-coding genes G1 and G2 we first identified lncRNAs within nt of G1 in H and within nt of G2 in A. A lncRNA was considered to be found “upstream” of the protein-coding gene when it overlapped it or ended 5′ to its 5′ end, and “downstream” when it overlapped it or started 3′ to the 3′ end of the protein-coding gene. Two lncRNA L1 and L2 from A and B were considered syntenic, if they were both upstream or both downstream of G1 and G2, with the same relative orientations.

Co-expression between lncRNA and their nearby TF genes

Pearson correlation coefficient was used to analyze the co-expression between lncRNA and their nearby TF genes. Co-expressed gene pairs were identified with an absolute Pearson correlation coefficient value ≥0.25 and an FDR-adjusted p-value ≤0.05.

Causal inference analysis of lncRNA/TF regulation

The lncRNA-TF-targets regulation relationships were assessed using the causal inference test (CIT) [6] to test the regulation chain and to select the possible lncRNA-TF regulation pairs. Briefly, the CIT has statistical tests for four conditions, all of which must be met for the TF -mediated causal classification: (1) lncRNA and TF target are associated, (2) lncRNA is associated with T F after adjusting for TF target, (3) TF is associated with TF target after adjusting for lncRNA, and (4) lncRNA is independent of TF target after adjusting for TF. The CIT p-value was defined as the maximum of the component test p values, and a multivariate linear regression was used in the four component tests. The targets of each TF were obtained from the TRRUST database [7], which collect transcriptional regulatory relationships unraveled by sentence-based text-mining.
Subject areaBiology
More specific subject areaGene expression
Type of dataTables
How data was acquiredGene expression extracted from RNA-seq was downloaded from TANRIC and TCGA database.
Data formatAnalyzed
Experimental factorsThe expression of lncRNA and protein-coding genes were extracted from the total expression profiles.
Experimental featuresPosition conservation analysis was conducted on lncRNAs across ten vertebrate species to find lncRNAs conserved by position. Co-expression and causal inference test were used to infer causal relationship between lncRNA and their adjacent TF genes.
Data source locationN/A
Data accessibilityWith this article
Related research articleSystematic analysis reveals long noncoding RNAs regulating neighboring transcription factors in human cancers. BBA molecular basis of disease. (In press)
  7 in total

1.  TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions.

Authors:  Heonjong Han; Jae-Won Cho; Sangyoung Lee; Ayoung Yun; Hyojin Kim; Dasom Bae; Sunmo Yang; Chan Yeong Kim; Muyoung Lee; Eunbeen Kim; Sungho Lee; Byunghee Kang; Dabin Jeong; Yaeji Kim; Hyeon-Nae Jeon; Haein Jung; Sunhwee Nam; Michael Chung; Jong-Hoon Kim; Insuk Lee
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

2.  TANRIC: An Interactive Open Platform to Explore the Function of lncRNAs in Cancer.

Authors:  Jun Li; Leng Han; Paul Roebuck; Lixia Diao; Lingxiang Liu; Yuan Yuan; John N Weinstein; Han Liang
Journal:  Cancer Res       Date:  2015-07-24       Impact factor: 12.701

3.  Systematic analysis reveals long noncoding RNAs regulating neighboring transcription factors in human cancers.

Authors:  Zhi Liu; Juncheng Dai; Hongbing Shen
Journal:  Biochim Biophys Acta Mol Basis Dis       Date:  2018-06-01       Impact factor: 5.187

4.  Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species.

Authors:  Hadas Hezroni; David Koppstein; Matthew G Schwartz; Alexandra Avrutin; David P Bartel; Igor Ulitsky
Journal:  Cell Rep       Date:  2015-05-07       Impact factor: 9.423

5.  Ensembl comparative genomics resources.

Authors:  Javier Herrero; Matthieu Muffato; Kathryn Beal; Stephen Fitzgerald; Leo Gordon; Miguel Pignatelli; Albert J Vilella; Stephen M J Searle; Ridwan Amode; Simon Brent; William Spooner; Eugene Kulesha; Andrew Yates; Paul Flicek
Journal:  Database (Oxford)       Date:  2016-05-02       Impact factor: 3.451

6.  Disentangling molecular relationships with a causal inference test.

Authors:  Joshua Millstein; Bin Zhang; Jun Zhu; Eric E Schadt
Journal:  BMC Genet       Date:  2009-05-27       Impact factor: 2.797

7.  NONCODE 2016: an informative and valuable data source of long non-coding RNAs.

Authors:  Yi Zhao; Hui Li; Shuangsang Fang; Yue Kang; Wei Wu; Yajing Hao; Ziyang Li; Dechao Bu; Ninghui Sun; Michael Q Zhang; Runsheng Chen
Journal:  Nucleic Acids Res       Date:  2015-11-19       Impact factor: 16.971

  7 in total
  4 in total

Review 1.  LncRNA: A Potential Research Direction in Intestinal Barrier Function.

Authors:  Zhi-Feng Jiang; Lin Zhang
Journal:  Dig Dis Sci       Date:  2020-06-26       Impact factor: 3.199

2.  Hypoxia-induced LncRNA DACT3-AS1 upregulates PKM2 to promote metastasis in hepatocellular carcinoma through the HDAC2/FOXA3 pathway.

Authors:  Liyan Wang; Bin Li; Xiaotong Bo; Xiaoyuan Yi; Xuhua Xiao; Qinghua Zheng
Journal:  Exp Mol Med       Date:  2022-06-28       Impact factor: 12.153

3.  DARS-AS1 recruits METTL3/METTL14 to bind and enhance DARS mRNA m6A modification and translation for cytoprotective autophagy in cervical cancer.

Authors:  Weiwei Shen; Miaohua Zhu; Qiming Wang; Xiaoming Zhou; Jiaying Wang; Tingting Wang; Jing Zhang
Journal:  RNA Biol       Date:  2022-01       Impact factor: 4.766

4.  LINC00511 is associated with the malignant status and promotes cell proliferation and motility in cervical cancer.

Authors:  Chun-Ling Yu; Xiao-Ling Xu; Fang Yuan
Journal:  Biosci Rep       Date:  2019-09-13       Impact factor: 3.840

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.