Literature DB >> 29900229

RNA-seq data of invasive ductal carcinoma and adjacent normal tissues from a Korean patient with breast cancer.

Ji Hyung Hong1, Yoon Ho Ko1,2, Keunsoo Kang3.   

Abstract

Invasive ductal carcinoma is the most common type of breast cancer. Here, we provide a whole transcriptome shotgun sequencing (called RNA-seq) dataset conducted with ten samples of invasive ductal carcinoma tissue and three samples of adjacent normal tissue from a single Korean breast cancer patient (luminal B subtype). Differentially expressed genes (DEGs) were identified with a false discovery rate (FDR)-adjusted p-value of 0.05. Gene ontology analysis identified several key pathways, including lymphocyte activation. A list of differentially expressed genes is provided. The raw data was uploaded to the sequence read archive (SRA) database and the BioProject ID is PRJNA432903.

Entities:  

Keywords:  Breast cancer; Invasive ductal carcinoma; Korean; Luminal B subtype; RNA-seq

Year:  2018        PMID: 29900229      PMCID: PMC5996721          DOI: 10.1016/j.dib.2018.03.079

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data This RNA-seq data provides a deep sequencing of ten samples of invasive ductal carcinoma tissue and three samples of adjacent normal tissue from a Korean breast cancer patient (luminal B subtype) The heterogeneous expression data from spatially distinct tumor samples can be used for various evaluation purposes. Gene ontology analysis revealed that lymphocyte activation and PPAR signaling pathway are significantly up- and down-regulated pathways, respectively, in breast cancer tissue compared to adjacent normal tissue.

Data

Total RNA was extracted from ten samples of cancer tissue (invasive ductal carcinoma; luminal B subtype) and three samples of adjacent normal tissue from a Korean patient with breast cancer. RNA-seq was performed to profile transcriptomes of breast cancer and normal samples. Differentially expressed genes were identified with an FDR-adjusted p-value cutoff of 0.05. Gene ontology analysis indicated that several pathways are associated with the onset or progression of breast cancer.

Experimental design, materials and methods

RNA-seq

One tissue sample of invasive ductal carcinoma (luminal B subtype) from breast tissue and a corresponding adjacent normal tissue were biopsied from a Korean woman with informed consent. This study was approved by the institutional review board of Catholic Medical Center (approval no. UC17TISI0015). The tumor and adjacent normal tissues were divided into ten and three samples, respectively. Poly(A) RNA was purified from 1 g total RNA from each sample, and cDNA was synthesized using SuperScript II (Invitrogen). Sequencing libraries were prepared using the TruSeq RNA Library preparation kit (Illumina) and sequenced using HiSeq. 2500 (Illumina).

RNA-seq analysis

Sequenced reads were trimmed using Trim Galore (version 0.4.2; https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) with Cutadapt (version 1.1.2) [1]. Trimmed reads were mapped to the reference human genome (hg38) using STAR (version 2.5.2b) [2]. The PCR-duplicate removal of mapped reads was performed using Sambamba (version 0.6.5) [3]. The quality of RNA-seq data was determined using RSeQC (version 2.6.4) with the transcript integrity number (TIN) score (Table 1) [4]. The abundances of RefSeq genes were estimated using Cufflinks with the Cuffnorm function (version 2.2.1) (Supplementary Table 1) [5].
Table 1

RNA quality was measured using the transcript integrity number (TIN) score.

CancerTIN score (median)NormalTIN score (median)
C079.6N179.3
C180.1N265.6
C279.7N367.8
C359.6
C479.1
C579.8
C680.6
C778.1
C880.1
C980.1
RNA quality was measured using the transcript integrity number (TIN) score.

Identification of differentially expressed genes

Differentially expressed genes (DEGs) between cancer and normal samples were identified using Cufflinks with the Cuffdiff function (version 2.2.1) [5]. DEGs were defined as the genes with FDR-adjusted p-values <0.05. A total of 2456 up-regulated and 2601 down-regulated genes were identified in cancer samples compared to adjacent normal samples (Supplementary Table 2). When the low-quality RNA-seq data (C3) was excluded for DEG analysis, a total of 3199 up-regulated and 3422 down-regulated genes were identified as DEGs (Fig. 1 and Supplementary Table 3).
Fig. 1

Comparison of differentially expressed genes. Venn diagrams show the number of common and unique DEGs between different DEG analyses. DEG analyses were performed with or without the C3 sample.

Comparison of differentially expressed genes. Venn diagrams show the number of common and unique DEGs between different DEG analyses. DEG analyses were performed with or without the C3 sample.

Gene ontology analysis

Gene ontology (GO) analysis was performed to identify key pathways regarding the DEGs that were identified without the C3 sample. The top 100 up-regulated (or down-regulated) DEGs that were highly expressed (> 10 average FPKM) were analyzed using Metascape (http://metascape.org) [6]. The GO analysis revealed that the majority of up-regulated genes were significantly associated with lymphocyte activation and that some down-regulated genes were involved in PPAR signaling pathway (Fig. 2).
Fig. 2

Gene ontology analysis showing altered pathways in breast cancer tissue compared to adjacent normal tissue. (A) Pathways significantly (p-value <0.05) associated with up- and down-regulated genes are shown. (B) The heatmap shows relative expression levels of the genes that are involved in lymphocyte activation and PPAR signaling pathway.

Gene ontology analysis showing altered pathways in breast cancer tissue compared to adjacent normal tissue. (A) Pathways significantly (p-value <0.05) associated with up- and down-regulated genes are shown. (B) The heatmap shows relative expression levels of the genes that are involved in lymphocyte activation and PPAR signaling pathway.
Subject areaBiology
More specific subject areaNGS, Transcriptomics, Cancer biology
Type of dataTranscriptome data
How data was acquiredHigh-throughput sequencing using Illumina HiSeq2500
Data formatRaw (fastq)
Experimental factorsBreast cancer (invasive ductal carcinoma; luminal B subtype) and adjacent normal tissues
Experimental featuresPoly(A) RNA was purified from 1 g total RNA from each sample, and cDNA was synthesized using SuperScript II (Invitrogen). Sequencing libraries were prepared using the TruSeq RNA Library preparation kit (Illumina)
Data source locationSeoul, Republic of Korea
Data accessibilityRaw data can be accessed at NCBI SRA (BioProject ID: PRJNA432903) (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA432903).
  5 in total

1.  RSeQC: quality control of RNA-seq experiments.

Authors:  Liguo Wang; Shengqin Wang; Wei Li
Journal:  Bioinformatics       Date:  2012-06-27       Impact factor: 6.937

2.  Sambamba: fast processing of NGS alignment formats.

Authors:  Artem Tarasov; Albert J Vilella; Edwin Cuppen; Isaac J Nijman; Pjotr Prins
Journal:  Bioinformatics       Date:  2015-02-19       Impact factor: 6.937

3.  STAR: ultrafast universal RNA-seq aligner.

Authors:  Alexander Dobin; Carrie A Davis; Felix Schlesinger; Jorg Drenkow; Chris Zaleski; Sonali Jha; Philippe Batut; Mark Chaisson; Thomas R Gingeras
Journal:  Bioinformatics       Date:  2012-10-25       Impact factor: 6.937

4.  Differential analysis of gene regulation at transcript resolution with RNA-seq.

Authors:  Cole Trapnell; David G Hendrickson; Martin Sauvageau; Loyal Goff; John L Rinn; Lior Pachter
Journal:  Nat Biotechnol       Date:  2012-12-09       Impact factor: 54.908

5.  Meta- and Orthogonal Integration of Influenza "OMICs" Data Defines a Role for UBR4 in Virus Budding.

Authors:  Shashank Tripathi; Marie O Pohl; Yingyao Zhou; Ariel Rodriguez-Frandsen; Guojun Wang; David A Stein; Hong M Moulton; Paul DeJesus; Jianwei Che; Lubbertus C F Mulder; Emilio Yángüez; Dario Andenmatten; Lars Pache; Balaji Manicassamy; Randy A Albrecht; Maria G Gonzalez; Quy Nguyen; Abraham Brass; Stephen Elledge; Michael White; Sagi Shapira; Nir Hacohen; Alexander Karlas; Thomas F Meyer; Michael Shales; Andre Gatorano; Jeffrey R Johnson; Gwen Jang; Tasha Johnson; Erik Verschueren; Doug Sanders; Nevan Krogan; Megan Shaw; Renate König; Silke Stertz; Adolfo García-Sastre; Sumit K Chanda
Journal:  Cell Host Microbe       Date:  2015-12-09       Impact factor: 21.023

  5 in total
  2 in total

1.  Combined analysis of circRNA and mRNA profiles and interactions in patients with Diabetic Foot and Diabetes Mellitus.

Authors:  Wanni Zhao; Jianfeng Liang; Zuoguan Chen; Yongpeng Diao; Gang Miao
Journal:  Int Wound J       Date:  2020-06-23       Impact factor: 3.315

2.  Statistical data analysis of cancer incidences in insurgency affected states in Nigeria.

Authors:  Patience I Adamu; Pelumi E Oguntunde; Hilary I Okagbue; Olasunmbo O Agboola
Journal:  Data Brief       Date:  2018-05-05
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.