| Literature DB >> 36182938 |
Anna Díez-Villanueva1,2,3, Rebeca Sanz-Pamplona1,2,3, Xavier Solé4,5, David Cordero1,2,3, Marta Crous-Bou6,7, Elisabet Guinó1,2,3, Adriana Lopez-Doriga1,2,3, Antoni Berenguer8, Susanna Aussó9, Laia Paré-Brunet10, Mireia Obón-Santacana1,2,3, Ferran Moratalla-Navarro1,2,3,11, Ramon Salazar11,12,13, Xavier Sanjuan11,14, Cristina Santos11,13,14, Sebastiano Biondo11,15, Virginia Diez-Obrero1,2, Ainhoa Garcia-Serrano1,2,3, Maria Henar Alonso1,2,3,11, Robert Carreras-Torres1,2,3, Adria Closa16,17, Víctor Moreno18,19,20,21.
Abstract
Colonomics is a multi-omics dataset that includes 250 samples: 50 samples from healthy colon mucosa donors and 100 paired samples from colon cancer patients (tumor/adjacent). From these samples, Colonomics project includes data from genotyping, DNA methylation, gene expression, whole exome sequencing and micro-RNAs (miRNAs) expression. It also includes data from copy number variation (CNV) from tumoral samples. In addition, clinical data from all these samples is available. The aims of the project were to explore and integrate these datasets to describe colon cancer at molecular level and to compare normal and tumoral tissues. Also, to improve screening by finding biomarkers for the diagnosis and prognosis of colon cancer. This project has its own website including four browsers allowing users to explore Colonomics datasets. Since generated data could be reuse for the scientific community for exploratory or validation purposes, here we describe omics datasets included in the Colonomics project as well as results from multi-omics layers integration.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36182938 PMCID: PMC9526730 DOI: 10.1038/s41597-022-01697-5
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Fig. 1Scheme of Colonomics data. Number of samples that remained after quality control for each type of data. M are healthy colon mucosae, T is tumoral tissue and N is normal tissue adjacent to tumor.
Clinical characteristics of the 250 samples.
| n | Female | Male | Age min | Age 1Q | Age median | Age mean | Age 3Q | Age max | Colon Site Left | Colon Site Right | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| M | 50 | 23 | 27 | 25 | 52 | 63 | 62.5 | 74 | 88 | 23 | 27 |
| N | 100 | 28 | 72 | 43 | 65 | 71.5 | 70.7 | 78 | 87 | 61 | 39 |
| T | 100 | 28 | 72 | 43 | 65 | 71.5 | 70.7 | 78 | 87 | 61 | 39 |
| All | 250 | 79 | 171 | 25 | 64 | 70 | 69.0 | 77 | 88 | 145 | 105 |
M is healthy colon mucosae, T is tumoral tissue and N is normal tissue adjacent to tumor.
Data accession summary table.
| Data | Format | Elements | N samples | Accession | Accession 2 | ||
|---|---|---|---|---|---|---|---|
| Expression | raw data | CEL files | 49,534 probes | 246 | GEO | — | |
| normalized data (log2 RMA) | txt files | 49,386 probes | 246 | GEO/UB-DD | 10.34810/DATA169 | ||
| summary by gene (PC1) | txt file | 20,070 genes | 246 | UB-DD | 10.34810/DATA169 | ||
| Methylation | raw data | idat files | 485,512 CpGs | 240 | GEO | — | |
| probes annotations | txt file | 430,086 CpGs | — | UB-DD | 10.34810/DATA169 | — | |
| normalized data (betas) | txt file | 430,086 CpGs | 240 | GEO/UB-DD | 10.34810/DATA169 | ||
| SNPs | raw data | CEL files | 1 M SNPs | 146 | EGA | — | |
| CNV | segment data | txt file | 54,002 segments | 99 | UB-DD | 10.34810/DATA169 | — |
| miRNAs | raw data | fastq file | probes | 250 | EGA | — | |
| count data | txt file | 2,641 miRNAs | 250 | UB-DD | 10.34810/DATA169 | — | |
| Whole Exome | raw data | fastq files (2 x sample) | reads | 84 | EGA | — | |
| Somatic Mutations | mutation data | txt file | 13,015 mutations | 42 | UB-DD | 10.34810/DATA169 | — |
| Clinical data | clinical data | txt file | covariates | 250 | UB-DD | 10.34810/DATA169 | — |
| Expression Predictive Models | predictive model | txt/db files | gene/SNP info | — | ZENODO | 10.5281/zenodo.6334768 | — |
| Methylation Predictive Models | predictive model | txt/db files | CpG/SNP info | — | ZENODO | 10.5281/zenodo.6334768 | — |
| miRNAs Predictive Models | predictive model | txt/db files | miRNA/SNP info | — | ZENODO | 10.5281/zenodo.6334768 | — |
| MultiAssayCLX.Rdata | MultiAssayExperiment R object | Rdata | 49,534 expression probes/430,086 methylation CpGs/2,641 miRNAs/38,905 CNV segments/Clinical data | — | UB-DD | 10.34810/DATA169 | — |
| Measurement(s) | Colon Gene expression • Colon DNA methylation • Colon Genptyping data • Colon Copy Number Variation • Colon miRNAs • Colon Whole Exome Sequencing |
| Technology Type(s) | Affymetrix Human Genome U219 Array Plate platform • Illumina Infinium HumanMethylation 450k BeadChip • Affymetrix Genome-Wide Human SNP 6.0 array • Small RNA Assay of the Agilent 2100 Bioanalyzer • Agilent kit Sure Select XT Human All Exon 50MB |
| Factor Type(s) | batch • background |
| Sample Characteristic - Organism | Homo sapiens |
| Sample Characteristic - Environment | colonic mucosa |
| Sample Characteristic - Location | Catalonia Autonomous Community, Spain |