| Literature DB >> 29261730 |
Victor Renault1,2, Jörg Tost3, Fabien Pichon3, Shu-Fang Wang-Renault3, Eric Letouzé4, Sandrine Imbeaud4, Jessica Zucman-Rossi4, Jean-François Deleuze1,2,3,5, Alexandre How-Kit2,5.
Abstract
MOTIVATION: Copy number variations (CNV) include net gains or losses of part or whole chromosomal regions. They differ from copy neutral loss of heterozygosity (cn-LOH) events which do not induce any net change in the copy number and are often associated with uniparental disomy. These phenomena have long been reported to be associated with diseases and particularly in cancer. Losses/gains of genomic regions are often correlated with lower/higher gene expression. On the other hand, loss of heterozygosity (LOH) and cn-LOH are common events in cancer and may be associated with the loss of a functional tumor suppressor gene. Therefore, identifying recurrent CNV and cn-LOH events can be important as they may highlight common biological components and give insights into the development or mechanisms of a disease. However, no currently available tools allow a comprehensive whole-genome visualization of recurrent CNVs and cn-LOH in groups of samples providing absolute quantification of the aberrations leading to the loss of potentially important information.Entities:
Mesh:
Year: 2017 PMID: 29261730 PMCID: PMC5736239 DOI: 10.1371/journal.pone.0189334
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 3Overview of the different steps handled by aCNViewer.
aCNviewer can process Affymetrix and Illumina SNP arrays as well as NGS data. LRR and BAF files are obtained after processing SNP raw data by PennCNV for Affymetrix and a threshold quantile normalization (tQN) for Illumina and subsequent use of ASCAT for CNV and cn-LOH detection. For NGS data, paired tumoral and non-tumoral whole exome/genome sequencing bam data are converted into seqz format and processed by Sequenza for CNV detection. aCNViewer converts CNV data into a CNV matrix with the window size defined by the user and which is subsequently used to compute dendrograms and heatmaps. Quantitative stacked histograms can be generated using the same matrix or a matrix of segments at base resolution (default behaviour). Text files are also available through GISTIC [33] providing a robust statistical way to select recurrent CNVs.
Fig 1Quantitative stacked histograms using 96 HCC samples on Affymetrix 500K Human Mapping Array data from [1].
A) Frequency of CNV and cn-LOH events along the genome. The left axis indicates the frequency of gains or losses among the 96 samples and the legend below indicates the number of copy number gains or losses from the reference baseline. The black line indicates the frequency of cn-LOH along the genome in negative ordinates. B) Frequency of homozygous/heterozygous CNVs along the genome. Copy-neutral events / gains and losses are respectively displayed in positive and negative ordinates.
Fig 4Hierarchical clustering of HCCs from [1] according to BCLC staging and based on CNVs.
A) Dendrogram representation. B) Bi-dimensional heatmap. A 2Mb window length is used for computation. The chromosomes of each window are shown on the right and the BCLC staging of each tumor is given on top of the bi-dimensional heatmap.
Fig 2Quantitative stacked histograms produced by aCNViewer showing the frequency of CNVs and cn-LOH along the genome in HCCs.
Quantitative stacked histograms generated using A) 96 freely available HCC Affymetrix 500K Human Mapping Array data [1], B) 243 HCC WES experiments from [2] and C) 317 pooled HCCs from both SNP and WES experiment data.
aCNViewer main options.
| Category | Option (default value) | Description |
|---|---|---|
| —plotAll (1) | specify whether all available plots should be generated (values are 0 or 1) | |
| —refBuild REF_BUILD | the genome build used to generate the CNV segments (hg18 and hg19 are currently supported. For custom build, please check the github website | |
| -w WINDOW_SIZE (2000000) / -p PERCENT | WINDOW_SIZE defines the window length in bp used to cut the genome in order to generate a matrix of CNV events. Alternatively, PERCENT can be used instead of WINDOW_SIZE in order to set the window size in percentage of chromosome length where PERCENT is a floating number between 0 and 100. | |
| -t TARGET_DIR | set the path of the output folder | |
| -b BIN_DIR | set the path of the folder containing all required binaries. For a detailed description of the structure, please refer to | |
| -f FILE_NAME | Path to the CNV file in PennCNV/ASCAT format. Can also process Sequenza results and in that case the following option—fileType Sequenza should be added and FILE_NAME should point to the folder containing Sequenza results. | |
| —ploidyFile FILE_NAME /—useCustomPloidies USE_CUSTOM_PLOIDIES (1) | Can either be a tab-delimited file with at least 2 columns: "sample" and "ploidy" or an integer, which will set the same ploidy to all samples. By default (USE_CUSTOM_PLOIDIES is 1), the ploidy is calculated using the CNV file grouped into windows of 10% of chromosomal length. The ploidy is then set to be the most represented CNV value for each sample. It is possible to use ASCAT/Sequenza ploidies by leaving FILE_NAME to null and by setting USE_CUSTOM_PLOIDIES to 0. | |
| —runGISTIC (0) | specify whether to run GISTIC in order to have a statistical way to prioritize regions of interest (values are 0 or 1) | |
| —smallMem SMALL_MEM (0) | If small_mem is 1, GISTIC will run in small memory mode and will only require about 10GB of RAM vs 50GB of RAM otherwise at the expense of a longer running time. | |
| —rColorFile FILE_NAME | file | |
| —outputFormat FORMAT | allow to customize output formats for the different types of available plots (histograms, heatmaps and dendrograms). The default value is hist:png(width = 4000,height = 1800,res = 300);hetHom:png(width = 4000,height = 1800,res = 300);dend:png(width = 4000,height = 2200,res = 300);heat:pdf(width = 10,height = 12). For more information, please refer to | |
| —lohToPlot LOH_TO_PLOT (cn-LOH) | Tell what values should be added to the histogram. Values should be one of "cn-LOH" for plotting cn-LOH only, "LOH" for LOH only, "both" for cn-LOH and LOH or "none" to disable this feature. | |
| —useFullResolutionForHist (1) | tell whether to plot histogram using full (base) resolution i.e. CNVs are not grouped into windows according to a user-defined length. If 0, the resolution of the plot will be given by either WINDOW_SIZE (option -w) or PERCENT (option -p) | |
| —useRelativeCopyNbForClustering (0) | indicate whether the CNV matrix used for the heatmap should be relative copy number values or raw copy number | |
| —keepGenomicPosForHistogram (0) | if set to 1, the fragmented genome is kept in its original position and not cluster windows according to sample CNV patterns | |
| —sampleFile SAMPLE_FILE | a tab-delimited file that should contain a column named Sample with the name of each sample and at least another column with the phenotypic/clinical feature. This file can contain a sample alias, which will be used as the official sample id if provided. This parameter can be used for dendrograms as well. | |
| -G FEATURE_NAME | refers to the name of the column of the phenotypic/clinical feature of interest in SAMPLE_FILE if specified. If you omit this parameter, one plot per feature defined in SAMPLE_FILE will be generated. This file can contain a sample alias, which will be used as the official sample id if provided. This parameter can be used for dendrograms as well. |
* an example can be found at https://github.com/FJD-CEPH/aCNViewer/blob/master/img/rColor.txt
° for more information, please check the github website: https://github.com/FJD-CEPH/aCNViewer