| Literature DB >> 29065857 |
Vineet Jha1, Gulzar Singh1,2, Shiva Kumar1, Amol Sonawane1, Abhay Jere3, Krishanpal Anamika4.
Abstract
BACKGROUND: Interpretation of large-scale data is very challenging and currently there is scarcity of web tools which support automated visualization of a variety of high throughput genomics and transcriptomics data and for a wide variety of model organisms along with user defined karyotypes. Circular plot provides holistic visualization of high throughput large scale data but it is very complex and challenging to generate as most of the available tools need informatics expertise to install and run them. RESULT: We have developed CGDV (Circos for Genomics and Transcriptomics Data Visualization), a webtool based on Circos, for seamless and automated visualization of a variety of large scale genomics and transcriptomics data. CGDV takes output of analyzed genomics or transcriptomics data of different formats, such as vcf, bed, xls, tab limited matrix text file, CNVnator raw output and Gene fusion raw output, to plot circular view of the sample data. CGDV take cares of generating intermediate files required for circos. CGDV is freely available at https://cgdv-upload.persistent.co.in/cgdv/ .Entities:
Keywords: Circular diagram; Genomics and transcriptomics data visualization; Visualization; Web circos
Mesh:
Year: 2017 PMID: 29065857 PMCID: PMC5655900 DOI: 10.1186/s12864-017-4169-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Comparison of CGDV with other available web tools
| Features | CIVI | ClicOFS | Circos Table Viewer | CGDV |
|---|---|---|---|---|
| Automated | ✓ | x | ✓ | ✓ |
| Manage raw output of various NGS data analysis tools | x | x | x | ✓ |
| Easy access to data and results | ✓ | ✓ | ✓ | ✓ |
| Prepackaged karyotype for multiple model organisms | x | x | ✓ | ✓ |
| Seamless upload and visualization of various genomics and transcriptomics data | x | x | x | ✓ |
| Generic data format support | x | ✓ | ✓ | ✓ |
Fig. 1Various circular figures generated by CGDV for genomics and transcriptomics data. a This figure represent amplification (orange dots) and deletion (black dots) from raw output of CNVnator tool. The size of the circles represents relative size of the duplications and deletions at each location. b This figure represents data from a BED file. Each point represents the value per coordinate from a given sample. Black line represents mean value of the data. c This figure represents output of analyzed ChIPSeq data. Heatmap in the inner track represents fold-enrichment value of the peaks. The outer track is a histogram displaying tags with p-value. d This figure represents homologous region in genome from BLAST output in tabular format. e This figure represents gene fusion event result which is the output of STAR-Fusion and/or FusionInspector. The tracks are heatmaps representing Jffpm value (outer track) and Sffpm value (inner track). The links are the position of gene fusion events between chromosomes. f This figure represents data of a Variant Call Format (VCF) file which is output of tools such as GATK (https://software.broadinstitute.org/gatk/) and SAMTools (http://samtools.sourceforge.net/). Innermost track represents depth of variations and middle track represents SNPs and INDELs in black and red dots respectively. g This figure represents gene/isoform expression FPKM values from Cuffdiff output. Each gene/isoform FPKM values is plotted against various condition as dots. h This figure represents numerical data from a tab limited matrix
CGDV supported data types, corresponding file formats and description of the plot
| S.No. | Data type | File format | Circular plot |
|---|---|---|---|
| 1 | VCF | vcf version 4.1 | SNP and InDel with their sequencing depth |
| 2 | CNVnator output | raw output of CNVnator | Amplification and deletion with their size |
| 3 | ChIPSeq | raw output from MACS in XLS format | Peaks and tag density with their |
| 4 | Gene fusion output | raw output from fusion inspector | Links between various genes which are fused together with color intensity based upon number of reads supporting each fusion event |
| 5 | Cuffdiff output | raw output from Cuffdiff | FPKM values per gene /isoform |
| 6 | BED | Extended BED upto 12 data columns | Expression values per genome coordinate |
| 7 | Matrix-links | Data in a matrix format | Links between data in the row and column |
| 8 | BLAST output | BLAST output data in a tabular format (BLAST run with –m8 option) | Links between similarity among homologous sequences |