Literature DB >> 23937229

RCircos: an R package for Circos 2D track plots.

Hongen Zhang1, Paul Meltzer, Sean Davis.   

Abstract

BACKGROUND: Circos is a Perl language based software package for visualizing similarities and differences of genome structure and positional relationships between genomic intervals. Running Circos requires extra data processing procedures to prepare plot data files and configure files from datasets, which limits its capability of integrating directly with other software tools such as R. Recently published R Bioconductor package ggbio provides a function to display genomic data in circular layout based on multiple other packages, which increases its complexity of usage and decreased the flexibility in integrating with other R pipelines.
RESULTS: We implemented an R package, RCircos, using only R packages that come with R base installation. The package supports Circos 2D data track plots such as scatter, line, histogram, heatmap, tile, connectors, links, and text labels. Each plot is implemented with a specific function and input data for all functions are data frames which can be objects read from text files or generated with other R pipelines.
CONCLUSION: RCircos package provides a simple and flexible way to make Circos 2D track plots with R and could be easily integrated into other R data processing and graphic manipulation pipelines for presenting large-scale multi-sample genomic research data. It can also serve as a base tool to generate complex Circos images.

Entities:  

Mesh:

Year:  2013        PMID: 23937229      PMCID: PMC3765848          DOI: 10.1186/1471-2105-14-244

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

Circos is a Perl language based software package for visualizing similarities and differences of genome structure and positional relationships between genomic intervals [1]. Although many tools for genomic data visualization have been developed [2-5], Circos is commonly used by the genome research community to present large-scale multi-sample genomic research data (http://circos.ca/in_literature/scientific/). While Circos is powerful and flexible in displaying genomic data it requires extra data procedures to prepare plot data files and configuration files from datasets, which limits its capability of integrating directly with other software tools such as R, one of the most commonly used toolsets in processing and statistical analysis of genomic data. Recently, Yin et al. [6] published a Bioconductor package, ggbio, that includes a function to display genomic data in a circular layout and covers many of the basic Circos-like plots. The ggbio package relies on multiple other packages and offers some integration with other Bioconductor packages. However, ggbio is somewhat complex (but powerful) and relies on high-level plotting packages. RCircos was developed as a simple and flexible approach to Circos-like plots that uses base R graphics. To make Circos 2D track plots simple and flexible, we implemented an R package, RCircos, that relies on base graphics and R data structures. With RCircos, Circos 2D track plots could be easily generated and the procedures can be effectively integrated with other R pipelines including graphics output manipulation.

Implementation

Packages used to build RCircos are all included in the R base installation (http://www.r-project.org/). Graphics functionality is accomplished using base R graphics. No other package is required unless input data is associated with special data structure such as GenomicRanges objects and need to be processed separately. To reduce the complexity of the usage, all functions in RCircos use a simple data frame as input. The first three columns of the data frame are genomic position data in the order of chromosome name, start position, and end position followed by one or more data columns except of link data which requires paired chromosome positions for each row. Data set in data frame is directly passed to the plot function without need of further processing. Sample data are included in the package to show the input data formats and can be easily explored with data(package = “RCircos”) function. We follow the layout paradigm set forth by Circos and arrange data plots by tracks. The core track is the chromosome ideogram track with highlighting and labels. Data plot tracks can be placed inside or outside of chromosome ideogram track. A set of parameters is used to control the plot pattern such as chromosome width, number of base pairs per chromosome unit, track height, and point type. These parameters are initialized prior to plotting but can be customized to meet the requirements of different plot types. RCircos is designed such that each type of Circos 2D track plot is drawn with a separate and dedicated function call. To make RCircos more flexible in integrating with other R pipelines, we chose low level plot functions of R including points(), lines(), polygon(), and text() to implement graphic plot functions of RCircos. All RCircos plots work on an existing plot facilitating plot customization using standard R plot functionality.

Result and discussion

RCircos implements most of Circos 2D track plots including scatter, line, histogram, heatmaps, tiles, connectors, and text labels. We use the chromosome ideogram tables from UCSC genome browser to generate chromosome ideogram images and currently human, mouse, and rat are available in RCircos, but other species can be supported if relevant ideogram table is provided in a same format as cytoBandIdeo table in the UCSC genome browser [7]. A set of demos and a complete vignette are included in the package to show the RCircos plot procedures for each Circos 2D track plot type. Figure 1 was generated using the code below with build-in datasets and default parameters showing the human chromosome ideogram track along with data tracks for connectors, gene labels, heatmap, scatter plot, line plot, histogram, tiles, and link lines.
Figure 1

RCircos image showing human chromosome ideogram with data tracks for connectors, gene labels, heatmap, scatter plot, line plot, histogram, tiles, and link lines.

RCircos image showing human chromosome ideogram with data tracks for connectors, gene labels, heatmap, scatter plot, line plot, histogram, tiles, and link lines. Since we implemented the RCircos plots with base R graphics, combining RCircos with other R plot functions is straightforward. Figure 2 show a heatmap generated with demo (“R.Circos.Demo.Mouse.And.Rat”) with blue and red colors for comparison of gene expression between mouse and rat (GEO data accession number: GSE42081) and link lines between top 50 highly expressed genes in mouse and the same genes in rat. Legend and color key for the heatmap were added with the legend() and image() functions.
Figure 2

Combination of RCircos plot and other R graphics plot. Mouse and rat chromosome ideograms, heatmaps, and link lines are drawn with RCircos with two input datasets. Title, legend, and color key are added with function calls of R graphics package.

Combination of RCircos plot and other R graphics plot. Mouse and rat chromosome ideograms, heatmaps, and link lines are drawn with RCircos with two input datasets. Title, legend, and color key are added with function calls of R graphics package.

Conclusions

The RCircos package provides simple and flexible functionality to generate Circos 2D track plots with R and can be easily integrated into other R data processing and graphic manipulation pipelines to present large-scale multi-sample genomic research data.

Availability and requirements

The package and source code of RCircos are available for download and install from CRAN website (http://www.r-project.org) with the license of GPL (> = 2).

Competing interests

The authors declare that they have no competing interests.

Author’s contributions

HZ designed and implemented the software package, and wrote manuscript. SD participated in the software design and drafted the manuscript. PM revised the manuscript critically. All authors read and approved the final manuscript.
  7 in total

1.  Circos: an information aesthetic for comparative genomics.

Authors:  Martin Krzywinski; Jacqueline Schein; Inanç Birol; Joseph Connors; Randy Gascoyne; Doug Horsman; Steven J Jones; Marco A Marra
Journal:  Genome Res       Date:  2009-06-18       Impact factor: 9.043

Review 2.  Visualizing genomes: techniques and challenges.

Authors:  Cydney B Nielsen; Michael Cantor; Inna Dubchak; David Gordon; Ting Wang
Journal:  Nat Methods       Date:  2010-02-25       Impact factor: 28.547

3.  Scribl: an HTML5 Canvas-based graphics library for visualizing genomic data over the web.

Authors:  Chase A Miller; Jon Anthony; Michelle M Meyer; Gabor Marth
Journal:  Bioinformatics       Date:  2012-11-19       Impact factor: 6.937

4.  MGAviewer: a desktop visualization tool for analysis of metagenomics alignment data.

Authors:  Zhengwei Zhu; Beifang Niu; Jing Chen; Sitao Wu; Shulei Sun; Weizhong Li
Journal:  Bioinformatics       Date:  2012-10-08       Impact factor: 6.937

5.  ggbio: an R package for extending the grammar of graphics for genomic data.

Authors:  Tengfei Yin; Dianne Cook; Michael Lawrence
Journal:  Genome Biol       Date:  2012-08-31       Impact factor: 13.583

6.  The UCSC genome browser and associated tools.

Authors:  Robert M Kuhn; David Haussler; W James Kent
Journal:  Brief Bioinform       Date:  2012-08-20       Impact factor: 11.622

Review 7.  Visualizing multidimensional cancer genomics data.

Authors:  Michael P Schroeder; Abel Gonzalez-Perez; Nuria Lopez-Bigas
Journal:  Genome Med       Date:  2013-01-31       Impact factor: 11.117

  7 in total
  186 in total

1.  Family-specific aggregation of lipid GWAS variants confers the susceptibility to familial hypercholesterolemia in a large Austrian family.

Authors:  Elina Nikkola; Arthur Ko; Marcus Alvarez; Rita M Cantor; Kristina Garske; Elliot Kim; Stephanie Gee; Alejandra Rodriguez; Reinhard Muxel; Niina Matikainen; Sanni Söderlund; Mahdi M Motazacker; Jan Borén; Claudia Lamina; Florian Kronenberg; Wolfgang J Schneider; Aarno Palotie; Markku Laakso; Marja-Riitta Taskinen; Päivi Pajukanta
Journal:  Atherosclerosis       Date:  2017-07-22       Impact factor: 5.162

2.  Integrated genomic profiling identifies microRNA-92a regulation of IQGAP2 in locally advanced rectal cancer.

Authors:  Raphael Pelossof; Oliver S Chow; Lauren Fairchild; J Joshua Smith; Manu Setty; Chin-Tung Chen; Zhenbin Chen; Fumiko Egawa; Karin Avila; Christina S Leslie; Julio Garcia-Aguilar
Journal:  Genes Chromosomes Cancer       Date:  2016-04       Impact factor: 5.006

3.  Age-dynamic networks and functional correlation for early white matter myelination.

Authors:  Xiongtao Dai; Hans-Georg Müller; Jane-Ling Wang; Sean C L Deoni
Journal:  Brain Struct Funct       Date:  2018-11-03       Impact factor: 3.270

4.  Genomic Characterization of Esophageal Squamous Cell Carcinoma Reveals Critical Genes Underlying Tumorigenesis and Poor Prognosis.

Authors:  Hai-De Qin; Xiao-Yu Liao; Yuan-Bin Chen; Shao-Yi Huang; Wen-Qiong Xue; Fang-Fang Li; Xiao-Song Ge; De-Qing Liu; Qiuyin Cai; Jirong Long; Xi-Zhao Li; Ye-Zhu Hu; Shao-Dan Zhang; Lan-Jun Zhang; Benjamin Lehrman; Alan F Scott; Dongxin Lin; Yi-Xin Zeng; Yin Yao Shugart; Wei-Hua Jia
Journal:  Am J Hum Genet       Date:  2016-04-07       Impact factor: 11.025

5.  Dynamic evolution of clonal epialleles revealed by methclone.

Authors:  Sheng Li; Francine Garrett-Bakelman; Alexander E Perl; Selina M Luger; Chao Zhang; Bik L To; Ian D Lewis; Anna L Brown; Richard J D'Andrea; M Elizabeth Ross; Ross Levine; Martin Carroll; Ari Melnick; Christopher E Mason
Journal:  Genome Biol       Date:  2014-09-27       Impact factor: 13.583

6.  Genomic signatures reveal new evidences for selection of important traits in domestic cattle.

Authors:  Lingyang Xu; Derek M Bickhart; John B Cole; Steven G Schroeder; Jiuzhou Song; Curtis P Van Tassell; Tad S Sonstegard; George E Liu
Journal:  Mol Biol Evol       Date:  2014-11-26       Impact factor: 16.240

7.  MiR-5571-3p and miR-135b-5p, derived from analyses of microRNA profile sequencing, correlate with increased disease risk and activity of rheumatoid arthritis.

Authors:  Cailong Liu; Axiao Pan; Xiaowei Chen; Jianxin Tu; Xiaoru Xia; Li Sun
Journal:  Clin Rheumatol       Date:  2019-02-01       Impact factor: 2.980

8.  Familial resemblances in blood leukocyte DNA methylation levels.

Authors:  Bénédicte L Tremblay; Frédéric Guénard; Benoît Lamarche; Louis Pérusse; Marie-Claude Vohl
Journal:  Epigenetics       Date:  2016-09-09       Impact factor: 4.528

9.  Immune Landscape of Viral- and Carcinogen-Driven Head and Neck Cancer.

Authors:  Anthony R Cillo; Cornelius H L Kürten; Tracy Tabib; Zengbiao Qi; Sayali Onkar; Ting Wang; Angen Liu; Umamaheswar Duvvuri; Seungwon Kim; Ryan J Soose; Steffi Oesterreich; Wei Chen; Robert Lafyatis; Tullia C Bruno; Robert L Ferris; Dario A A Vignali
Journal:  Immunity       Date:  2020-01-07       Impact factor: 31.745

10.  Widespread ancient whole-genome duplications in Malpighiales coincide with Eocene global climatic upheaval.

Authors:  Liming Cai; Zhenxiang Xi; André M Amorim; M Sugumaran; Joshua S Rest; Liang Liu; Charles C Davis
Journal:  New Phytol       Date:  2018-07-21       Impact factor: 10.151

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.