Literature DB >> 18411202

The CGView Server: a comparative genomics tool for circular genomes.

Jason R Grant1, Paul Stothard.   

Abstract

The CGView Server generates graphical maps of circular genomes that show sequence features, base composition plots, analysis results and sequence similarity plots. Sequences can be supplied in raw, FASTA, GenBank or EMBL format. Additional feature or analysis information can be submitted in the form of GFF (General Feature Format) files. The server uses BLAST to compare the primary sequence to up to three comparison genomes or sequence sets. The BLAST results and feature information are converted to a graphical map showing the entire sequence, or an expanded and more detailed view of a region of interest. Several options are included to control which types of features are displayed and how the features are drawn. The CGView Server can be used to visualize features associated with any bacterial, plasmid, chloroplast or mitochondrial genome, and can aid in the identification of conserved genome segments, instances of horizontal gene transfer, and differences in gene copy number. Because a collection of sequences can be used in place of a comparison genome, maps can also be used to visualize regions of a known genome covered by newly obtained sequence reads. The CGView Server can be accessed at http://stothard.afns.ualberta.ca/cgview_server/

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18411202      PMCID: PMC2447734          DOI: 10.1093/nar/gkn179

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Despite continual advances in sequence analysis and annotation programs, manual visualization of sequence characteristics remains an important part of understanding gene structure, function and evolution (1). For many fully sequenced genomes, web-based genome browsers offer graphical maps that are integrated with underlying databases of sequences, annotations and analyses (2–5). Genome browsers allow the simultaneous display of the genome sequence together with numerous annotation tracks, such as known genes, predicted genes, ESTs, mRNAs and contigs. In addition, genome browsers provide a window into comparative genomics by displaying similarity information, obtained using a variety of searching and alignment approaches. In cases where a particular genome sequence is not yet available online, comparisons can be performed using more specialized tools. For example, PipMaker (6) and ACT (7) can be used to visualize the similarity between user-supplied sequences, and offer more flexibility than genome browsers in terms of how sequences are compared. PipMaker is a web server that generates a percent identity plot (pip), which shows the position and percent identity of gap-free alignment segments. Feature information can be included in the graphical output, by supplying an optional features file. ACT (Artemis Comparison Tool) is a stand-alone Java program that can be used in conjunction with BLAST to compare two DNA sequences. When supplied with a BLAST results file (the user must perform the BLAST comparison separately), ACT connects regions of similarity between the sequences using coloured lines. These lines can reveal which segments of the genomes are conserved, and can highlight differences in genome organization, such as changes in gene order, or gene duplications. If GenBank or EMBL files are used as the input for ACT, the features described in the files are displayed along with the BLAST results. Although PipMaker and ACT can accept sequences from any source species, neither generates the circular maps that are popular for visualizing bacterial and organellar genomes. Several programs for creating circular maps are available, including CGView (8), GenomePlot (9), GenoMap (10) and the Microbial Genome Viewer (11). Here we describe the CGView Server, which represents our efforts to integrate many of the capabilities of PipMaker, ACT and BLAST with CGView. The CGView Server generates graphical maps that can be used to visualize sequence conservation in the context of sequence features, imported analysis results, open reading frames and base composition plots. Publication-quality customizable maps can be generated, showing the full sequence, or a more detailed view of a region of interest. Sample maps and data sets further illustrating applications of the CGView Server are available at http://stothard.afns.ualberta.ca/cgview_server/

PROGRAM DESCRIPTION

Data is submitted to the CGView Server via a simple web interface. The minimum information required to obtain a map is a DNA sequence and an email address. Four formats for the sequence are accepted: raw, FASTA, GenBank and EMBL. If either of the latter two formats is used, gene annotations in the file will appear on the map. An email address is required, since the map, which may take several minutes to generate, is returned as an email attachment. All fields in the submission form include a context-sensitive help icon, which can be used to access a description of the options available or the information required. Additional feature information pertaining to the primary DNA sequence can be supplied in the form of a GFF (General Feature Format) file (http://www.sanger.ac.uk/Software/formats/GFF/). GFF is a format for describing genes and other features associated with nucleic acid and protein sequences. This ‘features’ file can be used to supply gene positions for inclusion on the map that are not given in the primary sequence file. If the GFF file contains single-letter COG functional categories in the ‘feature’ column, the CGView Server will colour the features according to COG category (12). Alternatively, the features can be coloured according to gene type (CDS, tRNA, rRNA or other). GFF files are available from several analysis programs, or they can be assembled manually in spreadsheet programs like Excel. Quantitative measurements can be added to the map using a second ‘analysis’ GFF file. This file can be used to visualize scores or measurements arising from analysis programs, or from laboratory experiments. In addition to the required primary DNA sequence, up to three comparison sequences can be provided. These can be in raw, FASTA or multi-FASTA format. The multi-FASTA format allows a collection of sequences to be used for a single comparison. Potential collections include all the members of a protein family, or the set of proteins encoded by a particular bacterial genome. For each comparison sequence there is a set of options for specifying the search type and search parameters. These allow searches to be conducted at the DNA or protein level, and hits to be filtered based on significance (e-value), alignment length and percent identity. The final section of the CGView Server interface provides options for controlling the display of features calculated directly from the primary sequence (GC content, GC skew, ORFs, start and stop codons), and for adjusting the organization and appearance of the map. For example, BLAST hits can be arranged according to the reading frame of the query (for tbastx and blastx searches). This capability can be useful for identifying which ORFs in an overlapping group are conserved. BLAST hits can also be drawn with partial opacity such that regions of the primary sequence producing multiple overlapping hits can easily be identified. Other options include the ability to draw a zoomed view of the map, feature labels, a feature legend and a title. Data submitted to the CGView Server enters an analysis queue. A Perl program checks the queue periodically, and processes jobs sequentially. Processing begins with the formatdb program (included with BLAST), which is used to convert any comparison sequences into BLAST databases. The primary sequence, serving as the query, is first split into smaller sub-sequences of a user-defined size before calling standalone BLAST. The primary sequence file, BLAST results, GFF files and user options are passed to another Perl script, which builds an XML file for the CGView map-drawing program (8). CGView generates a PNG image, and the image and a description of the submitted files and settings are emailed to the user. The maps generated by the CGView Server consist of concentric feature rings (Figure 1). Depending on the selected settings, these rings are used to display gene information read from the primary sequence file, features or analysis results from the GFF files, base composition plots, ORFs, start and stop codons, and BLAST results (Figure 2). Features are coloured according to type, and in some cases the height of the feature is adjusted to reflect its properties. BLAST hits, for example, are drawn with a height that is proportional to the percent identity of the hit. Similarly, score values are used to determine the height of features in the analysis GFF file. An optional legend can be used to identify all features based on colour. Labels can be drawn for features read from the primary sequence record or ‘features’ GFF file. A sequence ruler, drawn inside of the innermost feature ring, allows the approximate positions of features to be determined.
Figure 1.

Sample output from the CGView Server. (A) Comparison of a mitochondrial genome with three other genomes using blastx. (B) Visualizing analysis scores for features of a plasmid. (C) Comparison of a bacterial genome with reads from a 454 sequencer using blastn. (D) Visualizing features, ORFs, start and stop codons of a bacterial genome and comparing the sequence with proteins encoded by three other bacteria.

Figure 2.

Example of a zoomed map produced by the CGView Server. A 40× zoomed view of the sequence depicted in Figure 1D, centered on base 110 000. The contents of the feature rings (starting with the outermost ring) are as follows. Ring 1: forward strand features read from the primary sequence GenBank file. Rings 2,3,4: forward strand ORFs in reading frames 3,2,1. Rings 5,6,7: forward strand start and stop codons in reading frames 3,2,1. Rings 8,9,10: reverse strand start and stop codons in reading frames 1,2,3. Rings 11,12,13: reverse strand ORFs in reading frames 1,2,3. Ring 14: reverse strand features read from the primary sequence GenBank file. Rings 15,16,17,18,19,20: BLAST hits obtained from blastx search of bacterial genome 1 proteins, in which the query was translated in reading frames 3,2,1,−1,−2,−3. Rings 21,22,23,24,25,26: BLAST hits obtained from blastx search of bacterial genome 2 proteins, in which the query was translated in reading frames 3,2,1,−1,−2,−3. Rings 27,28,29,30,31,32: BLAST hits obtained from blastx search of bacterial genome 3 proteins, in which the query was translated in reading frames 3,2,1,−1,−2,−3.

Sample output from the CGView Server. (A) Comparison of a mitochondrial genome with three other genomes using blastx. (B) Visualizing analysis scores for features of a plasmid. (C) Comparison of a bacterial genome with reads from a 454 sequencer using blastn. (D) Visualizing features, ORFs, start and stop codons of a bacterial genome and comparing the sequence with proteins encoded by three other bacteria. Example of a zoomed map produced by the CGView Server. A 40× zoomed view of the sequence depicted in Figure 1D, centered on base 110 000. The contents of the feature rings (starting with the outermost ring) are as follows. Ring 1: forward strand features read from the primary sequence GenBank file. Rings 2,3,4: forward strand ORFs in reading frames 3,2,1. Rings 5,6,7: forward strand start and stop codons in reading frames 3,2,1. Rings 8,9,10: reverse strand start and stop codons in reading frames 1,2,3. Rings 11,12,13: reverse strand ORFs in reading frames 1,2,3. Ring 14: reverse strand features read from the primary sequence GenBank file. Rings 15,16,17,18,19,20: BLAST hits obtained from blastx search of bacterial genome 1 proteins, in which the query was translated in reading frames 3,2,1,−1,−2,−3. Rings 21,22,23,24,25,26: BLAST hits obtained from blastx search of bacterial genome 2 proteins, in which the query was translated in reading frames 3,2,1,−1,−2,−3. Rings 27,28,29,30,31,32: BLAST hits obtained from blastx search of bacterial genome 3 proteins, in which the query was translated in reading frames 3,2,1,−1,−2,−3.

CONCLUSION

The CGView Server is a comparative genomics tool for circular genomes (plasmid, bacterial, mitochondrial and chloroplast) that allows sequence feature information to be visualized in the context of sequence analysis results and sequence similarity plots. The server seamlessly integrates several sequence analysis procedures and tools with the CGView genome visualization program. The server accepts a variety of commonly used data formats, and generates high-quality, fully labelled graphical maps. One drawback of the CGView Server compared to standalone tools like ACT is that the server returns static images. Although these images are suitable for publication, ACT may be more useful for in-depth exploration of sequences and BLAST results. To partially overcome the limitations of providing static images, the CGView Server includes an option for generating zoomed maps. Another limitation for some users may be the inability of the CGView Server to generate more conventional linear maps. The web-based Microbial Genome Viewer can be used to generate circular or linear maps, and may be more appropriate for some users. Despite these limitations, maps generated by the CGView Server can be used to aid in the identification of conserved or diverged genome segments, instances of horizontal gene transfer, and differences in gene copy number. Because a collection of sequences can be used in place of a comparison genome, maps can be used to identify sequences that are part of a particular family, or to visualize regions of a known genome covered by newly obtained sequence reads. Sample maps and data sets further illustrating applications of the CGView Server are available at http://stothard.afns.ualberta.ca/cgview_server/
  12 in total

1.  PipMaker--a web server for aligning two genomic DNA sequences.

Authors:  S Schwartz; Z Zhang; K A Frazer; A Smit; C Riemer; J Bouck; R Gibbs; R Hardison; W Miller
Journal:  Genome Res       Date:  2000-04       Impact factor: 9.043

2.  The generic genome browser: a building block for a model organism system database.

Authors:  Lincoln D Stein; Christopher Mungall; ShengQiang Shu; Michael Caudy; Marco Mangone; Allen Day; Elizabeth Nickerson; Jason E Stajich; Todd W Harris; Adrian Arva; Suzanna Lewis
Journal:  Genome Res       Date:  2002-10       Impact factor: 9.043

3.  GenoMap, a circular genome data viewer.

Authors:  Naoki Sato; Shigeki Ehira
Journal:  Bioinformatics       Date:  2003-08-12       Impact factor: 6.937

4.  Visualization for genomics: the Microbial Genome Viewer.

Authors:  Robert Kerkhoven; Frank H J van Enckevort; Jos Boekhorst; Douwe Molenaar; Roland J Siezen
Journal:  Bioinformatics       Date:  2004-02-26       Impact factor: 6.937

5.  Circular genome visualization and exploration using CGView.

Authors:  Paul Stothard; David S Wishart
Journal:  Bioinformatics       Date:  2004-10-12       Impact factor: 6.937

6.  ACT: the Artemis Comparison Tool.

Authors:  Tim J Carver; Kim M Rutherford; Matthew Berriman; Marie-Adele Rajandream; Barclay G Barrell; Julian Parkhill
Journal:  Bioinformatics       Date:  2005-06-23       Impact factor: 6.937

Review 7.  Automated bacterial genome analysis and annotation.

Authors:  Paul Stothard; David S Wishart
Journal:  Curr Opin Microbiol       Date:  2006-08-22       Impact factor: 7.934

8.  The UCSC Genome Browser Database: 2008 update.

Authors:  D Karolchik; R M Kuhn; R Baertsch; G P Barber; H Clawson; M Diekhans; B Giardine; R A Harte; A S Hinrichs; F Hsu; K M Kober; W Miller; J S Pedersen; A Pohl; B J Raney; B Rhead; K R Rosenbloom; K E Smith; M Stanke; A Thakkapallayil; H Trumbower; T Wang; A S Zweig; D Haussler; W J Kent
Journal:  Nucleic Acids Res       Date:  2007-12-17       Impact factor: 16.971

9.  Database resources of the National Center for Biotechnology Information.

Authors:  David L Wheeler; Tanya Barrett; Dennis A Benson; Stephen H Bryant; Kathi Canese; Vyacheslav Chetvernin; Deanna M Church; Michael Dicuccio; Ron Edgar; Scott Federhen; Michael Feolo; Lewis Y Geer; Wolfgang Helmberg; Yuri Kapustin; Oleg Khovayko; David Landsman; David J Lipman; Thomas L Madden; Donna R Maglott; Vadim Miller; James Ostell; Kim D Pruitt; Gregory D Schuler; Martin Shumway; Edwin Sequeira; Steven T Sherry; Karl Sirotkin; Alexandre Souvorov; Grigory Starchenko; Roman L Tatusov; Tatiana A Tatusova; Lukas Wagner; Eugene Yaschenko
Journal:  Nucleic Acids Res       Date:  2007-11-27       Impact factor: 16.971

10.  The COG database: an updated version includes eukaryotes.

Authors:  Roman L Tatusov; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Boris Kiryutin; Eugene V Koonin; Dmitri M Krylov; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Sergei Smirnov; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal:  BMC Bioinformatics       Date:  2003-09-11       Impact factor: 3.169

View more
  397 in total

1.  Genome analysis of Moraxella catarrhalis strain BBH18, [corrected] a human respiratory tract pathogen.

Authors:  Stefan P W de Vries; Sacha A F T van Hijum; Wolfgang Schueler; Kristian Riesbeck; John P Hays; Peter W M Hermans; Hester J Bootsma
Journal:  J Bacteriol       Date:  2010-05-07       Impact factor: 3.490

2.  Whole-genome sequencing, genome mining, metabolic reconstruction and evolution of pentachlorophenol and other xenobiotic degradation pathways in Bacillus tropicus strain AOA-CPS1.

Authors:  Oladipupo A Aregbesola; Ajit Kumar; Mduduzi P Mokoena; Ademola O Olaniran
Journal:  Funct Integr Genomics       Date:  2021-02-06       Impact factor: 3.410

3.  Molecular Epidemiology and Mechanism of Sulbactam Resistance in Acinetobacter baumannii Isolates with Diverse Genetic Backgrounds in China.

Authors:  Yunxing Yang; Ying Fu; Peng Lan; Qingye Xu; Yan Jiang; Yan Chen; Zhi Ruan; Shujuan Ji; Xiaoting Hua; Yunsong Yu
Journal:  Antimicrob Agents Chemother       Date:  2018-02-23       Impact factor: 5.191

4.  Complete sequences of two plasmids in a blaNDM-1-positive Klebsiella oxytoca isolate from Taiwan.

Authors:  Tzu-Wen Huang; Jann-Tay Wang; Tsai-Ling Lauderdale; Tsai-Lien Liao; Jui-Fen Lai; Mei-Chen Tan; Ann-Chi Lin; Ying-Tsong Chen; Shih-Feng Tsai; Shan-Chwen Chang
Journal:  Antimicrob Agents Chemother       Date:  2013-06-10       Impact factor: 5.191

Review 5.  The Large pBS32/pLS32 Plasmid of Ancestral Bacillus subtilis.

Authors:  Aisha T Burton; Daniel B Kearns
Journal:  J Bacteriol       Date:  2020-08-25       Impact factor: 3.490

6.  Complete Genome Sequence of Clostridium kluyveri JZZ Applied in Chinese Strong-Flavor Liquor Production.

Authors:  Yansheng Wang; Bin Li; Hong Dong; Xunduan Huang; Ruiyu Chen; Xingjie Chen; Laoji Yang; Bing Peng; Guopai Xie; Wei Cheng; Biao Hao; Changrun Li; Junfeng Xia; Buchang Zhang
Journal:  Curr Microbiol       Date:  2018-07-20       Impact factor: 2.188

7.  Genomic basis of a polyagglutinating isolate of Neisseria meningitidis.

Authors:  Lavanya Rishishwar; Lee S Katz; Nitya V Sharma; Lori Rowe; Michael Frace; Jennifer Dolan Thomas; Brian H Harcourt; Leonard W Mayer; I King Jordan
Journal:  J Bacteriol       Date:  2012-08-17       Impact factor: 3.490

8.  Emergence and Within-Host Genetic Evolution of Methicillin-Resistant Staphylococcus aureus Resistant to Linezolid in a Cystic Fibrosis Patient.

Authors:  Caroline Rouard; Fabien Garnier; Jeremy Leraut; Margaux Lepainteur; Lalaina Rahajamananav; Jeanne Languepin; Marie-Cécile Ploy; Nadège Bourgeois-Nicolaos; Florence Doucet-Populaire
Journal:  Antimicrob Agents Chemother       Date:  2018-11-26       Impact factor: 5.191

9.  Comparative genome analyses suggest a hemibiotrophic lifestyle and virulence differences for the beech bark disease fungal pathogens Neonectria faginata and Neonectria coccinea.

Authors:  Catalina Salgado-Salazar; Demetra N Skaltsas; Tunesha Phipps; Lisa A Castlebury
Journal:  G3 (Bethesda)       Date:  2021-04-15       Impact factor: 3.154

10.  De novo genome assembly and comparative annotation reveals metabolic versatility in cellulolytic bacteria from cropland and forest soils.

Authors:  Suman Yadav; Bhaskar Reddy; Suresh Kumar Dubey
Journal:  Funct Integr Genomics       Date:  2019-08-05       Impact factor: 3.410

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.