| Literature DB >> 28542514 |
Yan Pantoja1, Kenny Pinheiro1, Allan Veras1, Fabrício Araújo1, Ailton Lopes de Sousa1, Luis Carlos Guimarães1, Artur Silva1, Rommel T J Ramos1.
Abstract
With increased production of genomic data since the advent of next-generation sequencing (NGS), there has been a need to develop new bioinformatics tools and areas, such as comparative genomics. In comparative genomics, the genetic material of an organism is directly compared to that of another organism to better understand biological species. Moreover, the exponentially growing number of deposited prokaryote genomes has enabled the investigation of several genomic characteristics that are intrinsic to certain species. Thus, a new approach to comparative genomics, termed pan-genomics, was developed. In pan-genomics, various organisms of the same species or genus are compared. Currently, there are many tools that can perform pan-genomic analyses, such as PGAP (Pan-Genome Analysis Pipeline), Panseq (Pan-Genome Sequence Analysis Program) and PGAT (Prokaryotic Genome Analysis Tool). Among these software tools, PGAP was developed in the Perl scripting language and its reliance on UNIX platform terminals and its requirement for an extensive parameterized command line can become a problem for users without previous computational knowledge. Thus, the aim of this study was to develop a web application, known as PanWeb, that serves as a graphical interface for PGAP. In addition, using the output files of the PGAP pipeline, the application generates graphics using custom-developed scripts in the R programming language. PanWeb is freely available at http://www.computationalbiology.ufpa.br/panweb.Entities:
Mesh:
Year: 2017 PMID: 28542514 PMCID: PMC5443543 DOI: 10.1371/journal.pone.0178154
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Name and description of each graph.
| Graph | Description |
|---|---|
| Pan/core genome | Presents the pan-genome and core-genome profiles. |
| Vertical bar graph (blue) | Graph with the number of orthologous and paralogous genes shared among |
| Horizontal bar graph | Graph showing the number of unique genes in each strain, i.e., strain-specific genes. |
| Pie charts | Displays the proportion of homologous genes shared among |
| Phylogenetic trees graphs | Graphs are generated from 3 algorithms: neighbor-joining, UPGMA and maximum likelihood (ML). |
Fig 1Pan-genome and core-genome.
Graph representing the pan-genome (blue) and core-genome (red) of the 45 analyzed genomes. The graph also shows the α coefficient value of Heap’s Law when fitting the curve to the mean (yellow curve) or median (green curve) of the different boxplots.
Fig 2Phylogenetic trees based on the NJ algorithm.
Phylogenetic trees showing species evolution analysis for the 45 strains based on the NJ algorithm. The left graph is based on the gene distance matrix for core gene clusters, and the right graph is based on indel variations in core-gene clusters.