Literature DB >> 30186253

PGAweb: A Web Server for Bacterial Pan-Genome Analysis.

Xinyu Chen¹, Yadong Zhang^2,3,4, Zhewen Zhang^2,3, Yongbing Zhao⁵, Chen Sun⁶, Ming Yang⁷, Jinyue Wang^2,3,4, Qian Liu^1,8, Baohua Zhang^1,8, Meili Chen^2,3, Jun Yu^3,4, Jiayan Wu³, Zhong Jin^1,8, Jingfa Xiao^2,3,4.

Abstract

An astronomical increase in microbial genome data in recent years has led to strong demand for bioinformatic tools for pan-genome analysis within and across species. Here, we present PGAweb, a user-friendly, web-based tool for bacterial pan-genome analysis, which is composed of two main pan-genome analysis modules, PGAP and PGAP-X. PGAweb provides key interactive and customizable functions that include orthologous clustering, pan-genome profiling, sequence variation and evolution analysis, and functional classification. PGAweb presents features of genomic structural dynamics and sequence diversity with different visualization methods that are helpful for intuitively understanding the dynamics and evolution of bacterial genomes. PGAweb has an intuitive interface with one-click setting of parameters and is freely available at http://PGAweb.vlcc.cn/.

Entities: Chemical Disease Species

Keywords: bacterial pan-genome analysis; genome dynamics visualization; genome structure interpretation; orthologous gene identification; web server

Year: 2018 PMID： 30186253 PMCID： PMC6110895 DOI： 10.3389/fmicb.2018.01910

Source DB: PubMed Journal: Front Microbiol ISSN： 1664-302X Impact factor: 5.640

Introduction

The advancement of sequencing technology and assembly tools has facilitated fast acquisition of genome sequences of various species over the past two decades. Tettelin et al. (2005) coined the concept of the pan-genome in. The pan-genome encompasses the entire repertoire of genes accessible to a studied phylogenetic clade or a given species, which is divided into the core genome (genes shared by all strains), dispensable genes (genes shared only by a subset of the strains), and strain-specific (unique) genes. Pan-genomics has far-reaching significance (Vernikos et al., 2015; Xiao et al., 2015), especially in bacterial research, and has been applied to many fields, such as species evolution (Lefebure and Stanhope, 2007; Rasko et al., 2008), investigation of pathogenic and drug resistance mechanisms (D’Auria et al., 2010; Hu et al., 2011; Zhang et al., 2011), vaccine development (Maione et al., 2005; Serruto et al., 2009), host and environment adaptability (Saenz et al., 2007; Sugawara et al., 2013; Tian et al., 2016), and population structures (Donati et al., 2010). To make bacterial pan-genome analysis convenient and efficient, several tools, online databases, and web servers have been developed over the past two decades (Vernikos et al., 2015; Xiao et al., 2015). PGAP (Zhao et al., 2012), GET_HOMOLOGUES (Contreras-Moreira and Vinuesa, 2013), ITEP (Benedict et al., 2014), PanGP (Zhao et al., 2014), Roary (Page et al., 2015), and panX (Ding et al., 2018) were introduced as pan-genome analysis tools for clustering orthologous genes, identifying single nucleotide polymorphisms (SNPs), constructing phylogenies, function-based searching or analysis, and gene curation. However, they all are command-line tools demanding basic programming skills and with lack of in-depth inference and visualization of results. PGAT (Brittnacher et al., 2011), as a web-based database, provides many useful functions, such as predicting protein families, identifying SNPs, and providing analysis tools for Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and other genome annotations, but a limited number of species are integrated and users’ own data cannot be uploaded and analyzed. PanWeb (Pantoja et al., 2017), PANNOTATOR (Santos et al., 2013), and PanCGHweb (Bayjanov et al., 2010) are web servers for pan-genome analysis, but they all suffer from some flaws, such as limited functions or working only with microarray data. In order to perform bacterial pan-genome analysis more efficiently, we have developed a user-friendly online tool, PGAweb, based on our previously published software, PGAP (Zhao et al., 2012), PanGP (Zhao et al., 2014), and PGAP-X (Zhao et al., 2018). PGAP and PanGP have been reported as the well-used toolkits for pan-genome analysis (Vernikos et al., 2015; Xiao et al., 2015; Zekic et al., 2018). To enhance interpretation and visualization of genome dynamics, we have integrated our newly developed software PGAP-X into PGAweb (Zhao et al., 2018). Based on these three tools, PGAweb serves as a fast pan-genome analysis web server with some great improvements. In brief, PGAweb has a user-friendly, web-based interface, one-click input data submission, and smooth and efficient data analysis, not only providing diverse methods for identifying orthologous genes, but also presenting genomic diversity with different visualization methods and genomic dynamics information.

Methods and Implementation

PGAweb consists of two core modules, PGAP and PGAP-X. PGAP supports five main analytic functions including orthologous clustering, pan-genome profiling, sequence variation analysis, species phylogeny, and gene functional classification. PGAP-X performs four comparative genomics assignments: genome-wide sequence alignment, orthologous clustering, pan-genome profiling, and sequence variation analysis (Figure ). Compared with PGAP, PGAP-X introduces a novel algorithm for orthologous cluster identification, which can distinguish paralogs based on conserved genomic location from a genome alignment. PGAP-X also supports a visualization function for genome-wide sequence alignment, genome structure variation, and orthologous mapping based on sequence conservation. Users are able to choose one of the modules according to their research objectives. Organization of PGAweb. All analysis steps are grouped into three parts: data input, data analysis, and output-visualization (shown with different color codes and separated by dashed lines). The input data of each module will be checked. The filtered data will be used in the subsequent analysis. Green boxes report the analysis functions of each module. For the output, PGAweb provides one text file corresponding to a graph (genetic variation analysis in the PGAP module only provides tabular text).

PGAP Module

In this module, two different methods, MultiParanoid (MP) and GeneFamily (GF), are provided for orthologous clustering. In the MP method, first, orthologs between each pair of strains are identified by InParanoid (Remm et al., 2001; Ostlund et al., 2010); MP is then invoked to search for gene clusters among multiple strains based on the result of pairwise orthologous clusters from Inparanoid (Alexeyenko et al., 2006). With the GF method, all the protein sequences from the input data are first mixed together and an all vs. all sequence alignment is performed by the BLASTALL program; then, the MCL algorithm (Enright et al., 2002; Van Dongen, 2008) is directly invoked to cluster the filtered BLAST results. To detect the relationships between the numbers of genomes and pan-genome size or core gene clusters, pan-genome profiles are calculated based on orthologous clustering from all strains with the Heaps’ law or exponential model (Tettelin et al., 2005; Rasko et al., 2008). Genetic variation analysis and gene functional classification are also performed in this module. For each gene cluster, mutations, and indels are detected by protein sequence alignment of each cluster using MAFFT (Katoh and Toh, 2010) and then the aligned amino acid sequences are converted into corresponding codon-based nucleotide sequences. Functional classification analysis of gene clusters includes description of each cluster’s function and Clusters of Orthologous Groups (COG) classification (Tatusov et al., 2000) according to uploaded annotation files. The function description with highest frequency excluding “hypothetical protein” for all genes in each cluster is taken as the description of the cluster. For further elucidating genomic diversity, species evolution analysis is performed by two algorithms [Neighbor-joining and Unweighted Pair Group Method with Arithmetic Mean (UPGMA)] using PHYLIP (Felsenstein, 1989) based on two types of data: gene gain/loss matrix or SNPs in core gene clusters. The maximum-likelihood algorithm is only used in the analysis based on SNPs in core gene clusters.

PGAP-X Module

In this module, a new in-house algorithm is used for orthologous gene clustering (Zhao et al., 2018). Whole-genome sequences or pseudo-chromosome sequences are aligned by progressiveMauve (Darling et al., 2010). In visualizing alignment results, colored blocks are used to display homologous DNA fragments across strains, which highlight conserved regions, strain-specific blocks, and recombination events including insertions, inversions, and deletions on a large scale. The distribution of orthologous genes in each conserved block is displayed in different colors that represent degrees of conservation for each gene. A pan-genome profile is calculated based on the same methods as in the PGAP module. Genetic variation analysis among pairwise or multiple genomes is presented. In this part, two vital parameters, variation frequency (f), and variation number (n) in 1 kb regions, are used to filter out high-substitution regions. Regions with the following traits will be identified as high-substitution regions: (a) no fewer than n substitution sites in the region; (b) variation frequency in the region is more than f; (c) the interval between any two substitution sites is less than 1/f. The high-substitution regions are marked on the genome structure. For pairwise variation analysis, a reference genome and a query genome should be selected.

Implementation

The web server is compatible with many platforms, including Safari, Firefox, Opera, Chrome, and Internet Explorer, by using Bootstrap Framework[1] as the front-end program. It adopts HTML5 and CSS3 protocols and uses D3.js[2] and Echarts[3] for result interaction. The server back-end adopts an Express Framework[4] for Node.js MongoDB[5] is used for the storage of related information. Docker[6] is used as a container for packaging codes and software, which improves compatibility, portability, and safety. This website is open to all users, and login is not required.

Workflow of PGAweb

Step 1: Module Selection

The general workflow of PGAweb is shown in Figure . The first step is module selection for pan-genome analysis according to data type and research objectives. Two different pan-genome analysis modules are integrated in PGAweb. The PGAP module supports both draft and complete genomes and provides five main results, including orthologous clustering, pan-genome profiling, sequence variation analysis, species phylogeny, and gene functional classification. Most results generated by the PGAP module are text files. Detailed information can be found in these text files for in-depth pan-genome analysis. The PGAP-X module only supports analysis of complete genomes. The PGAP-X module’s capabilities include genome alignment, orthologous gene identification, genome variation detection, and pan-genome profile analysis. PGAP-X provides an in-house algorithm that can distinguish paralogous genes from orthologous genes based on conserved genomic location. The visualization function for genome structure conservation and gene distribution is integrated in the PGAP-X module. Step by step workflow of PGAweb. Step 1: Module selection. Step 2: Input data upload and parameter setting. (1) Data browse and upload, (2) method and function selection, (3) parameter setting, and (4) fill in e-mail address and submit job. Step 3: Results download.

Step 2: Input Data Upload and Parameter Setting

This step contains four procedures (Figure ). First, specific input data for pan-genome analysis should be selected and uploaded. The PGAP module requires three types of files for each strain: nucleotide and protein sequences as well as annotations. Four input formats are selectable: (1) NCBI new data format: .cds_from_genomic.fna.gz, protein.faa.gz, and .feature_table.txt.gz extensions. These files can be downloaded from ftp://ftp.ncbi.nlm.nih.gov/genomes/all/. (2) NCBI old data format: .ffn, .faa, and .ptt extensions. These files can be downloaded at ftp://ftp.ncbi.nlm.nih.gov/genomes/archive/old_genbank/. (3) PGAP raw data format: .nuc, .pep, and .function extensions. These data may come from the user’s own laboratory..nuc files store nucleotide sequences in FASTA format; .pep files store corresponding protein sequences in FASTA format; .function files store annotations with three columns including gene name, COG classification, and function description. (4) NCBI GenBank (full) format: .gb extension. In addition, each protein sequence should have its corresponding nucleotide sequence and annotation information, while the length of the nucleotide sequence minus 3 (excluded stop codon) must be equal to three times the corresponding protein length. All uploaded files are auto-transformed to corresponding PGAP raw data format and criterion checking is performed. The PGAP-X module needs a complete genome sequence and annotation information for each strain. The input files should have .fna and .ptt extensions. More detailed description is available at http://pgaweb.vlcc.cn/doc. The second procedure is the selection of the orthologous gene identification method and pan-genome analysis functions. In the PGAP module, either the GF method or the MP method can be chosen for orthologous gene identification. The GF method is fast, but less accurate compared with the MP method. The GF method is suitable choice for handling more than 50 genomes with consideration of calculation time. In the PGAP-X module, there is only one method for orthologous gene identification. After selection of gene identification method, users need to select the desired functions for pan-genome analysis. Homologous gene clustering for the PGAP module and genome alignment for the PGAP-X module is required. Third, parameters for selected functions need to be set. Finally, users may provide e-mail addresses for obtaining notification of finished results and click the submit button for running PGAweb.

Step 3: Output Description

The output of the PGAP module is composed of five parts: orthologous clustering, pan-genome profile, evolutionary analysis, sequence variation analysis, and functional classification. For orthologous clustering, the results include an intuitive figure showing statistical gene distribution by conservation and two downloadable tabular files (1.Gene_Distribution_By_Conservation.txt and 1.Orthologs_Cluster.txt). For the pan-genome profile, the relationships between the numbers of genomes and pan-genome size or core gene clusters are calculated. Whether a pan-genome is open or closed can be easily inferred from visual results. Pan-genome profiles provide indirect indications of lifestyle and opportunity for multiple mechanisms of DNA exchange with the global microbial gene pool of the specific species. Multiple algorithms for bacterial evolutionary analysis are provided to improve the accuracy of phylogenetic inference, especially for closely related species. All phylogenetic trees calculated by PHYLIP are output as .tree files, and the user can use visualization software to view the phylogenetic trees. Files with .outfile extension give users a simple view of the phylogenetic trees. For sequence variation analysis, information on positions with variation, amino acid types, nucleotide types, and variation types (indels, and synonymous and non-synonymous mutations) are presented in tabular file 3.CDS.variation.txt. All statistical results of functional classification of core genes, dispensable genes, and strain-specific genes are dynamically displayed and compared. This result is useful for speculation of strain-specific physiological function and metabolic mechanism. Moreover, the actual gene lists (5.Orthologs_Core_Cluster_COG_Distribution.list.txt, 5.Orthologs_Dispensable_Cluster_COG_Distribution.list.txt, and 5.Orthologs_specific_Cluster_COG_Distribution.list.txt) used in the functional classification can be downloaded for in-depth analyzes. Four parts of the pan-genome analysis results are showed in the PGAP-X output. Complete genome or pseudo-chromosome sequences are aligned in PGAP-X. In visualizing the alignment result, colored blocks are used to display homologous DNA fragments across strains, which highlight conserved regions, strain-specific blocks, and recombination events including insertions, inversions, and deletions on a large scale. The orthologous gene distribution in each conserved block is displayed in different colors that represent degrees of conservation for each gene. Genomic diversity of the bacterial population and differences in gene content are well depicted, and even horizontal gene transfer (HGT) or homologous recombination events are easily identified. Meanwhile, multiple-genome substitution and pairwise substitution regions are analyzed in PGAP-X. The high-substitution regions are marked on the genome structure with blue bars. For pairwise substitution region analysis, a reference strain must be selected; all sites of variation among paired genomes can be detected and reported in output text files but only high-substitution regions of selected strains are displayed. The influence of selective pressure on each genomic region or gene can be intuitively showed. Some highly variable regions may be involved in pathogenicity islands (PAIs) or related to strain adaptability. This may be used for SNP-based barcode design and gene selection for multilocus sequence typing (MLST). For the pan-genome profile, the result is the same as shown in the PGAP module.

Case Study

To evaluate the performance of PGAweb, we carried out a case study for Streptococcus suis. The two modules, PGAP and PGAP-X, were used with default parameters (GF method was selected to identify orthologous genes in PGAP). The PGAweb results are described as follows. Thirteen previously studied S. suis genomes (Zhang et al., 2011) were reanalyzed by using PGAweb. The genome sequence data can be downloaded easily from NCBI ftp[7] . S. suis is a gram-positive pathogen that usually causes zoonosis (Staats et al., 1997). Thirty-three serotypes are classified based on the composition of their capsular polysaccharides. These strains are classified into seven serotypes including serotype 1 (NC_017950), serotype 1/2 (NC_017619), serotype 2 (NC_009442, NC_009443, NC_017622, NC_012926, NC_017617, NC_012925, and NC_012924), serotype 3 (NC_015433), serotype 7 (NC_017620), serotype 9 (NC_017621), and serotype 14 (NC_017618). The analysis revealed that S. suis has an open pan-genome, whose pan-genome increases in number by 110 genes when a newly sequenced genome is added (Figures ). The phylogenetic results give us a clear idea of the phylogenetic relationship among different serotypes of the 13 S. suis strains. Taking one of the phylogenetic trees (UPGMA tree based on the presence or absence of orthologous genes) as an example (Figure ), the 13 strains are clearly grouped into two subgroups and all the serotype 2 strains are tightly grouped in the same clade. Serotype 1, 3, 7, and 9 strains could be assigned to a common clade. Serotype 14 and 1/2 strains were more closely related to the serotype 2 strains than to the other four serotypes. This result is consistent with earlier findings by Zhang et al. (2011). From genome structure conservation analysis (Figure ), the 89K PAI exists in three serotype 2 strains isolated in China (NC_009442, NC_009443, and NC_012924), while the other two Chinese serotype 2 strains (NC_017617 and NC_017622) harbored on different branches of the phylogenetic tree (Figure ) do not contain this PAI. This result confirms an earlier observation that the 89K PAI is prevalent and specific for Chinese serotype 2 virulent strains (Chen et al., 2007). The genes on the PAI are dispensable genes with lower-grade conservation (Figure ). This suggests that these genes located in a cluster might have been introduced through HGT or homologous recombination with other closely related species (Zhang et al., 2011). Results of the case study. (A) Pan-genome profiles of 13 Streptococcus suis strains. Pan- and core-genome size prediction with all combinations of studied strains. (B) New prediction happens when an additional genome is introduced. (C) Phylogenetic tree based on UPGMA built from 13 S. suis genomes. All strains are labeled with serotype and place of origin. All Chinese serotype two strains are marked in red, Vietnamese serotype two strains in yellow, and European serotype two strains in blue. (D) Alignment result of 13 S. suis genomes. The red arrow indicates Chinese serotype 2 S. suis strains. Red rectangles indicate the ∼89-kb PAI (candidate pathogenicity island) proposed by Chen et al. (2007). (E) Orthologous gene distribution of 13 S. suis genomes. Different colors represent degrees of conservation for each gene (PGAP-X).

Discussion

Since the term of microbial pan-genome was introduced in 2005, several stand-alone and web-based tools have been developed for pan-genome analyses (Supplementary Table ). However, stand-alone tools always require basic programming skills for installation and operation. We developed PGAweb for in-depth inference and visualization in bacterial pan-genome analysis. PGAweb has a user-friendly, web-based interface, one-click input data submission, and smooth and efficient data analysis. No installation, command lines, or informatics skills are necessary for using PGAweb, making it very efficient for users, especially experimental scientists. PGAweb provides various interactive and customizable functions including orthologous clustering, pan-genome profiling, sequence variation, evolutionary analysis, and functional classification. Different visualization methods are used for enhanced genomic structural dynamics and sequence diversity display. These are helpful for intuitively understanding the dynamics and evolution of bacterial genomes. According to the pan-genome concept, three genomic data-sets is the minimum number for pan-genome analysis. In the case of technical and resource limitations, PGAweb is only suitable for performing analyzes on the pan-genome of a moderate number of strains (no more than 50 strains). The key issue for large scale pan-genome analysis is the input data uploading with limited network speed. We estimate that the input data size will be exceed 500 Mb for 50 strains pan-genome analysis, which will cost users a very long time for input data uploading. Furthermore, many factors may influence the runtime of PGAweb, such as the number input data sets, genome sizes, selected method, etc. The running time grows exponentially with the increase in number of input data sets due to the computational complexity (Supplementary Figure ). The current version of PGAweb still has some shortcomings, which only supports the pan-genome analysis of bacterial and archaeal genomes so far. With the development of sequencing technology, the pan-genome concept has also been widely utilized to perform comparative genomic analysis in fungi (Dunn et al., 2012) and plants (Cao et al., 2011; Li et al., 2014; Golicz et al., 2016). In a future version, we plan to develop a new algorithm to support the analysis of eukaryotes, especially plants and animals, and integrate more efficient methods for large scale pan-genome analysis.

Conclusion

Here, we present a fast and freely available online server, PGAweb, with integration of our previously published software, PGAP, PanGP, and PGAP-X. In brief, PGAweb has a user-friendly web interface, one-click input data submission, and smooth and efficient data analysis, and provides not only diverse methods for identification of orthologous genes but also powerful tools to display genome dynamics and structure characteristics. Case study indicates that PGAweb can intuitively reflect the characteristics of bacterial pan-genomes such as an open or closed pan-genome. Orthologous clustering, gene distribution based on conservation, genome structure conservation, and genetic variation, as well as their visualization, are all helpful for biologists to study bacterial genome dynamics to identify pathogenic and drug-resistant genes and their sequence contexts.

Author Contributions

JX, ZJ, and JAW conceived and designed the web server. XC performed the server front-end and back-end. YAZ carried out the case studies. YOZ, CS, MY, JNW, QL, BZ, and MC contributed to the web server testing. YAZ, ZZ, and JX wrote the paper. JY polished the language and gave many constructive suggestions.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

40 in total

Review 1. Genome-based approaches to develop vaccines against bacterial pathogens.

Authors: Davide Serruto; Laura Serino; Vega Masignani; Mariagrazia Pizza
Journal: Vaccine Date: 2009-02-05 Impact factor: 3.641

2. Identification of a universal Group B streptococcus vaccine by multiple genome screen.

Authors: Domenico Maione; Immaculada Margarit; Cira D Rinaudo; Vega Masignani; Marirosa Mora; Maria Scarselli; Hervé Tettelin; Cecilia Brettoni; Emilia T Iacobini; Roberto Rosini; Nunzio D'Agostino; Lisa Miorin; Scilla Buccato; Massimo Mariani; Giuliano Galli; Renzo Nogarotto; Vincenzo Nardi-Dei; Vincenzo Nardi Dei; Filipo Vegni; Claire Fraser; Giuseppe Mancuso; Giuseppe Teti; Lawrence C Madoff; Lawrence C Paoletti; Rino Rappuoli; Dennis L Kasper; John L Telford; Guido Grandi
Journal: Science Date: 2005-07-01 Impact factor: 47.728

3. Legionella pneumophila pangenome reveals strain-specific virulence factors.

Authors: Giuseppe D'Auria; Nuria Jiménez-Hernández; Francesc Peris-Bondia; Andrés Moya; Amparo Latorre
Journal: BMC Genomics Date: 2010-03-17 Impact factor: 3.969

4. Analysis of the Saccharomyces cerevisiae pan-genome reveals a pool of copy number variants distributed in diverse yeast strains from differing industrial environments.

Authors: Barbara Dunn; Chandra Richter; Daniel J Kvitek; Tom Pugh; Gavin Sherlock
Journal: Genome Res Date: 2012-02-27 Impact factor: 9.043

5. A glimpse of streptococcal toxic shock syndrome from comparative genomics of S. suis 2 Chinese isolates.

Authors: Chen Chen; Jiaqi Tang; Wei Dong; Changjun Wang; Youjun Feng; Jing Wang; Feng Zheng; Xiuzhen Pan; Di Liu; Ming Li; Yajun Song; Xinxing Zhu; Haibo Sun; Tao Feng; Zhaobiao Guo; Aiping Ju; Junchao Ge; Yaqing Dong; Wen Sun; Yongqiang Jiang; Jun Wang; Jinghua Yan; Huanming Yang; Xiaoning Wang; George F Gao; Ruifu Yang; Jian Wang; Jun Yu
Journal: PLoS One Date: 2007-03-21 Impact factor: 3.240

6. Roary: rapid large-scale prokaryote pan genome analysis.

Authors: Andrew J Page; Carla A Cummins; Martin Hunt; Vanessa K Wong; Sandra Reuter; Matthew T G Holden; Maria Fookes; Daniel Falush; Jacqueline A Keane; Julian Parkhill
Journal: Bioinformatics Date: 2015-07-20 Impact factor: 6.937

7. PGAP-X: extension on pan-genome analysis pipeline.

Authors: Yongbing Zhao; Chen Sun; Dongyu Zhao; Yadong Zhang; Yang You; Xinmiao Jia; Junhui Yang; Lingping Wang; Jinyue Wang; Haohuan Fu; Yu Kang; Fei Chen; Jun Yu; Jiayan Wu; Jingfa Xiao
Journal: BMC Genomics Date: 2018-01-19 Impact factor: 3.969

8. PanWeb: A web interface for pan-genomic analysis.

Authors: Yan Pantoja; Kenny Pinheiro; Allan Veras; Fabrício Araújo; Ailton Lopes de Sousa; Luis Carlos Guimarães; Artur Silva; Rommel T J Ramos
Journal: PLoS One Date: 2017-05-24 Impact factor: 3.240

9. InParanoid 7: new algorithms and tools for eukaryotic orthology analysis.

Authors: Gabriel Ostlund; Thomas Schmitt; Kristoffer Forslund; Tina Köstler; David N Messina; Sanjit Roopra; Oliver Frings; Erik L L Sonnhammer
Journal: Nucleic Acids Res Date: 2009-11-05 Impact factor: 16.971

10. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

Authors: Xinpeng Tian; Zhewen Zhang; Tingting Yang; Meili Chen; Jie Li; Fei Chen; Jin Yang; Wenjie Li; Bing Zhang; Zhang Zhang; Jiayan Wu; Changsheng Zhang; Lijuan Long; Jingfa Xiao
Journal: Front Microbiol Date: 2016-06-27 Impact factor: 5.640

5 in total

1. Determining the Genetic Characteristics of Resistance and Virulence of the "Epidermidis Cluster Group" Through Pan-Genome Analysis.

Authors: Zhewei Sun; Danying Zhou; Xueya Zhang; Qiaoling Li; Hailong Lin; Wei Lu; Hongmao Liu; Junwan Lu; Xi Lin; Kewei Li; Teng Xu; Qiyu Bao; Hailin Zhang
Journal: Front Cell Infect Microbiol Date: 2020-06-12 Impact factor: 5.293

2. M1CR0B1AL1Z3R-a user-friendly web server for the analysis of large-scale microbial genomics data.

Authors: Oren Avram; Dana Rapoport; Shir Portugez; Tal Pupko
Journal: Nucleic Acids Res Date: 2019-07-02 Impact factor: 16.971

Review 3. Bioinformatics for Marine Products: An Overview of Resources, Bottlenecks, and Perspectives.

Authors: Luca Ambrosino; Michael Tangherlini; Chiara Colantuono; Alfonso Esposito; Mara Sangiovanni; Marco Miralto; Clementina Sansone; Maria Luisa Chiusano
Journal: Mar Drugs Date: 2019-10-11 Impact factor: 5.118

4. Root Exudates Alter the Expression of Diverse Metabolic, Transport, Regulatory, and Stress Response Genes in Rhizosphere Pseudomonas.

Authors: Olga V Mavrodi; Janiece R McWilliams; Jacob O Peter; Anna Berim; Karl A Hassan; Liam D H Elbourne; Melissa K LeTourneau; David R Gang; Ian T Paulsen; David M Weller; Linda S Thomashow; Alex S Flynt; Dmitri V Mavrodi
Journal: Front Microbiol Date: 2021-04-14 Impact factor: 5.640

5. PanExplorer: A web-based tool for exploratory analysis and visualization of bacterial pan-genomes.

Authors: Alexis Dereeper; Marilyne Summo; Damien F Meyer
Journal: Bioinformatics Date: 2022-08-02 Impact factor: 6.931

5 in total