| Literature DB >> 20435682 |
Yucheng Shao1, Xinyi He, Ewan M Harrison, Cui Tai, Hong-Yu Ou, Kumar Rajakumar, Zixin Deng.
Abstract
mGenomeSubtractor performs an mpiBLAST-based comparison of reference bacterial genomes against multiple user-selected genomes for investigation of strain variable accessory regions. With parallel computing architecture, mGenomeSubtractor is able to run rapid BLAST searches of the segmented reference genome against multiple subject genomes at the DNA or amino acid level within a minute. In addition to comparison of protein coding sequences, the highly flexible sliding window-based genome fragmentation approach offered can be used to identify short unique sequences within or between genes. mGenomeSubtractor provides powerful schematic outputs for exploration of identified core and accessory regions, including searches against databases of mobile genetic elements, virulence factors or bacterial essential genes, examination of G+C content and binucleotide distribution bias, and integrated primer design tools. mGenomeSubtractor also allows for the ready definition of species-specific gene pools based on available genomes. Pan-genomic arrays can be easily developed using the efficient oligonucleotide design tool. This simple high-throughput in silico 'subtractive hybridization' analytical tool will support the rapidly escalating number of comparative bacterial genomics studies aimed at defining genomic biomarkers of evolutionary lineage, phenotype, pathotype, environmental adaptation and/or disease-association of diverse bacterial species. mGenomeSubtractor is freely available to all users without any login requirement at: http://bioinfo-mml.sjtu.edu.cn/mGS/.Entities:
Mesh:
Year: 2010 PMID: 20435682 PMCID: PMC2896100 DOI: 10.1093/nar/gkq326
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.In silico ‘subtractive hybridization’ based comparative genomics using mGenomeSubtractor with chromosomal sequences of Streptomyces coelicolor A3(2) (reference genome), S. avermitilis MA-4680 and S. lividans TK24 as the inputs. (A) Histogram of BLASTP-based Ha-values for all 7769 annotated chromosomal CDS in S. coelicolor A3(2) against all the annotated CDS in the two subject genomes, S. avermitilis MA-4680 (‘NC_003155’ in green) and S. lividans TK24 (‘upload_sub_genome_1’ in red). The Ha-value reflects the degree of similarity in terms of the length of match and the degree of identity at an amino acid level between the matching CDS in the subject genome and the query CDS examined. (B) An expanded view of the hypervariable region highlighted by a blue rectangle in (C) corresponding to the 17-kb SLP1-like island of S. coelicolor A3(2). The island boundaries are flanked with the Tyr tRNA gene (red arrow) and direct repeats (green triangles) as indicated. CDS are colour-coded based on their COG assignment. The S. coelicolor A3(2) unique SCO4231 codes for a putative Type IV restriction endonuclease that cleaves both Dcm methylated and Dnd phosphorothioated DNA. The two comparator genomes are shown below. Black bars indicate the extent of S. coelicolor A3(2) CDS within this region that are also present in the individual comparator genomes. A G+C profile of the selected S. coelicolor A3(2) region is shown topmost. (C) Chromosome map of S. coelicolor A3(2) with CDS colour-coded based on the number of comparator Streptomyces genomes identified as harbouring a amino acid sequence-conserved homologue. The core region spanning coordinates 1.5–6.4 Mb is highlighted with a light pink background while the 1.5 Mb left and 2.3 Mb right arms are backgrounded in sky blue (23). CDS shown in black (‘2’) are conserved across both the S. avermitilis MA-4680 and S. lividans TK24 comparator genomes, while at the other extreme those shown in white (‘0’) are unique to S. coelicolor A3(2). The strain-specific CDS were identified based on a Ha-value cutoff of <0.64.
Figure 2.In silico ‘subtractive hybridization’ of Streptomyces lividans TK24 against its close relative S. coelicolor A3(2) using mGenomeSubtractor. (A) Circular map of S. lividans TK24 linear chromosome showing inferred contigs (ICs) and inferred gaps (IGs). The ICs (in blue) are produced by merging adjacent genes classified as ‘conserved’ with the BLASTN-based H-values <0.81, whereas IGs (in red) comprise contiguous CDS defined as ‘strain-specific’. (B) Table containing feature details of mGenomeSubtractor-predicted S. lividans TK24 ‘accessory’ genomic regions borne on identified IGs. The variable region covering IG46, IG47 and IG48 highlighted in pink exhibits near identity to matching regions of the 94 kb dnd-encoding SLG island originally identified in S. lividans 1326. IG164 highlighted in orange is a likely novel S. lividans-specific island. (C) A magnified view of the IG164 island. (D) Schematic of IG164 with individual CDS color-coded by COG assignment. (E) PCR assays for detecting IG164 signals with the primers denoted as arrowheads in (D) targeting SSPG_06582 (green), SSPG_06599 (brown), and the left (purple) and right (yellow) IG164 boundaries. Primer3Plus (17) was integrated to facilitate design of PCR primers. DNA templates used were from S. lividans 1326 (top panel), S. coelicolor A3(2) (middle) and S. avermitilis MA-4680 (bottom), respectively.