Literature DB >> 30383795

Comparative genomic and methylome analysis of non-virulent D74 and virulent Nagasaki Haemophilus parasuis isolates.

Tracy L Nicholson1, Brian W Brunelle1, Darrell O Bayles1, David P Alt1, Sarah M Shore1.   

Abstract

Haemophilus parasuis is a respiratory pathogen of swine and the etiological agent of Glässer's disease. H. parasuis isolates can exhibit different virulence capabilities ranging from lethal systemic disease to subclinical carriage. To identify genomic differences between phenotypically distinct strains, we obtained the closed whole-genome sequence annotation and genome-wide methylation patterns for the highly virulent Nagasaki strain and for the non-virulent D74 strain. Evaluation of the virulence-associated genes contained within the genomes of D74 and Nagasaki led to the discovery of a large number of toxin-antitoxin (TA) systems within both genomes. Five predicted hemolysins were identified as unique to Nagasaki and seven putative contact-dependent growth inhibition toxin proteins were identified only in strain D74. Assessment of all potential vtaA genes revealed thirteen present in the Nagasaki genome and three in the D74 genome. Subsequent evaluation of the predicted protein structure revealed that none of the D74 VtaA proteins contain a collagen triple helix repeat domain. Additionally, the predicted protein sequence for two D74 VtaA proteins is substantially longer than any predicted Nagasaki VtaA proteins. Fifteen methylation sequence motifs were identified in D74 and fourteen methylation sequence motifs were identified in Nagasaki using SMRT sequencing analysis. Only one of the methylation sequence motifs was observed in both strains indicative of the diversity between D74 and Nagasaki. Subsequent analysis also revealed diversity in the restriction-modification systems harbored by D74 and Nagasaki. The collective information reported in this study will aid in the development of vaccines and intervention strategies to decrease the prevalence and disease burden caused by H. parasuis.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 30383795      PMCID: PMC6211672          DOI: 10.1371/journal.pone.0205700

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Haemophilus parasuis is a small, Gram negative, non-motile, pleomorphic rod-shaped, and nicotinamide adenine dinucleotide (NAD)-dependent bacterium of the Pasteurellaceae family [1, 2]. H. parasuis is a respiratory pathogen affecting swine and is the etiological agent of Glässer's disease, a systemic infection resulting in arthritis, polyserositis (inflammation of serous membranes), and meningitis [2-4]. Additionally, H. parasuis infections can lead to pneumonia without signs of systemic disease in swine [5-7]. The morbidity and mortality caused by H. parasuis is a significant source of economic loss to the swine industry worldwide. Serotyping based on the production of heat-stable antigens, in which capsular polysaccharide is presumed to be the dominant component of the serotyping antigen, is routinely used for isolate classification and epidemiological purposes as well as for guidance in regards to vaccination strategies. Fifteen serovars of H. parasuis have been defined, however, a substantial percentage of clinical isolates are identified as nontypeable (NT) using conventional indirect hemagglutination (IHA) methods [8, 9]. Progress to alleviate this problem has been made with the determination of the nucleotide sequence of the capsule locus from fifteen serovar reference isolates, which has been used to develop molecular serotyping methods [10-12]. H. parasuis isolates can exhibit different virulence capabilities ranging from lethal systemic disease to subclinical carriage. Numerous studies have focused on the identification of virulence factors that enable some isolates to cause systemic disease, distinguishing them from isolates that remain colonizers of the upper respiratory tract. Examples of potential virulence factors that have been evaluated to date include capsule production, outer membrane proteins (OMPs), trimeric autotransporters, and regulatory proteins QseC and OxyR [13-22]. Despite the advancement in our understanding of the pathogenic mechanisms used by H. parasuis, a direct link between specific virulence factors and the ability to cause systemic disease has not been demonstrated. Accordingly, virulence is thought to be multifactorial [2-4]. Data directly linking specific genes to disease outcomes has been hindered by several substantial complications, the most notable being difficulties in genetic modification of the chromosome due to low transformation efficiencies attributed to strain specific restriction modification barriers, as well as difficulties in consistently reproducing Glässer's disease in conventionally raised pigs due to confounding factors such as age, health status, differences in maternal antibody titers towards H. parasuis, and coinfection with other respiratory pathogens [23-26]. There are no effective approaches to eradicate H. parasuis from pig herds and controlling outbreaks has proven difficult [2, 27]. Although vaccines have been developed, most are comprised of bacterins, resulting in poor heterologous protection. Consequently no broadly protective vaccines or intervention strategies exist [28-30]. The current treatment for H. parasuis is broad spectrum antibiotics, which are expensive and are believed to increase the risk of resistant strain development [29, 31–33]. Additionally, with increased pressure to limit antibiotic use in agriculture, alternative approaches are desperately needed to reduce disease burden and economic losses caused by H. parasuis. In a previous effort to link genomic differences to disease outcome, draft genome sequence data was obtained for ten genetically distinct isolates along with the evaluation of virulence in Caesarean-derived, colostrum deprived (CDCD) pigs [34]. These results demonstrated that strain D74 is a non-virulent colonizer of the upper respiratory tract, while in contrast, strain Nagasaki was highly virulent and capable of causing systemic disease [34]. Many genomic differences, including gene content and/or nucleotide variation, were identified that could account for the phenotypic difference between the strains [34]. Unfortunately many genes or regions of interest within each strain were incomplete, preventing a reliable one-to-one assignment and subsequent comparison of any predicted protein structure. In order to definitively identify and characterize the genomic differences between the highly virulent Nagasaki strain and the non-virulent D74 strain, the goal of our report was to obtain the closed whole-genome sequence and genome-wide methylation patterns between these phenotypically distinct strains.

Materials and methods

Genome sequencing and annotation

H. parasuis strain Nagasaki is a Serotype Type 5 reference Strain and a Multilocus sequence typing (MLST) Type 24 strain. H. parasuis strain D74 is a Serotype Type 9 reference Strain and a MLST Type 25 strain. Strains were cultured in Brain Heart Infusion (BHI) Broth (BD Biosciences, Sparks, MD) supplemented with 5% filtered heat-inactivated horse serum (GIBCO, Life Technologies, Grand Island, NY) and 0.01% (w/v) nicotinamide adenine dinucleotide (NAD) (Sigma-Aldrich, St. Louis, MO) at 37°C in 5% CO2 for 24 hours and total genomic DNA was extracted using the High Pure PCR Template Preparation Kit (Roche Applied Science, Indianapolis, IN). Whole genome sequencing was performed using both the Pacific Biosciences (PacBio) and Illumina MiSeq platforms. Library preparation for PacBio sequencing was performed following the PacBio 10-kb insert library preparation protocol available online at (http://www.pacb.com/wp-content/uploads/2015/09/Procedure-Checklist-10-kb-Template-Preparation-and-Sequencing.pdf). The 10 kb library for each strain was sequenced using the PacBio RSII platform with two SMRT cells for each isolate. Indexed libraries for the MiSeq protocol were generated with the Nextera XT DNA sample preparation and index kits (Illumina, San Diego, CA), pooled, and sequenced using MiSeq v2 500-Cycle reagent kit yielding 2 x 250-bp paired-end reads (Illumina, San Diego, CA). Whole genome assemblies were generated using the PacBio smrtanalysis v. 2.3.0 (https://www.pacb.com/products-and-services/analytical-software/smrt-analysis/) and CANU v. 1.3 [35] software. The average PacBio coverage for the assembled genomes was 805x for Nagasaki and 1,284x for D74. Assembling the PacBio data for each strain resulted in a fully sequenced closed circular chromosome, which was subsequently oriented to start at the dnaA gene and trimmed by removing any overlapping sequence. The genomes were then polished and error corrected using the Broad Institute’s Pilon v 1.18 [36] and Illumina data 103x and 120x average coverage for Nagasaki and D74, respectively. The closed genome for each strain was then annotated using NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) and additional curation was performed using the Prokka annotation software (version 1.12) [37] along with a Haemophilus-specific custom database. To compare putative protein sequences between Nagasaki and D74, the RAST prokaryotic genome annotation server [38] (http://rast.nmpdr.org/) was used to map annotated protein coding sequences (CDS) to functional subsystems and performed a one-to-one BLASTP comparison between the strains. Genome organization was evaluated using the Artemis Comparison Tool [39] and Mauve [40].

Plasmid sequencing and annotation

Assembled PacBio data for pD74 did not initially result in a circularized plasmid sequence. The complete nucleotide sequence was subsequently determined using a primer walking strategy employing five separate plasmid preparations isolated from strain D74 using a Wizard Plus SV Minipreps DNA Purification System (Promega, Madison, WI) according to the manufacturer's protocol. The resulting circularized plasmid sequence was then polished and error corrected using the Broad Institute’s Pilon v 1.18 [36] and Illumina data 104x average coverage for strain D74. The assembled pD74 sequence was then annotated using NCBI's Prokaryotic Genome Annotation Pipeline (PGAP). Further sequence analysis was carried out using BLASTN, BLASTX, and Tandem Repeats Finder [41].

IS and CRISPR analysis

Genomes were submitted to ISfinder (https://www-is.biotoul.fr/) [42] for the identification of bacterial IS elements using default parameters. Genomes were submitted to CRISPRFinder (http://crispr.i2bc.paris-saclay.fr/Server/) [43] for the identification CRISPR elements using default parameters.

Phenotypic analysis

Phenotypic antibiotic resistance was determined using the broth microdilution method by Iowa State University Veterinary Diagnostic Laboratory following standard operating procedures. Each isolate was tested using the Trek BOPO6F plate (Thermo Fisher Scientific Inc., Oakwood Village, OH) and minimum inhibitory concentrations (MICs) were determined. MICs were evaluated in accordance with Clinical Laboratory Standards Institute (CLSI) recommendations for resistance interpretations.

Genomic antimicrobial resistance (AMR) analysis

ResFinder 2.1 from the Center for Genomic Epidemiology (http://www.genomicepidemiology.org/) and the Comprehensive Antibiotic Resistance Database (CARD) (https://card.mcmaster.ca/home) were employed for AMR determinant identification. Genomes submitted to ResFinder 2.1 were evaluated for AMR determinants using starting parameters of a threshold ID of 90% and a minimum length of 60% and final parameters of a threshold ID of 30% and a minimum length of 20%. Genomes submitted to CARD were evaluated for AMR determinants using the criteria “default–perfect and strict hits only”.

Capsular loci analysis

The nucleotide sequences of genes within the capsule loci of Nagasaki and D74 reported by Howell et al. [10] were obtained from NCBI. Alignments of individual gene and protein sequences, as well as calculation of percent identities, were performed using the Geneious Alignment tool in Geneious 10.1.3 (Biomatters Ltd., Auckland, New Zealand). Global alignment with free end gaps parameters were used for both nucleotide and protein sequence alignments followed by determining the percent identity relative to Howell et al. [10]. Nucleotide and amino acid insertions are reported using HGVS nomenclature [44].

cdiA gene analysis

The nucleotide sequences of genes within the cdiA region of D74 were evaluated using BLASTX. Translated coding sequences were extracted and evaluated for the occurrence of any domain using the Pfam 31.0 database (https://pfam.xfam.org/) [45].

vtaA gene analysis

To ensure that all potential vtaA genes were identified in both Nagasaki and D74 genome annotations, the translated coding sequences were extracted and batch queried using the Pfam 31.0 database (https://pfam.xfam.org/) [45] for all occurrences of a YadA_anchor domain (PF03895) [15]. Geneious 10.1.3 (Biomatters Ltd., Auckland, New Zealand) was employed to evaluate and compare the genome location of vtaA genes in both Nagasaki and D74, including flanking upstream and downstream genes. The domain architecture and content of each vtaA gene from both Nagasaki and D74 was further evaluated for the occurrence of any domains using the Pfam 31.0 database (https://pfam.xfam.org/) [45]. BLASTN was employed to search sequences upstream of vtaA genes in both Nagasaki and D74 for the occurrence of the Nagasaki vtaA4 promoter sequence identified by Pina et al. [15].

Methylation analysis

Detection of modified bases (m6A, m4C, m5C) and clustering of modified sites to identify methylation associated motifs was performed using the RS_Modification_and_Motif_analysis.1 tool from the SMRT analysis package version 2.3.0. Briefly, raw reads were aligned to the complete genomes of D74 and Nagasaki and interpulse duration (IPD) ratios were measured for all pulses aligned to each position in the reference sequence (http://www.pacb.com/pdf/TN_Detecting_DNA_Base_Modifications.pdf). [46]

SSR analysis

The nucleotide sequences of all putative RM genes identified in D74 and Nagasaki were evaluated for the occurrence of SSRs, tandem repeats or homopolymeric tracts of consisting of five or more bases, within the coding region and in the region encompassing 150 bp upstream of the putative start codon using Geneious 10.1.3 (Biomatters Ltd., Auckland, New Zealand).

Accession number(s)

The whole-genome sequences for these isolates were deposited in DDBJ/ENA/GenBank with the accession numbers CP018034 for Nagasaki and CP018032 for D74 genome and CP18033 for the plasmid sequence. The sequence data, target sequences and associated details for methylation enzymes, used for analyses in this report have been deposited in the REBASE database (www.rebase.neb.com) [47]. RM-system and methylation motifs for both strains can be accessed via the index of the REBASE database (http://tools.neb.com/genomes/) or directly via this link: http://rebase.neb.com/cgi-bin/pacbioget?20940 for D74 or http://rebase.neb.com/cgi-bin/pacbioget?20939 for Nagasaki.

Results and discussion

Genome features of H. parasuis strains D74 and Nagasaki

The complete genome assembly and annotation of H. parasuis strain Nagasaki contains a single circular chromosome 2,348,962 base pairs (bp) in length, encodes a total of 2,268 predicted protein coding sequences (CDSs), and a G+C content of 40.0% (Table 1). Of the 2,268 CDSs, 95 were predicted to be pseudogenes. The complete genome assembly and annotation of H. parasuis strain D74 encompasses a single circular chromosome 2,467,568 base pairs (bp) in length, a total of 2,252 total predicted protein coding sequences (CDSs), and a G+C content of 39.7% (Table 1). Similar to Nagasaki, 95 out of the 2,252 CDSs were predicted to be pseudogenes. Despite the equivalent number, none of the predicted pseudogenes were similar between the stains (S1 Table and S2 Table). The difference in rRNA numbers between D74 and Nagasaki is consistent with other closed H. parasuis genomes available in GenBank (Table 1). The genomes of H. parasuis Nagasaki and D74 were assessed using PHASTER (http://phaster.ca/) [48] for the occurrence of phage regions. Nine phage regions, seven intact and two incomplete, were identified along the Nagasaki chromosome (Table 2). Six phage regions, one intact and five incomplete, were also identified along the D74 chromosome (Table 1). The phage regions identified are unique to each genome. The chromosomal location, including start and end nucleotide positions, length of phage region, and classification of phage regions for both strains are summarized in Table 2.
Table 1

General features of the genomes of H. parasuis strains D74 and Nagasaki.

D74Nagasaki
SequenceType (ST)95
Chromosome Size (bp)2,467,5682,348,962
G + C Content (%)39.7%40.0%
Total CDSs2,2522,268
Pseudogenesa9595
Functional CDSs2,1572,173
rRNA (16S-23S-5S)6-6-76-6-8
tRNA5760
Phage Regions69
Plasmid10

aEncoding either an incomplete predicted protein sequence, a frameshift, or an internal stop codon.

Table 2

Phage regions identified in H. parasuis strains D74 and Nagasaki.

StrainRegion #StartaEndaLengthbClassificationc
Nagasaki175,14695,57920,434Incomplete
2578,665613,91835.254Intact
3930,933982,26251.330Intact
4978,2091,022,17343,965Intact
51,146,6221,187,09140,470Intact
61,837,3591,870,44133,083Intact
71,890,2501,901,02310,774Incomplete
82,053,7202,085,57631,857Intact
92,113,9662,153,63239,667Intact
D7411,021,5421,030,1538,612Incomplete
21,179,0901,197,26618,177Incomplete
31,186,4551,216,78630,332Incomplete
41,303,9711,339,24535,275Intact
51,390,8031,407,65116,840Incomplete
61,868,2881,889,81221,525Incomplete

aBasepair chromosomal location.

bLength of region depicted in base pairs.

cClassification determined by PHASTER analysis (http://phaster.ca/) [48].

aEncoding either an incomplete predicted protein sequence, a frameshift, or an internal stop codon. aBasepair chromosomal location. bLength of region depicted in base pairs. cClassification determined by PHASTER analysis (http://phaster.ca/) [48]. An 11,595-bp circular plasmid was additionally identified in the complete genome assembly and annotation of H. parasuis strain D74. With the addition of the plasmid, the total nucleotides, including both the circular chromosome and plasmid, is 2,479,163 bp for strain D74. The plasmid harbored by strain D74, designated as pD74, contains seven CDSs with predicted functions based on sequence homology, including parA, rec, and repB, which have predicted functions in plasmid replication. Additionally, pD74 harbors six CDSs of unknown function and one predicted pseudogene (Fig 1). These CDSs lacked a reciprocal match in Nagasaki. A region containing four copies of a 22-bp tandem repeat sequence was identified upstream of the repB CDS, which could potentially comprise the origin of replication (Fig 1). The sequence spanning from 3,328 to 1,243 bp comprises the entire 9,462 bp sequence of H. parasuis plasmid pHS-Rec (accession no. AY862436) with 99.2% sequence identity (Fig 1) [49]. The sequence spanning from 5,847 to 1,025 bp shares 99.4% sequence identity with Pasteurella trehalosi plasmid pCCK13698 (accession no. AM183225) (Fig 1) [50]. pD74 sequencing from 1,246 to 5,350 bp shares 73.9% sequence identity with a chromosomal region of Gallibacterium anantis strain UMN179 (accession no. CP002667) (Fig 1) [51].
Fig 1

Map of plasmid pD74.

Arrows indicate annotated CDSs; blue arrows represent CDSs with predicted functions based on sequence homology, pink arrows represent predicted CDSs of unknown function. The MFS transporter gene, labeled with a blue-grey arrow, is predicted to be a pseudogene. Dark grey box indicates region containing 4 copies of a 22-bp tandem repeat sequence identified using the Tandem Repeats Finder tool [41]. Arcs inside map indicate sequence with similarity to other sources; green arc represents sequence similar to H. parasuis plasmid pHS-Rec [49], purple arc represents sequence similar to a portion of the Pasteurella trehalosi plasmid pCCK13698 [50], and yellow arc represents sequence similar to a region from the genome sequence of Gallibacterium anantis strain UMN179 [51].

Map of plasmid pD74.

Arrows indicate annotated CDSs; blue arrows represent CDSs with predicted functions based on sequence homology, pink arrows represent predicted CDSs of unknown function. The MFS transporter gene, labeled with a blue-grey arrow, is predicted to be a pseudogene. Dark grey box indicates region containing 4 copies of a 22-bp tandem repeat sequence identified using the Tandem Repeats Finder tool [41]. Arcs inside map indicate sequence with similarity to other sources; green arc represents sequence similar to H. parasuis plasmid pHS-Rec [49], purple arc represents sequence similar to a portion of the Pasteurella trehalosi plasmid pCCK13698 [50], and yellow arc represents sequence similar to a region from the genome sequence of Gallibacterium anantis strain UMN179 [51].

Comparison of the genome sequences of H. parasuis strains D74 and Nagasaki

A reciprocal or one-to-one BLASTP comparison of the protein coding sequences in Nagasaki and D74 identified 1,705 shared CDSs between the strains (Fig 2). Of the 2,173 functional CDSs in strain Nagasaki, 366 CDSs lacked a reciprocal match in D74 and were designated unique or Nagasaki-specific (Fig 2A). Conversely, 324 CDSs, out of the 2,157 functional CDSs in strain D74, lacked a reciprocal match in Nagasaki and were designated unique or D74-specific (Fig 2A). Comparison of the linear organization of the genomes of Nagasaki and D74 revealed many genome re-arrangements and inversions (Fig 2B). The alignment additionally resulted in 73 locally collinear blocks (LCBs), the largest of which is 520,607 bp (Fig 2B).
Fig 2

Comparison of the genome sequences of H. parasuis strains D74 and Nagasaki.

(A) Distribution of protein-encoding genes in H. parasuis strains Nagasaki and D74. Venn diagram demonstrating the unique 2,173 protein-encoding genes in Nagasaki (blue), the 2,157 protein-encoding genes D74 (yellow), and the shared coding sequences defined as bi-directional best hits (center circle, green) via 1-to-1 reciprocal BLASTP comparison using RAST [38] (excluding frame-shifted genes). (B) Comparison of the linear organization of the H. parasuis strains Nagasaki and D74 chromosomes. Mauve [40] was used to compare the chromosomes of Nagasaki and D74. Locally collinear blocks (LCBs) representing regions of sequence that align in each genome are illustrated as colored rectangles connected by lines. Nagasaki was used as the reference sequence. In D74, LCBs placed above the center line are in the same orientation as in Nagasaki and LCBs placed below the center line are in the reverse orientation relative to Nagasaki. Blank sections are regions that did not align and are likely to contain unique or strain-specific sequence. Many of the larger blank regions not in LCBs contain loci predicted to encode products of phage origin.

Comparison of the genome sequences of H. parasuis strains D74 and Nagasaki.

(A) Distribution of protein-encoding genes in H. parasuis strains Nagasaki and D74. Venn diagram demonstrating the unique 2,173 protein-encoding genes in Nagasaki (blue), the 2,157 protein-encoding genes D74 (yellow), and the shared coding sequences defined as bi-directional best hits (center circle, green) via 1-to-1 reciprocal BLASTP comparison using RAST [38] (excluding frame-shifted genes). (B) Comparison of the linear organization of the H. parasuis strains Nagasaki and D74 chromosomes. Mauve [40] was used to compare the chromosomes of Nagasaki and D74. Locally collinear blocks (LCBs) representing regions of sequence that align in each genome are illustrated as colored rectangles connected by lines. Nagasaki was used as the reference sequence. In D74, LCBs placed above the center line are in the same orientation as in Nagasaki and LCBs placed below the center line are in the reverse orientation relative to Nagasaki. Blank sections are regions that did not align and are likely to contain unique or strain-specific sequence. Many of the larger blank regions not in LCBs contain loci predicted to encode products of phage origin. The genomes of H. parasuis Nagasaki and D74 were assessed for the presence of insertion sequence (IS) elements using ISfinder (https://www-is.biotoul.fr/) [42]. IS elements found previously in other H. parasuis and Pasteurellaceae genomes were identified in both D74 and Nagasaki genomes and these IS elements contained the greatest similarity to sequences from the IS1595 family. Specifically, four ISHps3 and three ISHps4 IS elements were identified in D74, while one ISHps3 and four ISHps4 IS elements were identified in Nagasaki. In silico analysis revealed the occurrence of additional transposases annotated as belonging to other families, such as IS256 and IS110, in both D74 and Nagasaki genomes that were not identified by ISfinder analysis, given that it typically does not identify frameshifted or truncated IS units. These transposases were annotated as pseudogenes and therefore are likely nonfunctional. Of the five complete IS units identified in Nagasaki, two are within regions of conserved sequence order and have homologs in D74. Three of the intact IS units identified in Nagasaki are located at the border of sequence rearrangement, with one located at the border of one of the phage regions. Six complete IS units were identified in D74 with two located at the border of rearranged regions. Of the four IS units that are present within regions of conserved sequence order, two are within regions of conserved sequence order and have homologs in Nagasaki and two are small sequence insertions located within a conserved region in which only the IS unit is the only sequence not conserved within the region. Clusters of regularly interspaced short palindromic repeats (CRISPR) and spacer sequences in the genomes of both D74 and Nagasaki were screened using CRISPRFinder (http://crispr.i2bc.paris-saclay.fr) [43]. No evidence for CRISPR sequences were found in either genome. Phenotypic antibiotic resistance was determined for H. parasuis D74 and Nagasaki and results are summarized in S3 Table. Nagasaki exhibited phenotypic resistance to clindamycin while D74 exhibited “intermediate” limited phenotypic resistance to clindamycin; no other resistance was identified. The genomes of H. parasuis Nagasaki and D74 were screened for the presence of acquired resistance genes and known chromosomal mutations conferring clindamycin resistance. No erm, lnu, or other known resistance determinants were identified in either genome.

Capsule loci in H. parasuis strains D74 and Nagasaki

Due to the importance of capsular polysaccharide in serotyping or strain classification, virulence, and vaccine related research areas, the CDSs located within the capsule loci of Nagasaki and D74 were compared to nucleotide sequences reported by Howell et al. [10]. Several nucleotide and amino acid differences were identified and are summarized in S4 Table and S5 Table. Eleven out of sixteen predicted proteins within the D74 capsule locus were found to contain amino acid differences compared to the sequences reported by Howell et al. [10]. The genes harboring the most or more noteworthy changes for D74 include wzx2, astA, gltL, and wzs (S4 Table). The predicted amino acid changes in AstA relative to the Howell et al. [10] sequence included a two amino acid insertion and are: N98_H99insHId, Q134R, S157A, V177I, V183I, I189V, S195N, and V2056M. In contrast, the predicted amino acid sequence for GltL is shorter than that reported in Howell et al. [10] due to a different predicted start codon. The predicted amino acid changes in GltL are M1_K8del and V9M. The predicted amino acid changes in Wzx2 relative to the Howell et al. [10] sequence are: V176A, I180V, A187V, S192N, R278H, P291S, and R373S (S4 Table). The predicted amino acid changes in Wzs relative to the Howell et al. [10] sequence are: E3K, L27I, V230A, V613A, and V622A (S4 Table). In H. parasuis Nagasaki, twelve out of fourteen predicted proteins within the capsule locus were found to contain amino acid differences compared to the sequences reported by Howell et al. [10]. The genes harboring the more notable changes for Nagasaki include funA, wcfQ, and wbgX (S5 Table). FunA is predicted to result in a longer protein sequence compared to Howell et al. [10] due to a different predicted start codon and five amino acid changes within the shared region, along with 37 additional amino acids on the N-terminus. The predicted amino acid changes in FunA are M1ext-37, A2V, A6V, D27V, S28N, and V138I (S5 Table). The predicted amino acid changes in WcfQ relative to the Howell et al. [10] sequence are: E43K, I58V, S61T, I64T, F106S, D131G, F132S, N194K, Y262C, R269N, F270N, and L271F (S5 Table). WbgX is predicted to encode a shorter protein sequence compared to Howell et al. [10] due to a different predicted start codon. The predicted amino acid changes in WbgX are: M1del, K2M, E3N, F4Y, T177A, S343N, P344A, and A349I (S5 Table).

Virulence-associated genes identified in H. parasuis strains D74 and Nagasaki

To identify factors that could support host colonization and/ or virulence, the D74 and Nagasaki annotations were searched for CDSs encoding predicted functions in adhesion, hemolysis, secretion, toxin production, or other virulence-associated roles. Twenty-one CDSs encoding predicted adhesins were identified in D74 including five outer membrane genes (ompA, ompP1, ompP2, ompP5, and ompD15), two fimD fimbrial usher genes located adjacent to each other along the chromosome, pertactin family virulence factor aidA, filamentous hemagglutinin transporter fhaC, adhesin autotransporter bmaC, and two type IV pili genes (Table 3). Corresponding orthologs for pilT and bmaC were not found in Nagasaki. Gene leuC (A2U21_06460) in Nagasaki was the uni-directional best match to pilT (A2U20_03770) in D74 with 56% global nucleotide sequence identity and gene aidA2 (A2U21_04065) in Nagasaki was the uni-directional best match to bmaC (A2U20_08910) in D74 with 51% global nucleotide sequence identity (Table 3). Eighteen CDSs encoding predicted adhesins were identified in Nagasaki including five outer membrane genes (ompA, ompP1, ompP2, ompP5, and ompD15), one fimD fimbrial usher gene (chromosomally adjacent to a predicted pseudogene fimB), two pertactin family virulence factor genes aidA and aidA2, and two type IV pili (Table 4). In silico analysis of the region in D74 containing the two fimD genes compared to the fimD gene in Nagasaki indicated that the D74 fimD2 (A2U20_08925) aligns with the 5’ end of the Nagasaki fimD (A2U21_03995) and the D74 fimD (A2U20_08920) aligns with the 3’ end of the Nagasaki fimD gene. This suggests that the two fimD genes in D74 could have potentially arisen from a frameshift within a single gene. A 51% sequence identity was observed between corresponding orthologs esiB (A2U20_01865) in D74 and esiB2 (A2U21_08055) in Nagasaki (Table 3 and Table 4).
Table 3

Predicted virulence-associated genes identified in H. parasuis D74.

GroupD74 locus_tagD74 NameProduct/ FunctionHitaNagasaki locus_tag% Identityb
AdhesinA2U20_02695pilWType IV pilus biogenesis/stability protein PilWbiA2U21_08925100
A2U20_03215pulGType II secretory pathway, pseudopilin PulGbiA2U21_0841597
A2U20_03220pulJType II secretory pathway, component PulJbiA2U21_0841096
A2U20_03425pilMType IV pilus biogenesis protein PilMbiA2U21_0821598
A2U20_03430pilNType IV pilus biogenesis protein PilNbiA2U21_0821098
A2U20_03445pilQType IV pilus biogenesis PilQbiA2U21_0819594
A2U20_03490ompP5Outer membrane protein P5biA2U21_0815092
A2U20_03710ompP1Outer membrane protein precursor P1biA2U21_0795090
A2U20_03770pilTpilT domain-containing proteinuniA2U21_0646056
A2U20_04380ompP2Outer membrane protein P2 precursorbiA2U21_0731585
A2U20_04655ompD15Surface antigen (D15), outer membrane proteinbiA2U21_0704099
A2U20_04855ompAOuter membrane protein AbiA2U21_0682098
A2U20_06825fhaCFilamentous hemagglutinin transporter protein FhaCuniA2U21_1113552
A2U20_07210pilAType IV pilin PilAbiA2U21_1086087
A2U20_07215pilBType IV fimbrial assembly ATPase PilBbiA2U21_1085598
A2U20_07220pilCType IV fimbrial assembly protein PilCbiA2U21_1085099
A2U20_07225pilDTfp pilus assembly pathway, fimbrial leader peptidasebiA2U21_1084594
A2U20_08340aidAType V secretory pathway, adhesin AidAbiA2U21_0406572
A2U20_08910bmaCAdhesin BmaC autotransporteruniA2U21_0406551
A2U20_08920fimDPutative F17-like fimbrial usheruniA2U21_0399598
A2U20_08925fimD2Putative F17-like fimbrial usherbiA2U21_0399599
HemolysinA2U20_02515prtCSerralysin CuniA2U21_0047058
A2U20_06885Hemagglutinin/hemolysin-related protein-
A2U20_07380hlyDHemolysin secretion protein D-
A2U20_08300shlBHemolysin transporter protein ShlBbiA2U21_1113553
A2U20_08375osmYOsmotically-inducible protein OsmY; putative hemolysinbiA2U21_04100100
A2U20_09890ahpAHemolysin regulation protein AhpAbiA2U21_0262099
A2U20_10965prtBSerralysin B, hemolysin-type calcium-binding regionuniA2U21_0047056
A2U20_10970prtB2Serralysin B, hemolysin-type calcium-binding regionbiA2U21_0047069
SecretionA2U20_01865esiBPutative secretory immunoglobulin A-binding proteinbiA2U21_0805551
A2U20_09675Putative periplasmic/secreted proteinbiA2U21_0237593
ToxinA2U20_00255ebgCEbgC protein_Toxin-antitoxin biofilm protein TabAbiA2U21_0062098
A2U20_01495higAAntitoxin HigA_mRNA interferase antitoxinuniA2U21_0099563
A2U20_01500hicAAddiction module toxin HicAuniA2U21_0100059
A2U20_01850vapCVapC toxin family PIN domain ribonucleasebiA2U21_1000078
A2U20_02115hipASerine/threonine-protein kinase toxin HipAbiA2U21_0950095
A2U20_02705higA2Antitoxin HigA_mRNA interferase antitoxinbiA2U21_0891593
A2U20_02710higBmRNA interferase toxin HigBbiA2U21_0891096
A2U20_02805tdeAPutative toxin and drug export protein AbiA2U21_0878597
A2U20_02840relEAddiction module toxin RelEbiA2U21_0303559
A2U20_03250mazFProgrammed cell death toxin MazFuniA2U21_0619052
A2U20_03635hicA2Aaddiction module toxin HicAbiA2U21_0295094
A2U20_04570higA3Antitoxin HigA_mRNA interferase antitoxinuniA2U21_0099563
A2U20_04795pezTAntitoxin/toxin system zeta toxinuniA2U21_0313053
A2U20_04825cdtCCytolethal distending toxin subunit CdtCbiA2U21_0657099
A2U20_04830cdtBCytolethal distending toxin subunit CdtBuniA2U21_0656595
A2U20_04835cdtACytolethal distending toxin subunit CdtAbiA2U21_0656099
A2U20_04880higB2mRNA interferase toxin HigBuniA2U21_0532555
A2U20_04885higA4Antitoxin HigA_mRNA interferase antitoxinuniA2U21_0891556
A2U20_05030higA5Antitoxin HigA_mRNA interferase antitoxinbiA2U21_0613098
A2U20_05035higB3mRNA interferase toxin HigBbiA2U21_0613599
A2U20_05120cdtC2Cytolethal distending toxin subunit CdtCuniA2U21_0657098
A2U20_05125cdtB2Cytolethal distending toxin subunit CdtBbiA2U21_0656596
A2U20_05130cdtA2Cytolethal distending toxin subunit CdtAuniA2U21_0656099
A2U20_05385higA6Antitoxin HigA_mRNA interferase antitoxinbiA2U21_0631099
A2U20_05390higB4mRNA interferase toxin HigBbiA2U21_06305100
A2U20_05455higB5mRNA interferase toxin HigBbiA2U21_0625099
A2U20_05460higA7Antitoxin HigA_mRNA interferase antitoxinbiA2U21_06245100
A2U20_05480cdiAContact-dependent growth inhibition (CDI) toxin-
A2U20_05555chpSAntitoxin ChpSbiA2U21_0618599
A2U20_05585tabAToxin-antitoxin biofilm protein TabAbiA2U21_06150100
A2U20_05690hicA3Addiction module toxin HicAbiA2U21_0470562
A2U20_06620higA8Antitoxin HigA_mRNA interferase antitoxinbiA2U21_0533093
A2U20_06625higB6mRNA interferase toxin HigBbiA2U21_0532599
A2U20_06830cdiA2Contact-dependent growth inhibition (CDI) toxin-
A2U20_06850cdiA3Contact-dependent growth inhibition (CDI) toxin-
A2U20_06865cdiA4Contact-dependent growth inhibition (CDI) toxin-
A2U20_06875cdiA5Contact-dependent growth inhibition (CDI) toxin-
A2U20_07065pezT2Antitoxin/toxin system zeta toxinbiA2U21_0313075
A2U20_07375ltxBLeukotoxin export ATP-binding protein LtxBuniA2U21_0702052
A2U20_08285cdiA6Contact-dependent growth inhibition (CDI) toxin-
A2U20_08295cdiA7Contact-dependent growth inhibition (CDI) toxin-
A2U20_08515pasIPersistence and stress-resistance antitoxin PasIbiA2U21_0427598
A2U20_10085hicBAntitoxin HicBbiA2U21_00970100
A2U20_10110higA9Antitoxin HigA_mRNA interferase antitoxinbiA2U21_0099598
A2U20_10115higB7mRNA interferase toxin HigBbiA2U21_0100099
A2U20_10565hipA2Serine/threonine-protein kinase toxin HipAbiA2U21_0056599
OtherA2U20_00515nanHSialidasebiA2U21_1154099
A2U20_01700espPPutative extracellular serine proteasebiA2U21_0965575
A2U20_01715espP2Putative extracellular serine proteasebiA2U21_0964078
A2U20_03355vacJPutative VacJ lipoproteinbiA2U21_0828597
A2U20_05150sirARegulator of disulfide bond formationbiA2U21_0654099
A2U20_05155sirBInvasion protein expression up-regulator SirBbiA2U21_06535100
A2U20_07890sodASuperoxide dismutasebiA2U21_03770100
A2U20_08325sodCSuperoxide dismutasebiA2U21_0405092

a The Hit column contains a '-' (no hit), 'uni' or 'bi' RAST server results from a one-to-one BLASTP comparison of the protein coding sequences in the Nagasaki genome using the D74 genome as the reference. “bi” represents a bidirectional best hit in which the reverse hit from the Nagasaki comparison genome to the D74 reference genome was also the best hit. “uni” indicates a uni-directional hit in which the reverse hit from the comparison genome to the reference genome was not also the best hit. “-”indicates no hit or match was found.

bGlobal pairwise nucleotide percent sequence identity.

Table 4

Predicted virulence-associated genes identified in H. parasuis Nagasaki.

GroupNagasaki locus_tagNagasaki NameProduct/ FunctionHitaD74 locus_tag% Identityb
AdhesinA2U21_02720aidAType V secretory pathway, adhesin AidA-
A2U21_03995fimDPutative F17-like fimbrial usherbiA2U20_0892599
A2U21_04065aidA2Type V secretory pathway, adhesin AidAbiA2U20_0834072
A2U21_06820ompAOuter membrane protein A precursorbiA2U20_0485598
A2U21_07040ompD15Surface antigen (D15), outer membrane proteinbiA2U20_0465599
A2U21_07315ompP2Outer membrane protein P2 precursorbiA2U20_0438085
A2U21_07950ompP1Outer membrane protein precursor P1biA2U20_0371090
A2U21_08150ompP5putative outer membrane protein P5biA2U20_0349092
A2U21_08195pilQType IV pilus biogenesis protein PilQbiA2U20_0344594
A2U21_08210pilNType IV pilus biogenesis protein PilNbiA2U20_0343098
A2U21_08215pilMType IV pilus biogenesis protein PilMbiA2U20_0342598
A2U21_08410pulJType II secretory pathway, component PulJbiA2U20_0322096
A2U21_08415pulGType II secretory pathway, pseudopilin PulGbiA2U20_0321597
A2U21_08925pilWType IV pilus biogenesis/stability protein PilWbiA2U20_02695100
A2U21_10845pilDTfp pilus assembly pathway, fimbrial leader peptidasebiA2U20_0722594
A2U21_10850pilCType IV fimbrial assembly protein PilCbiA2U20_0722099
A2U21_10855pilBType IV fimbrial assembly ATPase PilBbiA2U20_0721598
A2U21_10860pilAType IV pilin PilAbiA2U20_0721087
HemolysinA2U21_00465Putative hemagglutinin/hemolysin-related protein-
A2U21_00470prtBSerralysin B, hemolysin-type calcium-binding regionbiA2U20_1097069
A2U21_02620ahpAHemolysin regulation protein AhpAbiA2U20_0989099
A2U21_04100osmYOsmotically-inducible protein OsmY_putative hemolysinbiA2U20_08375100
A2U21_11130shlB1Hemolysin transporter protein ShlB-
A2U21_11135shlB2Hemolysin transporter protein ShlBbiA2U20_0830053
A2U21_11140putative hemolysin-
A2U21_11145Putative hemolysin-
A2U21_11150hpmAHemolysin-
SecretionA2U21_02375Periplasmic/secreted proteinbiA2U20_0967593
A2U21_08050esiBPutative secretory immunoglobulin A-binding protein-
A2U21_08055esiB2Putative secretory immunoglobulin A-binding proteinbiA2U20_0186551
ToxinA2U21_00565hipASerine/threonine-protein kinase toxin HipAbiA2U20_1056599
A2U21_00620ebgCEbgC protein_Toxin-antitoxin biofilm protein TabAbiA2U20_0025598
A2U21_00970hicBAntitoxin HicBbiA2U20_10085100
A2U21_00995higAAntitoxin HigA_mRNA interferase antitoxinbiA2U20_1011098
A2U21_01000higBmRNA interferase toxin HigBbiA2U20_1011599
A2U21_02950hicAAddiction module toxin HicAbiA2U20_0363594
A2U21_03035relEAddiction module toxin RelEbiA2U20_0284059
A2U21_03130pezTAntitoxin/toxin system zeta toxinbiA2U20_0706575
A2U21_03940Toxin-antitoxin system, antitoxin component-
A2U21_04275pasIPersistence and stress-resistance antitoxin PasIbiA2U20_0851598
A2U21_04650higA2Antitoxin HigA_mRNA interferase antitoxinuniA2U20_0149554
A2U21_04705hicA2Addiction module toxin HicAbiA2U20_0569062
A2U21_04735hicA3Addiction module toxin HicAuniA2U20_0363594
A2U21_04850relEmRNA interferase toxin RelE-
A2U21_05325higB2mRNA interferase toxin HigBbiA2U20_0662599
A2U21_05330higA3Antitoxin HigA_mRNA interferase antitoxinbiA2U20_0662093
A2U21_05725relE2Addiction module toxin RelE-
A2U21_05930hicB2Antitoxin HicB-
A2U21_06130higA4Antitoxin HigA_mRNA interferase antitoxinbiA2U20_0503098
A2U21_06135higB3mRNA interferase toxin HigBbiA2U20_0503599
A2U21_06150tabAPutative Toxin-antitoxin biofilm protein TabAbiA2U20_05585100
A2U21_06185chpSAntitoxinbiA2U20_0555599
A2U21_06190pemKProgrammed cell death toxin PemKbiA2U20_0555097
A2U21_06245higA5Antitoxin HigA_mRNA interferase antitoxinbiA2U20_05460100
A2U21_06250higB4mRNA interferase toxin HigBbiA2U20_0545599
A2U21_06305higB5mRNA interferase toxin HigBbiA2U20_05390100
A2U21_06310higA6Antitoxin HigA_mRNA interferase antitoxinbiA2U20_0538599
A2U21_06560cdtAToxinbiA2U20_0483599
A2U21_06565cdtBCytolethal distending toxin subunit CdtBbiA2U20_0512596
A2U21_06570cdtCToxinbiA2U20_0482599
A2U21_06840cdtA2ToxinuniA2U20_0483599
A2U21_06845cdtB2Cytolethal distending toxin subunit CdtBuniA2U20_0483094
A2U21_06850cdtC2ToxinuniA2U20_0482594
A2U21_08015relE3Addiction module toxin RelE-
A2U21_08020stbDRelB/StbD replicon stabilization protein; antitoxin to RelE/StbE-
A2U21_08785tdeAPutative memebrane toxin/ drug export protein AbiA2U20_0280597
A2U21_08910higB6mRNA interferase toxin HigBbiA2U20_0271096
A2U21_08915higA7Antitoxin HigA_mRNA interferase antitoxinbiA2U20_0270593
A2U21_09500hipA2Toxin HipAbiA2U20_0211595
A2U21_09900vapDVirulence-associated protein D; endoribonuclease-
A2U21_10000vapCVapC toxin family PIN domain ribonucleasebiA2U20_0185078
OtherA2U21_03770sodASuperoxide dismutasebiA2U20_07890100
A2U21_04050sodCSuperoxide dismutasebiA2U20_0832592
A2U21_06535sirBInvasion protein expression up-regulator SirBbiA2U20_05155100
A2U21_06540sirARegulator of disulfide bond formationbiA2U20_0515099
A2U21_08285vacJputative VacJ lipoproteinbiA2U20_0335597
A2U21_09640espPPutative serine proteasebiA2U20_0171578
A2U21_09655espP2Putative serine proteasebiA2U20_0170075
A2U21_11540nanHSialidasebiA2U20_0051599

a The Hit column contains a '-' (no hit), 'uni' or 'bi' RAST server results from a one-to-one BLASTP comparison of the protein coding sequences in the Nagasaki genome using the D74 genome as the reference. “bi” represents a bidirectional best hit in which the reverse hit from the Nagasaki comparison genome to the D74 reference genome was also the best hit. “uni” indicates a uni-directional hit in which the reverse hit from the comparison genome to the reference genome was not also the best hit. “-”indicates no hit or match was found.

bGlobal pairwise nucleotide percent sequence identity.

a The Hit column contains a '-' (no hit), 'uni' or 'bi' RAST server results from a one-to-one BLASTP comparison of the protein coding sequences in the Nagasaki genome using the D74 genome as the reference. “bi” represents a bidirectional best hit in which the reverse hit from the Nagasaki comparison genome to the D74 reference genome was also the best hit. “uni” indicates a uni-directional hit in which the reverse hit from the comparison genome to the reference genome was not also the best hit. “-”indicates no hit or match was found. bGlobal pairwise nucleotide percent sequence identity. a The Hit column contains a '-' (no hit), 'uni' or 'bi' RAST server results from a one-to-one BLASTP comparison of the protein coding sequences in the Nagasaki genome using the D74 genome as the reference. “bi” represents a bidirectional best hit in which the reverse hit from the Nagasaki comparison genome to the D74 reference genome was also the best hit. “uni” indicates a uni-directional hit in which the reverse hit from the comparison genome to the reference genome was not also the best hit. “-”indicates no hit or match was found. bGlobal pairwise nucleotide percent sequence identity. Eight CDSs encoding predicted hemolysins were identified in D74 including two (locus tags A2U20_06885 and A2U20_07380) unique to strain D74 (Table 3). D74 harbors two putative Serralysin B genes, prtB and prtB2, and a putative Serralysin C gene prtC. In contrast, only putative Serralysin B gene prtB was identified in strain Nagasaki (Table 4). A notable size difference appears to exist between the proteins encoded by both prtB, prtB2, and prtC in D74, 1,954, 1,703, and 1,911 amino acids respectively, compared to the putative Serralysin B 910 amino acid protein encoded by prtB in Nagasaki. Nine CDSs encoding predicted hemolysins were identified in Nagasaki, five of which were identified as unique to Nagasaki. These include hpmA (A2U21_11150) and locus-tags A2U21_00465, A2U21_11140, and A2U210_11145. Nagasaki harbors two putative hemolysin transporter genes shlB1 and shlB2 (chromosomally located next to each other), while D74 contains only shlB1 gene (Tables 3 and 4). A notable size difference was also observed for the shlB gene in D74 encoding a putative 582 amino acid protein compared to the putative 329 amino acid protein encoded by the shlB in Nagasaki. Two CDSs encoding genes predicted to function in secretion were identified in D74, A2U20_09675 and esiB, a putative secretory immunoglobulin A-binding encoding gene, while three were identified in Nagasaki. These include A2U21_02375 and two putative secretory immunoglobulin A-binding encoding genes esiB and esiB2, chromosomally located next to each other. Thirty-nine CDSs encoding predicted toxins were identified in D74 and forty-six CDSs encoding predicted toxins were identified in Nagasaki. A noticeable difference between the CDSs encoding predicted toxins identified in D74 and Nagasaki was the seven cdiA genes encoding putative contact-dependent growth inhibition toxin A harbored only in strain D74 (Table 3). These cdiA genes harbored by D74 are discussed in more detail below. Both strains harbored a large number of toxin-antitoxin (TA) systems. TA systems are small genetic elements comprised of two components, a stable protein toxin and its more labile antagonistic antitoxin, which can be a protein or non-coding RNA [52]. TA systems were originally identified as plasmid-borne loci, which functioned to promote plasmid maintenance by killing daughter cells that lacked the TA encoded plasmid [52]. TA loci were subsequently discovered in numerous bacterial and archaeal chromosomes and provide several functions, such as stabilization of genomic regions, anti-addiction against similar plasmid-borne toxins, defense against phage infection, biofilm formation, control of the stress response, and bacterial persistence [52-54]. Six types of TA systems (types I to VI) have been described to date based on the type (either RNA or protein) and mode of action of the antitoxin [53]. The Type II TA system is highly abundant among prokaryotes and has been extensively studied [52, 54–56]. In type II TA systems, both toxin and antitoxin are small proteins encoded by genes in a bicistronic operon [52, 54–56]. The antitoxin blocks the toxicity of the toxin by forming a complex with it [52, 54–56]. Both D74 and Nagasaki contain several Type II TA families including relBE, mazEF, vapBC, and higBA, which was the most abundant with 10 putative higBA loci identified in D74 and 6 putative higBA loci identified in Nagasaki, with a varying degree of similarity (Table 3 and Table 4). Eight orthologs of other virulence-associated proteins were identified in both strains (Table 3 and Table 4). A 75% sequence identity was observed between corresponding orthologs espP (A2U20_01700) in D74 and espP2 (A2U21_09655) in Nagasaki and a 78% sequence identity was observed between corresponding orthologs espP2 (A2U20_01715) in D74 and espP (A2U21_09640) in Nagasaki (Table 3 and Table 4). Additionally a notable size difference was also observed for the putative proteins encoded by espP (A2U20_01700), 1,071 amino acids, and espP2 (A2U20_01715), 985 amino acids, in D74, compared to the putative proteins encoded by espP (A2U21_09640), 781 amino acids, and espP2 (A2U21_09655), 772 amino acids, in Nagasaki.

cdiA genes encoding putative contact-dependent growth inhibition proteins identified in H. parasuis D74

Contact-dependent growth inhibition (CDI) is a process used by Gram-negative bacteria to deliver diverse growth inhibiting nuclease toxins into the cytoplasm of neighboring cells upon cell-cell contact [57-61]. CDI is mediated by the CdiB and CdiA proteins, which are members of the TpsB and TpsA two-partner secretion (TPS) group of proteins [61-63]. CdiB facilitates secretion of the CdiA “exoprotein” onto the cell surface [61, 64]. CdiA then binds to specific outer-membrane receptors on susceptible bacteria and transfers its C-terminal toxin domain (CdiA-CT) into the target cell [58-61]. CDI+ bacteria also produce small immunity proteins (CdiI) that protect them from toxin delivered by neighboring cells of closely related species or sibling cells by binding to the CdiA-CT and neutralizing its toxin activity [58-61]. The putative cdiA genes in strain D74 are located in three regions with a cdiBAI organization with “orphan” cdiA genes located downstream, which is similar to the organization found in E. coli strains and other gamma-proteobacteria [61, 65]. Region one includes cdiA (A2U20_05480) and five upstream potential cdiI candidate genes (locus tags A2U20_05485-A2U20_05505) (Fig 3A). The second region encompasses fhaC (A2U20_06825) located upstream of cdiA2 (A2U20_06830), which, consistent with other TpsB proteins, shares substantial homology with CdiB (Fig 3A) [66]. Nine potential cdiI candidate genes (locus tags A2U20_06835-A2U20_06845, A2U20_06855-A2U20_06860, A2U20_06870, and A2U20_06880-A2U20_06890) and cdiA3, cdiA4, and cdiA5 (locus tags A2U20_06850, A2U20_06865, A2U20_06875) are located downstream of cdiA2. Genes cdiA3, cdiA4, and cdiA5 were identified as “orphan” cdiA genes based on their shared characteristics such as smaller size compared to larger full-length cdiA genes and resemble the 3’-ends of larger cdiA genes. (Fig 3A) [65]. The third region contains the cdiA7 (A2U20_08295) gene with the shlB gene (A2U20_08300) gene located upstream (Fig 3). Similar to FhaC and other TpsB proteins, ShlB shares substantial homology with CdiB [66]. A potential cdiI candidate (A2U20_08290), followed by two potential “orphan” cdiA genes, cdiA6 (A2U20_08285) and a predicted pseudogene fha (A2U20_08280), followed by two potential cdiI candidate genes (locus tags A2U20_08270 and A2U20_08265) are located downstream of cdiA7 (Fig 3A).
Fig 3

cdiA genes encoding putative contact-dependent growth inhibition proteins identified in H. parasuis D74.

(A) Organization of cdi loci in D74. The three genomic regions containing putative cdiA genes are depicted. Gene function was assigned based on the results of BLASTX searches. Each arrow represents a gene within the locus; direction of the arrow indicates orientation within the closed genome sequence. Dark blue arrows represent the cdiA genes, red arrows represent cdiB homologues, light blue arrows represent potential “orphan” cdiA genes, yellow arrows represent putative cdiI candidates, and grey arrows represent genes predicted to have functions relating to horizontal gene transmission, black arrows represent genes whose predicted function is unrelated to contact-dependent inhibition. Genomic regions are not shown to scale. (B) Domain architecture of the predicted CdiA proteins. Domain content of the CdiA proteins was determined using a pfam database search. Grey boxes represent ESPR signal peptide domains (PF13018), green boxes represent Haemagg_act domains (PF05860), light blue boxes represent Fil_haemagg domains (PF05594), orange boxes represent Fil_haemagg_2 domains (PF13332), red boxes represent PT-VENN domains (PF04829), and purple boxes represent EndoU_bacteria domains (PF14436).

cdiA genes encoding putative contact-dependent growth inhibition proteins identified in H. parasuis D74.

(A) Organization of cdi loci in D74. The three genomic regions containing putative cdiA genes are depicted. Gene function was assigned based on the results of BLASTX searches. Each arrow represents a gene within the locus; direction of the arrow indicates orientation within the closed genome sequence. Dark blue arrows represent the cdiA genes, red arrows represent cdiB homologues, light blue arrows represent potential “orphan” cdiA genes, yellow arrows represent putative cdiI candidates, and grey arrows represent genes predicted to have functions relating to horizontal gene transmission, black arrows represent genes whose predicted function is unrelated to contact-dependent inhibition. Genomic regions are not shown to scale. (B) Domain architecture of the predicted CdiA proteins. Domain content of the CdiA proteins was determined using a pfam database search. Grey boxes represent ESPR signal peptide domains (PF13018), green boxes represent Haemagg_act domains (PF05860), light blue boxes represent Fil_haemagg domains (PF05594), orange boxes represent Fil_haemagg_2 domains (PF13332), red boxes represent PT-VENN domains (PF04829), and purple boxes represent EndoU_bacteria domains (PF14436). CdiA proteins share a number of characteristics in common with the TpsA family protein FHA [61, 67]. CdiA proteins typically contain an N-terminal region homologous to FHA containing the TPS domain required for interaction with the TpsB partner, CdiB or FhaC respectively, and haemagglutinin repeats that are predicted to form a β-helical structure [61, 67]. In addition, most cdiA homologues encode the VENN peptide motif, which delineates the beginning of C-terminal toxin domain as well as the conserved and variable regions [61, 67]. The predicted proteins encoded by the cdiA genes in D74 were evaluated for the presence of these domains. Domains identified in CdiA include a ESPR signal peptide domain (PF13018) required for export and multiple haemaglutination activity domains, specifically one Haemagg_act domain (PF05860), seven Fil_haemagg domains (PF05594), two Fil_haemagg_2 domains (PF13332) (Fig 3B). CdiA2 is similar to CdiA and contains a ESPR signal peptide domain (PF13018) required for export and multiple haemaglutination activity domains, including one Haemagg_act domain (PF05860), ten Fil_haemagg domains (PF05594), two Fil_haemagg_2 domains (PF13332) (Fig 3B). CdiA3 and CdiA6 are similar to each other and contain a pre-toxin domain with VENN motif that marks the beginning of the C-terminal toxin domain similar to other previously reported CdiA proteins (Fig 3B). CdiA4 and CdiA5 are also similar to each other and both contain a Fil_haemagg_2 domain (PF13332) (Fig 3B). CdiA7 is the largest CdiA protein in D74 and contains a ESPR signal peptide domain (PF13018) required for export and multiple haemaglutination activity domains, including one Haemagg_act domain (PF05860), thirteen Fil_haemagg domains (PF05594), two Fil_haemagg_2 domains (PF13332), a pre-toxin VENN domain (PF04829), and a EndoU_bacterial nuclease domain (PF14436) (Fig 3B). Overall, cdiA3, cdiA4, cdiA5, and cdiA6 are similar to previously reported "orphan" cdiA genes as they encode much smaller proteins that lack a conserved export signal and other functional domains. The region containing the orphan cdiA genes was further evaluated and no additional cdi-related protein domains were found beyond those depicted in Fig 3B for CdiA3, CdiA4, CdiA5, and CdiA6. This indicates that the orphan cdiA genes are not the result of frameshift or indel mutations causing disruption of a larger intact cdiA gene(s). This genomic organization of a cdi locus containing one or more orphan cdiA genes downstream of full-length cdiA is common in many species of bacteria [65]. In contrast, CdiA, CdiA2, and CdiA7 are larger and contain most of the functional domains harbored by previously characterized CdiA-CT proteins; however, CdiA7 shares more similarity to other CdiA-CT proteins in that it additionally contains a PT-VENN motif and a recognizable nuclease domain at C-terminus.

vtaA family of trimeric autotransporter genes identified in H. parasuis strains D74 and Nagasaki

Pina et al. first identified thirteen proteins encoded by the vtaA family of trimeric autotransporter genes in strain Nagasaki based on the occurrence of a C-terminal YadA anchor domain, which defines this family of proteins [15]. That study also identified 17 homologues harbored by other H. parasuis strains with relatively conserved sequence within the passenger domain among vtaA homologues from pathogenic isolates and a high degree of divergence among non-virulent isolates [15]. The authors subsequently named these genes vtaA or virulence-associated trimeric autotransporter genes and classified the proteins encoded by vtaA genes into three groups based on sequence comparison of the C-terminal YadA anchor domain, with groups 1 and 2 being strongly associated with virulent H. parasuis isolates [15]. Previous studies have demonstrated that VtaA proteins are involved in virulence as well as being immunogenic, are produced during an infection, and are capable of conferring protection [68-70]. Recently, a comparison between the D74 and Nagasaki draft genomes identified only three vtaA genes harbored by D74 compared to thirteen harbored by Nagasaki [34]. Unfortunately, many vtaA genes identified within each strain were incomplete in the draft sequences, preventing a reliable one-to-one assignment of the vtaA-like ORFs to specific vtaA genes and subsequent evaluation of the predicted protein structure. To ensure identification of all potential vtaA genes, all protein coding sequences in Nagasaki and D74 were searched for the occurrence of a YadA anchor domain and no additional YadA anchor domain containing proteins were identified. The 13 vtaA genes identified in Nagasaki have been named according to their location along the chromosome and are listed in Table 5 along with their respective name and group originally assigned by Pina et al. [15]. Similarly, the 3 vtaA genes identified in D74 have been named according to their location along the chromosome and are also listed in Table 5. The predicted protein structure of all VtaA proteins for both Nagasaki and D74 were evaluated for known domain content (Fig 4). All Nagasaki VtaA proteins contain an N-terminal extended signal peptide or ESPR domain (PF13018) for Type V secretion, followed by 1–4 YadA head domains (PF05658), and 3–5 YadA anchor domains (PF03895). This region containing the head and stalk domains is referred to as the passenger domain region [71]. Following the passenger domain region, all Nagasaki VtaA proteins contain 2–8 collagen triple helix repeat domains (PF01391) followed by the C-terminal YadA anchor domain (PF03895) (Fig 4A). In contrast, none of the D74 VtaA proteins contain a collagen triple helix repeat domain (PF01391) and the predicted size for both VtaA_D1 (4054 AA) and VtaA_D2 (6778 AA) proteins is substantially larger than any predicted Nagasaki VtaA protein (Fig 4B). Additionally, VtaA_D3 differs from VtaA_D1 and VtaA_D2 proteins. VtaA_D3 contains a tryptophan-ring motif domain or TAA-Trp-ring (PF15401) and does not contain an N-terminal ESPR domain (PF13018) (Fig 4B). The absence of the ESPR domain suggests that VtaA_D3 may not be exported across the inner membrane. When we compared our VtaA predicted protein sequences to those reported by Pina et al. [15] two differences were identified and the other 11 out of 13 sequences were found to be 100% identical. The two differences identified were an amino acid change in VtaA_N9 (S1191G) relative to the previously reported sequence and an 18 amino acid insertion (AGPTGPQGPAGPTGSQDP) after amino acid 737 of VtaA8 that is not present in our VtaA_N4 sequence. The absence or presence of the insertion is located within the third collagen repeat domain and does not affect the presence or absence of this domain nor the total number of collagen repeat domains predicted for the two proteins.
Table 5

vtaA genes identified in H. parasuis Nagasaki.

Nagasakilocus_tagNagasakiNameName assigned by Pina et al. [15]Group assigned by Pina et al. [15]HitaD74locus_tagD74 Name% Identityb
A2U21_00055vtaA_N1vtaA11-0
A2U21_00360vtaA_N2vtaA102uniA2U20_04585vtaA_D149.73
A2U21_00905vtaA_N3vtaA41-0
A2U21_03315vtaA_N4vtaA8c1biA2U20_05995vtaA_D336.84
A2U21_04400vtaA_N5vtaA91uniA2U20_04585vtaA_D130.74
A2U21_05305vtaA_N6vtaA61-0
A2U21_05400vtaA_N7vtaA21-0
A2U21_05640vtaA_N8vtaA112uniA2U20_04585vtaA_D150.94
A2U21_06175vtaA_N9vtaA3d1-0
A2U21_06350vtaA_N10vtaA121biA2U20_04585vtaA_D194.09
A2U21_07110vtaA_N11vtaA131uniA2U20_04585vtaA_D159.87
A2U21_08125vtaA_N12vtaA53-0
A2U21_09355vtaA_N13vtaA73-0

a The Hit column contains a '-' (no hit), 'uni' or 'bi' RAST server results from a one-to-one BLASTP comparison of the protein coding sequences in the D74 genome using the Nagasaki genome as the reference. “bi” represents a bidirectional best hit in which the reverse hit from the D74 comparison genome to the Nagasaki reference genome was also the best hit. “uni” represents uni-directional indicates a hit in which the reverse hit from the comparison genome to the reference genome was not also the best hit. “-”indicates no hit or match was found.

bLocal percent sequence identity.

c18 amino acid insertion compared to the Pina et al. [15] sequence.

d Single amino acid difference at position 1191.

Fig 4

Domain architecture of the predicted VtaA proteins in H. parasuis strains D74 and Nagasaki.

(A) Domain architecture of the predicted VtaA proteins from Nagasaki. Assigned gene name designations are shown at left. Schematic depictions of the 13 Nagasaki VtaA proteins is shown. The domains were identified by a pfam database search. ESPR signal peptides (PF13018) are shown in purple, YadA head domains (PF05658) in red, YadA stalk domains (PF05662) in blue, YadA anchor domains (PF03895) in green, collagen domains (PF1391) in orange, TAA-Trp-ring domains (PF15401) in yellow. (B) Domain architecture of the predicted VtaA proteins from D74. Assigned gene name designations are shown at left. Schematic depictions of the pfam domains, colored as in Panel A, for the 3 VtaA proteins from D74.

Domain architecture of the predicted VtaA proteins in H. parasuis strains D74 and Nagasaki.

(A) Domain architecture of the predicted VtaA proteins from Nagasaki. Assigned gene name designations are shown at left. Schematic depictions of the 13 Nagasaki VtaA proteins is shown. The domains were identified by a pfam database search. ESPR signal peptides (PF13018) are shown in purple, YadA head domains (PF05658) in red, YadA stalk domains (PF05662) in blue, YadA anchor domains (PF03895) in green, collagen domains (PF1391) in orange, TAA-Trp-ring domains (PF15401) in yellow. (B) Domain architecture of the predicted VtaA proteins from D74. Assigned gene name designations are shown at left. Schematic depictions of the pfam domains, colored as in Panel A, for the 3 VtaA proteins from D74. a The Hit column contains a '-' (no hit), 'uni' or 'bi' RAST server results from a one-to-one BLASTP comparison of the protein coding sequences in the D74 genome using the Nagasaki genome as the reference. “bi” represents a bidirectional best hit in which the reverse hit from the D74 comparison genome to the Nagasaki reference genome was also the best hit. “uni” represents uni-directional indicates a hit in which the reverse hit from the comparison genome to the reference genome was not also the best hit. “-”indicates no hit or match was found. bLocal percent sequence identity. c18 amino acid insertion compared to the Pina et al. [15] sequence. d Single amino acid difference at position 1191. When the genome regions in Nagasaki and D74 were compared, the sequence and gene content up and downstream of the vtaA gene locations in Nagasaki were similar to the reciprocal locations in D74. No evidence of extensive genome rearrangement at these locations was observed. A different gene or genes were identified in D74 at the same location of a reciprocal vtaA gene in Nagasaki, which could have arisen from small insertion events. Further evaluation of the vtaA genes in both strains suggested that they are not in an operon configuration. Pina et al. identified an active promoter upstream of vtaA_N3 [15]. In silico analysis indicated that the upstream region of all 13 Nagasaki vtaA genes contained highly similar promoter sequences to the vtaA_N3 promoter. A highly similar promoter sequence to the vtaA_N3 was additionally observed upstream of vtaA_D2 in D74. This sequence conservation implies similar expression and/or regulation mechanisms among these genes.

Methylation motifs and RM-systems in H. parasuis strains D74 and Nagasaki

In bacteria, the most common post-replicative modification of DNA is methylation by methyltransferase (MTase) enzymes resulting in three types of epigenetic markers: N6-methyladenine (m6A), N4-methylcytosine (m4C) and 5-methylcytosine (m5C) [47, 72]. DNA methylation serves several key roles in bacterial processes, including mismatch repair, the timing of DNA replication, conferring protection against bacteriophages, and regulating gene expression. [73-78]. Analysis of the SMRT DNA sequencing kinetics was used to identify total base modifications in the genomes of H. parasuis D74 and Nagasaki, and the modified sequence motifs for each strain are summarized in Tables 6 and 7.
Table 6

Methylation motifs detected in H. parasuis D74.

MotifaModification Type# Detected# in Genome% DetectedMean Modification QVMean Motif CoveragePartner Motif
AGCNNNNNGCTm6A76777299.4%535.6376.2AGCNNNNNGCT
GTANNNNNNTGGm6A81782599.0%436.6380.3CCANNNNNNTAC
CCANNNNNNTACm6A81382598.5%421.3381.9GTANNNNNNTGG
AAGCTTm6A62363098.9%442.4384.0AAGCTT
GATCm6A15,53415,71898.8%410.7381.5GATC
GTAHNNNNNNCTTGm6A21722098.6%420.1378.9
CAAGNNNNNGNTACm6A555796.5%397.1348.8
AGCNNNNGGATCm6A475585.5%346.8365.8
GCAGGVNNDGm6A31665948.0%99.1382.2
VAAGCTCKDm6A20744746.3%131.9391.1
AHBYAGYADm6A6622,63225.2%108.6376.7
DDTGTNDNDGmodified_baseb1,5758,34618.9%52.3356.8
TNNNNNNHmodified_baseb179,0731,197,69615.0%52.7367.5
DTNVVNDDGmodified_baseb9,92078,83412.6%48.6366.4
AGNNNNNHm6A15,854200,8537.9%112.6371.8

aBold underlined bases indicate methylated base in motif sequence.

bBase modification not identified or recognized by software.

Table 7

Methylation motifs detected in H. parasuis Nagasaki.

MotifaModificationType#Detected# in Genome% DetectedMean Modification QVMean Motif CoveragePartner Motif
AAGNNNNNCTTm6A1,3221,32499.8%719.7571.6AAGNNNNNCTT
AACNNNNNTGGm6A1,0891,09299.7%676.2592.7CCANNNNNGTT
CCANNNNNGTTm6A1,0881,09299.6%700.0622.7AACNNNNNTGG
GTANNNNNNNCTTGm6A21721899.5%641.8610.1CAAGNNNNNNNTAC
CAAGNNNNNNNTACm6A21521898.6%671.3582.9GTANNNNNNNCTTG
GATCm6A14,60314,77898.8%602.6617.8GATC
AGGNNNNNCCTm6A45446098.7%688.5590.4AGGNNNNNCCT
AAGVNNNNCTTm6A3031,01329.9%130.6608.1
RAHDBAGYAm6A7212,88125.0%107.0582.3
TNNNNNNHmodified_baseb181,1791,123,13216.1%58.7579.9
DDTNVVNDDGmodified_baseb9,43961,83415.3%54.8582.8
TSNNKNNGmodified_baseb7,87164,47812.2%52.9587.1
AGDNNNNHm6A11,510134,4778.6%178.5591.9
BAGCNVNNHm6A1,13324,2174.7%140.2629.3

aBold underlined bases indicate methylated base in motif sequence.

bBase modification not identified or recognized by software.

aBold underlined bases indicate methylated base in motif sequence. bBase modification not identified or recognized by software. aBold underlined bases indicate methylated base in motif sequence. bBase modification not identified or recognized by software. A total of 15 sequence motifs were identified in strain D74, and N6-methyladenine (m6A) was the most prevalent type of modification detected (Table 6). Focusing on strain Nagasaki, a total of 14 recognition sites for methylation or sequence motifs were identified and, similar to D74, N6-methyladenine (m6A) was the most prevalent type of modification detected (Table 7). This analysis revealed a surprising degree of diversity in motifs observed between these closely related strains given that the methylation motif 5’-GA was the only motif shared or observed in both D74 and Nagasaki (Table 6 and Table 7). The genomes of H. parasuis Nagasaki and D74 were assessed using the Restriction Enzyme Database REBASE (www.rebase.neb.com) [47] for determination of putative MTases involved with each motif and for comparisons with known modification systems. A total of 26 genes associated with restriction-modification systems were identified in H. parasuis D74, including 11 genes associated with Type 1 restriction-modification (RM) systems and 15 genes associated with Type II RM systems (Table 8). Genes associated with Type III RM systems were not identified in D74 (Table 8). REBASE predicted three recognition sequences corresponding to a specific motif detected by the SMRT sequencing analysis. REBASE analysis indicated that the putative Type I RM enzymes S.Hpa74III, Hpa74III, and M.Hpa74III were predicted to be responsible for the 5’-GTA modification (Table 8). The putative Type II RM enzymes M.Hpa74I and M.Hpa74IP were indicated by REBSE to be responsible for the motif 5’-Am6A GCTT-3’modification (Table 8). The putative Type II RM enzyme dam or Hpa74II was predicted to be responsible for the 5’-GAm6TC-3’modification for D74 (Table 8). The remaining two motifs detected by the SMRT sequencing analysis did not correspond with a REBASE predicted recognition sequence. Two putative Type II RM enzymes Hpa74ORFHP and M.Hpa74ORFHP were predicted by REBASE to recognize the sequence motif 5’-GGCC-3, which was not a motif detected by the SMRT sequencing analysis. Three putative RM enzyme genes identified in D74 by REBSAE analysis are predicted pseudogenes. These include S2.Hpa74ORFDP, Hpa74ORFJP, and M.Hpa74IP, associated with the Type II RM system indicated by REBSE to be responsible for the motif 5’-Amodification (Table 8). In contrast, none of the putative RM enzymes genes identified in Nagasaki by REBASE analysis are predicted pseudogenes.
Table 8

Putative H. parasuis D74 restriction modification systems.

TypeaGenebD74 locus_tagD74 NamePredicted Recognition SequenceREBASE Name
IMA2U20_05240hsdMM.Hpa74ORFFP
ISA2U20_05245hsdSS.Hpa74ORFFP
IRA2U20_05260hsdR2Hpa74ORFFP
IRA2U20_10200hsdR3Hpa74ORFIP
ISA2U20_10215hsdS2S1.Hpa74ORFIP
ISA2U20_10220hypothetical protein CDSS2.Hpa74ORFIP
IMA2U20_10230hsdM2M.Hpa74ORFIP
IRA2U20_10460chsdR4cHpa74ORFJP
ISA2U20_11315hsdS3GTANNNNNNNCTTGS.Hpa74III
IRA2U20_11320hsdR5GTANNNNNNNCTTGHpa74III
IMA2U20_11330hsdM3GTANNNNNNNCTTGM.Hpa74III
IIMA2U20_00375hindIIIMAAGCTTM.Hpa74I
IIRA2U20_00380chindIIIRcAAGCTTHpa74IP
IIRMA2U20_00575bcgIAHpa74ORFBP
IISA2U20_00580bcgIBS.Hpa74ORFBP
IISA2U20_00800bcgIBS1.Hpa74ORFDP
IISA2U20_00805cbcgIB2cS2.Hpa74ORFDP
IIMA2U20_03405damGATCHpa74II
IIRMA2U20_07465hypothetical protein CDSHpa74ORFGP
IIRA2U20_07700restriction endonuclease CDSGGCCHpa74ORFHP
IIMA2U20_07705haeIIIMGGCCM.Hpa74ORFHP
IIMA2U20_10010yhdJM.Hpa74ORFOP
IIMA2U20_10475modification methylase CDSM.Hpa74ORFLP
IIRA2U20_10480HNH endonuclease CDSHpa74ORFLP
IISA2U20_10485bcgIB3S.Hpa74ORFMP
IIRMA2U20_10490bcgIA2Hpa74ORFMP

aSystems were designated Type I, II, or III based on REBASE analyses.

bGene designations of methylase (M), restriction (R), fused restriction-modification (RM), or specificity (S), along with predicted recognition sequence and REBASE name, were determined using REBASE analysis.

CPredicted pseudogene.

aSystems were designated Type I, II, or III based on REBASE analyses. bGene designations of methylase (M), restriction (R), fused restriction-modification (RM), or specificity (S), along with predicted recognition sequence and REBASE name, were determined using REBASE analysis. CPredicted pseudogene. Focusing on H. parasuis Nagasaki, 34 genes associated with restriction-modification systems were identified, including 14 genes associated with Type 1, 15 genes associated with Type II, and 5 genes associated with Type III RM systems (Table 9). Only two of the REBASE predicted recognition sequences corresponded to a specific motif detected by the SMRT sequencing analysis. REBASE analysis indicated that the putative Type I RM enzymes M.HpaNNII, HpaNNIIP, and S.HapNNII were predicted to be responsible for the 5’-GTA modification (Table 9). The putative Type II RM enzyme dam or M.HpaNNI was predicted to be responsible for the 5’-GAm6TC-3’modification (Table 9). The remaining five motifs identified by the SMRT sequencing analysis represent yet unknown recognition sequences.
Table 9

Putative H. parasuis Nagasaki restriction modification systems.

TypeaGenebNagasaki locus_tagNagasaki NamePredicted Recognition SequencebREBASE Nameb
IMA2U21_00095hsdMGTANNNNNNNCTTGM.HpaNNII
IRA2U21_00110hsdRGTANNNNNNNCTTGHpaNNIIP
ISA2U21_00115hsdSGTANNNNNNNCTTGS.HpaNNII
IRA2U21_01085hsdR2HpaNNORFEP
ISA2U21_01100hsdS2S1.HpaNNORFEP
ISA2U21_01105hypothetical protein CDSS2.HpaNNORFEP
IMA2U21_01115type I restriction-modification system subunit M CDSM.HpaNNORFEP
IRA2U21_04210hsdR4HpaNNORFIP
ISA2U21_04220hsdS3S1.HpaNNORFEP
ISA2U21_04225hsdS4S2.HpaNNORFEP
IMA2U21_04230hsdM2M.HpaNNORFEP
IRA2U21_06440hsdR5HpaNNORFJP
ISA2U21_06445hsdS5S.HpaNNORFJP
IMA2U21_06450hsdM3M.HpaNNORFJP
IISA2U21_00855bcgIBS1.HpaNNORFDP
IISA2U21_00860bcgIB2S2.HpaNNORFDP
IIRMA2U21_00865bcgIAHpaNNORFDP
IIMA2U21_03345hypothetical protein CDSM.HpaNNORFGP
IIMA2U21_03850hhalMGCGCM.HpaNNORFHP
IIRA2U21_03855type II RM endonucleaseGCGCHpaNNORFHP
IIMA2U21_06885hpaIIMCCGGM.HpaNNORFAP
IIMA2U21_08235damGATCM.HpaNNI
IIMA2U21_09040bspRIMCGCGM.HpaNNORFLP
IIRA2U21_09045hypothetical protein CDSCGCGHpaNNORFLP
IIMA2U21_09985restriction endonuclease subunit M CDSM1.HpaNNORFMP
IIMA2U21_09990restriction endonuclease CDSM2.HpaNNORFMP
IIRMA2U21_09995restriction endonuclease CDSHpaNNORFMP
IIMA2U21_10160hypothetical protein CDSM.HpaNNORFNP
IIMA2U21_10520hypothetical protein CDSBAM.HpaNNORFOP
IIIMA2U21_01380restriction endonuclease subunit M CDSGGAGM1.HpaNNORFFP
IIIMA2U21_01385bamHIMGGAGM2.HpaNNORFFP
IIIRA2U21_01390type III restriction endonuclease subunit R CDSHpaNNORFFP
IIIMA2U21_11010site-specific DNA-methyltransferase CDSM.HpaNNORFPP
IIIRA2U21_11015restriction endonuclease CDSHpaNNORFPP

aSystems were designated Type I, II, or III based on REBASE analyses.

bGene designations of methylase (M), restriction (R), fused restriction-modification (RM), or specificity (S), along with predicted recognition sequence and REBASE name, were determined using REBASE analysis.

aSystems were designated Type I, II, or III based on REBASE analyses. bGene designations of methylase (M), restriction (R), fused restriction-modification (RM), or specificity (S), along with predicted recognition sequence and REBASE name, were determined using REBASE analysis. Expression of MTases can undergo phase variation by slipped-strand mispairing due to the presence of simple sequence repeats (SSRs), such as homopolymeric tracts [79-83]. All of the putative RM genes identified in D74 (Table 8) and Nagasaki (Table 9) were search for the presence of SSRs within the coding region and in the region encompassing 150 bp upstream of the putative start codon. No SSRs were observed in the upstream region for five RM genes identified in D74 (A2U20_00805c, A2U20_10480, A2U20_10215, A2U20_11315, and A2U20_10200), while six homopolymeric tracts of consisting of five or more bases were observed in the upstream region of hsdR2 (A2U20_05260) (S6 Table). SSRs were observed within the coding region of all of the D74 RM genes and the numbers of SSRs ranged from two (A2U20_00580 and A2U20_11330) to 38 (A2U20_07465) (S6 Table). No SSRs were observed in the upstream region for six RM genes identified in Nagasaki (A2U21_00115, A2U21_01085, A2U21_04210, A2U21_03345, A2U21_03850, A2U21_06885), while six homopolymeric tracts of consisting of five or more bases were observed in the upstream region of A2U21_09985 (S7 Table). SSRs were observed within the coding region of all of the Nagasaki RM genes and the numbers of SSRs ranged from two (A2U21_10520) to 31 (A2U21_04210 and A2U21_00110) (S7 Table). While further studies are warranted, the occurrence of these homopolymeric tracts within these regions indicates the potential of these genes to undergo phase variation by slip strand mispairing.

Conclusions

This report provides the closed whole-genome sequence annotation and genome-wide methylation patterns for the H. parasuis non-virulent D74 strain and for the highly virulent Nagasaki strain. This collective information will enable reliable one-to-one assignment of specific genes of interest and subsequent evaluation of predicted protein structures. Highlights of the information gained from this study include the sequence and annotation of a plasmid harbored by strain D74 that shares a high degree of similarity to other plasmids harbored by members of the Pasteurellaceae family, which could prove useful in future allelic replacement and/or functional genomic studies. Evaluation of the virulence-associated genes contained within the genomes of D74 and Nagasaki led to the discovery of a large number of TA systems, primarily Type II TA families, within both genomes. Five predicted hemolysins were identified as unique to Nagasaki and seven putative contact-dependent growth inhibition toxin proteins were identified only in strain D74. Assessment of all potential vtaA genes revealed thirteen present in the Nagasaki genome and three in the D74 genome. Subsequent evaluation of the predicted protein structure revealed that none of the D74 VtaA proteins contain a collagen triple helix repeat domain and a much larger predicted amino acid size for two D74 VtaA proteins compared to any predicted Nagasaki VtaA protein. Fifteen methylation sequence motifs were identified in D74 and fourteen methylation sequence motifs were identified in Nagasaki using SMRT sequencing analysis. Only one of the methylation sequence motif was observed in both strains highlighting the diversity between D74 and Nagasaki. Subsequent analysis also revealed diversity in the restriction-modification systems harbored by D74 and Nagasaki. Our hope is that the assembly and annotation of these genomes, coupled with the comparative genomic analyses reported in this study, will aid in the identification of genetic elements that underlie and influence phenotypic differences between these isolates. Together, this information can support future research and the development of vaccines with improved efficacy towards H. parasuis in swine to decrease the prevalence and disease burden caused by this pathogen.

D74 CDS list.

(XLSX) Click here for additional data file.

Nagasaki CDS list.

(XLSX) Click here for additional data file.

AMR MIC data.

(XLSX) Click here for additional data file.

D74 capsule genes.

(DOCX) Click here for additional data file.

Nagasaki capsule genes.

(DOCX) Click here for additional data file.

D74 RM gene SSRs.

(XLSX) Click here for additional data file.

Nagasaki RM gene SSRs.

(XLSX) Click here for additional data file.

D74 annotations.

(GB) Click here for additional data file. (GB) Click here for additional data file. (GB) Click here for additional data file.
  79 in total

1.  An essential role for DNA adenine methylation in bacterial virulence.

Authors:  D M Heithoff; R L Sinsheimer; D A Low; M J Mahan
Journal:  Science       Date:  1999-05-07       Impact factor: 47.728

2.  Development of a new serological test for serotyping Haemophilus parasuis isolates and determination of their prevalence in North America.

Authors:  M Tadjine; K R Mittal; S Bourdon; M Gottschalk
Journal:  J Clin Microbiol       Date:  2004-02       Impact factor: 5.948

3.  Contact-dependent inhibition of growth in Escherichia coli.

Authors:  Stephanie K Aoki; Rupinderjit Pamma; Aaron D Hernday; Jessica E Bickham; Bruce A Braaten; David A Low
Journal:  Science       Date:  2005-08-19       Impact factor: 47.728

4.  A DNA methylation ratchet governs progression through a bacterial cell cycle.

Authors:  Justine Collier; Harley H McAdams; Lucy Shapiro
Journal:  Proc Natl Acad Sci U S A       Date:  2007-10-17       Impact factor: 11.205

Review 5.  Advances in the quest for virulence factors of Haemophilus parasuis.

Authors:  Mar Costa-Hurtado; Virginia Aragon
Journal:  Vet J       Date:  2013-09-04       Impact factor: 2.688

6.  Immunogenicity and protection against Haemophilus parasuis infection after vaccination with recombinant virulence associated trimeric autotransporters (VtaA).

Authors:  Alex Olvera; Sonia Pina; Marta Pérez-Simó; Virginia Aragón; Joaquim Segalés; Albert Bensaid
Journal:  Vaccine       Date:  2011-02-12       Impact factor: 3.641

7.  Distribution of genes involved in sialic acid utilization in strains of Haemophilus parasuis.

Authors:  Verónica Martínez-Moliner; Pedro Soler-Llorens; Javier Moleres; Junkal Garmendia; Virginia Aragon
Journal:  Microbiology (Reading)       Date:  2012-05-18       Impact factor: 2.777

8.  Naturally-farrowed, artificially-reared pigs as an alternative model for experimental infection by Haemophilus parasuis.

Authors:  Simone Oliveira; Lucina Galina; Isabel Blanco; Ana Canals; Carlos Pijoan
Journal:  Can J Vet Res       Date:  2003-05       Impact factor: 1.310

9.  Serum cross-reaction among virulence-associated trimeric autotransporters (VtaA) of Haemophilus parasuis.

Authors:  Alex Olvera; Verónica Martínez-Moliner; Sonia Pina-Pedrero; Marta Pérez-Simó; Nuria Galofré-Milà; Mar Costa-Hurtado; Virginia Aragon; Albert Bensaid
Journal:  Vet Microbiol       Date:  2013-02-28       Impact factor: 3.293

10.  Gene content and diversity of the loci encoding biosynthesis of capsular polysaccharides of the 15 serovar reference strains of Haemophilus parasuis.

Authors:  Kate J Howell; Lucy A Weinert; Shi-Lu Luan; Sarah E Peters; Roy R Chaudhuri; David Harris; Oystein Angen; Virginia Aragon; Julian Parkhill; Paul R Langford; Andrew N Rycroft; Brendan W Wren; Alexander W Tucker; Duncan J Maskell
Journal:  J Bacteriol       Date:  2013-07-19       Impact factor: 3.490

View more
  3 in total

1.  Whole-genome sequence analyses of Glaesserella parasuis isolates reveals extensive genomic variation and diverse antibiotic resistance determinants.

Authors:  Xiulin Wan; Xinhui Li; Todd Osmundson; Chunling Li; He Yan
Journal:  PeerJ       Date:  2020-06-22       Impact factor: 2.984

2.  Expression Analysis of Outer Membrane Protein HPS_06257 in Different Strains of Glaesserella parasuis and Its Potential Role in Protective Immune Response against HPS_06257-Expressing Strains via Antibody-Dependent Phagocytosis.

Authors:  Xiaojun Chen; Hanye Shi; Xingyu Cheng; Xiaoxu Wang; Zongjie Li; Donghua Shao; Ke Liu; Jianchao Wei; Beibei Li; Jian Wang; Bin Zhou; Zhiyong Ma; Yafeng Qiu
Journal:  Vet Sci       Date:  2022-07-06

3.  Transcriptomic differences noted in Glaesserella parasuis between growth in broth and on agar.

Authors:  Samantha J Hau; Kathy T Mou; Darrell O Bayles; Susan L Brockmeier
Journal:  PLoS One       Date:  2019-08-06       Impact factor: 3.240

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.