Literature DB >> 26883378

S1 gene-based phylogeny of infectious bronchitis virus: An attempt to harmonize virus classification.

Viviana Valastro¹, Edward C Holmes², Paul Britton³, Alice Fusaro⁴, Mark W Jackwood⁵, Giovanni Cattoli⁴, Isabella Monne⁴.

Abstract

Infectious bronchitis virus (IBV) is the causative agent of a highly contagious disease that results in severe economic losses to the global poultry industry. The virus exists in a wide variety of genetically distinct viral types, and both phylogenetic analysis and measures of pairwise similarity among nucleotide or amino acid sequences have been used to classify IBV strains. However, there is currently no consensus on the method by which IBV sequences should be compared, and heterogeneous genetic group designations that are inconsistent with phylogenetic history have been adopted, leading to the confusing coexistence of multiple genotyping schemes. Herein, we propose a simple and repeatable phylogeny-based classification system combined with an unambiguous and rationale lineage nomenclature for the assignment of IBV strains. By using complete nucleotide sequences of the S1 gene we determined the phylogenetic structure of IBV, which in turn allowed us to define 6 genotypes that together comprise 32 distinct viral lineages and a number of inter-lineage recombinants. Because of extensive rate variation among IBVs, we suggest that the inference of phylogenetic relationships alone represents a more appropriate criterion for sequence classification than pairwise sequence comparisons. The adoption of an internationally accepted viral nomenclature is crucial for future studies of IBV epidemiology and evolution, and the classification scheme presented here can be updated and revised novel S1 sequences should become available.

Entities: Chemical

Keywords: Classification; Evolution; IBV; Phylogeny

Mesh：

Substances：
Viral Envelope Proteins

Year: 2016 PMID： 26883378 PMCID： PMC7172980 DOI： 10.1016/j.meegid.2016.02.015

Source DB: PubMed Journal: Infect Genet Evol ISSN： 1567-1348 Impact factor: 3.342

Introduction

Infectious bronchitis virus (IBV) is the etiological agent of an acute and highly contagious disease that affects chickens of all ages and poses a major economic burden on the poultry industry. The virus exists in a wide range of antigenically and genetically distinct viral types, making the prevention and the control of this important pathogen both complex and challenging. Although the natural host of IBV is the chicken, the presence of IBV-like and other avian coronaviruses in both domestic and wild animals, including domestic fowl, partridge, geese, pigeon, guinea fowl, teal, duck and peafowl has been reported (Cavanagh, 2007, Cavanagh, 2005). IBV is a single-stranded, positive-sense RNA virus of the family Coronaviridae, genus Gammacoronavirus (Cavanagh and Naqi, 2003; International Committee on Taxonomy of viruses, http://www.ictvonline.org/virustaxonomy.asp). The viral genome comprises two untranslated regions (UTRs) at the 5′ and 3′ ends (Boursnell et al., 1987, Ziebuhr et al., 2000), two overlapping open reading frames (ORFs) encoding the polyproteins 1a and 1ab, and regions encoding the main structural proteins — spike (S), envelope (E), membrane (M) and nucleocapsid (N) (Spaan et al., 1988, Sutou et al., 1988). In addition, two accessory genes, ORF3 and ORF5, expressing proteins 3a and 3b and 5a and 5b, respectively, have been described (Casais et al., 2005, Hodgson et al., 2006, Lai and Cavanagh, 1997). The S protein (~ 3462 nt), located in the surface of the viral membrane, is the major inducer of neutralizing antibodies (Cavanagh and Naqi, 1997, Winter et al., 2008) and is responsible for virus binding and entry to host cells (Cavanagh et al., 1986a, Koch et al., 1990, Niesters et al., 1987). It is post-translationally cleaved into the amino-terminal S1 (~ 535 amino acids) and the carboxyl-terminal S2 (~ 627 amino acids) subunits at a multi-basic cleavage site (Cavanagh et al., 1986b). The observation that IB serotypes may differ by 20% to 25% at the genomic scale, and up to 50% of amino acids in the S1 protein (Cavanagh et al., 2005), has warranted considerable attention (Cavanagh and Gelb, 2008). Such variability may lead to important biological differences between strains and novel serotypic variants can emerge as the result of a limited number of amino acid changes in the spike protein. Nucleotide heterogeneity is most prevalent in the S1 portion of the S gene and largely contained within three different hypervariable regions (HVRs) (aa 38–67, 91–141 and 274–387) (Cavanagh et al., 1988, Moore et al., 1997). Accordingly, the analysis of the complete or partial S1 gene nucleotide sequence has been conventionally used to determine viral genetic types. Currently, more than 50 different antigenic and genetic types of IBV have been recognized, some with substantial economic impact on the livestock industry, and some others restricted to specific geographical areas (de Wit et al., 2011a, Jackwood, 2012). Effective surveillance is primarily based on the identification of the virus type causing disease (Jackwood and de Wit, 2013). A variety of methods have been developed to differentiate IBV strains. Systems that examine the antigenic or genetic features of an isolate result in the description of serotypes and genotypes, respectively, whereas methods that are focused on the immune response of chickens against challenge with an IBV strain lead to the designation of protectotypes (Lohr, 1988). Importantly, however, the genotype-, serotype- or protectotype-based approaches do not always group IBVs in the same way. In the absence of fast and appropriate biological assays for IBV classification, analyses of S1 sequence data are the most widely used means to assign IBV strains to groups, arbitrarily and confusingly defined as genetic types, genotypes, clades or clusters. Both phylogenetic analysis and measures of pairwise similarity between nucleotide and amino acid sequences have been used for this purpose. However, there is no agreement on the exact method by which sequences should be compared nor the criteria used to distinguish viral genetic types. This is in part due to the rapid appearance of novel variants and a lack of consistency and uniformity in the nomenclature of the IBV genetic groups. For example, several genotyping studies have been performed on IBV within a specific geographic area without considering a more global context (see below). As a consequence, different clade designations, such ‘Korean New Cluster II’ (Mase et al., 2010), ‘JP-IV’ (Lim et al., 2012) and ‘Chinese New Type’ (Li et al., 2013), have been assigned to describe closely related viruses. Further confusion arises because different regions of the S1 subunit have been used to infer phylogenetic trees, and which region is most informative is debated (Kingham et al., 2000, Lee et al., 2003, Li et al., 2012, Mo et al., 2013, Schikora et al., 2003, Wang and Huang, 2000). Although it is generally true that longer sequences are more informative, several laboratories use a part of S1 that can include one or more HVRs. The study described here was performed with the aim of constructing a comprehensive, reliable and robust phylogenetic inference on a global scale as the basis for classifying IB viruses for epidemiological purposes. Due to its variability and biological function, the S1 gene is the region commonly sequenced as an ideal target in molecular assays to type IBV strains. Accordingly, we focused on the complete S1 gene. Using all publicly available S1 gene sequence data, our goal was to determine the genetic structure of IBV and to propose a rational and standardized nomenclature of the IBV genetic groups identified here, referred to as lineages. In addition, we evaluated the ability of S1 fragments of different sizes to recapitulate the phylogeny and classification obtained from full-length S1 sequences.

Materials and methods

All available nucleotide sequences corresponding to the complete coding sequence of the S1 gene (~ 1620 bp) of IBV (n = 1652) were downloaded from GenBank (http://www.ncbi.nlm.nih.gov). Details on these sequences, including their genotype and serotype, were extracted from the GenBank annotations. Sequences shorter than 1440 bp and those of low quality, for example resulting in a non-sense and/or truncated S1 protein, or identical in both sequence and strain name were removed, resulting in a final data set of 1518 sequences. An alignment of the complete S1 gene was performed with a slow and iterative refinement method (FFT-NS-i) implemented in Mafft v.7.0 (http://mafft.cbrc.jp/alignment/software/; Katoh and Standley, 2013) and a maximum likelihood (ML) phylogenetic tree was estimated (see below). The initial ML tree revealed that some previously recognized IBV groups did not form monophyletic groups (Supplementary material Fig. S1; see Results) indicative of inter-lineage recombination events that are relatively frequent in IBV (Cavanagh et al., 1992b, Kottier et al., 1995, Lee and Jackwood, 2000). To confirm the occurrence of recombination smaller sequence data sets comprising the suspected recombinant and the putative parental strains were analyzed using the RDP, Geneconv, Maxchi, BootScan, 3Seq and Chimaera methods available in the RDP package v.4 (Martin et al., 2010), applying default settings. The Simplot program v.3.5 was also used to define the locations of recombination break-points (Lole et al., 1999). We considered “true recombinants” to be those sequences identified by at least two methods (P < 1 × 10− 10) and confirmed by significant phylogenetic incongruence among trees estimated on either side of the putative recombination break-points. All sequences with a history of recombination determined in this manner were removed from the original 1518 sequence data set used to identify ‘pure’ IBV lineages, but described as recombinant IBV forms (see Results). In addition, a number of sequences were considered to be unreliable due to a lack of congruence between the strain description in the associated publication and the corresponding nucleotide sequences. This quality control step resulted in a final data set of 1286 full-length S1 sequences, which was used to determine the phylogenetic relationships among IBV strains and to classify them into well-established lineages. Evolutionary distances between lineages and genotypes were inferred using the complete S1 data set, with pairwise (p-distance) comparisons of nucleotide and amino acid sequences performed using the MEGA6 program (Tamura et al., 2013). To facilitate tree visualization we performed an additional phylogenetic analysis using a smaller subset of full-length S1 sequences (n = 199). This subset comprised, where available, 6 representative sequences of each IBV lineage identified in the final ‘cleansed’ data set described above. In addition, 26 strains recognized as unique variants because they did not group with any of the identified lineages were included. Detailed information on the selected isolates along with their corresponding nucleotide sequences are provided as Supplementary materials (Table S1). The same data set was also used to assess whether the lineages established using phylogenetic analysis of the complete data set were maintained when only a portion of the S1 gene was analyzed. To that end, two different phylogenetic trees were inferred using the two most common sequenced regions corresponding to the coding sequences of HVRs1 and 2, located between nucleotide positions 112 and 423, and HVR3 between positions 820 and 1161 of the S1 gene, respectively (according to the sequence M21883). Finally, two additional data sets were created to determine whether there was sufficient temporal structure in the data to undertake a molecular clock dating analysis. The first data set consisted of 372 sequences sampled between 1956 and 2013 and randomly selected from the complete data collection, while the second data set represented a single large lineage (here named as lineage GI-19 but originally designated as QX) of relatively close related viruses sampled between 1993 and 2010. Specifically, all the GI-19 S1 gene full-length sequences collected before the administration of the homologous vaccine in the field (n = 354) were selected. To assess the extent of temporal structure in these data, a regression of root-to-tip genetic distances against date of sampling was performed using the Path-O-Gen program v.1.4 (http://tree.bio.ed.ac.uk/software/pathogen/) based on an input ML phylogenetic tree (see below). In all cases phylogenetic trees were inferred using the ML method available in PhyML (Guindon et al., 2010) and implemented in Geneious v.7.1.8 (Kearse et al., 2012), employing a combination of NNI and SPR branch swapping. Prior to phylogenetic analysis, all hypervariable and potentially poorly aligned regions were removed using Gblocks (Castresana, 2000). For this analysis, a less stringent procedure, allowing for gap positions within final blocks, was employed. In addition, the best-fit model of nucleotide substitution was inferred using JModeltest v.2.1.4 (Darriba et al., 2012). Accordingly, the General Time Reversible (GTR) model with a discrete gamma distribution (Γ) and allowing for invariant sites (I) was selected in all data analyses based on AICc. Nodal supports in the PhyML analyses were assessed using Shimodaira–Hasegawa (SH)-like branch supports (Anisimova and Gascuel, 2006, Guindon et al., 2010). To further assess the robustness of the phylogenetic tree, additional analyses of the small S1 gene sequence data set were performed using the Bayesian approach within MrBayes v.3.2 (Huelsenbeck and Ronquist, 2001), and the Neighbor-Joining method available in MEGA6 (Tamura et al., 2013). In both these cases we employed the GTR + I + Γ substitution model, with nodal support values obtained by posterior probabilities and 1000 bootstrap replicates, respectively. Topological congruence between trees was compared through visual inspection for (i) ML trees obtained for the complete (n = 1286) and the small data sets (n = 199), (ii) the ML trees estimated for the full-length S1 sequences (∼ 1620 nt) and those corresponding to the HVRs1 and 2 (312 nt) and HVR3 (342 nt) regions, and (iii) the ML, NJ and Bayesian trees all run on the small data set.

Results

Overall IBV data set

To assess the phylogenetic relationships among the IBV variants and develop a harmonized system to define and name viral lineages, we analyzed all full-length S1 gene IBV sequences available on GenBank. These data comprised 1652 nucleotide sequences obtained from field samples and IBV vaccine strains collected worldwide between 1937 and 2013. After quality control, we inferred a ML phylogenetic tree on a total of 1518 sequences (Fig. S1) with the aim of obtaining a picture of the global genetic variability of this pathogen.

Recombination analysis

The topology of the preliminary ML tree showed evidence for recombination among IBV lineages. In particular, although defined previously, the so-called QX, 793B and Italy02 genetic groups, here referred to as the GI-19, -13 and -21 lineages (see below), no longer appeared as monophyletic groups (Fig. S1). We therefore performed additional analyses to determine whether recombination has occurred within the S1 gene and how this may have impacted the tree topology. This revealed a total of 213 recombinant viruses, which were removed from the data set to enable a more robust phylogenetic inference and identification of major viral lineages. For the purposes of classification, we propose that such recombinant viruses are simply referred to as combinations of the 32 IBV lineages defined below. Recombination has clearly been of importance in shaping the evolution of some IBV variants. In particular, 143 viruses sampled in China (n = 107) and Korea (n = 36) since the 1990s were found to descend from parental strains belonging to the QX and HN08 (here referred to as lineage 22) genetic groups. This recombination involves, among others, viruses originally described as clustering into Chinese genotype III (Liu et al., 2006b) also known as the ck/CH/LSC/95I-type or tl/CH/LDT3/03I-type (Han et al., 2011, Mo et al., 2013, Sun et al., 2011), and those previously assigned to the ck/CH/LHLJ/95I-type and BJ-type cluster (Han et al., 2011). In addition, the Korean nephropathogenic strains already known to be recombinants and originally designated as New Cluster I (Lim et al., 2012, Lim et al., 2011), also fell into the group derived from the recombination between QX and HN08. Multiple recombinant break-points were detected within this group, with most located between nucleotides 550 and 652 and 934 and 1125 (according to the sequence AY561711). In 44 viruses we found evidence of inter-lineage recombination between the 793B and the QX- or HN08 clades, thereby supporting previous observations (Mo et al., 2013). Notably, all sequences possessed break-points located between nucleotide positions 665 and 709 (according to the sequence AY561711). These strains were collected in China from 2004 to 2012 and some were originally grouped by phylogenetic analysis with the 793B or QX genetic groups (Ji et al., 2011). With the exception of few viruses, the remaining recombinant sequences do not share any common break-points or parental strains and were a mosaic of diverse parental lineages. However, taken together, these results reveal that the majority of recombination break-points are located in the intermediate region between the HVRs1 and 2 and the HVR3.

Classification of IB viruses

Our phylogenetic analysis of 1286 IB strains (Fig. 1 ) was used to derive a new and coherent classification scheme for IBV based on the S1 gene. Not only this is the most variable region within the IBV genome, containing abundant phylogenetic information, but it is also the major immunogenic component and the most commonly sequenced region of the IBV genome. Accordingly, 32 IBV lineages, each of which was defined by strongly supported nodes (> 0.98 SH-like test support values), were identified using our expansive S1 gene phylogeny. The designation of “lineage” was arbitrarily assigned to monophyletic groups of at least three viruses sampled from at least two different outbreaks. Strains that do not cluster into any lineages according to these subjective criteria are labeled as unique variant (UV) in the phylogenetic tree (n = 26). The lineages further fall into 6 well-supported (i.e. SH-like test support values of 1.0) and more genetically divergent groups, herein termed “genotypes”; 27 lineages cluster into genotype I (GI), which includes the majority of the IBV strains, whereas the remaining 5 genotypes contain one lineage each.

Fig. 1

Phylogenetic tree of complete S1 nucleotide sequences (1456 nt) of 1286 IBVs. The phylogeny shows the evolutionary relationships among all IBV genotypes and lineages proposed here. Each lineage is color-coded and its corresponding designation is reported. Unique variants (UVs) are marked in black. The red box designates the 27 lineages within GI. SH-like branch supports are shown for key nodes. The scale bar represents the number of nucleotide substitutions per site, and the tree is mid-point rooted for clarity only. The IBV lineages defined in this manner exhibit uncorrected pairwise distances of 13% and 14% for nucleotide and amino acid sequences, respectively. Similarly, viral genotypes differed at 30% of nucleotides and 31% of amino acids. Importantly, however, because natural virus evolution is unlikely to always produce discrete boundaries, these distance values should only be considered as “rules of thumb” rather than universally valid parameters. Thus, IBV classification should not be undertaken on pairwise distance comparisons alone, but requires input from phylogenetic data. To avoid confusion, IBV lineages were labeled using the abbreviation of the genotype in which they fall, followed by a consecutive number assigned according to the temporal order of the collection date of the first virus detected per lineage, here referred to as prototype strain. More details on the prototype viruses are provided in Table 1 . The same temporal scheme was used to assign consecutive roman numbers to the different genotypes. For those viruses collected in the same year and belonging to different lineages within GI we have followed the temporal order of their GenBank sequence submissions. Accordingly, they are labeled GI-1 to GI-27; the oldest IBV in the current study falls into lineage 1, whereas lineage 27 represents the most recently identified cluster within GI. Moreover, to simplify the possible future designation of additional genetic variants, we assigned the number ‘1’ to the all lineages out from genotype I, even if a second IBV lineage is not yet detected in any of these five genotypes. Accordingly, they are labeled GII-1, GIII-1 GIV-1, GV-1 and GVI-1. Of note is that the lineage GI-24 consists of Indian IB viruses that so far have not been included in a scientific publication or for which a phylogenetic analysis has not been still performed.

Table 1

Prototype strains and period of circulation of each lineage (data based on the complete S1 nucleotide sequences of the viruses included in the analysis).

Lineage	Period of circulation	Prototype strain
Lineage	Period of circulation	Strain name	Country of origin	Collection date	GenBank acc. number
GI-1	1937–2013	Beaudette	USA	1937	M95169
GI-2	1954–2006	Holte	USA	1954	GU393336
GI-3	1960–2006	Gray	USA	1960	L14069
GI-4	1962–1998	Holte	USA	1962	L18988
GI-5	1962–2012	N1/62	Australia	1962	U29522
GI-6	1962–2010	VicS	Australia	1962	U29519
GI-7	1964–2012	TP/64	Taiwan	1964	AY606320
GI-8	1965–1967	L165	USA	1965	JQ964061
GI-9	1973–2011	ARK99	USA	1973	M99482
GI-10	1970s–2000s	B	New Zealand	1970s	AF151954
GI-11	1975–2009	UFMG/G	Brazil	1975	JX182775
GI-12	1978–2006	D3896	The Netherlands	1978	X52084
GI-13	1983–2013	Moroccan-G/83	Morocco	1983	EU914938
GI-14	1984–2006	B1648	Belgium	1984	X87238
GI-15	1986–2008	B4	Korea	1986	FJ807932
GI-16	1986–2011	IZO 28/86	Italy	1986	KJ941019
GI-17	1988–1999	CA/Machado/88	USA	1988	AF419315
GI-18	1993–1999	JP8127	Japan	1993	AY296744
GI-19	1993–2012	58HeN-93II	China	1993	KC577395
GI-20	1996–1999	Qu_mv	Canada	1996	AF349621
GI-21	1997–2005	Spain/97/314	Spain	1997	DQ064806
GI-22	1997–2011	40GDGZ-97I	China	1997	KC577382
GI-23	1998–2012	Variant 2	Israel	1998	AF093796
GI-24	1998–2013	V13	India	1998	KF757447
GI-25	2004–2013	CA/1737/04	USA	2004	EU925393
GI-26	2006–2007	NGA/B401/2006	Nigeria	2006	FN182243
GI-27	2008–2013	GA08	USA	2008	GU301925
GII-1	1979–1984	D1466	The Netherlands	1979	M21971
GIII-1	1988–2008	N1/88	Australia	1988	U29450
GIV-1	1992–2003	DE/072/92	USA	1992	U77298
GV-1	2002–2008	N4/02	Australia	2002	DQ059618
GVI-1	2007–2012	TC07-2	China	2007	GQ265948

Prototype strains and period of circulation of each lineage (data based on the complete S1 nucleotide sequences of the viruses included in the analysis). The sequence details, where available, were added to the strain name in the format: GenBank accession number, strain name (as reported in the public database), country of origin and collection date.

Phylogenetic analysis of the small data set

To further assess the reliability of our classification scheme and to better display the IBV phylogeny, we performed an additional ML phylogenetic analysis on a smaller, sub-sampled, data set (n = 199), representative of IBV variability in the field (Fig. 2 ). These two trees had very consistent topologies; all lineage-defining branches are distinct from each other and strongly supported (> 0.97 SH-like support values). To confirm these findings, we analyzed the smaller data set using different phylogenetic methods. Importantly, equivalent branching patterns were obtained using both NJ and Bayesian methods (Supplementary materials Figs. S2 and S3). Accordingly, we suggest that this smaller data set is used as a reference tool for future epidemiological and evolutionary studies of IBV. The nucleotide sequences of the reference data set are provided as Supplementary materials (Table S1).

Fig. 2

Phylogenetic tree of complete S1 nucleotide sequences. The phylogeny contains a total of 199 IBV strains, including 6 representative sequences of each lineage detected and 26 strains recognized as unique variants. Each lineage is color-coded and its corresponding designation is reported. Bars reporting the genotypes in which the lineages fall are shown. GenBank accession number, isolate number or name, country of origin and collection date is given for each strain. The designation “UV” indicates unique variants, here marked in black. A complete list of the 199 sequences used is provided in Table S1. SH-like branch supports are shown for key nodes. The scale bar represents the number of nucleotide substitutions per site, and the tree is mid-point rooted for clarity only. Phylogenetic tree of complete S1 nucleotide sequences. The phylogeny contains a total of 199 IBV strains, including 6 representative sequences of each lineage detected and 26 strains recognized as unique variants. Each lineage is color-coded and its corresponding designation is reported. Bars reporting the genotypes in which the lineages fall are shown. GenBank accession number, isolate number or name, country of origin and collection date is given for each strain. The designation “UV” indicates unique variants, here marked in black. A complete list of the 199 sequences used is provided in Table S1. SH-like branch supports are shown for key nodes. The scale bar represents the number of nucleotide substitutions per site, and the tree is mid-point rooted for clarity only.

Description of individual lineages

We used a geography-based system (see below) to describe the 32 IBV lineages reported here. Because of their wide geographic distribution, some lineages are clearly of importance. Among these, lineages GI-1 and -13 (previously named as the Mass and 793B types, respectively) are commonly found, partly reflecting the use of vaccines derived from them in the countries where they have been reported. In contrast, other lineages are confined only to specific countries, many of which are limited to Asia and North America. Africa and South America possess unique lineages as well as some of the European-origin types. Notably, geographically distinct wild-type lineages were identified in Australia and New Zealand, likely reflecting their spatial isolation.

Widely distributed lineages

The GI-1 lineage comprises the first IBV serotype identified and even today is one of the best known and most widely distributed genetic groups, likely due to the extensive use of a homologous vaccine derived from one of its strains. In our data set this group contains 189 viruses collected worldwide (with the exception of Oceania), which were previously assigned to the Massachusetts (also known as Mass or M41), the H120 and the Connecticut (Jungherr et al., 1956) types. The Mass serotype, of which the M41 is the representative strain, is mainly associated with respiratory disease (Cavanagh and Naqi, 1997). The GI-13 lineage is present in many parts of the world and in our study comprises 70 viruses, both vaccine and virulent field strains, previously assigned to the 793B type (also known as 4/91 and CR88) (Gough et al., 1992, Parsons et al., 1992, Picault et al., 1995). Notably, the so-called Israeli variant 1 viruses are members of this lineage (Callison et al., 2001, Gelb et al., 2005). The first known strain of CR88 serotype was isolated in France in 1985 (Picault et al., 1995), whereas the 793B strain emerged in the United Kingdom in 1991 and was originally described as a unique serotype responsible for severe respiratory syndromes (Callison et al., 2001, Cook et al., 1996). A retrospective study revealed a 96% sequence similarity between a strain isolated in Morocco in 1983, which is here referred to as the GI-13 lineage prototype strain, and the 793B variant, suggesting that this North African virus is the progenitor of the lineage (Jones et al., 2004). Recently, this genetic type was identified for the first time in Canada in outbreaks with predominantly respiratory disease and/or egg production problems (Martin et al., 2014). The largest number of IBV strains included in the present investigation comes from the GI-19 lineage that contains 546 viruses collected between 1993 and 2012. The GI-19 variant, the so-called QXIBV strain, was first detected in China in 1996 where it was associated predominantly with severe nephritis, ‘false layer’ syndrome and potentially proventriculitis (Wang et al., 1998). Since then, several QX-type strains have been identified in China, although most cases have been associated with renal pathology (Liu et al., 2006b). In Europe, numerous reports described QX-like strains following the Chinese index case (Abro et al., 2011, Beato et al., 2005, de Wit et al., 2011b, Gough et al., 2008, Monne et al., 2008, Valastro et al., 2010, Worthington et al., 2008). At the same time, the first QX-like strains were identified in Japan (Ariyoshi et al., 2010, Mase et al., 2004) and Korea (Lee et al., 2008). Thereafter, it was soon reported in such diverse localities as Russia, Africa and the Middle East (Amin et al., 2012, Bochkov et al., 2006, Toffan et al., 2011). Thus, all strains falling in the GI-19 lineage have been previously assigned to the QX clade, also called LX4 (Han et al., 2011, Li et al., 2013, Liu et al., 2009) and A2 (Ji et al., 2011, Li et al., 2010). Confusingly, the same genetic group has also been referred to as Korean-II (K-II) (Lee et al., 2008) and Japanese-III (JP-III) clusters (Ariyoshi et al., 2010, Mase et al., 2004). Of note, a recently submitted sequence (KC577395) shows that the lineage had arisen in China by 1993. The GI-16 lineage contains 19 viruses collected between 1986 and 2011 in China, Taiwan and Italy, previously classified as the Q1 (even known as T3 and J2) or ck/CH/LDL/97I type (Liu et al., 2006b, Yu et al., 2001). The designation of Korean III genotype (K-III) was also used to describe Korean strains clustering with Chinese viruses of LDL-like type. Notably, the classification into K-III was performed using phylogenetic analyses of partial S1 gene sequences (620–642 nt) (Lee et al., 2008), and to our knowledge no complete S1 nucleotide sequences of this genetic group are currently available. The GI-16 lineage has been associated with respiratory syndrome (Ababneh et al., 2012, Yu et al., 2001), severe drops in egg production (de Wit et al., 2012) and nephropathogenic disease (Huang et al., 2004, Toffan et al., 2013). Although it is known to have a more widespread geographic distribution, we only include sequences from three countries. After the first isolation of the Q1 strain in China between 1996 and 1998 (Yu et al., 2001), the lineage was reported in Taiwan in 2002 (Huang et al., 2004), in South America since 2009 (Marandino et al., 2015, Sesti et al., 2014), in some Middle Eastern countries (Ababneh et al., 2012) and in Italy (Toffan et al., 2013) in 2011, and in Colombia in 2012 (Jackwood, 2012). Of note, our phylogenetic analysis reveals that the GI-16 prototype strain is an Italian virus – IZO28/86 – isolated in 1986, approximately 10 years before the first identification of the Q1 strain in China.

Indigenous Asian lineages

In addition to those of European and American origin, we found 6 different lineages to be geographically strictly confined to Asia, of which one constitutes a different genotype (GVI-1). Thus, two distinct genotypes have been present and are probably still circulating in this continent. Most GI-7 lineage viruses were associated with nephropathogenic diseases in infected chickens (Huang et al., 2004). The lineage was detected in Taiwan and China and comprises a total of 43 isolates; the majority were isolated after 1988, with the exception of the TP/64 strain which was isolated in Taiwan in 1964 from layers showing respiratory problems and drop in egg production (Huang et al., 2004). Due to high nucleotide sequence similarity (90%), we propose the existence of a single group comprising strains previously assigned to two different genetic groups referred to as Taiwan-I (TW-I) and Taiwan-II (TW-II) (Liu et al., 2003, Wang and Tsai, 1996). GI-15 consists of 11 respiratory strains collected exclusively in Korea between 1986 and 2008 and previously placed into the genotype named as Korean I (K-I) (Hong et al., 2012, Lee et al., 2010, Lee et al., 2008, Song et al., 1998). The GI-18 lineage comprises of 3 Japanese and 2 Chinese viruses collected between 1993 and 1999, and contains both respiratory and nephropathogenic strains (Mase et al., 2004, Shieh et al., 2004). The lineage was designated as Japan I (JP-I) (Ariyoshi et al., 2010, Mase et al., 2004, Shieh et al., 2004), as it originally contained only Japanese wild type field strains. However, it is clear that the lineage is no longer confined to Japan. The GI-22 lineage is the only Chinese indigenous genetic type identified here. Since its first detection, it has been of direct relevance to the poultry industry, reflecting its occurrence and widespread distribution in China, as well as its virulence. GI-22 includes 82 field viruses mainly of nephropathogenic nature collected from outbreaks in both broilers and layers flocks during 1997–2011. These local strains were initially assigned to the ck/CH/LSC/99I-type cluster following the inclusion of the Chinese IBV reference strain ck/CH/LSC/99I isolated in 1999 (Han et al., 2011, Liu et al., 2009, Liu et al., 2006b, Mo et al., 2013, Sun et al., 2011), although it is also known as HN08 (Ji et al., 2011, Li et al., 2013). Based on numerous epidemiological surveys conducted in China, the GI-22 lineage along with GI-19, appears to be the dominant viruses in the country (Han et al., 2011, Ji et al., 2011, Li et al., 2013, Li et al., 2010, Liu et al., 2009, Liu et al., 2006b, Ma et al., 2012, Mo et al., 2013, Sun et al., 2011). The GI-24 lineage contains IB viruses indigenous to India and, to date, no publications describe these strains (i.e. they are only recently reported as accession numbers) such that little epidemiological and clinical information is available. The lineage comprises of 24 viruses collected during the period 1998–2013. Of these, 11 have been assigned to a genotype named NPR by the submitting authors, while 12 others seem to be of nephropathogenic nature (according to the data reported in GenBank), with no data reported for one strain. The only published data on the circulation of local Indian variant was that of Bayry et al. (2005) who described the emergence in India of a unique nephropathogenic IBV classified as a novel genotype (isolate PDRC/Pune/Ind/1/99, AY091551). A BLAST search revealed that the PDRC/Pune/Ind/1/99 is 99% similar to the GI-24 prototype strain. However, there is currently insufficient data as to whether this strain can be included in GI-24. As noted above, GVI-1 represents a genetically distinct lineage present in Asia. It comprises 13 isolates collected in China and Korea between 2007 and 2012, which were originally grouped in the ‘Korean New Cluster II’ (Lim et al., 2012), also designated as Chinese New-Type (Li et al., 2013). The available data on the pathogenicity of these strains revealed them to be of respiratory nature (Li et al., 2010, Lim et al., 2012). Notably, viruses closely related to those included here were also sampled in Japan in 2009 and assigned to a group named JP-IV (Mase et al., 2010) by sequencing of the partial S1 gene (621 nt). To date, no JP-IV-like S1 complete sequences are available so we cannot determine whether the so-called “JP-IV strains” are included in GVI-1 or if they cluster into a separate lineage.

Indigenous North American lineages

A large number of lineages, falling into two well distinct genotypes (GI and GIV), have been reported as indigenous to North America (GI-8, -9, -17, -20, -25, -27 and GIV-1). However, only some of these – GI-9, GI-27 and GIV-1 – have been implicated in widespread disease disseminations and persistent virus infections (Jackwood, 2012). The GI-8 lineage includes one of the first IBV serotypes (SE-17) recognized to be different from the pre-existing IBV antigenic types. However, as this lineage was only detected for a brief period it is likely of limited importance. The variant was isolated in 1967 in Georgia from a chicken flock with acute respiratory distress and was designated as SE-17 (Hopkins, 1969). A retrospective study identified respiratory SE-17 IBVs to be present in USA since 1965 (Mondal et al., 2013). The GI-9 lineage contains vaccine and virulent field strains collected from 1973 to 2011, the majority from the USA (44/49). Herein, we report IBVs previously known to be of Arkansas (Ark) and Ark DPI-like type and strains referred to as California 99 type first detected in North Carolina in 1999 (Martin et al., 2001, Mondal and Cardona, 2004). These viruses are the causative agent of respiratory syndromes, observed in the field as well as under experimental conditions (Fields, 1973, Johnson et al., 1973, Martin et al., 2001, Mondal and Cardona, 2007). A total of 12 viruses sampled in Pennsylvania, California and Alabama from 1988 to 1999 fall within the GI-17 lineage. This includes strains associated with respiratory distress and renal pathologies, with one also implicated in reproductive pathology (Gelb et al., 2005, Moore et al., 1998, Ziegler et al., 2002). These strains were previously designated as California variants (CAV) (Hein et al., 1989, Moore et al., 1998). Among these, two viruses isolated in the late 1990s in Pennsylvania – PA/Wolgemuth/98 and PA/171/99 – were classified as being two unique genotypes, genetically similar but antigenically distinct from the CA/Machado/88 reference prototype strain (Ziegler et al., 2002). Although we only identified two strains as members of the lineage GI-20, it has been included in our classification because of its epidemiological relevance in Canada. The lineage has been never described outside of Eastern Canada, yet appeared to be the most common lineage circulating in the country between 2000 and 2013 (Martin et al., 2014). The Qu_mv prototype variant (AF349621) was isolated in Quebéc in 1996 from commercial broiler flocks displaying respiratory signs of disease (Ojkic and Binnington, 2002, Smati et al., 2002). Since then, its prevalence in the region has risen, also spreading to Nova Scotia and Ontario. Finally, we group 26 American indigenous viruses collected between 2004 and 2013 and associated with respiratory infection into the GI-25 (n = 9) and GI-27 (n = 17) lineages, which were previously designated as GA07 and GA08, respectively (Jackwood et al., 2007, Kulkarni and Resurreccion, 2010). GI-24 includes, among others, the prototype CA/1737/04 strain (Jackwood et al., 2007) along with the DMV/5642/06 (Wood et al., 2009) and GA/60,173/07 (Jackwood et al., 2007) variants. The GI-27 lineage contains the most recent IBV defining a lineage, being first identified in 2007. The variant, which became the predominant virus type at that time, was reported to be a novel genotype and designated as GA08 (Jackwood et al., 2010, Kulkarni and Resurreccion, 2010). Another cluster restricted to the USA is lineage 1 of GIV, which is also the only North American lineage belonging to a different genotype. This group contains both vaccine and field strains (n = 24) isolated between 1992 and 2003. Among these is the variant referred to as Delaware variant (DE or DE072), isolated in 1992 from commercial broiler chicks during severe respiratory disease and designated to be of a novel genotype and serotype compared to the others (Gelb et al., 1997, Mondal et al., 2001). In the same lineage are IBV strains previously designated as GA98 and described to be closely related to the DE variant, although of a different serotype (Lee et al., 2001). It has been suggested that the GA98 variant arose from immune selection caused by DE072 attenuated live vaccine introduced in the country in 1993 (Lee and Jackwood, 2001). In addition, viruses recovered in 2000 from layer flocks experiencing reduction in egg production also fell into this lineage (Mondal et al., 2001).

North American and Asian lineages

The GI-2, GI-3 and GI-4 lineages were first described in the USA between the 1950s and the 1960s and later detected in Asia many years later. Notably, however, GI-2 and GI-4 were reported in USA only during the 1950s–1960s and never again, while GI-3 was also reported in North America in the late 1990s (Gelb et al., 2005, Gelb et al., 2001) before being identified in Taiwan in 2006. Hence, old lineages may be sporadically re-detected. A total of 5 viruses cluster within the GI-3 lineage, which, among others, includes the serotypes known as Holte and Iowa 97 (Albassam et al., 1986, Hofstad, 1958) and two viruses sampled in China between 2004 and 2006 (Bing et al., 2007). The GI-4 lineage consists of 1 nephropathogenic strain, isolated in USA in 1962 (Winterfield and Hitchner, 1962), whose S1 gene was entirely sequenced in 1994 (accession number L18988; Wang et al., 1994) and two additional viruses collected in China for which no published data are currently available. Of note, the same IB strain nomenclature has been used to identify two viruses genetically distant between each other and belonging to two different lineages. Hence, both the GI-2 and GI-4 lineages include a virus called ‘Holte’ as prototype strain. The GI-3 lineage contains 7 viruses, comprising both respiratory and nephropathogenic strains. It was originally designated as JMK or the Gray serotype because of the appropriate reference strains (accession numbers L14070 and L14069, respectively). Although these two viruses are antigenically very similar (Cowen and Hitchner, 1975), their pathogenicity is different because the Gray variant can be nephropathogenic while the JMK virus is strictly respirotropic (Kwon and Jackwood, 1995, Thor et al., 2011, Winterfield et al., 1964, Winterfield and Hitchner, 1962).

Indigenous South American lineage

The GI-11 lineage is unique to South America and comprises a total of 13 Brazilian viruses collected between 1975 and 2009. However, novel IBV sequences, which were obtained from field samples from Argentina and Uruguay, have been recently submitted to GenBank (Marandino et al., 2015). By phylogenetic analysis of the complete S1 coding region, the authors included these strains in a genotype referred to as South America I (SAI), which also contains the GI-11 Brazilian viruses. A previous nomenclature based on partial S1 nucleotide sequences of local Brazilian field variants has been also adopted (Balestrin et al., 2014, Chacón et al., 2011, Fraga et al., 2013, Villarreal et al., 2010), and referred to as the Brazil (Villarreal et al., 2010) or BR-I (Chacón et al., 2011) genotypes. The partial Brazilian sequences show a high degree of nucleotide similarity with those of GI-11 (98–92%), such that it is unclear whether they represent the same genetic type. The GI-11 lineage has been associated with a variety of clinical conditions, ranging from respiratory disease, infertility, drop in egg production and egg quality (Chacón et al., 2011, Chacón et al., 2008, Montassier, 2010, Villarreal et al., 2007a) to enteric disorders (Villarreal et al., 2010, Villarreal et al., 2007b). It was recently demonstrated that the Brazilian variant causes predominantly respiratory and kidney diseases under experimental conditions (Chacón et al., 2014: de Wit et al., 2015). Interestingly, our phylogenetic analysis demonstrates that the indigenous GI-11 lineage has been circulating in the country since 1975, supporting the hypothesis of Montassier (2010) that this variant had already been present in the field since at least as early as 1988.

European IBV lineages

Two distinct lineages that fall in two different genotypes – GI-21 and GII-1 – were identified as unique to Europe. Notably, one of these has also been reported in Russia (Bochkov et al., 2006) and recently in Morocco (Fellahi et al., 2015). Within the GI-21 lineage we group 14 viruses sampled between 1997 and 2005 in Italy, the United Kingdom and Spain. The IB viral type of the lineage was originally isolated in Italy in 1999 and designated Italy02 (Bochkov et al., 2007). Thereafter, it was reported to be one of the most predominant genotypes in Spain (Dolz et al., 2009) and the third most frequent in Western Europe over 2002–2006 (Worthington et al., 2008). This variant has mainly been detected in broiler flocks that experienced respiratory signs, as well as adult birds, broiler breeders and layers, associated with drop in egg production (Worthington et al., 2004). It also appeared to induce renal disease in young chickens (Dolz et al., 2012). Although strains in this lineage are related to one of the major and widespread European wild types, a limited number of complete S1 nucleotide sequences are available for analysis. GII-1 lineage is the only group of European viruses that falls in a different genotype to all the other viruses which are classified here as GI. The lineage is comprised of only the Dutch isolates D1466 and V1397, showing a large evolutionary distance compared to the remaining IBV genotypes. The D1466 variant (also called D212) was detected for the first time in The Netherlands in the late 1970s, when it was recognized to have antigenic and molecular properties significantly different from known IBV strains (Adzhar et al., 1995, Davelaar et al., 1984, Kusters et al., 1989, Kusters et al., 1987). Historically, D1466 has never been responsible for major disease in flocks and hence may be of relatively low pathogenicity. However, an increase in virulence of this variant was recently observed. In particular, poor egg production in both layers and broiler breeders was reported between 2005 and 2006 in some countries of Western Europe (Worthington et al., 2008) and more recently in Poland (Domanska-Blicharz et al., 2012).

Indigenous African lineage

The GI-26 lineage represents a unique African cluster of viruses that were identified relatively recently. It contains 32 viruses isolated in Nigeria and Niger between 2006 and 2007, for which no obvious clinical signs were recorded. These local strains were previously grouped into a novel IBV genotype designated as IBADAN, referring to the name of the city (in Nigeria) where the variant was first detected, and were described to be genetically and antigenically clearly distinct from all other known IBV strains (Ducatez et al., 2009).

European and African lineages

Two IBV lineages – 12 and 14 – were found in some European countries as well as in Nigeria. Both fall into GI and were also reported in Russia (Bochkov et al., 2006). Strains previously classified as D207-like, D274-like or UK/6/82-like types fall into the GI-12 lineage. Here, we report 3 Dutch and 3 British strains isolated during 1978–1986 from broilers experiencing respiratory infection and from breeding flocks showing aberrant egg production (Cavanagh et al., 1992a, Cook and Huggins, 1986, Cook, 1984, Cook, 1983, Davelaar et al., 1984). In addition, 1 field strain from Russia and 2 from Nigeria (Ducatez et al., 2009), collected in 2002 and 2006, respectively, fall in this lineage. Although the circulation of this variant is well documented (Bochkov et al., 2006, Cavanagh et al., 1999, Cavanagh et al., 1992a, Cook, 1984, Davelaar et al., 1984, Meulemans et al., 2001, Monne et al., 2009, Valastro et al., 2014, Worthington et al., 2008), only a relatively small number of D274-like sequences are available for analysis. A GI-12 like strain was also identified in Egypt in 1989 (Abdel-Moneim et al., 2006), although the status of this virus is ambiguous as only partial S1 sequence (722 nt) is currently available. The GI-14 lineage comprises only two viruses collected in Belgium (B1648) (Meulemans et al., 1987) and Nigeria (NGA/324/2006) (Ducatez et al., 2009), although it merits classification due to its epidemiological relevance and pathogenicity. After its first identification in Belgium in 1984 (Meulemans et al., 1987), the variant was again reported in the country in 1993 (Meulemans et al., 2001) and later in Italy (Capua et al., 1999), Russia (Bochkov et al., 2006) and Slovenia (Krapez et al., 2011). No other complete S1 gene sequences are available. The viruses related to this variant were previously referred to as the B1848-like type and reported to be mostly nephropathogenic (Meulemans et al., 1987, Capua et al., 1999) and also associated with egg production problems (Capua et al., 1999). The variant was rarely detected in France and Germany between 2002 and 2006 (Worthington et al., 2008), and did not appear to be causing relevant illness in poultry flocks.

Indigenous Middle Eastern lineage

The GI-23 lineage represents the unique wild-type cluster geographically confined to the Middle East. Strains belonging to this lineage have been detected since 1998 in Israel and are still circulating in the area (Ganapathy et al., 2015, Najafi et al., 2015). Some have become dominant in the majority of farms and are involved in respiratory and renal pathologies (El-Mahdy et al., 2012, Meir et al., 2004). However, the complete S1 sequence is only available for a limited number of viruses (n = 9). Some authors have previously assigned these strains as Israeli Variant 2 to distinguish them from those clustering within Israeli Variant 1 (Abdel-Moneim et al., 2002, Callison et al., 2001, Mahmood et al., 2011, Meir et al., 2004). Alternatively, studies performed on the Egyptian isolates divided them into different genotypes on the basis of their HVR3 sequences; they were defined as Egyptian Variant 1, having as reference the strain Egypt/Beni-Suef/01 (Abdel-Moneim et al., 2002) and Egyptian Variant 2, which includes the viruses ck/Eg/BSU-2/2011 and ck/Eg/BSU-3/2011(Abdel-Moneim et al., 2012). To date, no complete nucleotide sequences are available for the three Egyptian strains.

Indigenous Australian and New Zealand lineages

Likely due to their geographical isolation, Australia and New Zealand possess only unique indigenous variants. We found 5 distinct IBV lineages in these localities, 3 falling into GI (GI-5, -6 and -10) and 2 possessing large evolutionary distances between each other and compared to those found elsewhere. Hence, our classification into distinct genotypes designated as GIII-1 and GV-1. The GI-5 and GI-6 lineages contain both vaccine and field strains (13 and 17 viruses, respectively), mostly sampled in Australia. The only Chinese sequences included here are 4 field viruses (1 in GI-5 and 3 in GI-6) that presumably represent re-isolations of the vaccine strains JAAS and J9, which were from Australia and used in China to control IBV (Liu et al., 2006a). Hence, both these lineages may be geographically confined to Oceania. In addition, one strain sampled in New Zealand falls in GI-6. Among the strains clustering in GI-5 are the Armidale vaccine strain and the nephropathogenic N1/62, also known as T strain. Within the GI-6 lineage is the VicS/62 strain that was introduced as a vaccine into Australia in 1966 (Cumming, 1969). The strains within GI-5 and GI-6 were originally grouped as Australian subgroup I (Ignjatovic et al., 2006), which includes both respiratory and nephropathogenic strains (Sapats et al., 1996). The GI-10 lineage contains 6 New Zealand indigenous viruses; 3 were collected in the 1970s and the remainder in the 2000s (McFarlane and Verma, 2008). This IBV variant was first reported in the country in 1967 (Pohl, 1967), and ten years later 4 different strains designated as A, B, C and D were identified using virus neutralization tests (Lohr, 1977, Lohr, 1976). Finally, both the lineages falling into GIII and GV contain respiratory and indigenous Australian pathogens (4 and 7 strains, respectively). The GIII-1 lineage was first identified in 1988 (Ignjatovic and McWaters, 1991) and designated as Australian subgroup II (Sapats et al., 1996), whereas the GV-1 lineage was described approximately 14 years later and referred to as Australian subgroup III (Ignjatovic et al., 2006). Both appear to be genetically and antigenically different from the classical strains, here grouped into GI-5 and GI-6 (Ignjatovic et al., 1997, Mardani et al., 2010, Sapats et al., 1996).

Phylogenetic analysis of the HVRs

Since partial S1 gene sequences are often used to classify the IBV strains, we inferred two additional ML phylogenetic trees based on HVRs1 and 2 (312 nt) and HVR3 (342 nt) of the reference sub-sampled data set (n = 199). Strikingly, important topological inconsistencies were observed between the HVRs1 and 2 phylogeny and that inferred using the complete S1 gene. Specifically, although GIII, V and VI exhibit large evolutionary distances compared to the remaining lineages, they cluster within GI in marked contrast to what is seen in the complete S1 tree, while 4 lineages – GI-7, -14, -23, and -27 – do not form monophyletic groups (Fig. 3 ). In addition, while most groups were strongly supported in the SH test (> 0.96 SH-like), others were more weakly supported, such as GI-15 which only received 0.60 support, and GI-6 and GI-9, both of which received 0.80 SH-like support. Genetic typing based on HVR3 was similarly inconsistent with that obtained from the whole S1 gene (Fig. 4 ). In particular, 8 lineages (GI-5, -7, -10, -18, -22, -24, -25 and -27) are no longer monophyletic. Overall, these results indicate that both the genotypes and lineages identified using the HVRs are not representative of those obtained from the phylogenetic analysis of the whole S1 gene, so that only the latter should be used in IBV genetic classification.

Fig. 3

Fig. 4

Phylogenetic tree of partial S1 nucleotide sequences including HVR3. The phylogeny contains a total of 199 IBV strains, including 6 representative sequences of each lineage detected and 26 strains recognized as unique variants. All strains belonging to the same lineage, assessed on the basis of the complete full-length sequences, are labeled with a unique color code as in Fig. 1, Fig. 2. The color-coded boxes reporting the lineage designations are shown only for those lineages correctly identified. GenBank accession number, isolate number or name, country of origin and collection date is given for each strain. The designation “UV” indicates unique variants, here marked in black. SH-like branch supports are shown for key nodes. The scale bar represents the number of nucleotide substitutions per site, and the tree is mid-point rooted for clarity only.

Phylogenetic tree of partial S1 nucleotide sequences including HVRs1 and 2. The phylogeny contains a total of 199 IBV strains, including 6 representative sequences of each lineage detected and 26 strains recognized as unique variants. All strains belonging to the same lineage, assessed on the basis of the complete full-length sequences, are labeled with a unique color code as in Fig. 1, Fig. 2. The color-coded boxes reporting the lineage designations are only shown for those lineages correctly identified. GenBank accession number, isolate number or name, country of origin and collection date is given for each strain. The designation “UV” indicates unique variants, here marked in black. SH-like branch supports are shown for key nodes. The scale bar represents the number of nucleotide substitutions per site, and the tree is mid-point rooted for clarity only. Phylogenetic tree of partial S1 nucleotide sequences including HVR3. The phylogeny contains a total of 199 IBV strains, including 6 representative sequences of each lineage detected and 26 strains recognized as unique variants. All strains belonging to the same lineage, assessed on the basis of the complete full-length sequences, are labeled with a unique color code as in Fig. 1, Fig. 2. The color-coded boxes reporting the lineage designations are shown only for those lineages correctly identified. GenBank accession number, isolate number or name, country of origin and collection date is given for each strain. The designation “UV” indicates unique variants, here marked in black. SH-like branch supports are shown for key nodes. The scale bar represents the number of nucleotide substitutions per site, and the tree is mid-point rooted for clarity only.

Assessment of temporal structure

Finally, to determine whether there was sufficient temporal structure for molecular clock dating, we fitted a linear regression of root-to-tip genetic distance from the ML tree against the date (year) of collection for 372 randomly selected sequences from the entire data set. This revealed a weakly negative relationship between genetic distance and time (R-squared = − 0.003; correlation coefficient = − 0.181 under the best-fitting root). Such a clear lack of temporal structure means that molecular clock dating schemes based on ‘tip dating’ alone cannot proceed. An equivalent root-to-tip regression using the GI-19 lineage alone, which includes samples collected from 1993 to 2010 (n = 354) was conducted to determine whether this was also true of more closely related sequences. Similarly, the analysis revealed only weak temporal structure (R-squared = 0.159; correlation coefficient = 0.399).

Discussion

Advances in molecular biology and bioinformatics analyses have impacted virus classification at all taxonomic levels. The International Committee on Taxonomy of Viruses (ICTV) has no guidelines for the classification of viruses below the species level. However, classification systems have been developed and widely used for a variety of avian pathogens, including Avian influenza (AI) (WHO/OIE/FAO H5N1 Evolution working group) and Newcastle disease (ND) viruses (Aldous et al., 2003, de Almeida et al., 2013), within which distinct “lineages” have been established through phylogenetic analysis and sequence similarities. Herein, we propose a similar framework for IBV. To date, no genetic characterization of IBV has included sequences from all the existing viral variants or adopted a unified system for naming the groups, such that no consensus on IBV classification has been reached. Indeed, the diversity of IBV genetic clustering and naming available at present is highly confusing. Hence, we have attempted to construct a comprehensive phylogenetic history of this virus and from this to derive a rational and harmonious scheme for the classification of IBV that we suggest should be used for future epidemiological and evolutionary studies. We have focused on the complete nucleotide sequence of the S1 gene as the basis for IBV lineage assignment. Not only it is the most variable region within the IBV genome, containing abundant phylogenetic information, but it encodes the major immunological determinants (Jackwood and de Wit, 2013) and it is used by many laboratories studying IBV. Hence, phylogenetic analysis of the S1 gene and an S1-based viral classification might provide data of direct epidemiological relevance for controlling IBV spread, particularly as field and vaccine strains share a high degree of S1 sequence identity (Gelb et al., 2005). Importantly, our classification was exclusively based on the topology of the phylogenetic tree, with strong statistical (SH-like) support values at each node defining monophyletic groups. Hence, IBV strain clustering was evaluated by a robust (maximum likelihood) phylogenetic method that is able to efficiently handle a large number of sequences, and combined with an efficient statistic – the SH-like test – that can rapidly estimate the support for individual groupings on the tree. That very similar tree topologies were estimated using different phylogenetic techniques not only suggests that they are robust, but that faster phylogenetic methods can be used if necessary. A more challenging issue is recombination, which undoubtedly has major implications for virus classification (Simmonds, 2015). Importantly, however, viral phylogenies based on a single gene (as here) have been previously used to establish viable classification schemes. Notable examples include members of genus Enterovirus (Mirand et al., 2006, Oberste et al., 1999), pestiviruses such as BVDV-1 (Deng et al., 2012, Vilcek et al., 2001) and BVDV-2 (Flores et al., 2002, Jenckel et al., 2014, Weber et al., 2015), circoviruses such as PCV2 (Franzo et al., 2015, Grau-Roma et al., 2008, Segalés et al., 2008), and lentiviruses such as FIV (Marçola et al., 2013, Sodora et al., 1994). In the case of IBV we propose that an effective classification scheme, particularly the designation of lineages and genotypes, should be based on clearly identifiable genetic groups (i.e. with recombinants removed) as these represent a robust phylogenetic backbone. A similar approach has been undertaken for PCV2 (Franzo et al., 2015). Hence, we contend that this is the most coherent and practical way for virus classification in the face of recombination, particularly as it is impractical to integrate multiple incongruent phylogenies and simplistic to think that such complex evolutionary histories will produce more rational classifications. Rather than being defined as unique variants in their own right, recombinants can then be referred to as combinations of these distinct lineages and genotypes, analogous to the definition of ‘circulating recombinant forms’ among HIV subtypes. However, it is evident that more experimental studies are needed to assess how recombination might impact viral fitness. In this respect, the relatively high number of recombinant viruses in our data (n = 213) is in part due to the presence of strains showing an identical recombinant structure and possessing a strong epidemiological link between each other. Hence, these should not be regarded as result of independent recombination events. As well as providing the first complete picture of IBV biodiversity, by determining the phylogenetic relationships between all described genetic groups we have provided a well-defined evolutionary history of IBV, which in turn results in a clear definition of viral genotypes and lineages. Accordingly, a total of 6 genotypes (GI-GVI) and 32 lineages were identified, with other potential groups present as unique variants (UVs) and which may become established should future viruses be sequenced. Some well-established lineages such as GI-1 and -13 have a broad geographic distribution, which is presumably associated with the use of vaccines derived from them. Therefore, the majority of the IBV strains included in these lineages might be vaccine and vaccine-like strains. The first vaccine to control the disease was developed in the USA in the 1950s using the van Roeckel M-41 strain (van Roeckel et al., 1942) that represents the parent strain of most of the Mass type vaccines used there. By the early 1960s IB had been diagnosed in The Netherlands, leading to the development of a Mass-based vaccine known as the H strain (Bijlenga et al., 2004). The resulting vaccines, H120 and H52, soon became widely used. Today, the Mass and H120 strains of the lineage GI-1 continue to be the most commonly administrated attenuated-live vaccines. In contrast to the GI-1 vaccine strains, the 793B-like vaccines (GI-13), which were developed in Europe in the 1990s and used in many countries, have been never administered in North America. To date, the GI-13 lineage has not been detected in the USA, Oceania and many African and Latin American countries. The S1 gene phylogeny was also characterized by strong geographic structure, such that IBV strains are often clustered by place of sampling. In particular, with the exception of strains of the GI-1 lineage, IBVs in Europe differ from those found in the USA or Australia, and each geographic group can be distinguished at the phylogenetic scale. That most of the strains in the GI-9 lineage come from the USA might suggest that the pathogenic Ark variant is geographically confined to that country. However, there are unpublished reports recording the circulation of Ark-like strains in South America (Jackwood, 2012, Marandino et al., 2015). The Ark virus is one of the most commonly reported types able to cause widespread disease in the USA, against which an attenuated vaccine was developed. When it first emerged in Arkansas in 1973, it was described as genetically distinct from all the known IBV serotypes recognized at that time and was referred to as Ark99 (Fields, 1973, Johnson et al., 1973). During the 1980s, an attenuated vaccine derived from an Ark-type virus isolated in the Delmarva Peninsula (Ark DPI strain) (Gelb et al., 1983, Gelb et al., 1981) was extensively used in the USA and remains one of the most common vaccines administered to flocks in this country and also in the United Kingdom. In this respect, a previous epidemiological survey reported the identification of GI-9-like strains in Western Europe only in flocks that had received the commercial bivalent IBMM + Ark vaccine (Worthington et al., 2008). However, no European IBV sequences similar to the GI-9 lineage are available in the public database. A similar situation arises with the Chinese IBVs in GI-9 (n = 5). Among these, the Jilin strain (AY839144), which was previously reported to be 100% identical to the Ark DPI strain (Ammayappan et al., 2008), is currently used as vaccine in China (Liu et al., 2006a). This suggests that the Chinese IBVs present in this lineage most likely represent re-isolations of the vaccine strain and not of the Ark field type. Although the widespread circulation of some specific lineages is probably attributed to the use of vaccination programs based on strains derived from these, this is likely not always the case. In particular, the spread of the nephropathogenic QX-like variant of the GI-19 lineage occurred long before its homologous vaccine was administrated in the field. This Chinese lineage has generated considerable attention due to its ability to become endemic, causing major economic losses in the poultry industry worldwide, with the exception of the Americas and Oceania where it has been never detected. The origin of this lineage and the factors responsible of its distinctive distribution remain unclear (Bochkov et al., 2006, Gough et al., 2008). A role of wild birds has been hypothesized based on evidence that IBV may replicate in Anseriformes (Bochkov et al., 2006, Cavanagh, 2005). Importantly, the present study seems to counter the common assumption that the GI-16 lineage arose in China in 1996. In particular, our analysis provides evidence that the Italian IZO28/86 strain, isolated in 1986, belongs to GI-16 such that it constitutes the lineage prototype strain. This nephropathogenic virus was originally sampled in Italy about 10 years before the first identification of the Q1 strain in China, and its sequence has been only recently submitted to the public database. Additionally, the IZO28/86 sequence is closely related to strain 624/I (JQ901492), suggesting that they belong to the same lineage, which also includes the Q1-like strains. However, they have previously been classified as distinct genotypes. The 624/I virus was first reported as novel variant in Italy in 1993 (Capua et al., 1994) during an outbreak of severe respiratory disease, although only a 350 nt region of S1 was sequenced (Capua et al., 1999). Thereafter, it has been sporadically detected in Italy, in Russia (Bochkov et al., 2006) and Slovenia (Krapez et al., 2011). More recently a longer nucleotide 624/I sequence (1043 nt in length) has been released (JQ901492), which clusters into the same monophyletic group. Based on these observations, it is plausible that both variants belong to the GI-16, unless recombination has occurred in the C-terminal portion of the 624/I S1 sequence. In the last two decades, a combination of phylogenetic clustering and patterns of sequence similarity in the S1 gene have been conventionally used to group IBV isolates into genetic clades, although a confusing variety of such clustering schemes currently exist. For instance, IBVs have been referred to as novel variants when their S1 nucleotide sequences are at least ≤ 75% dissimilar from that of any other IBV type (Gelb et al., 2005, Kingham et al., 2000). However, because of rate variation between sequences, reflected here in the lack of temporal structure in the data, distance-based classification methods are susceptible to error. In particular, elevated evolutionary rates leading to individual clusters may result in high genetic distances between sister taxa even though they are closely related. Thus, we suggest that phylogenetic relationships are a more appropriate measure of evolutionary history and hence the basis of a rationale classification than pairwise comparison of sequences. In addition, it is unrealistic to think that nature will create discrete groups of sequences that can consistently be recovered using genetic distances. Most phylogenetic analyses of IBV have been based on the three more variable regions (HVRs) of the S1 gene. Some investigators have reported that the genetic typing based on HVR1 of the S1 gene is inconsistent with the groupings based on the whole S1 gene (Li et al., 2012, Mo et al., 2013, Schikora et al., 2003), although others disagree (Lee et al., 2003, Wang and Huang, 2000). We clearly show here that the hypervariable fragments (HVRs1 and 2 and HVR3) do not consistently produce clusters that are equivalent to those found through phylogenetic analyses of the S1 phylogeny. Therefore, the risk of misclassification decreases by using a larger portion of the S1 gene, and the sequencing of only one of these regions might result in insufficient phylogenetic resolution. Hence, we strongly recommend a phylogeny that considers the complete S1 gene sequence be employed for future designations of novel IBV lineages or genotypes.

Conclusions

Given the rapid evolution of IBV and the use of mass vaccination strategies to control the disease worldwide, additional genetic variants will likely be discovered in the future. As heterogeneous genetic group designations, which are inconsistent with phylogenetic classification, have largely been used to the present day, it is essential to employ a standard nomenclature of practical use and a well-supported system to identify these novel variants. Herein, we propose a simple and repeatable S1 phylogeny-based classification system combined with an unambiguous lineage nomenclature for the future assignment of IBV strains. Following the suggestions here proposed, at least three complete S1 sequences of viral samples collected at least from two different outbreaks should be available for the identification of a new viral lineage, and genotypes and lineages should be referred to according to the current numerical system. In addition, and in similar manner to the convention in AIV, we encourage the use of a uniform and informative system for naming IBV isolates, which at least should include the name of the strain, country of origin and date of collection (Cavanagh, 2001). Clearly, the adoption of an internationally accepted nomenclature and a common system to coherently designate viruses is central for efficient communication on the evolution and emergence of epidemiologically important IBV variants. The following are the supplementary data related to this article.

Table S1

Nucleotide sequences of the 199 IBV strains selected as reference dataset.

Fig. S1

ML phylogenetic tree of 1518 complete S1 nucleotide sequences. Recombinant sequences and those considered to be unreliable are labeled in red and in green, respectively (and were removed from the main analysis). The colored boxes designate the GI-19, -13 and -21 lineages that no longer appeared as monophyletic groups. SH-like branch supports are shown for key nodes. The scale bar represents the number of nucleotide substitutions per site, and the tree is mid-point rooted for clarity only.

Fig. S2

Neighbor-Joining phylogenetic tree of complete S1 gene nucleotide sequences. A total of 199 IBV strains, including 6 representative sequences of each lineage detected and 26 unique variants were analyzed. Each lineage is color-coded and its corresponding designation is reported. Bars reporting the genotypes in which the lineages fall are shown. The designation “UV” indicates unique variants, here marked in black. The red box designates the 27 lineages within GI. GenBank accession number, isolate number or name, country of origin and collection date is given for each strain. Bootstrap support values are shown for key nodes. The scale bar represents the number of substitutions per site, and the tree is mid-point rooted for clarity only.

Fig. S3

Bayesian (MrBayes) phylogenetic tree of complete S1 nucleotide sequences. A total of 199 IBV strains, including 6 representative sequences of each lineage detected and 26 unique variants were analyzed. Each lineage is color-coded and its corresponding designation is reported. Bars reporting the genotypes in which the lineages fall are shown. The designation “UV” indicates unique variants, here marked in black. The red box designates the 27 lineages within GI. GenBank accession number, isolate number or name, country of origin and collection date is given for each strain. Posterior probabilities are shown for key nodes. The scale bar represents the number of substitutions per site, and the tree is mid-point rooted for clarity only.

170 in total

1. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis.

Authors: J Castresana
Journal: Mol Biol Evol Date: 2000-04 Impact factor: 16.240

2. Molecular characterization of infectious bronchitis virus isolates foreign to the United States and comparison with United States isolates.

Authors: S A Callison; M W Jackwood; D A Hilt
Journal: Avian Dis Date: 2001 Apr-Jun Impact factor: 1.577

3. Increased level of protection of respiratory tract and kidney by combining different infectious bronchitis virus vaccines against challenge with nephropathogenic Brazilian genotype subcluster 4 strains.

Authors: J J De Wit; P Brandao; C A Torres; R Koopman; L Y Villarreal
Journal: Avian Pathol Date: 2015-10 Impact factor: 3.378

4. Isolation of a new serotype of infectious bronchitis-like virus from chickens in England.

Authors: J K Cook
Journal: Vet Rec Date: 1983-01-29 Impact factor: 2.695

5. High prevalence of bovine viral diarrhea virus 1 in Chinese swine herds.

Authors: Yu Deng; Chun-Qing Sun; San-Jie Cao; Tao Lin; Shi-Shan Yuan; Hong-Biao Zhang; Shao-Lun Zhai; Lv Huang; Tong-Ling Shan; Hao Zheng; Xin-Tian Wen; Guang-Zhi Tong
Journal: Vet Microbiol Date: 2012-04-26 Impact factor: 3.293

6. Pathogenicity and molecular characteristics of infectious bronchitis virus (IBV) strains isolated from broilers showing diarrhoea and respiratory disease.

Authors: J L Chacón; M S Assayag; L Revolledo; C S Astolfi-Ferreira; M P Vejarano; R C Jones; A J Piantino Ferreira
Journal: Br Poult Sci Date: 2014 Impact factor: 2.095

7. A 'novel' infectious bronchitis strain infecting broiler chickens in Italy.

Authors: I Capua; R E Gough; M Mancini; C Casaccia; C Weiss
Journal: Zentralbl Veterinarmed B Date: 1994-04

8. Sequence analysis of the S1 glycoprotein of infectious bronchitis viruses: identification of a novel genotypic group in Australia.

Authors: S I Sapats; F Ashton; P J Wright; J Ignjatovic
Journal: J Gen Virol Date: 1996-03 Impact factor: 3.891

Review 9. The molecular biology of coronaviruses.

Authors: M M Lai; D Cavanagh
Journal: Adv Virus Res Date: 1997 Impact factor: 9.937

10. Sequence analysis of infectious bronchitis virus isolates from the 1960s in the United States.

Authors: Shankar Mondal; Yung-Fu Chang; Udeni Balasuriya
Journal: Arch Virol Date: 2012-10-11 Impact factor: 2.574

119 in total

1. Pathological changes, shedding pattern and cytokines responses in chicks infected with avian influenza-H9N2 and/or infectious bronchitis viruses.

Authors: Osama Mahana; Abdel-Sattar Arafa; Ahmed Erfan; Hussein A Hussein; Mohamed A Shalaby
Journal: Virusdisease Date: 2019-01-12

2. Phylogenetic analysis of avian infectious bronchitis virus isolates from Morocco: a retrospective study (1983 to 2014).

Authors: Siham Fellahi; Mehdi El Harrak; Slimane Khayi; Jean-Luc Guerin; Jens H Kuhn; Mohammed El Houadfi; My Mustapha Ennaji; Mariette Ducatez
Journal: Virol Sin Date: 2017-04 Impact factor: 4.327

3. Emerging infectious bronchitis virus (IBV) in Egypt: Evidence for an evolutionary advantage of a new S1 variant with a unique gene 3ab constellation.

Authors: Ibrahim Moharam; Hesham Sultan; K Hassan; Mahmoud Ibrahim; Salama Shany; Awad A Shehata; Mohammed Abo-ElKhair; Florian Pfaff; Dirk Höper; Magdy El Kady; Martin Beer; Timm Harder; Hafez Hafez; Christian Grund
Journal: Infect Genet Evol Date: 2020-07-01 Impact factor: 3.342

4. CD59 association with infectious bronchitis virus particles protects against antibody-dependent complement-mediated lysis.

Authors: Yanquan Wei; Yanhong Ji; Huichen Guo; Xiaoying Zhi; Shichong Han; Yun Zhang; Yuan Gao; Yanyan Chang; Dan Yan; Kangyu Li; Ding Xiang Liu; Shiqi Sun
Journal: J Gen Virol Date: 2017-10-25 Impact factor: 3.891

5. Genotyping and pathotyping of diversified strains of infectious bronchitis viruses circulating in Egypt.

Authors: Ali Zanaty; Abdel-Satar Arafa; Naglaa Hagag; Magdy El-Kady
Journal: World J Virol Date: 2016-08-12

6. Molecular characterization of isolated infectious bronchitis viruses from affected vaccinated broiler flocks in Syria.

Authors: Tamara Al-Jallad; Morshed Kassouha; Mohamad Salhab; Anouar Alomar; Mouhamad Al-Masalma; Fahim Abdelaziz
Journal: BMC Vet Res Date: 2020-11-19 Impact factor: 2.741