Literature DB >> 17054987

Mosaic structure of human coronavirus NL63, one thousand years of evolution.

Krzysztof Pyrc¹, Ronald Dijkman, Lea Deng, Maarten F Jebbink, Howard A Ross, Ben Berkhout, Lia van der Hoek.

Abstract

Before the SARS outbreak only two human coronaviruses (HCoV) were known: HCoV-OC43 and HCoV-229E. With the discovery of SARS-CoV in 2003, a third family member was identified. Soon thereafter, we described the fourth human coronavirus (HCoV-NL63), a virus that has spread worldwide and is associated with croup in children. We report here the complete genome sequence of two HCoV-NL63 clinical isolates, designated Amsterdam 57 and Amsterdam 496. The genomes are 27,538 and 27,550 nucleotides long, respectively, and share the same genome organization. We identified two variable regions, one within the 1a and one within the S gene, whereas the 1b and N genes were most conserved. Phylogenetic analysis revealed that HCoV-NL63 genomes have a mosaic structure with multiple recombination sites. Additionally, employing three different algorithms, we assessed the evolutionary rate for the S gene of group Ib coronaviruses to be approximately 3 x 10(-4) substitutions per site per year. Using this evolutionary rate we determined that HCoV-NL63 diverged in the 11th century from its closest relative HCoV-229E.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2006 PMID： 17054987 PMCID： PMC7094706 DOI： 10.1016/j.jmb.2006.09.074

Source DB: PubMed Journal: J Mol Biol ISSN： 0022-2836 Impact factor: 5.469

Introduction

Coronaviruses, a genus of the Coronaviridae family, are enveloped viruses with a large plus-strand RNA genome. The genomic RNA is 27−32 kb in size, capped and polyadenylated. Coronaviruses have been identified in bats, mice, rats, chickens, turkeys, swine, dogs, cats, rabbits, horses, cattle and humans and cause highly prevalent diseases such as respiratory, enteric, cardiovascular and neurological disorders.1., 2. Originally, coronaviruses were classified on the basis of antigenic cross-reactivity in three antigenic groups. When coronavirus genome sequence data began to accumulate, the original antigenic groups were converted into genetic groups based on similarity of the nucleotide sequences. The coronaviruses possess a characteristic genome composition. The 5′ two-thirds of a coronavirus genome encodes two polyproteins (1a and 1ab) that contain all proteins necessary for RNA replication. The 3′ one-third of a coronavirus genome encodes several structural proteins such as spike (S), envelope (E), membrane (M) and nucleocapsid (N) proteins that, among other functions, participate in the budding process and are incorporated into the virus particle. Additional accessory protein genes are located in the 3′ part of the genome in a coronavirus species-specific position. HCoV-NL63, a recently discovered member of the Coronaviridae family,4., 5., 6. has spread worldwide, is observed most frequently in the winter season and is associated with acute respiratory illness and croup in young children, elderly and immunocompromised patients.7., 8., 9., 10. A recent report suggested that HCoV-NL63 is one of the pathogens underlying Kawasaki disease, although other studies could not confirm this association.12., 13., 14. HCoV-NL63 belongs to the group I coronaviruses according to phylogenetic analyses. The highest similarity is observed with HCoV-229E and porcine epidemic diarrhoea virus (PEDV), 65% and 61%, respectively. Phylogenetic analysis based on gene 1a sequences indicates the presence of diverse HCoV-NL63 strains with distinct molecular markers. The increasing number of HCoV-NL63 sequences from several locations provides further evidence for this genetic diversity and confirms the presence of two main genetic clusters.10., 15., 16. However, drawing conclusions based on phylogenetic analysis of a single gene sequence and sometimes even a partial gene sequence requires caution as the true phylogeny can only be demonstrated by analyzing complete genome sequences. Full genome sequences of field isolates were, however, not available. Therefore, we sequenced the complete genomes of two HCoV-NL63 field isolates (Amsterdam 57 and Amsterdam 496) and genome fragments of 21 additional field isolates. Here, we present evidence for the in-field recombination of HCoV-NL63. Furthermore, we characterized the molecular variability of HCoV-NL63 isolates to have insight into the evolution of the virus. We observed high variability at certain genome regions and a molecular clock analysis revealed that the virus has been present in the human population for centuries.

Results

HCoV-NL63 isolates

We sequenced the complete genomes of two HCoV-NL63 isolates (Amsterdam 57 and Amsterdam 496) directly from patient material. Isolate 496 was obtained from the throat swab of an eight-month old boy (February 2003) and isolate 57 was amplified from the bronchoalveolar lavage of a 57-year old woman (December 2002), both suffering from acute respiratory illness. Both isolates displayed the same basic genome structure as previously described for HCoV-NL63. Furthermore, we partially sequenced additional isolates directly from patient material to analyze the variability of HCoV-NL63 (Table 1 ). The regions were: 1a gene nt 3004–3888 (3K) and nt 5815–6280 (6K), S gene nt 20,497–21,003 (21K), ORF3 gene nt 24,521–25,206 (25K), and N gene nt 26,136–27,166 (26K). In some cases only a few regions were amplified because of the low virus load in some patient samples.

Table 1

Clinical isolates of HCoV-NL63

Isolate name	Sampling date	Viral load in patient sample (copies/ml)	Fragments sequenced
3	n.k.	5.66 × 10⁵	3K
27	n.k.	2.23 × 10⁶	3K; 21K
42	n.k.	n.d.	26K
57	08.01.2004	2.00 × 10⁸	Full genome
63c	n.k.	n.d.	26K
72	31.12.2002	196305	3K; 6K; 21K; 25K; 26K
111	12.01.2004	n.d.	21K
120	13.01.2004	n.d.	21K
173	n.k.	n.d.	3K
202	31.03.2003	n.d.	21K
212	01.02.2003	n.d.	21K; 25K
223	08.01.2003	n.d.	21K; 25K; 26K
242	10.02.2003	n.d.	25K; 26K
246	13.01.2003	1.80 × 10⁴	6K; 21K; 25K; 26K
248	16.01.2003	1.98 × 10⁶	3K; 6K; 25K; 26K
251	07.01.2003	8.34 × 10⁷	25K
466	04.02.2003	5.36 × 10⁵	3K; 6K; 21K; 25K; 26K
496	25.02.2003	1.70 × 10⁶	Full genome
705	n.k.	3.30 × 10⁶	3K; 21K
744	n.k.	7.20 × 10⁶	3K
791	n.k.	3.70 × 10⁶	21K
857	10.12.2003	3.50 × 10⁷	3K; 6K; 21K; 25K; 26K
890	n.k.	7.39 × 10⁵	3K

n.k., not known. n.d., not determined.

Clinical isolates of HCoV-NL63 n.k., not known. n.d., not determined.

Genetic variability along the genome

Pair-wise sequence alignments of isolates Amsterdam 1, NL, 57 and 496 demonstrate an overall genome similarity of 99.0% between the HCoV-NL63 strains. We plotted the frequency of polymorphic nucleotides along the genome to visualize variable sites (Figure 1(a)) and identified two hypervariable regions, one in the 5′ part of the 1a gene encoding nsp1-nsp3 (nt 170–5000) and in the 5′ part of the spike gene (nt 20,300–22,000). The latter region encompasses the S1 region that contains a unique insert in HCoV-NL63 when compared to its closest relative HCoV-229E.

Figure 1

Molecular variability of HCoV-NL63 along the genome. (a) Frequency of polymorphic sites at the nucleotide level among four isolates of HCoV-NL63. (b)Frequency of polymorphic sites on the synonymous and non-synonymous positions among four isolates of HCoV-NL63. The analysis was done with a 500 nt non-overlapping window. Within the variable region 1–5000 nt in the 1a gene, we identified 126 variable sites among the four full genome isolates (55 non-synonymous substitutions), which resulted in 51 variable amino acid (aa) positions. In this region, we also identified an in-frame deletion of 15 nt in isolate 496 and NL (corresponding to nt 3321–3335 of Amsterdam 1). To determine the prevalence of this deletion in the virus population, we analyzed partial sequences of the 1a gene (region 3K) of additional HCoV-NL63 isolates (Table 1) and found it in three more patients (003, 890 and 248) while in eight patients no deletion was observed. The second variable region encompasses the S1 part of the S gene. Out of 175 polymorphic nucleotides (56 non-synonymous substitutions) leading to 51 aa substitutions, 119 are located in the first 1200 nt (46 non-synonymous substitutions) of the spike gene leading to 41 aa changes. Furthermore, we observed a 3 nt deletion within the S gene (corresponding to nt 20798 and 20800 of Amsterdam 1) in isolates 496, 57 and NL. Sequencing of 12 additional patient samples identified no additional variants with this 3 nt deletion. The analysis of synonymous/non-synonymous substitutions along the genome indicates that synonymous substitutions are generally in excess over non-synonymous substitutions (Figure 1(b)). To determine if the high variability in the 3K and 21K regions is driven by positive selection we analyzed these regions with PAML software, which provides maximum likelihood estimates of the extent of positive selection. Likelihood ratio tests were used to assess whether a model, which included positive selection, was significantly better than one that did not. When positive selection was indicated, empirical Bayes' methods were used to identify which individual sites were under positive selection. According to the PAML analysis the 3K and 21K regions showed no significant sign of positive selection. We analyzed the most conserved genome regions to identify suitable targets for the development of a PCR-based diagnostic assay that can detect all HCoV-NL63 isolates. The 1b polyprotein gene is highly conserved, with 25 variable nucleotides and only one aa substitution in the region nt 12401–15195 among the four isolates. This region encodes the RNA-dependent RNA polymerase (RdRp). The second most conserved region is the N protein and we confirmed the homogeneity in nine additional patient isolates. Of the 24 variable positions scored in a 1031 nt region, only five were non-silent and resulted in 4 aa changes. Although it was previously mentioned that the ORF3 gene is highly variable in strain NL, we observed a low heterogeneity in Amsterdam 1, 57 and 496. Also among 11 patient isolates, only ten polymorphic nucleotides were observed (one non-synonymous substitution), resulting in only 1 aa change in the patient isolates. A 3 nt insertion and an additional 1 aa change were observed only in the cultured NL isolate.

HCoV-NL63 recombination

We analyzed the full genome sequence of the four HCoV-NL63 isolates for possible recombination events. As the explorative bootscan analysis was not suggestive due to the low number of highly similar sequences and a stochastic noise that could not be distinguished from the real signal, we decided to analyze only the regions showing high number of informative sites. We analyzed the partial sequences of five regions (3K, 6K, 21K, 25K, and 26K) for 57, 496, NL and Amsterdam 1 isolates and nine additional patient isolates. The phylogenetic analysis confirms that in the 3K region the isolates Amsterdam 1 and 57 do cluster together in one subgroup while NL and 496 are located in the second (Figure 2, Figure 3 ). Phylogenetic analysis of the 6K region of several isolates reveals that the clustering pattern changes, with isolates NL, 496 and Amsterdam 1 in one subgroup and isolate 57 being an outgroup (Figure 2, Figure 3). In the 21K region the analysis shows that Amsterdam 1 is a single representative of one cluster, while NL, 496 and 57 are tightly clustering in the second group (Figure 2, Figure 3). Therefore, there is clear evidence that HCoV-NL63 isolates are mosaics with multiple recombinations along the genome.

Figure 2

Figure 3

Discordance in clustering of different isolates of HCoV-NL63 at regions 3K, 6K and 21K. Three alignments of only variable sites subtracted from the original sequence with DnaSP 4.0 software, shows the discordance in clustering at different regions of the genome. Groups A and B were created arbitrarily to show the discordance. Group A was defined as the group that contains the Amsterdam 1 isolate.

Discordance in phylogenetic clustering of different isolates of HCoV-NL63 at regions 3K, 6K and 21K. Phylogenetic trees were constructed as described in Materials and Methods using an UPGMA algorithm. The scale bar unit represents a 0.002 substitution per site. The trees were rooted with the sequences of it's closest relative: for 3K and 6K the HCoV-229E and for the 21K region the PEDV. Four completely sequenced isolates are marked with colored boxes, illustrating the discordance in clustering in different regions of the genome. Discordance in clustering of different isolates of HCoV-NL63 at regions 3K, 6K and 21K. Three alignments of only variable sites subtracted from the original sequence with DnaSP 4.0 software, shows the discordance in clustering at different regions of the genome. Groups A and B were created arbitrarily to show the discordance. Group A was defined as the group that contains the Amsterdam 1 isolate. Because in several regions the sequence was highly conserved and the analysis did not show signs of the presence of two genetic clusters it was very difficult to identify the exact location of the recombination spots. The S gene does contain enough informative sites, and we identified two spots of recombination: one between positions nt 21,072 and 21,161 and the second between positions nt 21,662 and 21,884 in the Amsterdam 1 genome (Figure 4 ).

Figure 4

Identification of recombination sites in the S gene. The alignment includes only subtracted variable sites. These variable sites were subtracted with DnaSP 4.0 software. The change of color represents the alternation of genetic clustering between isolates. The numbers at the top represent the beginning of the S gene (nt 20,472 in the Amsterdam 1 isolate), coordinates of recombination spots inside the S gene (nt 21,061–21,072 and 21,575–21,576 in the Amsterdam 1 isolate) and the 3′ terminus of the S gene (nt 24,542 in the Amsterdam 1 isolate).

Interspecies recombination

We also analyzed whether recombination within the coronavirus family can be identified. The analysis was performed with the SimPlot software by plotting the similarity between different members of the Coronaviridae family as well as by scanning the genome with the bootscan tool. Along the genome the similarity between HCoV-NL63 and HCoV-229E is the highest, except for one part of the M gene. The similarity graph shows that the 3′ region of the M gene has a higher nucleotide similarity to PEDV than to HCoV-229E (Figure 5(a)). Additionally, the bootscan analysis suggests recombination between an ancestral HCoV-NL63 strain and PEDV in that region (Figure 5(b)). To rule out that the observed effect is the result of convergent evolution we analyzed the synonymous substitution pattern between HCoV-NL63 and HCoV-229E. It has been described that the synonymous substitution rate is increased at genome regions, which originated from another species and may be used as a marker of gene transfer from another species.18., 19. Indeed a rise in the synonymous substitution rate in the 3′ region of the M gene was observed (Figure 5(c)).

Figure 5

Signs of interspecies recombination in the M gene. Similarity plot (a) and bootscan analysis with the Kimura (two-parameter) distance model, neighbor-joining tree model and 500 bootstrap replicates (b) of the HCoV-NL63 M protein. (c) Analysis of the substitution pattern on the synonymous and non-synonymous level between the M gene of HCoV-NL63 and HCoV-229E. The window used was 40 nucleotides with a step of one nucleotide.

Molecular clock analysis

Based on the nucleotide sequence coding for the S protein (nt 20,649–22,269), a maximum-likelihood phylogenetic tree was constructed for HCoV-NL63 and several HCoV-229E strains for which the date of isolation was known (three isolates from year 1967, five isolates from year 1999 and six isolates from year 2000). Based on these sequence data, the evolutionary rate of HCoV-229E was calculated by Bayesian coalescent approach, serial ML estimate21., 22. and sUPGMA approaches. The evolutionary rates estimated with these three approaches were of very similar magnitude and were 3.28 × 10− 4 (95% confidence interval, 1.72 × 10− 4 to 5.00 × 10− 4), 6.17 × 10− 4 and 2.82 × 10− 4 (95% confidence interval, 1.36 × 10− 4–4.42 × 10− 4) substitutions per site per year, respectively. Assuming a constant evolutionary rate in time and between the branches for HCoV-NL63 and HCoV229E, the time to the most recent common ancestor (TMRCA) of HCoV-NL63 and HCoV-229E was dated by the Bayesian coalescent approach around the year 1053 (95% highest posterior density interval, year 966 to 1142). This estimate was highly consistent under different demographic models, including an exponential growth (TMRCA around year 1105 (95% confidence interval, 1017 to 1188)), and expansion growth (TMRCA around year 1124 (95% confidence interval, 1038 to 1206)). A likelihood ratio test indicated that the molecular clock hypothesis could not be rejected (P = 0.05). We also attempted to date back the split of two HCoV-NL63 lineages, but due to several recombination spots in the spike gene region that we sequenced (see also Figure 4) this analysis was not possible. The 3K and 6K fragments could not be used because we only know the substitution rate for the 20,649–22,269 region in the HCoV-NL63 genome.

Discussion

Homologous recombination is well known for several RNA viruses, including coronaviruses.24., 25. A “copy-choice” mechanism has been proposed, in which the RdRp, together with the nascent RNA strand, dissociates from the original template and re-associates at the same position on another template subsequently recommencing RNA synthesis. Recombination of coronavirus genomes has been observed in vitro in cell culture,24., 25., 26. in experimentally infected animals, and in embryonated eggs. In the case of infectious bronchitis virus, there is evidence for homologous recombination occurring in the field.29., 30., 31., 32. We present evidence that recombination has occurred during the evolution of HCoV-NL63 and that viral isolates possess a mosaic genome structure. Recombination was discovered by full genome sequence analysis of HCoV-NL63 variants from clinical samples. Analysis of several genome regions showed discordance in the phylogenetic clustering along the genome, a clear sign of recombination. Because the majority of informative sites are located at non-coding positions one can exclude that related genome sequences were the result of convergent evolution due to positive selection. HCoV-NL63 is the causative agent of up to 10% of all respiratory illnesses.4., 7., 8., 9., 10., 15., 33., 34., 35., 36. This high prevalence obviously increases the possibility of a recombination event through co-infection with another human or zoonotically transmitted animal coronavirus. Thus, recombination might enable highly pathogenic recombinant virus variants to arise. There is some evidence for recombination between PEDV and an ancestral HCoV-NL63 strain. Whereas HCoV-NL63 is mostly similar to HCoV-229E, a part of the HCoV-NL63 M gene shows the highest similarity to PEDV, suggesting a possible interspecies recombination event. The M protein is a key player in virus assembly and budding, thereby interacting with other structural viral proteins. The M protein of coronaviruses has been shown to span the virion membrane three or four times. The Nexo-Cendo topology is adopted by most coronaviral M proteins, but it has been proposed that the M protein of TGEV is present in the viral envelope in two topological states, Nexo-Cendo and Nexo-Cexo. The sequence analysis with the TMHMM v 2.0 prediction software† suggests a similar scenario for HCoV-NL63 (data not shown). Also the M protein of PEDV, but notably not HCoV-229E, shares these characteristics, which suggests that this domain of the M gene is obtained by interspecies recombination. Coronaviruses are well equipped to adapt rapidly to changing ecological niches by the high substitution rate of their RNA genome. The average substitution rate for this family was estimated to be 10− 4 substitutions per year per site.38., 39. HCoV-NL63 is a member of the group I coronaviruses, with highest similarity to HCoV-229E and PEDV. These three species cluster with a recently described bat coronavirus (BatCoV, strain 61) in subgroup Ib. Our efforts to establish the substitution rate of HCoV-NL63 failed, as there is not enough sequence data available from isolates of past years. For this reason we decided to calculate the substitution rate for HCoV-229E, using partial sequences of the S gene from different dates. Based on this substitution rate, we dated the divergence time of HCoV-229E and HCoV-NL63 to the 11th century. The reliability of molecular dating is dependent on the validity of the molecular clock hypothesis, which assumes that the substitution rate is roughly constant. A maximum likelihood test confirmed that the molecular clock hypothesis is suitable for the coronavirus data set investigated here. We propose that around 900 years ago the HCoV-229E and HCoV-NL63 viruses started to evolve from a common ancestor into the direction of separate species. For HCoV-OC43 it has been estimated that it emerged at the end of the 19th century or the beginning of the 20th century, implying that it has only been around for 100 years.41., 42. The SARS-CoV was introduced in humans in 2002, and for HCoV-HKU1, this has not been investigated. The heterogeneity of HCoV-HKU1, which appears in two separate genotypes and a third genotype that is a recombinant of these two genotypes,43., 44. suggests that this virus was not recently introduced into the human population, similar to the situation of HCoV-NL63. The divergence of HCoV-NL63 and HCoV-229E was followed by a separation of HCoV-NL63 into two lineages, of which we suspect that this occurred at geographically distinct locations. Subsequently the two lineages recombined during co-infection, illustrated by the fact that all four full genome sequences available thus far display a mosaic genome organization. The variability of coronaviruses has not been studied thoroughly. Although there are several reports concerning SARS-CoV, the relatively short time that this virus was present in the human population makes a long term study impossible. The first variable region of HCoV-NL63 encodes the three proteins nsp1–nsp3. The biological function of nsp1 and nsp2 proteins is thought to be linked to virus replication.45., 46. The nsp3 protein is a coronaviral papain-like proteinase (PLpro) that is expected to be a multifunctional protein with several domains that mediate various enzymatic activities.47., 48. Thus, the high variability of this region might influence the viral replication and interaction with cellular proteins. The second variable region is located in the 5′ part of the S gene. This region contains 24% of all polymorphic nucleotides, whereas it encompasses only ∼ 4% of the genome. The coronavirus S protein is an important determinant for the host cell specificity and tissue tropism, which is largely determined by the distribution of its receptor. Recently, Hofmann et al. reported that HCoV-NL63 uses the angiotensin converting enzyme 2 (ACE2) molecule as a receptor. The interaction of NL63-S with ACE2 is surprising, as HCoV-NL63 is closely related to HCoV-229E, which uses CD13 as a receptor. Furthermore, NL63-S shares no appreciable amino acid identity with SARS-CoV-S, which does use ACE2. The amino acid sequence of the CD13-binding site in 229E-S is 57% conserved in NL63-S, whereas the alignment of the ACE2-binding site of SARS-CoV-S with NL63-S reveals only 14% aa identity. These data suggest that NL63-S and SARS-CoV-S might have evolved different strategies to interact with ACE2. The receptor binding domain of NL63-S protein resides in the S1 region. Thus, the variability that we observed may alter the ACE2-binding properties of the S protein or, alternatively, the binding to a co-receptor. Besides the two hypervariable regions, we also identified regions with a remarkably low substitution rate. The 1b gene is extremely conserved, most prominently in the region encoding RdRp (nt 12416–15195). Analysis of the ORF3 gene shows high conservation of this gene, unlike what has been reported for HCoV-NL63. This suggests a vital function of the ORF3 protein during natural infection. Further investigations on the ORF3 protein function is needed to determine its real biological relevance. The observation of recombination within the HCoV-NL63 group indicates that two lineages, identified in previous reports in the 1a gene, cannot be treated as separate lineages. Characterization and typing of currently circulating strains should be performed with at least two assays, based on sequences derived from hypervariable regions 1–6000 nt and 20,000–21,000. On the contrary, a sensitive diagnostic assay for detection of HCoV-NL63 should be designed in the regions with highest stability such as the 1b or N gene.

Materials and Methods

Patient isolates

HCoV-NL63 positive patients were identified by a diagnostic nested RT-PCR as described or by a real time PCR that was performed with primers NF (5′ GCGTGTTCCTACCAGAGAGGA 3′) and NR (5′ GCTGTGGAAAACCTTTGGCA 3′), and HCoV-NL63 was detected with probe NP (5′ FAM–ATGTTATTCAGTGCTTTG GTCCTCGTGAT–TAMRA 3′) as described. A total of 23 NL63-positive patients were identified within the Academic Medical Center, Amsterdam. Two HCoV-NL63 isolates were selected for full genome sequencing (57 and 496) and several genome fragments were sequenced for the other 21 isolates (Table 1). Sampling dates and patient characteristics are summarized in Table 1. Sequencing was performed on an ABI 3700 machine (Perkin-Elmer Applied Biosystems) using the BigDye terminator cycle sequencing kit (version 1.1). Chromatogram sequence files were inspected and assembled with CodonCode 1.4 and further corrected manually. Several genomic regions were amplified using the following primers. For the 1a gene: sense 5′ GGTCACTATGTAGTTTATGATG 3′ and sense 5′ GGATTTTTCA TAACCACTTAC 3′; antisense 5′ CTTTTGATAACGGTCACTATG 3′ and antisense 5′ CTCA TTACATAAAACATCAAACGG 3′. For the S gene: sense 5′ GGTTGTTGTTACGCAATAAT GGTCGT 3′; antisense 5′ ACACGGCCATTATGTGTGGT 3′. For ORF3: sense 5′ ATTGTT TAACTTCATCAATGC 3′; antisense 5′ CCATAAAATGGAATTGAGGACAATAC 3′. For N: sense 5′ CTCTCAGGAGGGTGTTTTGTCAGAAAG 3′; antisense 5′ ATAATAAACATTCA ACTGGAATTA C 3′.

RT-PCR and full genome sequencing

The cDNA used for sequencing was generated with MMLV-RT, 1 μg of random hexamer DNA primers, in 10 mM Tris (pH 8.3), 50 mM KCl, 0.1% (v/v) Triton-X100, 6 mM MgCl2 and 50 μM dNTPs at 37 °C for 90 min. The cDNA was converted into double-stranded DNA in a standard PCR reaction with 1.25 units of Taq polymerase (Perkin-Elmer) per reaction and appropriate primers. Full genome sequencing of the two field isolates of HCoV-NL63 was performed with single round RT-PCR as described above, with a set of overlapping PCR products (average size 700 bp) encompassing the entire genome. Primers were designed in regions that are conserved between the Amsterdam 1 and NL isolates of HCoV-NL63. Primer sequences used for full genome sequencing are available on request. The 5′ and 3′-terminal sequence were determined by 5′ RACE (Invitrogen) and 3′ RACE as described. Each PCR fragment was sequenced on both strands and the virus isolates were amplified and sequenced on separate dates to prevent sample contamination. Each experiment contained negative extraction controls. Sequencing was performed as described above.

Sequence analysis

Multiple sequence alignments were prepared with ClustalX 1.83 and manually edited in BioEdit. Phylogenetic analyses were conducted using MEGA, version 3.1. Bootscan and similarity graphs were prepared with SimPlot 2.5 software. Analysis of HCoV-NL63 variability and synonymous and non-synonymous substitutions was done with DnaSP 4.0 software. Positive selection was analyzed with PAML 3.14 software. The analysis is according to the codon-based evolution models (one-ratio, neutral and selection models) developed by Nielsen and Yang, which allows the ratio of synonymous and non-synonymous substitution rates to vary among amino acid positions. This method uses dS and dN to denote the rates of synonymous and non-synonymous substitutions, respectively. Their ratio reflects the selection intensity at the amino acid level. Evolutionary rates were estimated using three approaches: Bayesian inference in BEAST, version 1.2 (kindly made available by A. J. Drummond and A. Rambaut, University of Oxford‡ ) and serial ML estimate and sUPGMA with PEBBLE 1.0. Divergence times were estimated using Bayesian inference in BEAST, version 1.2‡. The Markov chain Monte Carlo (MCMC) length was 108 with a sample frequency of 103 and effective sample size of 9 × 104. The burn-in was 107. Convergence to stationarity was investigated using the Tracer 1.3 MCMC trace analysis tool. The molecular clock hypothesis was tested by the likelihood ratio test.

Nucleotide sequence accession numbers

The sequence of HCoV-NL63 isolate Amsterdam 57 and Amsterdam 496 described here were deposited in GenBank under accession numbers: DQ445911 and DQ445912, respectively. The partial sequences of patient isolates were deposited in GenBank under accession numbers DQ462752-DQ462792DQ462752DQ462753DQ462754DQ462755DQ462756DQ462757DQ462758DQ462759DQ462760DQ462761DQ462762DQ462763DQ462764DQ462765DQ462766DQ462767DQ462768DQ462769DQ462770DQ462771DQ462772DQ462773DQ462774DQ462775DQ462776DQ462777DQ462778DQ462779DQ462780DQ462781DQ462782DQ462783DQ462784DQ462785DQ462786DQ462787DQ462788DQ462789DQ462790DQ462791DQ462792. The GenBank accession number of HCoV-NL63 isolate Amsterdam-1 is NC_005831, isolate NL is AY518894, HCoV-229E is AF304460, and PEDV (strain CV777) is AF353511. The GenBank accession numbers of the 6K region of isolates 072, 246, 248 and 466 are AY567494, AY567489, AY567493 and AY567488, respectively.

50 in total

1. The inference of stepwise changes in substitution rates using serial sequence samples.

Authors: A Drummond; R Forsberg; A G Rodrigo
Journal: Mol Biol Evol Date: 2001-07 Impact factor: 16.240

2. Comparative analysis of 22 coronavirus HKU1 genomes reveals a novel genotype and evidence of natural recombination in coronavirus HKU1.

Authors: Patrick C Y Woo; Susanna K P Lau; Cyril C Y Yip; Yi Huang; Hoi-Wah Tsoi; Kwok-Hung Chan; Kwok-Yung Yuen
Journal: J Virol Date: 2006-07 Impact factor: 5.103

3. Complete genomic sequence of human coronavirus OC43: molecular clock analysis suggests a relatively recent zoonotic coronavirus transmission event.

Authors: Leen Vijgen; Els Keyaerts; Elien Moës; Inge Thoelen; Elke Wollants; Philippe Lemey; Anne-Mieke Vandamme; Marc Van Ranst
Journal: J Virol Date: 2005-02 Impact factor: 5.103

4. The Coronaviridae now comprises two genera, coronavirus and torovirus: report of the Coronaviridae Study Group.

Authors: D Cavanagh; D A Brian; M A Brinton; L Enjuanes; K V Holmes; M C Horzinek; M M Lai; H Laude; P G Plagemann; S G Siddell
Journal: Adv Exp Med Biol Date: 1993 Impact factor: 2.622

5. Human coronavirus NL63 associated with lower respiratory tract symptoms in early life.

Authors: Laurent Kaiser; Nicolas Regamey; Hanna Roiha; Christelle Deffernez; Urs Frey
Journal: Pediatr Infect Dis J Date: 2005-11 Impact factor: 2.129

6. High-frequency RNA recombination of murine coronaviruses.

Authors: S Makino; J G Keck; S A Stohlman; M M Lai
Journal: J Virol Date: 1986-03 Impact factor: 5.103

7. Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data.

Authors: Alexei J Drummond; Geoff K Nicholls; Allen G Rodrigo; Wiremu Solomon
Journal: Genetics Date: 2002-07 Impact factor: 4.562

8. New human coronavirus, HCoV-NL63, associated with severe lower respiratory tract disease in Australia.

Authors: Katherine E Arden; Michael D Nissen; Theo P Sloots; Ian M Mackay
Journal: J Med Virol Date: 2005-03 Impact factor: 2.327

9. Detection of human coronavirus NL63 in young children with bronchiolitis.

Authors: Takashi Ebihara; Rika Endo; Xiaoming Ma; Nobuhisa Ishiguro; Hideaki Kikuta
Journal: J Med Virol Date: 2005-03 Impact factor: 2.327

10. Genome structure and transcriptional regulation of human coronavirus NL63.

Authors: Krzysztof Pyrc; Maarten F Jebbink; Ben Berkhout; Lia van der Hoek
Journal: Virol J Date: 2004-11-17 Impact factor: 4.099

93 in total

Review 1. The novel human coronaviruses NL63 and HKU1.

Authors: Krzysztof Pyrc; Ben Berkhout; Lia van der Hoek
Journal: J Virol Date: 2006-11-01 Impact factor: 5.103

Review 2. Current understanding of middle east respiratory syndrome coronavirus infection in human and animal models.

Authors: Yanqun Wang; Jing Sun; Airu Zhu; Jingxian Zhao; Jincun Zhao
Journal: J Thorac Dis Date: 2018-07 Impact factor: 2.895

3. Recent transmission of a novel alphacoronavirus, bat coronavirus HKU10, from Leschenault's rousettes to pomona leaf-nosed bats: first evidence of interspecies transmission of coronavirus between bats of different suborders.

Authors: Susanna K P Lau; Kenneth S M Li; Alan K L Tsang; Chung-Tong Shek; Ming Wang; Garnet K Y Choi; Rongtong Guo; Beatrice H L Wong; Rosana W S Poon; Carol S F Lam; Sylvia Y H Wang; Rachel Y Y Fan; Kwok-Hung Chan; Bo-Jian Zheng; Patrick C Y Woo; Kwok-Yung Yuen
Journal: J Virol Date: 2012-08-29 Impact factor: 5.103

4. Surveillance of Bat Coronaviruses in Kenya Identifies Relatives of Human Coronaviruses NL63 and 229E and Their Recombination History.

Authors: Ying Tao; Mang Shi; Christina Chommanard; Krista Queen; Jing Zhang; Wanda Markotter; Ivan V Kuzmin; Edward C Holmes; Suxiang Tong
Journal: J Virol Date: 2017-02-14 Impact factor: 5.103

Review 5. Recombination, reservoirs, and the modular spike: mechanisms of coronavirus cross-species transmission.

Authors: Rachel L Graham; Ralph S Baric
Journal: J Virol Date: 2009-11-11 Impact factor: 5.103

6. Isolation and characterization of current human coronavirus strains in primary human epithelial cell cultures reveal differences in target cell tropism.

Authors: Ronald Dijkman; Maarten F Jebbink; Sylvie M Koekkoek; Martin Deijs; Hulda R Jónsdóttir; Richard Molenkamp; Margareta Ieven; Herman Goossens; Volker Thiel; Lia van der Hoek
Journal: J Virol Date: 2013-02-20 Impact factor: 5.103

7. Human coronavirus NL63 open reading frame 3 encodes a virion-incorporated N-glycosylated membrane protein.

Authors: Marcel A Müller; Lia van der Hoek; Daniel Voss; Oliver Bader; Dörte Lehmann; Axel R Schulz; Stephan Kallies; Tasnim Suliman; Burtram C Fielding; Christian Drosten; Matthias Niedrig
Journal: Virol J Date: 2010-01-15 Impact factor: 4.099

8. Understanding Human Coronavirus HCoV-NL63.

Authors: Sahar Abdul-Rasool; Burtram C Fielding
Journal: Open Virol J Date: 2010-05-25

9. Distant relatives of severe acute respiratory syndrome coronavirus and close relatives of human coronavirus 229E in bats, Ghana.

Authors: Susanne Pfefferle; Samuel Oppong; Jan Felix Drexler; Florian Gloza-Rausch; Anne Ipsen; Antje Seebens; Marcel A Müller; Augustina Annan; Peter Vallo; Yaw Adu-Sarkodie; Thomas F Kruppa; Christian Drosten
Journal: Emerg Infect Dis Date: 2009-09 Impact factor: 6.883

Review 10. Coronaviruses post-SARS: update on replication and pathogenesis.

Authors: Stanley Perlman; Jason Netland
Journal: Nat Rev Microbiol Date: 2009-06 Impact factor: 60.633