Literature DB >> 32925939

Population genomics of louping ill virus provide new insights into the evolution of tick-borne flaviviruses.

Jordan J Clark^1,2, Janice Gilray², Richard J Orton¹, Margaret Baird¹, Gavin Wilkie¹, Ana da Silva Filipe¹, Nicholas Johnson^3,4, Colin J McInnes², Alain Kohl¹, Roman Biek⁵.

Abstract

The emergence and spread of tick-borne arboviruses pose an increased challenge to human and animal health. In Europe this is demonstrated by the increasingly wide distribution of tick-borne encephalitis virus (TBEV, Flavivirus, Flaviviridae), which has recently been found in the United Kingdom (UK). However, much less is known about other tick-borne flaviviruses (TBFV), such as the closely related louping ill virus (LIV), an animal pathogen which is endemic to the UK and Ireland, but which has been detected in other parts of Europe including Scandinavia and Russia. The emergence and potential spatial overlap of these viruses necessitates improved understanding of LIV genomic diversity, geographic spread and evolutionary history. We sequenced a virus archive composed of 22 LIV isolates which had been sampled throughout the UK over a period of over 80 years. Combining this dataset with published virus sequences, we detected no sign of recombination and found low diversity and limited evidence for positive selection in the LIV genome. Phylogenetic analysis provided evidence of geographic clustering as well as long-distance movement, including movement events that appear recent. However, despite genomic data and an 80-year time span, we found that the data contained insufficient temporal signal to reliably estimate a molecular clock rate for LIV. Additional analyses revealed that this also applied to TBEV, albeit to a lesser extent, pointing to a general problem with phylogenetic dating for TBFV. The 22 LIV genomes generated during this study provide a more reliable LIV phylogeny, improving our knowledge of the evolution of tick-borne flaviviruses. Our inability to estimate a molecular clock rate for both LIV and TBEV suggests that temporal calibration of tick-borne flavivirus evolution should be interpreted with caution and highlight a unique aspect of these viruses which may be explained by their reliance on tick vectors.

Entities: Chemical

Mesh：

Year: 2020 PMID： 32925939 PMCID： PMC7515184 DOI： 10.1371/journal.pntd.0008133

Source DB: PubMed Journal: PLoS Negl Trop Dis ISSN： 1935-2727

Introduction

The tick-borne encephalitis sub-complex of the genus Flavivirus (family Flaviviridae) is composed of several closely related arboviruses, all of which possess a single-stranded, positive sense, RNA genome of ~12kb [1]. Members of this subcomplex are found across the northern hemisphere [2,3] and include several zoonotic viruses, such as tick-borne encephalitis virus (TBEV), of which there are three predominant subtypes: European (TBEV-Eu), Siberian (TBEV-Sib) and Far-Eastern (TBEV-FE). TBEV frequently infects humans as incidental hosts, resulting in febrile illness and often fatal encephalitis [4]. The case fatality rate (CFR) of TBEV varies according to subtype, with TBEV-Eu and TBEV-Sib exhibiting a CFR of 1–2%, while the highly pathogenic TBEV-FE subtype exhibits a CFR of 20–40% [5]. In contrast, louping ill virus (LIV), which is closely related to TBEV, is predominantly associated with disease in ruminants and birds, whereas human cases are rare [6,7]. LIV is endemic to the British Isles [8]and spread by the hard tick Ixodes ricinus. First isolated in the early 1930s, LIV was identified as the causative agent of louping ill disease, which can cause mortalities in sheep (Ovis aries) and, more significantly, red grouse (Lagopus lagopus scotica) [9,10]. LIV is of significant economic concern to sheep farmers and the game estates where grouse are maintained for commercial shooting purposes. The threat posed by LIV is further aggravated by the movement of tick populations into new regions and higher altitudes [11-13], and a reduction in tick inter-stadial development time due to climate change [14]. While LIV is predominantly found within the UK and Republic of Ireland, the virus has also been reported in Southern Norway [15], the Danish island of Bornholm [16], and Far-Eastern Russia [17]. However, the biological mechanisms responsible for this disjointed geographic distribution are not well understood. Whereas the genomic diversity of TBEV is well characterised, evolutionary studies of LIV have been limited to the analysis of single genes [18], which can lead to inconsistent phylogenetic inference [19,20]. At present, only four complete LIV genomes are available, which has prevented a meaningful analysis of LIV genomic diversity and evolution and their comparison with other flaviviruses. Whilst recombination has previously been shown to contribute to dengue virus [21,22] and Japanese encephalitis virus evolution [23], recombination in the tick-borne flaviviruses (TBFV) is controversial. Two studies have reported a recombination event between LIV and TBEV-Eu [20,24]; however, subsequent work has raised doubts about these findings [25]. Although the geographic ranges of the two viruses are largely distinct [26], creating limited opportunities for interaction, recent detection of TBEV in Britain [27-29], where LIV is endemic, has added new urgency to investigate the possibility of recombination between the two viruses. Phylogenies are essential for determining the epidemiological and evolutionary history of viruses, and, by calibrating trees with a molecular clock, to place their history into a temporal context [30]. Previous phylogenetic studies have reported clock rates and time calibrated trees for both TBEV and LIV; however, only in the case of TBEV were estimates based on full genomes [20,31,32]. In contrast, rates for LIV were derived from glycoprotein sequences, resulting in considerable uncertainty with respect to phylogenetic relationships and divergence times [18]. Moreover, previous studies in both viruses implicitly assumed that time-stamped sequence data sets contained sufficient evolutionary signal to estimate clocks reliably. Recent work on other RNA viruses has shown that this assumption is not always met [33], warranting a more careful approach to phylogenetic dating in TBFV. Here, we address these knowledge gaps based on a novel dataset of 22 LIV genomes derived from UK isolates sampled over the past century. Combining these data with published genome sequences for LIV (Fig 1) and related viruses we sought to i) conduct the first systematic test for recombination and positive selection in LIV, ii) clarify the phylogeography of LIV and its phylogenetic relationship to other TBEV, and iii) re-assess the evolutionary rate and divergence time estimates of LIV and TBEV.

Fig 1

Map of the UK showing the sampling locations of the 22 LIV isolates generated in this study.

Isolates which have been sequenced during this study are coloured according to their country of isolation, with Scotland shown as blue, England as red, Ireland as green, and Wales as pink. Isolates whose sequence has been downloaded from GenBank are shown as black circles. Strains Primorye-185-91 and LEIV-7435Tur were isolated in the Russian Far-East and Turkmenistan and are therefore not shown on the map. Further details of the 26 LIV genomes utilised in this study are also presented.

Map of the UK showing the sampling locations of the 22 LIV isolates generated in this study.

Methods

Virus isolates

The isolation of the virus isolates sequenced in this study, with the exception of ENG_PEN6_2009, and WA_AB2_2010, is described elsewhere [18]. ENG_Dog_2015 was isolated from the brain of an infected dog and has been previously described [34]. All viruses with the exception of ENG_PEN6_2009, and WA_AB2_2010 were passaged once in BHK21 cells (which were derived from cell stocks present at the Moredun Research Institute). Isolates ENG_A_1980, ENG_DEV1_1983, ENG_DEV2_1989, IRE_IRE3_1968, and SCO_LOCH2_1993, did not sequence well after passage in BHK21 cells, therefore they were grown in the cell line CPT-Tert [35]. The CPT-Terts are an ovine derived cell line which were kindly provided by Dr. David Griffiths (Moredun Research Institute). All cells were maintained in DMEM supplemented with 10% FCS, 1% HEPES, 1000 units/ml penicillin and 1 mg/ml streptomycin in T25 cell culture flasks. Cells were infected at a MOI of 0.1 and incubated at 37°C with a 5% CO2 atmosphere for 36–48 hours at which point CPE became obvious.

RNA extraction & sequencing

RNA extraction of cells in 25 cm2 flasks was carried out using TRIzol reagent (Thermo Fisher Scientific). Cell culture media was removed, cells washed twice with PBS, and 1 ml TRIzol reagent added to the cell monolayer. Samples ENG_PEN6_2009 and WA_AB2_2010 were obtained from sheep brain tissue which was taken from suspected LIV cases and homogenised in TRIzol reagent (Thermo Fisher Scientific). RNA extraction was carried out as per manufacturer’s instructions. RNA libraries were prepared for sequencing using the TruSeq RNA kit (Illumina), following the manufacturer’s protocol. Briefly, sample preparation utilized 400 ng of sample RNA, measured using a Qubit 3.0 fluorometer (Invitrogen). RNA was chemically fragmented and first strand cDNA generated using SuperScript II (Thermo Fisher Scientific) as per manufacturer’s instructions. Following the ligation of index sequences, samples were cleaned using AMPure XP beads (Beckman Coulter) and washed twice in 70% (v/v) ethanol. Following resuspension PCR was employed to enrich for the cDNA fragments which had adapter sequences ligated to their ends as per manufacturer’s instructions. The thermal cycler conditions employed were as follows: 98°C for 30 sec followed by 15 cycles of 98°C for 10 sec, 60°C for 30 sec, 72°C for 30 sec, and a final extension of 72°C for 5 minutes. Following PCR, libraries were cleaned with AMPure XP beads as described before. All libraries were pooled together at equimolar concentrations. The molarity of the libraries was calculated based on mass concentration using a Qubit 3.0 fluorometer (Invitrogen) and the size of the fragments using an Agilent 2200 TapeStation (Agilent) as per manufacturer’s instructions. Pooled libraries were then sequenced on an Illumina MiSeq (Illumina). This sequencing strategy utilises paired end reads of 250bp. The number of reads obtained for each genome and the genome coverage of each genome is displayed in S1 Table.

Sequence assembly

Prior to bioinformatic analysis, read quality was assessed using FASTQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). The raw FASTQ sequence reads have been submitted to the Europe Nucleotide Archive (ENA) under accession number PRJEB38554. Adapter sequences were removed, and quality filtered using trim_galore (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) utilising a quality threshold of Q25 and a minimum read length of 75. Reads were also filtered for low complexity and duplicates using prinseq [36]. Filtered reads were subsequently mapped onto a LIV complete genome sequence downloaded from GenBank (LIV 369/T2, accession number: Y07863) using alignment software (BWA-MEM, [37]). The assembled data was parsed using DiversiTools (http://josephhughes.github.io/btctools/) to determine the frequency of nucleotides at each site and to construct a consensus sequence; consensus is defined as the most dominant base at each genome position. To validate the consensus sequences generated via read alignment to the reference genome, all sequences from the first twelve samples we sequenced were also de novo assembled using SPAdes [38]. Reference aligned sequences and de novo assembled sequences were found to be identical, therefore all subsequent genomes were generated via reference alignment. The accession numbers corresponding to the sequences generated during this study are shown in S2 Table.

Sequence alignment & analysis

Multiple sequence alignment was carried out using the MUSCLE program within the Geneious software package [39,40]. Eight iterations were utilised within MUSCLE. Distance tables were generated within Geneious to determine the pairwise identity of sequences. The sequences utilised for this analysis are shown in S2 Table.

Recombination screening

Alignments were utilised for recombination analysis using recombination detection program 4 (RDP4) [41]. Within RDP4 the: RDP, geneconv, Bootscan, MaxChi, Chimaera, Siscan and 3Seq methods were utilised [42-48]. Standard parameters were employed for screening. Recombination events with a p-value of <0.05 and which were identified by more than three methods were considered significant.

Selection analysis

Selection analysis was carried out using the HYPHY software via the online selection tool Datamonkey (http://www.datamonkey.org/) [49,50]. Tests for positive selection were carried out using the fixed effects likelihood (FEL) analysis, and Mixed Effects Model of Evolution (MEME) analysis [50,51]. Genome-wide selection was performed using an alignment of 26 LIV coding sequences (CDS) however, single gene alignments were also analysed. For this analysis, a p-value of ≤ 0.05 was considered significant. HYPHY was also utilised to estimate a phylogenetic tree using the dataset of 26 LIV genomes and the aBS-REL (adaptive branch-site random effects likelihood) model [49,50,52]. The resulting tree was downloaded from the Datamonkey website and the total branch length compared with the total branch length of the ML tree generated using PHYML using the ape package within the R software package (R Core Team, https://www.r-project.org/, http://ape-package.ird.fr/).

Phylogenetic analysis

The most suitable substitution model for the LIV dataset investigated using the program JModelTest2 (https://github.com/ddarriba/jmodeltest2 [53]). The most suitable model with the lowest BIC score was found to be the GTR+G+I model. Bayesian phylogenetic trees were generated using the program MrBayes within the Geneious software suite [40,54]. The GTR+G+I model was employed with 4 gamma categories and an MCMC chain length of 1 million with 4 heated chains. 10% of these were discarded as burn-in and a consensus tree generated. Omsk haemorrhagic fever virus (accession number: NC_005062) was included as an outgroup. Maximum likelihood [ML] trees were utilised for root-to-tip divergence analysis of the LIV dataset and were generated using the program PHYML within the Geneious software suite [40,55]. Trees were estimated using the GTR substitution model with 1000 parametric bootstrap replicates. Root-to-tip divergence analysis to test for the presence of a molecular clock in the LIV phylogeny was carried out using the program TempEst utilising the best-fitting root option (http://tree.bio.ed.ac.uk/software/tempest/)[56]. The rate of evolution of LIV was estimated using the Bayesian evolutionary analysis sampling trees (BEAST) software package (http://beast.community/, version 1.8.4) [57]. To determine the most appropriate molecular clock model and coalescent model for the LIV dataset, marginal likelihood estimation (MLE) using path-sampling (PS) and stepping-stone (SS) was carried out [57-59]. A chain length of 50,000,000 was utilised with 100 steps of chain length 1,000,000 for the PS or SS MLE. A model with a lognormal relaxed clock and a Bayesian skygrid tree prior was found to have the lowest MLE and was therefore deemed most suitable [60,61]. BEAST analysis was carried out using the GTR+G+I substitution model, the uncorrelated lognormal relaxed clock, the Bayesian skygrid coalescent model and a chain length of 50,000,000. The BEAGLE library was employed when running BEAST to improve the speed of runs [62]. The resulting log files were analysed using the program Tracer (https://github.com/beast-dev/tracer/releases/tag/v1.7.1, version 1.7.0, [63]) to ensure that the effective sample size (ESS) of each parameter in the complete BEAST run was above 200. Final maximum clade credibility (MCC) trees were generated using the program TreeAnnotator (version 1.8.4) with burn-in specified as 10% of states. To determine the validity of the clock estimate, BEAST analysis was repeated using a null model composed of the original sequence data with dates randomised. This analysis was repeated 20 times and the clock rate estimates compared to the estimates produced utilising the heterochronus dataset using the program TRACER. A TBEV-Eu tree was also generated in BEAST using an alignment of 36 TBEV-Eu sequences sampled across Europe and Russia. A chain length of 50,000,000 and the Bayesian skygrid coalescent model was employed. As with the LIV dataset, the clock-rate estimate was investigated using 20 date randomised datasets. All final phylogenetic trees were visualised using FigTree (version 1.4.3, http://tree.bio.ed.ac.uk/software/figtree/).

Results

No evidence for recombination in LIV

New complete genome sequences of 22 LIV isolates were obtained and aligned with 4 LIV genomes available from GenBank (Fig 1), as well as 36 TBEV-Eu genomes, four TBEV-Sib genomes, 21 TBEV-FE genomes and single genomes of Spanish sheep encephalitis virus (SSEV), Spanish goat encephalitis virus (SGEV), Greek goat encephalitis virus (GGEV), and Turkish sheep encephalitis virus (TSEV) (total of 91 TBFV genome sequences; S1 Table). Incorporation of the published LIV 369(T2) sequence (GenBank accession number Y07863, hereafter referred to as LIV 369/T2_Y07863) in this alignment resulted in the detection of a recombination signal located between nucleotides 5924 and 6129 of the LIV genome (Fig 2A). However, our dataset of 22 newly generated LIV genomic sequences included a re-sequencing of the original LIV 392(T2) isolate (hereafter referred to as SCO_369/T2_1963), which we found to differ from that published in GenBank. When the published sequence was replaced with the newly generated sequence, the recombination signal was no longer detected. Aligning the recombinant region containing both the published and re-sequenced LIV 369(T2) isolate, reveals that bases 5924 and 6129 of the LIV genome are shared between the GenBank derived LIV 369/T2_Y07863 sequence and several TBEV-Eu strains, whilst in the re-sequenced SCO_369/T2_1963 genome, this region is highly similar to all other LIV isolates (Fig 2B). This suggests that that the recombination signal detected in the published LIV 369/T2_Y07863 genome is an artefact. Available data for LIV therefore provide no evidence for recombination with other TBFV.

Fig 2

A) Recombination detection program (RDP) analysis of the alignment of LIV, LIV-like, TBEV-Eu, TBEV-Sib, and TBEV-FE sequences. A total of 91 genome sequences were included in the analysis. The blue line represented pairwise identity in the GenBank derived LIV 369/T2_Y07863 isolate whilst the purple line represented the pairwise identity of TBEV-Eu strain Neudoerfl. The recombination signal located to the area of intersection, located between nucleotides 5924 and 6129 of the LIV genome. The yellow line represents the LIV isolate IRE_IRE2_1971 which was identified as a potential recombination event however this was not statistically supported and was therefore rejected. B) A simplified alignment of a portion of the suspected recombination region, from nucleotides 5932–6010 for ease of comparison between LIV and TBEV. This is derived from a shortened alignment of 67 genomes which excludes the TBEV-Sib and TBEV-FE genomes. These TBEV sub-types did not contribute to the observed artefactual recombination signal and are omitted here for clarity. TBEV-Eu, LIV-like and LIV sequences are indicated by vertical bars, placed to the left of the sequence names and coloured yellow, green and blue respectively. The GenBank derived LIV 369/T2_Y07863 isolate, and the re-sequenced SCO_369/T2_1963 genome generated during this study are highlighted. Note that the LIV 369/T2_Y07863 sequence is most similar to the TBEV-Eu sequences, whereas the SCO_ 369/T2_1963 sequence is most similar to the LIV sequences.

Genome-wide sequence comparison and selection analysis of LIV

The 26 LIV isolates shared a mean nucleotide identity of ~96% across the entire genome (range: 92.1%–99.9%; Table 1) and a mean amino acid identity of 98.6 (range: 96.9%-99.97%). Average nucleotide identity for individual genes was also ~96%, with NS2A being the most variable (88.4%-91.3% nucleotide identity, 91.3–100% amino acid identity). The amino acid identity for all genes was around ~98%, with NS2B and NS3 most conserved (99.1% amino acid identity). The genetic diversity present within the LIV dataset is similar to that found for TBEV-Eu, with the exception of the 3’ UTR which is approximately 96% identical between the LIV isolates, but is highly variable between TBEV-Eu isolates, sharing only around 78% identity (Table 1). Analysis of selection patterns was carried out on the coding region of all 26 LIV genomes (Table 2), using the fixed effects likelihood (FEL) model and the mixed effects model of evolution (MEME). FEL assumes that selection pressures acting on each codon remain constant throughout the phylogeny [64] while MEME allows for positive selection to be episodic and thus to only apply to some branches [51].

Table 1

Distances table showing the pairwise genetic and amino acid distances of an alignment of 26 LIV genomes, an alignment of 36 TBEV-Eu genomes, and an alignment combining both datasets.

Mean distances are shown with the range highlighted in brackets. The alignments were generated, and distance tables generated using MUSCLE within the Geneious software package. As some sequences contained gaps at the 5’ and 3’ genomic termini, only complete genomes were utilised for comparison of the full genome sequence, and the 5’ and 3’ UTRs. For LIV 21 sequences were utilised for the 5’ UTR, 22 for the 3’ UTR and 21 for the full genome sequence. For TBEV-Eu 26 sequences were utilised for the 5’ UTR, 20 for the 3’UTR, 22 for the full genome nucleotide sequence and 36 genomes for the amino acid sequence.

		Identity (%)				Identity (%)				Identity (%)
		Nucleotide	Amino acid			Nucleotide	Amino acid			Nucleotide	Amino acid
LIV	Genome	96.8 (92.3–99.9)	98.6 (96.9–99.97)	TBEV-Eu	Genome	96.5 (94.3–99.9)	99.1 (98.5–99.97)	LIV/TBEV-Eu	Genome	91.3 (84.9–99.9)	96.96 (94.3–99.97)
	5’UTR	95.4 (83.5–100)	-		5’UTR	97.6 (93.1–100)	-		5’UTR	92.0 (82.6–100)	-
	C	96.9 (91.7–100)	96.9 (89.3–100)		C	98.1 (95.8–100)	98.6 (96.4–100)		C	93.0 (83.9–100)	94.0 (84.8–100)
	prM/M	96.7 (91.5–100)	98.9 (97–100)		prM/M	96.9 (93.8–100)	98.5 (94.7–100)		prM/M	93.7 (87.5–100)	96.1 (89.3–100)
	E	96.6 (91.3–100)	98.6 (96.4–100)		E	97.9 (96.6–99.9)	99.5 (98.2–100)		E	92.2 (86.1–100)	96.6 (92.5–100)
	NS1	96.6 (91.8–100)	98.5 (96.3–100)		NS1	98.2 (96.5–100)	99.1 (97.2–100)		NS1	92.4 (85.6–100)	97.2 (93.8–100)
	NS2A	96.0 (88.4–91.3)	97.1 (91.3–100)		NS2A	97.0 (94.3–99.9)	97.6 (94.8–100)		NS2A	90.3 (82.3–100)	94 (87.4–100)
	NS2B	96.6 (90.3–100)	99.1 (96.2–100)		NS2B	98.1 (95.7–100)	99.4 (96.9–100)		NS2B	92.5 (84.2–100)	99.4 (97–100)
	NS3	96.6 (91.8–100)	99.1 (97.7–100)		NS3	97.9 (96.9–100)	99.5 (98.2–100)		NS3	92.7 (86.6–100)	97.9 (95.2–100)
	NS4A	96.3 (91.9–100)	98.9 (96.6–100)		NS4A	97.3 (95.1–100)	99.1 (96–100)		NS4A	91.5 (83.7–100)	97.5 (93.3–100)
	NS4B	96.6 (92.2–100)	98.7 (96–100)		NS4B	97.4 (95.4–100)	98.7 (96.8–100)		NS4B	92.1 (85.3–100)	96.2 (92.5–100)
	NS5	96.9 (92.7–100)	98.9 (97.5–100)		NS5	97.8 (96.7–100)	99.2 (98.1–100)		NS5	93.0 (87.4–100)	97.6 (95.4–100)
	3’UTR	96.2 (91.4–99.8)	-		3’UTR	77.8 (51.4–100)	-		3’UTR	77.5 (51.4–100)	-

Table 2

Alignment of codon sites which were found to be subjected to positive selection.

	Gene	E		NS2A	NS3	NS5
	Model	MEME	MEME	FEL	FEL/MEME	MEME	MEME	FEL/MEME
	Codon site	88	308	100	96	434	522	699
	p-value	0.040	0.051	0.043	0.021/0.035	0.031	0.012	0.005/0.014
	Consensus	I	D	G	T	A	K	A
LIV isolates	IRE_IRE2_1971	.	.	.	.	R	.	T
	WA_AB2_2010	A	.	.	.	H	.	.
	WA_I_1980	A	.	S	.	H	.	V
	SCO_31_1931	.	.	.	.	H	.	T
	Penrith	.	.	.	.	.	.	.
	ENG_PRES1_1991	.	.	.	.	.	.	.
	ENG_PEN6_2009	.	.	.	.	.	.	.
	ENG_PEN3_1983	.	.	C	I	.	.	.
	ENG_PEN4_1983	.	.	.	.	.	.	.
	IRE_IRE3_1968	.	.	.	.	.	.	V
	SCO_369/T2_1963	.	.	.	.	.	.	.
	SCO_G_1979	.	.	.	.	.	.	.
	ENG_Dog_2015	.	.	.	M	.	.	.
	SCO_INV14_1992	.	.	.	.	.	S	.
	SCO_INV6_1986	.	E	.	M	.	.	.
	LEIV-7435Tur	.	.	.	M	.	.	T
	Primorye-185-91	.	.	.	M	.	.	T
	SCO_LOCH6_1993	.	.	.	.	.	.	.
	SCO_LOCH2_1993	.	.	.	I	.	.	.
	SCO_K_1980	.	.	.	I	.	.	V
	LIV 3/1	.	.	.	V	.	.	.
	SCO_INV1_1983	.	E	S	M	.	.	.
	ENG_DEV4_1995	.	E	S	I	.	.	.
	ENG_A_1980	.	E	S	I	.	.	.
	ENG_DEV1_1983	.	E	S	I	.	.	.
	ENG_DEV2_1989	.	E	S	I	.	.	.

Distances table showing the pairwise genetic and amino acid distances of an alignment of 26 LIV genomes, an alignment of 36 TBEV-Eu genomes, and an alignment combining both datasets.

Alignment of codon sites which were found to be subjected to positive selection.

Codon sites are numbered individually for each gene analysed. FEL denotes sites which have been identified using FEL, MEME denotes sites which have been identified by MEME, and FEL/MEME denotes sites identified by both FEL and MEME analysis. The statistical support for the selected site is given as a p-value, only sites with p < 0.05 are reported. The consensus amino acid is shown for each site and represents the most common amino acid found at that site. Sites which match the consensus are shown as dots, whilst divergent sites display the amino acid which does not match the consensus. Within E, two sites were identified as being subjected to episodic diversifying selection as shown by MEME analysis (Table 2). Of these sites, codon 308 within E encoded a charged aspartic acid residue in all isolates except SCO_INV6_1986, SCO_INV1_1983, ENG_DEV4_1995, ENG_A_1980, and ENG_DEV1_1983, which instead possess a similarly charged glutamic acid. In addition, codon site 100 within NS2A was identified as being under positive selection. Most isolates possessed a glycine residue at this site, however the clade of isolates sampled from southern England (ENG_DEV4_1995, ENG_A_1980, ENG_DEV1_1983, and ENG_DEV2_1989) encodes a serine, as does the Scottish isolate SCO_INV_1983 and the Welsh isolate WA_AB2_2010, whilst the northern English isolate ENG_PEN3_1983 exhibits a cysteine residue. Codon 96 within NS3 was identified by both MEME and FEL and encoded a threonine residue in the majority of sequences, whilst several isolates throughout the tree, sampled decades apart, from geographically distinct areas, exhibited a methionine, an isoleucine, or a valine. Codon 434 of the NS5 gene was found to encode a histidine in the Welsh isolates, an arginine in IRE_IRE2_1971, a histidine in the oldest sampled Scottish isolate SCO_31_1931 and an alanine in all other isolates (Table 2). As the Welsh isolates, and strains IRE_IRE2_1971 and SCO_31_1931 represent the most ancestral LIV isolates, codon site 434 may therefore have evolved from a histidine to an alanine as LIV spread throughout Britain. SGEV exhibits a histidine at codon 434, further indicating that histidine is the ancestral codon. Codon 699 within NS5 was also identified as being subjected to positive selection with most isolates encoding an alanine, but with several isolates sampled from different years and locations throughout the tree encoding a valine or threonine. Additionally, FEL identified 482 sites which were under purifying selection (Fig 3). The dN/dS ratio across the entire CDS was found to be 0.0745, indicative of strong purifying selection. Of the individual LIV genes, NS2B appeared to be under the strongest purifying selection (ω = 0.0342), whilst capsid (C) exhibited the weakest (ω = 0.241).

Fig 3

A) Gene by gene selection analysis of the 26 LIV genomes included in this study. dN/dS values are shown for each of the LIV genes. A dN/dS value of <1 is indicative of purifying selection. The number of sites under positive and negative selection was calculated using FEL and MEME analysis in the HYPHY software package as part of the DataMonkey web server. B) Graph of dN/dS values for each of the LIV genes.

Phylogenetic analysis of LIV and closely related TBFV

A Bayesian phylogenetic tree of TBFV genomes placed all LIV isolates into a single monophyletic clade, with SGEV being the closest relative, followed by SSEV and, more distantly TBEV-Eu (Fig 4A). Previous phylogenetic analyses using LIV E gene sequences exhibited only a few nodes with bootstrap values >75% [18], however in our analysis the nodes representing splits between TBFV species all received 100% posterior support as did most splits within LIV, demonstrating the increased phylogenetic certainty which can be achieved using whole genome data (Fig 4B). Within the LIV clade, the Welsh strains WA_I_1980 and WA_AB2_2010 and the Irish strain IRE_IRE2_1971 formed a sister clade to all remaining strains, including all isolates from England and Scotland (86% consensus support).

Fig 4

Bayesian consensus trees generated using MrBayes utilising the 22 LIV genomes generated during this study, four LIV genomes available on Genbank, and eight genomes of closely related BBFs.

Bayesian consensus trees generated using MrBayes utilising the 22 LIV genomes generated during this study, four LIV genomes available on Genbank, and eight genomes of closely related BBFs.

A) The Bayesian consensus tree with the LIV clade collapsed, highlighting the relationship of LIV with the other TBFVs included in the analysis. B) The LIV clade alone with SGEV included as an outgroup. The % consensus support value of all nodes was 100, except for those specified. The scale bar represents the number of substitutions per site. The geographic area of isolation of the LIV isolates in B) is denoted by colour. Trees were generated using MrBayes (version 3.2.6, [54]) within the Geneious software suite (version 7.1.9 [40]). The GTR+G+I substitution model with four gamma categories was utilised, as this was found to suit the dataset best using JModelTest [53]. The trees were generated using four heated Markov chain Monte Carlo (MCMC) chains, with a chain length of 1,000,000. Consensus trees were generated using 10% burn-in and a support threshold of 50% and visualised using FigTree (version 1.4.3). The tree contained evidence of geographic clustering, even for samples taken decades apart. In addition to the two Welsh strains (1980 vs 2009), this was true for English strains sampled between 1983 and 2009 in the borders region at Penrith (ENG_PEN3_1983, ENG_PEN4_1983, ENG_PEN6_2010) which clustered in a clade that fell close to other strains from the same area, ENG_PRES1_1991 and Penrith. The four isolates sampled from sheep between 1980 and 1995 from Southern England (ENG_DEV1_1983, ENG_DEV2_1989, ENG_DEV4_1995, ENG_A_1980) also formed a separate clade. There was somewhat less evidence of geographic clustering between the Scottish LIV isolates though most of them fell into a single clade that also contained isolates from other areas. One clade was formed between three isolates sampled from grouse between 1980 and 1993 in the Lochindorb region of Northern Scotland (SCO_LOCH2_1993, SCO_LOCH6_1993, SCO_K_1980). Conversely, three isolates sample near Inverness in the Scottish Highlands did not form a single clade and instead separated into multiple lineages. There were also several cases in which geographically distant isolates grouped together. Specifically, a 1968 Irish isolate (IRE_IRE3_1968), a 2015 isolate from Devon (ENG_dog_2015) and two LIV strains from Far-Eastern Russia (1991) and Turkmenistan (1985), all fell within the larger Scottish clade. This indicates repeated long-distance movement of LIV between Scotland and other countries.

Molecular clock analysis reveals a weak temporal signal

Extracting the genetic divergence from the root for all LIV isolates, based on the Bayesian phylogeny, and regressing these values against the sampling dates using TempEst [56] revealed a positive relationship (S1 Fig). This confirmed that the data contain a clock signal, justifying molecular clock analysis in BEAST. The topology of the maximum clade credibility (MCC) tree estimated in BEAST was identical to that of the MrBayes tree (Fig 5). As the presence of strong purifying selection can result in an underestimation of branch lengths, a tree was generated using the adaptive branch-site random effects likelihood (aBS-REL) model [52]. Comparing the total tree length to that of a tree generated under a general time reversible (GTR+G+I) substitution model, revealed no major difference (aBS-REL: 0.38, GTR+G+I: 0.32). This indicates that the branch lengths of the LIV phylogeny were not underestimated as a consequence of purifying selection. To confirm whether a molecular clock rate could reliably be estimated from the LIV data, estimation in BEAST was repeated twenty times with datasets in which sampling dates were randomised. The HPD of the clock rates estimated by all twenty of the date randomised BEAST runs overlapped with the mean clock rate estimated from of the original data, indicating that the clock signal in data set was weak and potentially spurious (Fig 6A). The initial BEAST analysis yielded an evolutionary rate of 1.9x10-5 substitutions/site/year (subst/site/yr; 95% highest posterior density interval (HPD): 5.7x10-6–3.9x10-5 subst/site/yr) and a TMRCA of 3077 years before present (ybp, 95% HPD: 909–5616). Notwithstanding the weak clock signal, this calibration would suggest that most splits in the LIV phylogeny took place hundreds of years ago, with very few nodes having estimate ages within the past century (Fig 5). However, due to potential unreliability of the clock rate, any divergence time inferences should be treated with caution. To test whether the weak molecular clock in the LIV dataset was characteristic for TBFV in general, we estimated a molecular clock and performed date randomisation tests for TBEV-Eu, a close relative of LIV. The clock rate obtained was 3.3x10-5 subs/site/yr (95% HPD: 2x10-5-5x10-5 substitutions/site/year) which is consistent with previous estimates based on sub genomic datasets [20,32]. Rates estimated for two of the twenty date-randomised runs overlapped with the rate estimated from the observed dataset (Fig 6B).

Fig 5

Time-scaled maximum clade credibility (MCC) tree of 26 LIV genomes collected between 1931 and 2015 estimated in BEAST.

Nodes with posterior probability values of 1.0 are displayed as diamonds whilst values < 1 are shown. The 95% HPD of the node age is shown for selected nodes. However, due to the data set failing a date randomisation test, node age estimates should be considered more uncertain than indicated here. Isolate names are colour coded according to geographic isolation. The dates of isolation of all isolates are included in their tip names.

Fig 6

Plots produced using Tracer which shows the estimated clock rate of the A) LIV and B) TBEV-Eu datasets.

The rate estimated using the correctly assigned dates are shown in blue while the estimated substitution rate of these datasets using twenty BEAST runs with randomised tip isolation dates are shown in grey. The dotted line highlights runs in which the 95% HPD of the runs overlap.

Time-scaled maximum clade credibility (MCC) tree of 26 LIV genomes collected between 1931 and 2015 estimated in BEAST.

Plots produced using Tracer which shows the estimated clock rate of the A) LIV and B) TBEV-Eu datasets.

Discussion

This study represents the largest evolutionary analysis of LIV to date using a representative set of full genomes sampled across the species’ range. Relative to previous analyses, which had been limited to single alignments, this provided new insights into LIV recombination, selection, epidemiology and evolutionary history. No detectable recombination signal was found between LIV and any of the TBFV included. A recombination event between LIV reference strain LIV 369/T2_Y07863 (based on a published genome sequence [1]) and the TBEV-Eu reference strain Neudoerfl had previously been suggested [20,24]. A subsequent study obtained the LIV 369/T2 strain from the European virus archive and, upon re-sequencing the entire genome, found that it differed from the published LIV 369/T2_Y07863 sequence and did not exhibit the previously reported recombination signal [25]. The LIV 369/T2 isolate sequenced in this study was derived from the original virus isolated at the Moredun Research Institute in 1963. We did not find evidence of recombination between LIV and any of the TBFV genomes, thus supporting the findings of Norberg et al. (2013), and their suggestion that the detected recombination signal may be due either to a sequencing artefact or due to an artificial recombination event which took place in a laboratory setting. We therefore conclude that no evidence of past genetic exchange between LIV and TBEV can be detected. Regardless, recombination is still theoretically possible in areas such as the UK, Norway, and Russia where these viruses now co-circulate due to increasing TBEV range. Such recombination could result in novel TBFV phenotypes of potential public health concern. While LIV only occasionally infects humans, TBEV is an important human pathogen, therefore a potential LIV/TBEV-Eu recombinant may exhibit altered pathogenicity and present a public health risk. Previous studies have shown that 3’UTR length modulates virulence in some TBEV subtypes [65,66], therefore a LIV-TBEV-Eu recombinant possessing the shorter LIV 3’UTR may possess increased virulence. While it has been demonstrated elsewhere that the deletion of the variable region of TBEV-Eu does not result in increased replication in a cell culture setting [67], this study did not investigate potential differences in pathogenicity using animal models. TBEV-FE exhibits increased virulence compared to TBEV-Eu and TBEV-Sib [5], therefore a recombinant LIV/TBEV-FE virus may also exhibit enhanced pathogenicity. The hypothetical phenotypes of a LIV/TBEV recombinant require confirmation by pathogenesis studies using virus produced by a reverse genetic system. While there is currently no published LIV reverse genetics system, available TBEV reverse genetics approaches could be adapted to investigate the phenotypes of LIV/TBEV recombinants. Consistent with patterns in other flaviviruses [20,68-72], including TBEV [20,70], the genomic ORF for LIV is subjected to strong purifying selection (dN/dS = 0.0745). Such strong selective constraints are thought to be due to the reliance of the virus on the infection of both vertebrate and arthropod hosts during its life-cycle [71,73]. Only four of the LIV genes (E, NS2A, NS3 and NS5) exhibited any evidence of positive selection. Within E, two sites were identified as being subjected to episodic diversifying selection as shown by MEME analysis (Table 2). Of these sites, codon 308 within E has previously been demonstrated to contribute to monoclonal antibody escape and reduced neurovirulence in mice [74], indicating that substitutions at this site may confer resistance to neutralising antibodies. The functional consequences of the other positively selected sites we have identified are unclear and warrant further study. The selection analysis we have undertaken may inform future vaccine design, as potential vaccine candidates should avoid the inclusion of antigen-determining sites which are subjected to positive selection, as this may influence antibody escape [75,76]. In silico analysis of the dataset may identify putative B and T cell epitopes which could be verified by further in vivo studies [77-79], however these analyses were beyond the scope of this study. Genetic diversity present within the LIV dataset was low, with the 26 LIV isolates sharing ~96% mean nucleotide identity and ~98% mean amino acid identity (Table 1). Of the LIV genes, NS2A and C were most divergent, consistent with the weakest purifying selection. Compared with a dataset of TBEV-Eu genomes, LIV appears to exhibit less genetic variability due to the variability present within the TBEV-Eu 3’UTR. The 3’UTR of the combined dataset of LIV and TBEV-Eu genomes shared a mean identity of 77.5% with the most divergent isolate pair exhibiting 51.4% identity. The difference between the 3’UTR of LIV and TBEV is striking and may contribute to the differences in phenotype noted between these viruses. The 3’UTR is responsible for cyclisation during replication but also gives rise to the subgenomic flavivirus RNA (sfRNA) which has been shown to function as a type I interferon antagonist during mosquito-borne flavivirus infection and is involved in pathogenesis [80-82]. In light of the emergence of TBEV in the UK, our dataset could support the development of tools for differential diagnosis between LIV and TBEV. The published LIV and TBEV qPCR methodologies likely cannot distinguish between these closely related viruses [83-85], therefore alternative method employing species specific primers, possibly targeting the 3’UTR, could be developed using our dataset. Bayesian phylogenetic analysis of the LIV dataset revealed that LIV evolution has been influenced by a mixture of both localised and long-distance transmission events. Within the tree, distinct sub-clades are present where isolates sampled from similar geographic locations group together. This is particularly evident for the isolates collected in in northern England, northern Scotland, southern England, and Wales. This level of spatial clustering is indicative of localised transmission, where infected animals or ticks are transported over short distances. LIV persistence is also clear in areas such as Lochindorb, Devon, and Penrith, where sub-clades are composed of isolates sampled from the same geographic location decades apart. This implies LIV persistence in these areas and the presence of suitable hosts such as sheep [86,87], mountain hares (Lepus timidus) [88,89], red/roe deer (Cervus elaphus/ Capreolus capreolus) and red grouse [90,91]. Three distinct LIV transmission cycles have been postulated in the UK which rely upon the presence of sheep alone, a mixture of red grouse, mountain hares and deer, and other combinations of these hosts [92,93]. Whilst sheep and grouse support sufficient viremia to infect feeding ticks, mountain hares have been shown to facilitate non-viraemic transmission wherein infected ticks can infect naïve ticks which co-feed on the same animal [89,94]. While deer are not susceptible to LIV and do not support non-viraemic transmission, they are able to carry significant tick burdens and are responsible for amplifying tick populations which, in turn, contributes to LIV persistence [93,95]. One of these proposed transmission cycles must be at work to maintain the virus in areas where LIV genetic lineages remain detectable over multiple decades. There are also several cases where closely related isolates were sampled from geographically distant areas. One such example is the clustering of the Irish isolate IRE_IRE3_1968 with the western Scottish strains. Whilst this had been previously suggested based on E gene sequence data [18], our phylogenetic placement of IRE_IRE3_1968 based on whole genome data was supported by 100% node support. This provides more confidence in the interpretation that LIV has been potentially (re-)introduced to Ireland from western Scotland, however, it is difficult to draw conclusions as only two Irish LIV sequences are present in the dataset. Additionally, the isolate obtained from a recent canine case in Devon [34], ENG_DOG_2015, appears to be more distantly related to other isolates sampled from the same area and instead clusters between the strains isolated from Western Scotland and SCO_INV1_1983, a strain isolated from Inverness-shire in northern Scotland. Similarly, SCO_INV2_1983 shares a common ancestor with the isolates sampled in Devon and does not cluster with the two other LIV isolates, SCO_INV6_1986, and SCO_INV14_1992, which were also isolated in the Inverness area. This indicates movement of LIV between northern Scotland and southern England. The most extreme cases of long-distance dispersal in our data are two non-British LIV strains which have been previously isolated in Turkmenistan and the Russian Far East. The cluster containing these two viruses is nested within the UK isolates, indicating that they are direct descendants of the latter, which is puzzling given the large distances involved. While the long-distance movement of LIV within the UK may be explained by sheep trade, with transported animals harbouring infected ticks, the movement of LIV from the UK to Asia and Russia is more difficult to explain. It has been theorised previously that, similar to Powassan virus (POWV), LIV was introduced to Primorsky Krai in the Russian Far East via animal trade following World War I or II [17,96]. While it is not possible to accurately estimate the divergence time of the Russian and Turkmenistan LIV isolates using present data, trade between Vladivostok city, the primary trade hub of Primorsky Krai, and the rest of the world started in the 1860s [96]. Trade with the Russian Far-East was also bolstered by the construction of the trans-Siberian railway which began in 1891 and was completed in 1916 [97]. It can therefore be postulated that if LIV was introduced to Primorsky Krai via trade with the UK, it most likely would have been from the 1860s onwards. Alternatively, LIV may have been introduced to Turkmenistan and Russia via the transport of ticks by migratory birds. The transport of tick-borne pathogens between countries by birds has been well documented [98-100], as is the long-distance dispersal of avian influenza viruses by migratory waterfowl [101-103]. However, there are no direct migratory links between central/northeast Eurasia and the UK [104] and in order to transport LIV, birds must reside in environments where LIV is prevalent, such as upland moors, long enough for infected ticks to attach. Moreover, it is implausible that ticks would remain attached to the host over travel distances of several thousand kilometres. Given that LIV has also been detected in Norway and Denmark, it is possible that these countries represent stop-over points where UK strains of LIV have become established locally, and from where they can be distributed further. However, without further investigation these hypotheses are merely speculative. The hosts which contribute to LIV transmission outside the UK, with the exception of sheep, are currently unknown. As an arbovirus, the range of LIV relies on the presence of its tick vector and susceptible hosts. Unlike TBEV, which is associated with woodland areas where its rodent reservoir hosts are present [105-107], LIV distribution in the UK appears to be predominantly associated with the presence of sheep, red grouse, deer, and mountain hares. As some of these species, or close relatives, are present in Denmark, Norway, and Far Eastern Russia, host communities similar to those in the UK could contribute to LIV transmission in these foci. Interestingly, the Russian isolates were derived from Ixodes persulcatus [17], which demonstrates that LIV can be spread by tick species other than its European vector, I. ricinus. As these vectors and the main transmission hosts of LIV are distributed across much of Eurasia, it is entirely possible that LIV is present, but remains undetected, in many countries outside of the UK. Molecular clock analyses indicate that even genomic data from isolates collected over multiple decades might not allow the reliable calibration of a molecular clock for the TBFV. Using date-randomisation, we obtained rate estimates in BEAST that were generally lower but statistically indistinguishable from the observed rate based on the original data. In the meantime, clock rates and divergence time estimates for LIV must be considered potentially unreliable. BEAST analysis produced a molecular clock rate of 1.9x10-5 substitutions/site/year (95% HPD: 5.7x10-6–3.9x10-5 substitutions/site/year). This clock rate is about an order of magnitude slower than the estimated molecular clock rate of TBEV-Eu [20,31,32] and for LIV based on E gene data only [18] but similar to the rate estimated for Far-Eastern POWV strains [72,96,108]. The estimated TMRCA of the LIV sequences was 3077 years ago (95% HPD: 909–5616), an order of magnitude older than the TMRCA estimated previous based on E gene sequences [18]. Our results suggest that the molecular clock signal in LIV is weak and that larger datasets will be required to achieve more reliable temporal calibration. Future studies should focus on the acquisition of more LIV genomes from within the UK in addition to foci in other countries where LIV has been shown to circulate. Longitudinal sampling of LIV over a number of years from isolated foci, such as some of the Scottish Isles, where it should be possible to sample the same viral lineages through time, would also be useful for the calibration of a molecular clock. This strategy has previously been utilised to investigate the rate of TBEV-Eu evolution [20]. Repeating similar analyses for a dataset of TBEV-Eu genomes revealed that the weak clock signal applied in this case as well, though to a lesser degree. Combined with the increasing recognition that evolutionary rates of viruses are dependent on the time scale of sampling [33], this argues that current estimates of origins and divergence times both within and among TBFV species should be treated with a great degree of caution. Whilst passage in cell culture could obscure the relationship between genetic distances and sampling dates, this seems unlikely in our case, as the isolates were passaged a maximum of four times; likely too few passages to incur sufficient mutations to confound the clock analysis. Furthermore, strains which exhibit genetic variability dissonant with their year of isolation should be identifiable as outliers in TempEst [56], and no such outliers were apparent. Likewise, the clinical isolates ENG_Dog_2015, WA_AB2_2010 and ENG_PEN6_2009, which were not passaged in cell culture, did not appear as outliers relative to strains that had been passaged more often. The absence of a strong molecular clock signal in these viruses is therefore more likely to be related to aspects of their biology. Possible factors include the reliance on the tick vector, with potentially long intervals between transmission events, and the alternation between vertebrate host and vector. It is unclear how the rates of viral replication and evolution within ticks, which spent the majority of their life at ambient temperature, compare to those within endotherm mammalian hosts. However, it has previously been reported that TBEV is more prevalent in engorged ticks removed from humans and animals than in wild questing ticks [109-113]. Furthermore, it has been demonstrated that the viral load of a TBEV-Eu strain increased 500-fold after feeding in in vitro infected ticks, indicating that blood feeding may affect the viral replication rate within the tick vector [114]. This apparent variation between the replication rate within vertebrate host and vector species may contribute to the weak temporal signal present in the LIV and TBEV-Eu datasets, though it wouldn’t explain the apparent difference in their evolutionary rates. In summary, we have investigated the evolution of LIV using a dataset of 26 LIV genomes, 22 of which were newly sequenced, which were isolated from across the UK in a time period spanning eight decades. Utilising this dataset in addition to 65 genomes from closely related TBFV we did not find evidence of recombination contributing to LIV evolution. While this has been previously thought to be due to the geographic isolation of LIV compared to the other members of the TBFV sub-family [115,116], the identification of LIV foci in Norway, Denmark and Russia represent clear opportunities for recombination to take place. We have confirmed that, like other arboviruses, LIV is predominantly subjected to strong purifying selection however we have identified several sites within the LIV genome which are positively selected for, including one site in the E gene which has previously been implicated in neutralising antibody-escape [74]. Phylogenetic analysis of the LIV genomes clarified the evolutionary relationships of the LIV isolates and indicate that LIV is spread by both localised transmission events within the UK and long-distance dispersals between geographically distant parts of the UK and between the UK and Russia/Turkmenistan. We found that we were unable to accurately estimate the LIV molecular clock, despite utilising a dataset of 22 LIV genomes samples over a period of almost 80 years. This problem also extended to TBEV-Eu and likely also applies to other TBFV (e.g. POWV; [72,108]). The ability to study the evolution of tick-borne viruses is pivotal in our preparedness against these globally emerging pathogens, particularly as evidence for their introduction into new geographic areas continues to be found [27-29]. Our population genomic study of LIV lays an important foundation for further work and highlights the limitations of estimating viral molecular clock rates in TBFV based on current data.

LIV root-to-tip plot of the genetic distance against sampling time of the LIV phylogeny.

Genetic divergence was based on a ML tree generated using the dataset of 26 LIV genomes. The genetic distances from the root to the tips of the ML tree are plotted against the year of isolation. The regression line is plotted according to the best fitting root which minimises the sum of the squared residuals from the regression line. The x-intercept of the regression line represents the TMRCA, while the gradient of the line represents the clock rate. The correlation coefficient (R2) estimates the dispersion of the residuals around the regression line. The plot was generated using TempEst [56]. (TIFF) Click here for additional data file.

List of virus sequences utilised for phylogenetic analysis.

(TIFF) Click here for additional data file.

LIV sequencing statistics.

(TIFF) Click here for additional data file. 7 Apr 2020 Dear Dr Clark, Thank you very much for submitting your manuscript "Population genomics of louping ill virus provide new insights into the evolution of tick-borne flaviviruses" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. Three experts in the field have reviewed your manuscript. You will notice that there is a broad range of opinions and feedback provided by the reviewers. Please evaluate their assessments carefully when you consider how best to revise your manuscript. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Michael R Holbrook, PhD Associate Editor PLOS Neglected Tropical Diseases Ann Powers Deputy Editor PLOS Neglected Tropical Diseases *********************** Three experts in the field have reviewed your manuscript. You will notice that there is a broad range of opinions and feedback provided by the reviewers. Please evaluate their assessments carefully when you consider how best to revise your manuscript. Reviewer's Responses to Questions Key Review Criteria Required for Acceptance? As you describe the new analyses required for acceptance, please consider the following: Methods -Are the objectives of the study clearly articulated with a clear testable hypothesis stated? -Is the study design appropriate to address the stated objectives? -Is the population clearly described and appropriate for the hypothesis being tested? -Is the sample size sufficient to ensure adequate power to address the hypothesis being tested? -Were correct statistical analysis used to support conclusions? -Are there concerns about ethical or regulatory requirements being met? Reviewer #1: The authors have used appropriate methods Reviewer #2: The methods are in general really well done, one minor caveat, with the selection analysis. For selection analysis you should use IFEL as well. In general, I would treat any that are only found in one method with a great deal of suspicion. Especially as they have very different P-values. Only those found in all three methods would be considered statistically significant. Reviewer #3: Major comments: The objectives of this study are clearly stated in the introduction of the manuscript and consist in using a set of 22 new complete genome sequences of Louping ill virus to (re) inspect evolutionary aspects such as recombination, positive selection and evolutionary rates, describe the spatial dynamics of LIV and its phylogenetic relationships with the rest of the Tick-born flavivirus group. The study design is appropriate overall but does not go deep enough into the analysis in a number of cases. The positive selection analysis is limited to the detection of statistically significant signal with very little downstream investigation, and the phylogeographic perspective on the phylogeny does not involve phylogeographic inference or any type of historic data that could corroborate the authors observations. Minor comments: Sequence assembly: The authors describe that they have been assembling the LIV genomes by mapping to a reference (Y07863) which they later find to be potentially inaccurate. Have they tried re-mapping to a different reference as a control? Also, when the mapped and reference sequences are divergent enough, the reference can bias mapping and consensus determination. Have the authors tried de novo assembly to control for the the accuracy of the consensus they determined using reference mapping? De novo assembly can be problematic when to few reads are available but for those samples with a sufficient number of reads this could probably be done. Recombination analysis: In the recombination screening section, the methods are described but there is no information on the criteria for considering a recombination event as "true". Could you please specify these criteria? For example "a p-value < 0.05 with more than 3 methods was considered as a robust recombination signal". Please also provide the results obtained for the novel sequence (e.g. a graph as a supplementary data). -------------------- Results -Does the analysis presented match the analysis plan? -Are the results clearly and completely presented? -Are the figures (Tables, Images) of sufficient quality for clarity? Reviewer #1: Results are relevant and clearly presented Reviewer #2: I have a few minor issues with the tables, in their presentation otherwise all the results are well presented. Table 1, identities are not equal throughout the whole data set, please make clear in the legend that these are the ranges and not the same for each virus. Table 2 – legends do not reflect the table, please correction Me is not used M is. Reviewer #3: The analysis presented matches the analysis plan. However, the authors do not go deep enough into the analysis in some cases. Investigating further the sites with evidence of positive evolution or digging for historical data corroborating the authors' observations regarding the phylogeography of LIV would be most valuable for the study. On the spatial dynamics results, several of the observations made by the authors rely on a small number of sequences and are not necessary in line with the data present in the tree. Please, re-write that section more cautiously considering that you do not have a very important sampling over time and space. Major comments: Selection analysis: The selection analysis is limited to identifying statistically significant signal using to different methods with no further investigation. The results are fairly descriptive with no insight on possible functional evolution. For sites with statistically significant signal of positive selection it would be very valuable to go deeper into the analysis. A few examples of the questions that would be worth asking: Are mutations a the site occurring several times in the phylogeny? Is it only in present in recent sequences? Is it occurring in a conserved area of the genome? Is there literature on this specific mutation/site/region of the genome? When you have a deep enough coverage do you have population mixture at this site? For example, the mutation in NS3, occurs in different parts of the tree, at different points in time, occurs in a region with low dN/dS... Phylogeny: Overall, I do not agree with a number of the conclusions of the authors on the phylogenetics/spatial dynamics section: l356-357: "The tree contained repeated evidence of geographic clustering, even for samples taken decades apart." There are several exceptions to this statement, the sequences from Ireland do not cluster together, the sequences from southern England do not cluster all together (ENG_Dog_2015 is not clustering with the rest of the sequences A, DEV1/2/4), the sequences from northern Scotland do not all cluster together either (INV1 is separate form INV6 and INV14). l359-362: "SCO_LIV_31 sampled in 1931 from the Scottish borders, which grouped with English strains sampled more recently in the borders region (Penrith, ENG_PRES1_991, ENG_PEN3_1983, ENG_PEN4_1983,ENG_PEN6_2010)." SCO_LIV_31 is still quite distant from the English strains mentioned above. l362-363: "All other Scottish strains (sampled between 1962 and 1993) formed a separate clade" That clade also includes sequences from the south of England (ENG_Dog_2015) and from Ireland (IRE3), which make up to a third of the clade. One important statement that can be made from this tree (and that is made by the authors) is that the sequences from eastern Russia and Turkmenistan probably have an origin in the UK/Ireland. This is a very interesting point to discuss in terms of historical data that may support or contradict this observation. Molecular clock analysis: As it is, the order of the paragraph is surprising to me because the authors use theevolutionary rate they infer to evaluate tMRCAs for the whole species and ancestral nodes before testing (aBSREL model and randomisation) the robustness of their results. Mentioning the tMRCA (if at all) should be done way more cautiously knowing that the evolutionary rate that was used for inference is not reliable. It is not clear to me why, since you show that your estimate may not be reliable since there is an overlap with the estimates obtained using sequences with randomized dates, you repeat the analysis with the CDS without using a date-randomisation control. Minor comments: l352-353 "increased phylogenetic certainty resulting from whole genome data" Please illustrate your point (e.g. with a comparison) stating for which species and in which studies the split was uncertain. -------------------- Conclusions -Are the conclusions supported by the data presented? -Are the limitations of analysis clearly described? -Do the authors discuss how these data can be helpful to advance our understanding of the topic under study? -Is public health relevance addressed? Reviewer #1: Conclusion are within bounds in relation to the results Reviewer #2: The discussion is robust, it could have been a little more expansive on the 3'UTR which seems to be the most intriguing difference between LIV and TBEV, but that is my opinion and does not need to be changed. Reviewer #3: Some of the conclusion are in line with the data, others are not. The predominance of localised transmission in the dispersal dynamics is not clear based on the trees that are presented by the authors and rather suggest a mixture between localised and long-distance transmission. The authors also discuss the data based on molecular clock estimates before mentioning that their estimates are not reliable, which is misleading for the reader. One of the limitations of this study is the small number of sequences with regard to the analysis. The LIV dataset is currently too small, with no longitudinal sampling to accurately describe the local dynamics of the spread of LIV. Specific comments: l457-458: "Such recombination could result in novel TBFV phenotypes of potential public health concern." Please develop here. Is there experimental or field evidence of specific lineages/species being more virulent, transmissible (or other phenotype relevant to public health) than others? "Our findings could support the development, if necessary, of tools for differential diagnosis and vaccine design." Same here, please develop on how and why. For example, you could discuss the utility of the information on diversity and purifying selection analysis for vaccine and drug development. l472-473: "Codon site 100 within NS2A was identified as being under positive selection as was codon 96 within NS3." l480-481:"Sites 522 and 699 within NS5 were also identified as being subjected to positive selection." Here you're simply stating your results, please develop on this statements or remove. l479-480: "SGEV exhibits a histidine at codon 434, further indicating that histidine is the ancestral codon." Please develop on the possible functional consequences of histidine being the ancestral codon, otherwise the relevance of this information is not clear. l496-498:"Bayesian phylogenetic analysis of the LIV dataset revealed that the Irish and Welsh isolates tended to cluster separately from the English and Scottish isolates, with the exception of IRE_IRE3_1968." Since there are only two Irish sequences, only the Welsh isolates cluster separately then. l496-519: I don't agree with most of the paragraph, there is possibly a degree of local transmission but there are also movements between Ireland and the UK, North and South and given the current sampling and data it is not possible conclude that one type of transmission is predominant over the other. On the introduction from western Scotland to Ireland please be more cautious since there are only three sequences sampled several years apart. "from scotland to england" Here, there is no basis for assuming any directionality in the movements between England and Scotland with certainty. line 520-529: Here it would be very insightful to have historical data on exchanges between the UK and Eastern Russia/Turkmenistan to further inform that part of the tree and possibly clarify the timing of the introduction. l530-535: Here I do not agree with the structure of the paragraph. The authors should start by making clear that they identified that the estimates are not reliable and then discuss the results in order to identify how the estimates could be improved or corroborated. l574-576:"This apparent variation between the replication rate within vertebrate host and vector species may contribute to the weak temporal signal present in the LIV and TBEV-Eu datasets" Yes, and please discuss the other limitations that could explain the lack of temporal signal here. l581: "recombination has not contributed to LIV evolution" You could be more cautious here because this is true, given the currently sampled viral diversity. 582: "geographic isolation of LIV" I'm not so sure about that. 590: "predominantly by localised transmission events with few long-distance 590 dispersals." Again, I do not agree with this conclusion. -------------------- Editorial and Data Presentation Modifications? Use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity. If the only modifications needed are minor and/or editorial, you may wish to recommend “Minor Revision” or “Accept”. Reviewer #1: (No Response) Reviewer #2: Line 303, need to amend as this is the wrong tense I would also not focus as much on the recombination, I think allowing this idea to fade away would be a really good thing, and so I would not focus on it other than to mention that you have eliminated the possibility due to previous sequencing errors. Reviewer #3: In the virus isolates section, I do not seem to find any description for isolation/sampling of strains ENG_PEN6_2009 and WA_AB2_2010. Please introduce abbreviations (e.g. TBEV-Eu, TBEV-Sib, TBEV-FE) l288-291: "The genetic diversity present within the LIV dataset is similar to that found for TBEV-Eu, with the exception of the 3’ UTR which is approximately 96% identical between the LIV isolates, but is highly variable between TBEV-Eu isolates (Table 1)." Please provide the %id for the 3'UTR region in the TBEV-Eu group to illustrate your point and ease reading. Some dates are not matching between Fig 1, and the trees. Is WA_AB2 from 2009 (tree-Fig5) or 2010 (table-Fig1)? Please provide access to raw sequencing data. -------------------- Summary and General Comments Use this section to provide overall comments, discuss strengths/weaknesses of the study, novelty, significance, general execution and scholarship. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. If requesting major revision, please articulate the new experiments that are needed. Reviewer #1: Review of manuscript PNTD-D-20-00207: "Population genomics of louping ill virus provide new insights into the evolution of tick-borne flaviviruses" by Jordan J. Clark et al. The authors have studied the spread and evolution of a virus which, before this study, has not received a lot of attention despite its relevance to animal health. They also present data on molecular dating in relation to other tick-borne flaviviruses. The results are likely of interest to a broader community of readers. Below are my comments to their manuscript. I hope they will help to clarify certain parts and improve their manuscript. In general I only have minor comments. I think the manuscript is well-written and clear in most aspects. It is difficult to have everything in one manuscript, but what I find to be missing is a discussion around the ecology, host-association and geographical distribution of LIV. For example, given that it seems to be fairly geographically restricted (although it is found at very geographically different locations), it would be very interesting to hear the authors ideas as for what is needed for LIV to become established in a particular area? Why is it relatively prevalent in the UK but (seemingly) not elsewhere? What habitat and ecological circumstances are necessary for LIV to become established in one area? Is it simply likely to be undersamples elsewhere? Although the focus is clearly the evolutionary history of LIV/TBFV, I think a paragraph or so regarding the tick-virus-reservoir host-ecology relationship would complement the manuscript quite well. Minor comments: -r71: "...resulting in febrile illness and often fatal encephalitis (4)..." "Often" is relative of course. But I would say that it makes more sense to mention e.g. incidence / case fatality rate. -r141: "..).RNA.." Add space. -r153-160: I would probably use e.g. "sequence libraries" or similar instead of "samples" to distinguish between RNA samples and libraries. -r214: "suite(37,52)" add space -r283 and more: Check usage of range. "92.1% – 99.9%" vs. "92.1%–99.9%" Reviewer #2: This paper is a discussion of the evolution of LIV and highlights the similarities and differences from TBEV. The paper is coherent and has some excellent points. I have a few minor comments that I feel need to be addressed but the paper has been well designed and the analyses are in general robust. Reviewer #3: In this study the authors sequence complete genomes from a set of 22 Louping ill virus isolates and use them to (re)inspect evolutionary aspects such as recombination, positive selection and evolutionary rates at the level of the species. Using RDP, they identify no evidence of recombination within the LIV species and find additional support for the absence of recombination between the isolate LIV 369/T2_Y07863 and sequences from the TBEV Eu group. Using two different maximum likelihood approaches they find evidence of positive selection within the LIV species at seven sites and show broad purifying selection with low dN/dS ratios observed in most regions of the genome. The authors use a combination of ML and Bayesian tree-building methods to infer the phylogenetic relationships between LIV isolates and between LIV species and the rest of the TBFV group. They use these trees, in combination with space and time information attached to the sequences to identify both localised and long-distance transmission dynamics of LIV in the UK and Ireland. Finally, using Bayesian inference they infer the evolutionary rate within the species and identify that their estimate is not reliable, using a control dataset with randomised dates. This study is robust in terms of methodology but has several limitations in terms of analysis. The selection analysis dos not go deep enough so that this part of the results is only descriptive with no insight into possible functional evolution within LIV. The conclusions of the authors on the spatial spread of the virus do not reflect what is shown in the trees, identifying a predominantly local transmission while the trees rather point at a mixture of local and long-distance transmission with no dynamic being clearly predominant. Also, given the sampling, it is difficult to confirm local transmission, longitudinal sampling would be ideal here. Finally, the authors should make clear in their results and in the discussion section that the rate estimates they discuss are not reliable. In the current form, they start by discussing the results before mentioning that the evolutionary rates are not robust. -------------------- PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see https://journals.plos.org/plosntds/s/submission-guidelines#loc-methods 5 Jun 2020 Submitted filename: Responses to reviewers.docx Click here for additional data file. 17 Jul 2020 Dear Dr Clark, Thank you very much for submitting your manuscript "Population genomics of louping ill virus provide new insights into the evolution of tick-borne flaviviruses" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Our reviewers have re-evaluated your submission and found that the concerns raised after the first review were generally adequately addressed. One reviewer did have some minor comments that they felt would improve the presentation of your study. Once these comments are dealt with, we should be in good shape to move forward with your submission. Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Michael R Holbrook, PhD Associate Editor PLOS Neglected Tropical Diseases Ann Powers Deputy Editor PLOS Neglected Tropical Diseases *********************** Our reviewers have re-evaluated your submission and found that the concerns raised after the first review were generally adequately addressed. One reviewer did have some minor comments that they felt would improve the presentation of your study. Once these comments are dealt with, we should be in good shape to move forward with your submission. Reviewer's Responses to Questions Key Review Criteria Required for Acceptance? As you describe the new analyses required for acceptance, please consider the following: Methods -Are the objectives of the study clearly articulated with a clear testable hypothesis stated? -Is the study design appropriate to address the stated objectives? -Is the population clearly described and appropriate for the hypothesis being tested? -Is the sample size sufficient to ensure adequate power to address the hypothesis being tested? -Were correct statistical analysis used to support conclusions? -Are there concerns about ethical or regulatory requirements being met? Reviewer #1: The methods are well-executed, -described and -motivated. Reviewer #2: All my comments were adequately addressed. Reviewer #3: The authors answered all the comments regarding the method section and corrected the manuscript wherever it was necessary. Minor comments: (1) Whenever de novo assembly is possible I would recommend always controlling the consensus sequence by comparing what is obtained with de novo assembly versus alignment to a reference. (2) In Figure 2.A, please specify in the legend which sequence corresponds to the yellow line. -------------------- Results -Does the analysis presented match the analysis plan? -Are the results clearly and completely presented? -Are the figures (Tables, Images) of sufficient quality for clarity? Reviewer #1: The results and related figures/tables are clearly presented and match the aim of the manuscript. Reviewer #2: All my comments were adequately addressed. Reviewer #3: The authors answered all the comments regarding the results section and in most cases, corrected the manuscript wherever it was necessary. I still have some minor comments on that section: line 350: "MEME denotes sites which have been identified by MEME" Please correct the legend so it correctly describes what is in the table. In my understanding it should be MEME rather than ME. On the Selection analysis: The authors provide a better description of the mutations identified in the selection analysis. I would recommend moving some of the information to the results section (e.g. which strains carry which amino acid at a given position) and in the discussion, have a documented and well-argued paragraph on why the readers should care (or not) about some of the mutations. For now that part of the discussion looks a little too much like a catalogue. On the Phylogenetic analysis: The authors have improved their conclusions on the spatial dynamics of LIV but some inaccuracies remain: line 573-575: "This is particularly evident for the isolates collected in northern England, northern Scotland, Wales and Southern England." As mentioned in my previous comment, not all sequences from Northern Scotland cluster together (INV1 is separate form INV6 and INV14). The authors should specify this point in the narrative, as they do for the isolate ENG_DOG_2015. General comment on paragraph 571-593: While discussing mechanisms of localised transmission, rather than just considering small distance movements, the authors should also discuss possible mechanisms for persistence in a population over the years because this would also be an important part of the phenomenon. lines 591-608: On strains from Eastern Russia and Turkmenistan: I completely agree with the authors, if there is no or very little historical data, the authors should not speculate too much. However they can cautiously lay out several possible scenarios specifying that they are only hypotheses. line 594: Recent is very relative, I recommend avoiding using such qualifier, or specifying in which context it can be considered "recent". lines 606-608: The last sentence is a little too speculative. I recommend making a general statement stating that based on the (scarce) historical data available, introduction of LIV into these regions might have occurred via trade routes. The authors should avoid specifying a point of entry, because there is no proof for that. Important point, the tree suggests that the lineages in Eastern Russia and Turkmenistan stemmed from the UK region, pinning down the geographical origin of the introduction is not possible given the sampling and the approach used here. On the minor comment: line 365-370: The amended version is better but I don't think this is about resolution. For this analysis, genomic sequences were more informative than single gene sequences and yielded more robust results. Table 2 legend line 350: "ME denotes sites which have been identified by MEME" Based on what is shown in the table, it should be MEME -------------------- Conclusions -Are the conclusions supported by the data presented? -Are the limitations of analysis clearly described? -Do the authors discuss how these data can be helpful to advance our understanding of the topic under study? -Is public health relevance addressed? Reviewer #1: Conclusions are within bounds of what can be inferred from the data and addressed properly. Reviewer #2: All my comments were adequately addressed Reviewer #3: The authors answered all the comments regarding the discussion section and in most cases, corrected the manuscript wherever it was necessary. I still have some minor comments on that section: On paragraph 484-499: When writing that there are no reverse genetics systems available for LIV the authors could however mention that reverse genetics systems developped for TBEV could most likely be harnessed for studying the phenotype of LIV/TBEV recombinants. lines 514-516: "the clade of isolates sampled from ... encode a serine" In my understanding it should be "encodes". line 581 in original manuscript: "recombination has not contributed to LIV evolution" and lines 685-687 in the amended version. The amended version is not clear enough on that point. I suggest replacing "we have confirmed" with a more cautious formulation "we did not find evidence of recombination contributing to LIV evolution" On the comment of the authors on the "geographic isolation of LIV". I get the point however, there is evidence that LIV is present outside the UK and might have been for some time, as indicated by the identification of LIV in samples from Eastern Russia and Turkmenistan. -------------------- Editorial and Data Presentation Modifications? Use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity. If the only modifications needed are minor and/or editorial, you may wish to recommend “Minor Revision” or “Accept”. Reviewer #1: (No Response) Reviewer #2: No changes required. Reviewer #3: The authors corrected the manuscript wherever it was necessary. On last edit I recommend is increasing the resolution of the figures. -------------------- Summary and General Comments Use this section to provide overall comments, discuss strengths/weaknesses of the study, novelty, significance, general execution and scholarship. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. If requesting major revision, please articulate the new experiments that are needed. Reviewer #1: The authors have addressed all my previous comments. I have nothing further to add. Reviewer #2: No changes required Reviewer #3: The authors answered all the comments from reviewers #1, #2, and #3 and made most of the corrections that were recommended. The manuscript as greatly improved so that only a few minor modifications are now necessary for it to be ready for publication. -------------------- PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see http://journals.plos.org/plosntds/s/submission-guidelines#loc-materials-and-methods 30 Jul 2020 Submitted filename: Response to reviewers 2.docx Click here for additional data file. 7 Aug 2020 Dear Dr Clark, We are pleased to inform you that your manuscript 'Population genomics of louping ill virus provide new insights into the evolution of tick-borne flaviviruses' has been provisionally accepted for publication in PLOS Neglected Tropical Diseases. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Neglected Tropical Diseases. Best regards, Michael R Holbrook, PhD Associate Editor PLOS Neglected Tropical Diseases Ann Powers Deputy Editor PLOS Neglected Tropical Diseases *********************************************************** 8 Sep 2020 Dear Dr Clark, We are delighted to inform you that your manuscript, "Population genomics of louping ill virus provide new insights into the evolution of tick-borne flaviviruses," has been formally accepted for publication in PLOS Neglected Tropical Diseases. We have now passed your article onto the PLOS Production Department who will complete the rest of the publication process. All authors will receive a confirmation email upon publication. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any scientific or type-setting errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Note: Proofs for Front Matter articles (Editorial, Viewpoint, Symposium, Review, etc...) are generated on a different schedule and may not be made available as quickly. Soon after your final files are uploaded, the early version of your manuscript will be published online unless you opted out of this process. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Neglected Tropical Diseases. Best regards, Shaden Kamhawi co-Editor-in-Chief PLOS Neglected Tropical Diseases Paul Brindley co-Editor-in-Chief PLOS Neglected Tropical Diseases

108 in total

1. [Molecular evolution of the tick-borne encephalitis and Powassan viruses].

Authors: E L Subbotina; V B Loktev
Journal: Mol Biol (Mosk) Date: 2012 Jan-Feb

2. A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints.

Authors: D P Martin; D Posada; K A Crandall; C Williamson
Journal: AIDS Res Hum Retroviruses Date: 2005-01 Impact factor: 2.205

3. Detection of Louping ill virus in clinical specimens from mammals and birds using TaqMan RT-PCR.

Authors: L Marriott; K Willoughby; F Chianini; M P Dagleish; S Scholes; A C Robinson; E A Gould; P F Nettleton
Journal: J Virol Methods Date: 2006-06-30 Impact factor: 2.014

4. Variable region of the 3' UTR is a critical virulence factor in the Far-Eastern subtype of tick-borne encephalitis virus in a mouse model.

Authors: Mizuki Sakai; Kentaro Yoshii; Yuji Sunden; Kana Yokozawa; Minato Hirano; Hiroaki Kariwa
Journal: J Gen Virol Date: 2014-01-06 Impact factor: 3.891

5. TBE incidence versus virus prevalence and increased prevalence of the TBE virus in Ixodes ricinus removed from humans.

Authors: Jochen Süss; Christine Klaus; Roland Diller; Christina Schrader; Nikolaus Wohanka; Ulrich Abel
Journal: Int J Med Microbiol Date: 2006-02-21 Impact factor: 3.473

6. Nucleotide substitutions in dengue virus serotypes from Asian and American countries: insights into intracodon recombination and purifying selection.

Authors: Susanta K Behura; David W Severson
Journal: BMC Microbiol Date: 2013-02-14 Impact factor: 3.605

7. Climate change influences on the potential geographic distribution of the disease vector tick Ixodes ricinus.

Authors: Abdelghafar A Alkishe; A Townsend Peterson; Abdallah M Samy
Journal: PLoS One Date: 2017-12-05 Impact factor: 3.240

8. Detection of new endemic focus of tick-borne encephalitis virus (TBEV), Hampshire/Dorset border, England, September 2019.

Authors: Maya Holding; Stuart D Dowall; Jolyon M Medlock; Daniel P Carter; Liz McGinley; Mollie Curran-French; Steven T Pullan; John Chamberlain; Kayleigh M Hansford; Matthew Baylis; Richard Vipond; Roger Hewson
Journal: Euro Surveill Date: 2019-11

9. Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors: Heng Li; Richard Durbin
Journal: Bioinformatics Date: 2009-05-18 Impact factor: 6.937

10. H5N1 avian influenza re-emergence of Lake Qinghai: phylogenetic and antigenic analyses of the newly isolated viruses and roles of migratory birds in virus circulation.

Authors: Guihua Wang; Dawei Zhan; Laixing Li; Fumin Lei; Bohua Liu; Di Liu; Haixia Xiao; Youjun Feng; Jing Li; Baoan Yang; Zuohua Yin; Xiaohui Song; Xiaojia Zhu; Yanlong Cong; Juan Pu; Jian Wang; Jinhua Liu; George F Gao; Qingyu Zhu
Journal: J Gen Virol Date: 2008-03 Impact factor: 3.891

5 in total

Review 1. Potential Mechanisms of Transmission of Tick-Borne Viruses at the Virus-Tick Interface.

Authors: Mahvish Maqbool; Muhammad Sohail Sajid; Muhammad Saqib; Faisal Rasheed Anjum; Muhammad Haleem Tayyab; Hafiz Muhammad Rizwan; Muhammad Imran Rashid; Imaad Rashid; Asif Iqbal; Rao Muhammad Siddique; Asim Shamim; Muhammad Adeel Hassan; Farhan Ahmad Atif; Abdul Razzaq; Muhammad Zeeshan; Kashif Hussain; Rana Hamid Ali Nisar; Akasha Tanveer; Sahar Younas; Kashif Kamran; Sajjad Ur Rahman
Journal: Front Microbiol Date: 2022-05-05 Impact factor: 6.064

2. Alternation between taxonomically divergent hosts is not the major determinant of flavivirus evolution.

Authors: Chiara Pontremoli; Diego Forni; Mario Clerici; Rachele Cagliani; Manuela Sironi
Journal: Virus Evol Date: 2021-04-21

3. Evidence of circulation of Orthobunyaviruses in diverse mosquito species in Kwale County, Kenya.

Authors: Hellen Koka; Joel Lutomiah; Solomon Langat; Edith Koskei; Albert Nyunja; James Mutisya; Francis Mulwa; Samuel Owaka; Victor Ofula; Samson Konongoi; Fredrick Eyase; Rosemary Sang
Journal: Virol J Date: 2021-10-12 Impact factor: 4.099

4. Metagenome-Assembled Viral Genomes Analysis Reveals Diversity and Infectivity of the RNA Virome of Gerbillinae Species.

Authors: Han Du; Lijuan Zhang; Xinqiang Zhang; Fengze Yun; Yuhao Chang; Awaguli Tuersun; Kamila Aisaiti; Zhenghai Ma
Journal: Viruses Date: 2022-02-09 Impact factor: 5.048

5. Phylogeography and Re-Evaluation of Evolutionary Rate of Powassan Virus Using Complete Genome Data.

Authors: Artem N Bondaryuk; Tatiana E Peretolchina; Elena V Romanova; Anzhelika V Yudinceva; Evgeny I Andaev; Yurij S Bukin
Journal: Biology (Basel) Date: 2021-12-06

5 in total