Literature DB >> 18505720

Temporal trends in the discovery of human viruses.

Mark E J Woolhouse1, Richard Howey, Eleanor Gaunt, Liam Reilly, Margo Chase-Topping, Nick Savill.   

Abstract

On average, more than two new species of human virus are reported every year. We constructed the cumulative species discovery curve for human viruses going back to 1901. We fitted a statistical model to these data; the shape of the curve strongly suggests that the process of virus discovery is far from complete. We generated a 95% credible interval for the pool of as yet undiscovered virus species of 38-562. We extrapolated the curve and generated an estimate of 10-40 new species to be discovered by 2020. Although we cannot predict the level of health threat that these new viruses will present, we conclude that novel virus species must be anticipated in public health planning. More systematic virus discovery programmes, covering both humans and potential animal reservoirs of human viruses, should be considered.

Entities:  

Mesh:

Year:  2008        PMID: 18505720      PMCID: PMC2475551          DOI: 10.1098/rspb.2008.0294

Source DB:  PubMed          Journal:  Proc Biol Sci        ISSN: 0962-8452            Impact factor:   5.349


1. Introduction

Despite long-standing interest in global biodiversity (May 1988), only recently has the diversity of human pathogens been catalogued (Taylor ). Approximately 1400 pathogen species are currently recognized (Woolhouse & Gaunt 2007). Fewer than 200 of these are viruses, but novel virus species are being reported in humans at a rate of over two per year, much faster than for other kinds of pathogen (Woolhouse & Gaunt 2007). Novel viruses are a major public health concern, whether causing disease on the massive scale of HIV/AIDS, more transient events such as the SARS epidemic or potential future threats such as pandemic influenza. An analysis of temporal patterns of virus discovery is therefore of considerable interest. Our analysis is based on the rate of accumulation of new human virus species: the ‘discovery curve’. Discovery curves have previously been used to estimate the total diversity of various plant and animal taxa (Dove & Cribb 2006; Bebber ). However, to our knowledge, the discovery curves have not previously been compiled for any category of human pathogen. Having compiled the discovery curve, we proceed to develop a simple statistical model which we use to estimate the size of the pool of human virus species, N, and the expected rate of discovery of new species to 2020.

2. Material and methods

A standard method for estimating numbers of species is to extrapolate the cumulative species discovery curve (Bebber ). We gathered data for this curve by systematically searching the primary literature for first reports of human infection with each of the currently recognized virus species, using species as defined by the International Committee on Taxonomy of Viruses (ICTV; http://www.ictvonline.org/). We note that the set of viruses we are interested in—those that can infect humans—is a small subset of the total (over 1500 species according to ICTV) and, as is discussed below, not a closed set because many of these viruses can also infect other hosts (Taylor ). We regard this as analogous to constructing species discovery curves for any subdivision of geographical range or habitat. As we demonstrate below, this approach yields an excellent description of the discovery curve. We used piecewise linear regression to test for changes in the slope of the discovery curve. The results suggested upswings in 1930 (95% CI, 1929–1933) and 1954 (1953–1956). We therefore restricted detailed analysis to the period 1954–2006. We modelled discovery since 1954 assuming a total number of species available to be discovered (the species pool) of N virus species, each discovered in any given year with probability p. The model was fitted to the data and assessed using Markov chain Monte Carlo (MCMC)-based Bayesian inference, generating distributions and credible intervals for the parameters. The model defines the expected number of discovered viruses in year t aswhere year t=1 corresponds to 1954. The binomial distribution B(N, p) can be accurately approximated by a Poisson distribution with parameter Np for the range of values of N and p of interest. We considered fitting a distribution for values of p; however, provided individual p-values are low there is minimal improvement in model fit. Thus, for a set of model parameters, the likelihood of observing data, X={x}, the number of viruses discovered for years 1 to k, is given byParameter distributions for N and p were calculated using MCMC simulation using a standard Metropolis algorithm with flat prior information. It was necessary to compute a correlation matrix to define a joint proposal since N and p are closely correlated. We monitored convergence using two chains. Once they had converged, we had a burn in period of 105 samples. We compared the model with the observed data by calculating the mean, trend in the mean and variance for the number of virus species discovered per year (based on five million simulations using best-fit parameter values). The model was extrapolated to year 2020 by calculating the expected number of viruses discovered using the best-fit model. The 95% posterior prediction intervals were calculated using two million model simulations taking into account parameter uncertainty (as given by data from 1954 to 2006) and natural model simulation stochasticity. As a validation exercise, the model was also fitted to the curve for accumulated virus families from 1954 using the same methods, except that the Poisson approximation no longer holds, so a binomial distribution was used. A family (based on current ICTV classifications) was added to the total when the first post-1954 species was allocated to that family. We tested the assumption that species can be randomly assigned to families (weighted by the size of the families) by noting the number of years in which 0, 1, 2, etc. virus families were discovered. This was done one million times to obtain a distribution for comparison with the observed values.

3. Results

From a comprehensive search of the primary literature, we found 188 virus species that have been reported to infect humans, going back to yellow fever virus in 1901 (table 1). Since then, the number of human virus species discovered in any given year has ranged from zero to six. As is typical (Bebber ), the cumulative species discovery curve increases slowly initially and then more rapidly (figure 1). Piecewise linear regression suggests no further upswings since 1954, roughly corresponding to the advent of tissue culture techniques for virus detection (figure 1).
Table 1

List of viruses ordered by year of first reporta of human infection.

yearspeciesfamily
1901Yellow fever virusflavi
1903Rabies virusrhabdo
1907Dengue virusflavi
1907Human papillomaviruspapilloma
1907Molluscum contagiosum viruspox
1907Variola viruspox
1909Polioviruspicorna
1911Measles virusparamyxo
1919Human herpesvirus 3herpes
1921Human herpesvirus 1herpes
1931Rift Valley fever virusbunya
1933Influenza A virusorthomyxo
1933Lymphocytic choriomeningitis virusarena
1933St Louis encephalitis virusflavi
1934Cercopithecine herpes virus 1herpes
1934Japanese encephalitis virusflavi
1934Louping ill virusflavi
1934Mumps virusparamyxo
1934Orf viruspox
1937Tick-borne encephalitis virusflavi
1938Cowpox viruspox
1938Eastern equine encephalitis virustoga
1938Rubella virustoga
1938Venezuelan equine encephalitis virustoga
1938Western equine encephalitis virustoga
1940Influenza B virusorthomyxo
1940West Nile virusflavi
1941Bwamba virusbunya
1943Newcastle disease virusparamyxo
1944Sandfly fever Naples virusbunya
1944Sandfly fever Sicilian virusbunya
1946Colorado tick fever virusreo
1947Omsk haemorrhagic fever virusflavi
1948Encephalomyocarditis viruspicorna
1948Human enterovirus Cpicorna
1949Human enterovirus Apicorna
1949Human enterovirus Bpicorna
1950Influenza C virusorthomyxo
1950Vesicular stomatitis virusrhabdo
1951Bunyamwera virusbunya
1952California encephalitis virusbunya
1952Murray Valley encephalitis virusflavi
1952Ntaya virusflavi
1953Human rhinovirus Apicorna
1954Human adenovirus Badeno
1954Human adenovirus Cadeno
1954Human adenovirus Eadeno
1955Human adenovirus Dadeno
1956Chikungunya virustoga
1956Human herpesvirus 5herpes
1956Human parainfluenza virus 2paramyxo
1956Ilheus virusflavi
1957Human adenovirus Aadeno
1957Human respiratory syncytial virusparamyxo
1957Kyasanur forest disease virusflavi
1957Mayaro virustoga
1957Wesselsbron virusflavi
1958Human parainfluenza virus 1paramyxo
1958Human parainfluenza virus 3paramyxo
1958Human parechoviruspicorna
1958Junin virusarena
1959Banzi virusflavi
1959Guaroa virusbunya
1959Powassan virusflavi
1960Human parainfluenza virus 4paramyxo
1960Human rhinovirus Bpicorna
1961Caraparu virusbunya
1961Catu virusbunya
1961O'nyong-nyong virustoga
1961Oropouche virusbunya
1962Rio Bravo virusflavi
1962Sindbis virustoga
1963Equine rhinitis virus Apicorna
1963Great Island virusreo
1963Pseudocowpox viruspox
1963Yaba monkey tumour viruspox
1964Human herpesvirus 4herpes
1964Machupo virusarena
1964Zika virusflavi
1965Chagres virusbunya
1965Foot and mouth disease viruspicorna
1965Tanapox viruspox
1965Wyeomyia virusbunya
1966Changuinola virusreo
1966Human coronavirus 229Ecorona
1966Quaranfil virusunassigned
1966Saimiriine herpesvirus 1herpes
1967Chandipura virusrhabdo
1967Crimean-Congo haemorrhagic fever virusbunya
1967Human coronavirus OC43corona
1967Human enterovirus Dpicorna
1967Piry virusrhabdo
1967Tacaiuma virusbunya
1968Human herpesvirus 2herpes
1968Marburg virusfilo
1968Tataguine virusbunya
1970Everglades virustoga
1970Hepatitis B virushepadna
1970Lassa virusarena
1970Punta Toro virusbunya
1971Aroa virusflavi
1971BK viruspolyoma
1971Duvenhage virusrhabdo
1971JC viruspolyoma
1971Vaccinia viruspox
1972Bovine papular stomatitis viruspox
1972Mokola virusrhabdo
1972Monkeypox viruspox
1972Norwalk viruscalici
1972Ross River virustoga
1973Bangui virusbunya
1973Dugbe virusbunya
1973Hepatitis A viruspicorna
1973Kotonkan virusrhabdo
1973Rotavirus Areo
1973Tamdy virusbunya
1974Getah virustoga
1975B19 virusparvo
1975Bhanja virusbunya
1975Human astrovirusastro
1975Lebombo virusreo
1975Shuni virusbunya
1975Thogoto virusorthomyxo
1976Orungo virusreo
1976Wanowrie virusbunya
1977Hepatitis delta virusunassigned
1977Sudan Ebola virusfilo
1977Zaire Ebola virusfilo
1978Hantaan virusbunya
1978Issyk-Kul virusbunya
1980Human T-lymphotropic virus 1retro
1980Puumala virusbunya
1982Human T-lymphotropic virus 2retro
1982Seoul virusbunya
1983Candiru virusbunya
1983Hepatitis E virusunassigned
1983Human adenovirus Fadeno
1983Human immunodeficiency virus 1retro
1984Human toroviruscorona
1984Rotavirus Breo
1985Borna disease virusborna
1986European bat lyssavirus 2rhabdo
1986Human herpesvirus 6herpes
1986Human immunodeficiency virus 2retro
1986Kasokero virusbunya
1986Kokobera virusflavi
1986Rotavirus Creo
1987Dhori virusorthomyxo
1987Sealpox viruspox
1987Suid herpesvirus 1herpes
1988Barmah Forest virustoga
1988Picobirnavirusbirna
1989European bat lyssavirus 1rhabdo
1989Hepatitis C virusflavi
1990Banna virusreo
1990Gan Gan virusbunya
1990Reston Ebola virusfilo
1990Semliki Forest virustoga
1990Trubanaman virusbunya
1991Guanarito virusarena
1992Dobrava-Belgrade virusbunya
1993Sin Nombre virusbunya
1994Hendra virusparamyxo
1994Human herpesvirus 7herpes
1994Human herpesvirus 8herpes
1994Sabia virusarena
1995Bayou virusbunya
1995Black Creek Canal virusbunya
1995Cote d'Ivoire Ebola virusfilo
1995Hepatitis G virusflavi
1995New York virusbunya
1996Andes virusbunya
1996Australian bat lyssavirusrhabdo
1996Juquitiba virusbunya
1996Usutu virusflavi
1997Laguna Negra virusbunya
1998Menangle virusparamyxo
1999Nipah virusparamyxo
1999Torque teno viruscirco
2000Whitewater Arroyo virusarena
2001Baboon cytomegalovirusherpes
2001Human metapneumovirusparamyxo
2003SARS coronaviruscorona
2004Human coronavirus NL63corona
2005Human bocavirusparvo
2005Human coronavirus HKU1corona
2005Human T-lymphotropic virus 3retro
2005Human T-lymphotropic virus 4retro

Full details of sources available from authors on request.

Figure 1

The discovery curve for human virus species. Cumulative number of species reported to infect humans (black circles and line). Statistically significant upward breakpoints are shown (vertical lines). Best-fit curve (solid line) and lower and upper 95% posterior prediction intervals (dashed lines) for extrapolation to 2020.

We confirmed that our model reproduced the observed slight downward trend in the rate of discovery since 1954 (figure 1) and the observed variance in the data from 1954 to 2006 (figure 2). The distribution of the number of virus species discovered per year shows slight overdispersion (mean=2.69; variance=3.07; variance-to-mean ratio greater than 1) which falls within the predicted range (mean=2.70 with 95% credible interval 2.41–3.00; variance=3.03 with interval 1.99–4.49). Together, these results support our choice of model, even though we do not explicitly consider heterogeneity in the probability of discovering a given species in any one year (p) or temporal variation in sampling effort, detection techniques and reporting.
Figure 2

Approximate probability density of variance in simulated data from 1954 to 2006 for the best-fit model. Arrow shows observed value.

Noting that p and N are highly correlated (figure 3), our best estimate for p is 0.015 (95% credible interval, 0.004–0.026) with 117 (38–562) so far undiscovered virus species. Extrapolating the discovery curve, allowing for parameter uncertainty and stochastic discovery, we obtain a best estimate of 22 new species (10–40) by 2020 (figure 1).
Figure 3

Approximate probability density function of parameter p and N generated by MCMC methods (see main text for details).

Data on the cumulative discovery of new virus families are also reproducible (figure 4). The predicted distribution of the number of virus families discovered per year (assuming random allocation of species to families) compares favourably with the observed distribution (figure 5). This provides further support for the appropriateness of our model.
Figure 4

Accumulation of virus families associated with species discovered after 1954 (black circles and line). Best-fit curve (solid line) and lower and upper 95% posterior prediction intervals (dashed lines) extrapolated to 2020. Fitted parameter values are N=25 (95% credible intervals 24–37) and p=0.056 (0.027–0.089).

Figure 5

Frequency distribution for the number per year of virus families associated with species discovered from 1954 to 2006, generated by reassigning the discovered viruses to families, repeated 106 times. Expected number with 95% credible intervals (bars) and data (black circles).

4. Discussion

We conclude that it is extremely probable that new human viruses will continue to be discovered in the immediate future; we are not yet close to the end of the virus discovery curve. As a direct result of this, it is not possible to estimate the size of the species pool for human viruses with precision. However, in contrast to the negative assessment by Bebber of the use of incomplete species accumulation curves, we consider that the upper and lower limits to our estimate of the size of the species pool are of interest and also have practical implications. Current trends are consistent with a pool of at least 38 undiscovered species that will be reported at an average rate of at least approximately one per year to 2020. In this context, it is worth noting that three new species were reported in 2007: two polyoma viruses, Ki and Wu, and a reovirus, Melaka (Allander ; Chua ; Gaynor ). Other viruses may have been reported but not yet classified. In practice, future rates of discovery will, of course, be affected by any major advances in virus detection technology or by any major shifts (upwards or downwards) in the effort expended on virus discovery programmes. Tissue culture was regarded as the ‘gold standard’ for virus detection up until a few years ago when molecular methods came to the fore (Storch 2007), although there has not been a detectable increase in discovery rates as a result. Indeed, it is striking that there have been no dramatic changes in the pattern of virus discovery for over 50 years; extrapolations from our data should therefore provide a useful benchmark for probable future discovery rates. The upper limit for N is finite but large; we cannot rule out hundreds of novel human viruses to be reported in the future. There are two (not mutually exclusive) possible explanations for such a high level of diversity. First, it could reflect the largely unknown extant diversity of viruses in the non-human animal reservoirs that constitute the major source of emerging human pathogens (Taylor ; Woolhouse & Gaunt 2007). The majority of human viruses are known to be capable of infecting non-human hosts (almost exclusively mammals and birds), and the animal origin of many apparently novel human viruses (e.g. HIV1 and HIV2, SARS CoV, Nipah virus) has been frequently remarked upon (Morse 1995; Woolhouse & Gowtage-Sequeria 2005; Wolfe ); indeed, recently discovered viruses are even more likely to be associated with a non-human reservoir (Woolhouse & Gaunt 2007). All these observations are consistent with the idea that a significant fraction of viruses discovered in the last few decades is ecological ‘spillover’ from animal populations rather than newly evolved specialist human viruses. We have very limited knowledge of the diversity of viruses present in most mammal and bird species (with most attention having been paid to viruses of domestic animals; Cleaveland ), so it is unclear for how long this process might continue. An alternative explanation for a large pool of human viruses is that this reflects a high rate of evolution (within a reservoir population) of truly novel species capable of infecting humans. This hypothesis is difficult to test directly without much more comprehensive sequence data from both human and non-human virus populations. We note that the finite upper limit for the current estimate of N does not necessarily imply that the process of virus discovery is not open-ended (as a result of the evolution of new species) since there could be a low background rate of virus evolution, which will remain once extant diversity has been fully revealed. The balance between revealing extant diversity and the continual evolution of new species could be explored using a more complex model than equation (2.1); however, the available data are insufficient to yield useful estimates of the additional parameters required. Although we cannot know in advance how big a threat they will pose, novel human viruses must be anticipated in public health planning and surveillance programmes for emerging infectious diseases (King ; Jones ). However, current approaches to virus discovery are largely passive, usually relying on investigation of reports of human disease with unfamiliar clinical symptoms and uncertain aetiology. Recently, there have been calls for more active discovery programmes for viruses and other pathogens involving ‘systematic sampling and phylogeographic analysis of related pathogens in diverse animal species’ (Wolfe ). We consider that such calls are supported by the results reported here.
  14 in total

Review 1.  Species accumulation curves and their applications in parasite ecology.

Authors:  Alistair D M Dove; Thomas H Cribb
Journal:  Trends Parasitol       Date:  2006-10-10

2.  How many species are there on Earth?

Authors:  R M May
Journal:  Science       Date:  1988-09-16       Impact factor: 47.728

3.  Diseases of humans and their domestic mammals: pathogen characteristics, host range and the risk of emergence.

Authors:  S Cleaveland; M K Laurenson; L H Taylor
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2001-07-29       Impact factor: 6.237

Review 4.  Factors in the emergence of infectious diseases.

Authors:  S S Morse
Journal:  Emerg Infect Dis       Date:  1995 Jan-Mar       Impact factor: 6.883

5.  Identification of a third human polyomavirus.

Authors:  Tobias Allander; Kalle Andreasson; Shawon Gupta; Annelie Bjerkner; Gordana Bogdanovic; Mats A A Persson; Tina Dalianis; Torbjörn Ramqvist; Björn Andersson
Journal:  J Virol       Date:  2007-02-07       Impact factor: 5.103

6.  A previously unknown reovirus of bat origin is associated with an acute respiratory disease in humans.

Authors:  Kaw Bing Chua; Gary Crameri; Alex Hyatt; Meng Yu; Mohd Rosli Tompang; Juliana Rosli; Jennifer McEachern; Sandra Crameri; Verasingam Kumarasamy; Bryan T Eaton; Lin-Fa Wang
Journal:  Proc Natl Acad Sci U S A       Date:  2007-06-25       Impact factor: 11.205

Review 7.  Ecological origins of novel human pathogens.

Authors:  Mark Woolhouse; Eleanor Gaunt
Journal:  Crit Rev Microbiol       Date:  2007       Impact factor: 7.624

8.  Host range and emerging and reemerging pathogens.

Authors:  Mark E J Woolhouse; Sonya Gowtage-Sequeria
Journal:  Emerg Infect Dis       Date:  2005-12       Impact factor: 6.883

9.  Identification of a novel polyomavirus from patients with acute respiratory tract infections.

Authors:  Anne M Gaynor; Michael D Nissen; David M Whiley; Ian M Mackay; Stephen B Lambert; Guang Wu; Daniel C Brennan; Gregory A Storch; Theo P Sloots; David Wang
Journal:  PLoS Pathog       Date:  2007-05-04       Impact factor: 6.823

Review 10.  Origins of major human infectious diseases.

Authors:  Nathan D Wolfe; Claire Panosian Dunavan; Jared Diamond
Journal:  Nature       Date:  2007-05-17       Impact factor: 49.962

View more
  49 in total

Review 1.  Viruses and human cancer: from detection to causality.

Authors:  Ronit Sarid; Shou-Jiang Gao
Journal:  Cancer Lett       Date:  2010-10-23       Impact factor: 8.679

Review 2.  The virome in mammalian physiology and disease.

Authors:  Herbert W Virgin
Journal:  Cell       Date:  2014-03-27       Impact factor: 41.582

Review 3.  Current drivers and future directions of global livestock disease dynamics.

Authors:  Brian D Perry; Delia Grace; Keith Sones
Journal:  Proc Natl Acad Sci U S A       Date:  2011-05-16       Impact factor: 11.205

4.  Integrated diversity and shared species analyses of human viromes.

Authors:  Yuting Qiao; Shutao Li; Jianmei Zhang; Qiang Liu; Qiang Wang; Hongju Chen; Zhanshan Sam Ma
Journal:  Arch Virol       Date:  2021-07-29       Impact factor: 2.574

5.  Metagenomic analysis of viruses in feces from unsolved outbreaks of gastroenteritis in humans.

Authors:  Nicole E Moore; Jing Wang; Joanne Hewitt; Dawn Croucher; Deborah A Williamson; Shevaun Paine; Seiha Yen; Gail E Greening; Richard J Hall
Journal:  J Clin Microbiol       Date:  2014-10-22       Impact factor: 5.948

6.  The diversity of human RNA viruses.

Authors:  Mark E J Woolhouse; Kyle Adair
Journal:  Future Virol       Date:  2013-02       Impact factor: 1.831

7.  Advancing full length genome sequencing for human RNA viral pathogens.

Authors:  Appolinaire Djikeng; David Spiro
Journal:  Future Virol       Date:  2009-01       Impact factor: 1.831

8.  Search strategy has influenced the discovery rate of human viruses.

Authors:  Ronald Rosenberg; Michael A Johansson; Ann M Powers; Barry R Miller
Journal:  Proc Natl Acad Sci U S A       Date:  2013-08-05       Impact factor: 11.205

Review 9.  Deciphering serology to understand the ecology of infectious diseases in wildlife.

Authors:  Amy T Gilbert; A R Fooks; D T S Hayman; D L Horton; T Müller; R Plowright; A J Peel; R Bowen; J L N Wood; J Mills; A A Cunningham; C E Rupprecht
Journal:  Ecohealth       Date:  2013-08-06       Impact factor: 3.184

10.  Correlates of viral richness in bats (order Chiroptera).

Authors:  Amy S Turmelle; Kevin J Olival
Journal:  Ecohealth       Date:  2010-01-05       Impact factor: 3.184

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.