Literature DB >> 34128784

Genomic contextualisation of ancient DNA molecular data from an Argentinian fifth pandemic Vibrio cholerae infection.

Matthew J Dorman1,2, Nicholas R Thomson1,3, Josefina Campos4.   

Abstract

Entities:  

Keywords:  VCR; Vibrio cholerae; aDNA; ancient DNA; cholera; fifth pandemic

Mesh:

Substances:

Year:  2021        PMID: 34128784      PMCID: PMC8461468          DOI: 10.1099/mgen.0.000580

Source DB:  PubMed          Journal:  Microb Genom        ISSN: 2057-5858


× No keyword cloud information.

Data Summary

The authors confirm that all supporting data, code and protocols have been provided within the article or through supplementary data files. No whole-genome sequencing data were generated in this study. Accession numbers for the publicly available sequences collated and/or generated in a previous study [1] and used in this analysis are listed in Table S2 (available in the online version of this article) (https://doi.org/10.6084/m9.figshare.14384225.v1). The collated blastn results used to draw the conclusions in this paper are provided in Table S1 (https://doi.org/10.6084/m9.figshare.14384225.v1). Other metadata, including genome accession numbers and the serogroup for each isolate (simplified to O1 or non-O1/O139), were taken from the supplementary data of Dorman et al. [1] and are provided in Table S2 (https://doi.org/10.6084/m9.figshare.14384225.v1). Raw blastn output files, annotated genome assemblies originally collated and used in [1], the VCR query sequence transcribed from Ramirez and colleagues' article [2], and the tree in Fig. 1(b) are provided in a Figshare repository linked to this study: https://dx.doi.org/10.6084/m9.figshare.13636577.
Fig. 1.

Non-pandemic from Argentina and the Americas harbour the most similar VCR alleles to that amplified from aDNA dating from the fifth pandemic. (a) A phylogenetic tree [1] containing 380 diverse sequences, as well as an outgroup of three species, on which the tree is rooted. Ten non-O1, non-toxigenic and non-pandemic from the Americas harbour VCR alleles most similar to that amplified from the La Zanja CE1 individual (99.13 % nucleotide identity) [2]. All of these are distantly related to the Classical and 7PET pandemic lineages. Bar: Substitutions per variable site. (b) Two VCR alleles exist amongst the ten isolates highlighted in (a) which differ in sequence by 1 nt substitution at position 32 or 43 relative to that of the CE1 amplicon. Sample name colours correspond to the VCR allele identified, as in (a). Leaves in (a) were coloured manually (Adobe Illustrator CC v23.1.1).

The remainder of the data and metadata presented have been published previously under a CC-BY open access licence [1, 3]. The phylogenetic tree presented in Fig. 1(a), and the gene presence/absence matrix used to classify isolates as toxigenic, are available from the Figshare repository linked to Dorman et al. [1]: https://doi.org/10.6084/m9.figshare.11310131. Non-pandemic from Argentina and the Americas harbour the most similar VCR alleles to that amplified from aDNA dating from the fifth pandemic. (a) A phylogenetic tree [1] containing 380 diverse sequences, as well as an outgroup of three species, on which the tree is rooted. Ten non-O1, non-toxigenic and non-pandemic from the Americas harbour VCR alleles most similar to that amplified from the La Zanja CE1 individual (99.13 % nucleotide identity) [2]. All of these are distantly related to the Classical and 7PET pandemic lineages. Bar: Substitutions per variable site. (b) Two VCR alleles exist amongst the ten isolates highlighted in (a) which differ in sequence by 1 nt substitution at position 32 or 43 relative to that of the CE1 amplicon. Sample name colours correspond to the VCR allele identified, as in (a). Leaves in (a) were coloured manually (Adobe Illustrator CC v23.1.1). At least seven cholera pandemics have been documented since the 1800s [4-6]. The first six of these are believed to have been caused by serogroup O1 of the classical biotype, whereas serogroup O1 biotype El Tor is the aetiological agent of the ongoing seventh pandemic (1961–present) [5]. Although much has been learned about the sixth and seventh pandemics from preserved and contemporaneous collections of bacterial cultures, nearly nothing is known about the bacteriology and molecular biology of earlier pandemics. We therefore read with great interest the recent paper by Ramirez and colleagues, in which they present what is believed to be the first genetic and molecular evidence of from the fifth cholera pandemic in Argentina (1886–1887 AD) [2]. In their palaeopathological study, Ramirez et al. extracted ancient DNA (aDNA) from sediment taken from the pelvic abdominal cavities of four putative cholera victims from the La Zanja archaeological site in Córdoba, Argentina [2]. Procedures were designed to minimise environmental contamination [2]. They managed to amplify a fragment of the genome (the repetitive DNA sequence, VCR) from two of these aDNA extracts, though they were unable to amplify the ctxA, ctxB or tcpA genes from any of the four specimens studied (these genes are associated with toxigenic, epidemic ). The authors successfully sequenced the VCR amplicon from one of these samples, and compared this to genome sequences available in GenBank, including partial sequences of two isolates from Argentina [2]. The two genomes which contained a VCR sequence most similar to that found in the fifth pandemic Argentinian sample, here dubbed the CE1 allele, were those of Sa5Y and SA3G, non-O1 . isolated in California in 2004 [7]. Together with collaborators, we recently completed a genomic study of the seventh cholera pandemic in Argentina, alongside a simultaneous analysis of non-pandemic from the country [1]. For this project, we sequenced 490 Argentinian , isolated from 1992 onwards, including 65 non-pandemic isolates. These genome sequences were not included in the analysis of Ramirez et al. We speculated that analysing additional genomes from the Americas, and specifically from Argentina, might shed further light on the distribution of this fifth pandemic VCR allele amongst . Accordingly, we interrogated the collection of diverse non-pandemic genome assemblies used in our study, a total of 383 genomes (Fig. 1a). Perhaps unsurprisingly, we could not find a perfect match (100 % nucleotide identity) to the VCR sequence reported by Ramirez et al. in any of the genomes in our Argentinian dataset (Table S1). However, ten genomes did contain VCR alleles that differed from the CE1 allele by 1 nt (e-value 2.32×10−53; 115/115 aligned nucleotides, 1 nt mismatch, 99.13 % identity, bitscore 203) (Fig. 1a). All of these were isolated in the Americas; eight of the ten are from Argentina, and were isolated in Jujuy, Salta and Formosa provinces, as well as Ciudad Autónoma de Buenos Aires, between 1992 and 2010 [1]. Four of these Argentinian isolates are of clinical origin, and the remainder are environmental isolates [1]. The remaining two genomes are from elsewhere in the Americas: isolate HE-16 from Haiti [8] and isolate SIO from California [9]. Two VCR alleles that differed by 1 nt from the CE1 sequence at one of two different positions were identified (Fig. 1b). The first of these sequences (allele 1, in 7/10 genomes; Fig. 1) was identical to that found in Sa5Y and SA3G by Ramirez et al. [2]. The second sequence (allele 2) was found in HE-16 and two Argentinian genomes (Fig. 1). Notably, all ten of these isolates are non-O1 , distantly related to pandemic lineages including the Classical lineage, and are non-toxigenic [1] (Fig. 1, Table S2), as the CE1 sample was predicted to be [2]. Although caution must be taken not to over-interpret these data, particularly in the absence of a complete genome sequence from this archaeological sample, our genomic observations are consistent with the conclusions made by Ramirez and colleagues – namely, that the individual from whom VCR was amplified and successfully sequenced is likely to have been infected with non-O1 and non-toxigenic . This further supports the hypothesis that this Argentinian infection during the fifth pandemic was caused by non-O1 bacteria that are local either to Argentina or to Latin America more generally, rather than being linked to the globally distributed Classical lineage that is believed to have caused all historical cholera pandemics prior to the ongoing seventh pandemic [4, 10]. Based on the large number of non-pandemic genomes available to us [1, 11], and because VCR alleles differing from CE1 at two or more positions simultaneously were broadly distributed across (Table S1), we speculate that the CE1 VCR allele might be an ancestral form of at least one of the VCR sequences found in contemporary non-pandemic local to the Americas. However, a whole genome sequence from this fifth pandemic bacterium would be required to prove whether this sequence is ancestral. Beyond the curious nature of this archaeological finding, there is a very important subtlety to this observation – although we may still lack molecular or genomic evidence that the Classical lineage caused the fifth cholera pandemic in Argentina and elsewhere, these data suggest that non-pandemic bacteria may have been similarly associated with sporadic infections during the 1880s just as we have found for local lineages of non-pandemic in Argentina and Latin America in the present day [1, 11]. While we cannot use this single archaeological sample to draw general conclusions about the fifth cholera pandemic per se, it will only be through such investigations that inroads will be made into understanding these historical events. Not only have Ramirez and colleagues presented the first molecular evidence of a infection from the fifth pandemic in Latin America, their work is the latest in a recent surge in interest in using the unique aspects of cholera pandemics in Latin America to understand cholera and more generally [1, 11, 12]. The precisely defined periods in which pandemic cholera occurred and was introduced to Latin America [4] make this an ideal setting for researching the history of this disease and its epidemiology [1, 11]. This also re-emphasises the importance of aDNA research to studies of historical pandemics [10]. Continued work in this area has the potential to reconstruct the history of previous cholera pandemics, and obtaining partial or whole bacterial genome sequences from aDNA will enable more comprehensive phylogenetic research into these questions. Cholera is a disease which has been well documented throughout history, due in part to it being highly transmissible and causing explosive epidemics. However, there is a paucity of molecular information about pre-dating the turn of the twentieth century. The analysis of ancient DNA (aDNA) is an increasingly common approach by which the histories of bacterial infections can be reconstructed. Ramirez and colleagues recently presented the first aDNA evidence for a infection dating from the late 1880s in Argentina – surprisingly, their data suggested infection with a non-toxigenic bacterium. Here, we use a collection of non-pandemic Argentinian genomes to show that the genome fragment amplified by Ramirez et al. is most similar to non-pandemic from the Americas. Our results strongly indicate that the individual described by Ramirez and colleagues is likely to have been infected with non-pandemic local to the Americas. This suggests that non-pandemic may have caused sporadic infections in Latin America for hundreds of years. This hints at untapped reserves of information about historical cholera pandemics in Latin America, and emphasises the importance of aDNA research for deriving further insights in this area.

Methods

blast analysis and phylogenetics

The 115 nt VCR sequence reported by Ramirez et al. [2] was transcribed and used as a query with which to search assembled genome sequences described in Dorman et al. [1] using blastn [13] (all of the annotated assemblies used in that study have been uploaded to the Figshare repository supporting this article in GFF3 format). Results were filtered and sorted (cut-offs for inclusion: aligned length ≥100 nt; ordered by bitscore), and are provided in Table S1. The most similar results were defined in line with the results of Ramirez and colleagues [2]: e-value 2.32×10−53; 115/115 aligned nucleotides, 1 nt mismatch, 99.13 % identity. The sequences of each result were extracted from the genome assemblies. Two VCR alleles were identified which satisfied these criteria, due to variation at one of two independent nucleotides relative to the reference query. Therefore, both sequences were used along with the best match from the N16961 reference genome [14] to calculate a maximum-likelihood phylogeny under the GTR model using Seaview v4.6.1 and PhyML v3.0 [15, 16], for illustrative purposes. Default settings for maximum-likelihood calculations using nucleotide sequence inputs were used in Seaview v4.6.1. The phylogenetic tree presented in Fig. 1(a) has been published previously under a CC-BY 4.0 licence and was re-used verbatim in this study [1, 3].

Data visualisation

Phylogenetic trees were visualised alongside metadata and sequence alignments using the iTOL web service [17]. Isolates were classified as toxigenic on the basis of harbouring both ctxA and ctxB, as determined from the gene presence/absence matrix in Dorman et al. [1, 3]. The figure presented in the paper was edited manually using Adobe Illustrator CC v23.1.1. Click here for additional data file.
  16 in total

1.  Cholera.

Authors:  R POLLITZER; S SWAROOP; W BURROWS
Journal:  Monogr Ser World Health Organ       Date:  1959

2.  Historical background of cholera in the Americas.

Authors:  A Llopis; J Halbrohr
Journal:  Epidemiol Bull       Date:  1991

3.  Detection of Vibrio cholerae aDNA in human burials from the fifth cholera pandemic in Argentina (1886-1887 AD).

Authors:  Darío Alejandro Ramirez; Héctor Alex Saka; Rodrigo Nores
Journal:  Int J Paleopathol       Date:  2021-01-13       Impact factor: 1.393

4.  SeaView version 4: A multiplatform graphical user interface for sequence alignment and phylogenetic tree building.

Authors:  Manolo Gouy; Stéphane Guindon; Olivier Gascuel
Journal:  Mol Biol Evol       Date:  2009-10-23       Impact factor: 16.240

Review 5.  Cholera dynamics: lessons from an epidemic.

Authors:  Deepak Balasubramanian; Sebastian Murcia; C Brandon Ogbunugafor; Ronnie Gavilan; Salvador Almagro-Moreno
Journal:  J Med Microbiol       Date:  2021-02       Impact factor: 2.472

6.  The seventh pandemic of cholera.

Authors:  B Cvjetanovic; D Barua
Journal:  Nature       Date:  1972-09-15       Impact factor: 49.962

7.  A glimpse into the expanded genome content of Vibrio cholerae through identification of genes present in environmental strains.

Authors:  Alexandra Purdy; Forest Rohwer; Rob Edwards; Farooq Azam; Douglas H Bartlett
Journal:  J Bacteriol       Date:  2005-05       Impact factor: 3.490

8.  Genomic diversity of 2010 Haitian cholera outbreak strains.

Authors:  Nur A Hasan; Seon Young Choi; Mark Eppinger; Philip W Clark; Arlene Chen; Munirul Alam; Bradd J Haley; Elisa Taviani; Erin Hine; Qi Su; Luke J Tallon; Joseph B Prosper; Keziah Furth; M M Hoq; Huai Li; Claire M Fraser-Liggett; Alejandro Cravioto; Anwar Huq; Jacques Ravel; Thomas A Cebula; Rita R Colwell
Journal:  Proc Natl Acad Sci U S A       Date:  2012-06-18       Impact factor: 11.205

9.  Genomic and phenotypic diversity of coastal Vibrio cholerae strains is linked to environmental factors.

Authors:  Daniel P Keymer; Michael C Miller; Gary K Schoolnik; Alexandria B Boehm
Journal:  Appl Environ Microbiol       Date:  2007-04-20       Impact factor: 4.792

10.  Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees.

Authors:  Ivica Letunic; Peer Bork
Journal:  Nucleic Acids Res       Date:  2016-04-19       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.