Literature DB >> 30247677

AmtDB: a database of ancient human mitochondrial genomes.

Edvard Ehler1,2, Jirí Novotný1,3, Anna Juras4, Maciej Chylenski5, Ondrej Moravcík6, Jan Paces1,3.   

Abstract

Ancient mitochondrial DNA is used for tracing human past demographic events due to its population-level variability. The number of published ancient mitochondrial genomes has increased in recent years, alongside with the development of high-throughput sequencing and capture enrichment methods. Here, we present AmtDB, the first database of ancient human mitochondrial genomes. Release version contains 1107 hand-curated ancient samples, freely accessible for download, together with the individual descriptors, including geographic location, radiocarbon dating, and archaeological culture affiliation. The database also features an interactive map for sample location visualization. AmtDB is a key platform for ancient population genetic studies and is available at https://amtdb.org.

Entities:  

Mesh:

Year:  2019        PMID: 30247677      PMCID: PMC6324066          DOI: 10.1093/nar/gky843

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Ancient DNA (aDNA) is a genetic material obtained from ancient specimens, and unlike modern DNA, undergoes fragmentation and post-mortem damages caused mainly by environmental factors (1). Ancient DNA studies, conducted in the last 30 years, have confirmed that while maintaining appropriate procedures, we are able to recover genetic material from ancient specimens. Until recently, the majority of human aDNA studies were focused mainly on mitochondrial DNA (mtDNA) thanks to the fact that mtDNA is present in cells in a higher copy number than the nuclear genome, and therefore it is often the only genetic marker that can be recovered from poorly preserved samples. Due to its maternal inheritance, high mutation rate, absence of recombination and population-level variability, it is a useful tool for reconstructing the past demographic events (2). Despite the long-standing interest in ancient mtDNA, it was only in the past few years, when a high number of complete mt genomes were made available, alongside with the development of the high-throughput sequencing, often combined with the capture enrichment methods. Mitochondrial DNA, often as a part of nuclear genome studies, was used to reconstruct demographic events that took place in pre-LGM (Last Glacial Maximum) and post-LGM era in Europe (3,4), to trace demographic changes that shaped past and modern populations mtDNA variation (5–13), including the influence of Neolithization process (14–23), and Steppe migrations (24–26). Moreover, mtDNA was used in several kinship studies as a molecular marker which excludes direct maternal kinship between ancient individuals (27–31). Although there are currently available modern mtDNA databases, e.g. EMPOP (32), MITOMAP (33), HmtDB (34), and mtDB (35), there is no database that would be dedicated specifically to ancient mt genomes. A database concentrated primarily on ancient DNA is the Online Ancient Genome Repository (https://www.oagr.org.au). OAGR is the database primarily for samples generated (or collaborated on) by the Australian Centre for Ancient DNA, University of Adelaide, and includes both human SNP markers data and microbiome data. Our AmtDB is filling this gap by consistent way of mapping the published aDNA samples from different sources, and providing the associated metadata in standard, uniform, easily-downloadable-and-usable way, together with the mt genomes sequences and links to other resources. While our primary focus lies on ancient mtDNA, the metadata itself can be easily used in ancient genomic, archaeological or anthropological studies.

Database overview and functionality

The AmtDB database, as of initial version v1.000, contains 1107 samples. For 887 of these samples we provide the full mt sequences in FASTA format. For all samples, we offer metadata in form of additional descriptors. Although we utilize custom scripts for semi-automated data retrieval, all provided data are hand-curated and checked. Authors of the aDNA studies usually provide the mtDNA sequences in three different ways, or in any combination of there of: As complete mtDNA sequences deposited in GenBank (https://www.ncbi.nlm.nih.gov/genbank/) database (labeled as fasta). As results from high-throughput sequencing, in the form of SAM/BAM files deposited in an appropriate database, i.e. European Nucleotide Archive (https://www.ebi.ac.uk/ena) or Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) (labeled as bam). In haplotype format, i.e. list of changed position in comparison to rCRS (36) or RSRS (37) (labeled as reconstructed). In case of available FASTA from GenBank, we provide this sequence in the database. Otherwise, we either reconstruct the mt sequence from the haplotype using the Haplosearch (https://haplosearch.com) tool (38), or preferably reconstruct the mt sequence from provided SAM/BAM files, merging multiple files (per individual) with the use of SAMtools (39). The bioinformatics pipeline in this procedure includes mapping the merged reads as single-end reads against the rCRS with BWA software (40) and collapsing duplicate sequence reads with identical start and end coordinates using FilterUniqueSAMCons.py script (41). Consensus sequences are built using ANGSD toolkit (42). In these samples we also display the average sequence depth (coverage). User can filter all samples according to the mt sequence source discussed above (fasta, bam, reconstructed). More details about mtDNA reconstruction pipeline can be found in our previous publication (15) and in AmtDB documentation (https://amtdb.org/help). Besides the mt sequences, the AmtDB contains additional information about the samples, the metadata. The samples can be selected and browsed based on primary ID and alternative ID(s), several geographic location descriptors, latitude and longitude in decimal degrees, archaeological site, or group of archaeological cultural background descriptors. We also supply a comment column, which may contain additional info for the sample, usually information about relationship, uncertainties, or important notes that do not fit into other category and might be valuable for researchers. Biological variables include sex, mt haplogroup, Y chromosomal haplogroup, and Y chromosomal haplotype. For sample age related information, we use calibrated BCE or CE ((Before) Common Era) dates wherever possible. For the radiocarbon dated samples, we provide the precise min. and max. values of the 95.4% probability interval for calibrated (B)CE date, uncalibrated BP (Before Present) age, and radiocarbon laboratory and sample code. For samples that are not directly 14C dated, but other samples from the same layer are, we provide calibrated (B)CE age of the layer. For samples, that were dated only according to the material culture associated with the sample, we use uncalibrated (B)CE age. Our database search engine allows to filter 14C dated samples only. For each sample we also provide publication reference, DOI based reference link and link to sample (mt) sequence depository. Focal point of our simple, clear and user-friendly interface with advanced search options (Figure 1A) is the visualization of the filtered samples on an interactive world map (Figure 1B). Samples on the map can be clustered together by their distance and smaller clusters are created when the map is zoomed in. Tooltip with sample links appears when cluster is right clicked. Maps are available in several graphical overlays (political, physical, satellite or blind map), and are ready for download together with all provided sequences and metadata, without registration.
Figure 1.

AmtDB (https://amtdb.org) advanced search overview. (A) The database was filtered for Neolithic and Copper Age samples from France and Spain. These samples also have following attributes: known sex (‘F’ or ‘M’), age between 4600 and 2200 BCE, complete mtDNA available, mt reconstructed (sequence source) from BAM files with average coverage at least 10, and are radiocarbon dated. (B) Visualization of the search results on a map, showing unclustered and clustered samples (clustering can be toggled on and off). Tooltip with sample links can be displayed for each sample or cluster (here shown for four samples cluster from Arroyal I site in Burgos, Spain).

AmtDB (https://amtdb.org) advanced search overview. (A) The database was filtered for Neolithic and Copper Age samples from France and Spain. These samples also have following attributes: known sex (‘F’ or ‘M’), age between 4600 and 2200 BCE, complete mtDNA available, mt reconstructed (sequence source) from BAM files with average coverage at least 10, and are radiocarbon dated. (B) Visualization of the search results on a map, showing unclustered and clustered samples (clustering can be toggled on and off). Tooltip with sample links can be displayed for each sample or cluster (here shown for four samples cluster from Arroyal I site in Burgos, Spain).

CONCLUSION

The database is currently in initial operational capability phase, v1.000, and will get 2–3 major updates per year, concentrating on adding more published samples into the database. We believe the community of ancient human populations researchers will find AmtDB useful, as to our best knowledge, there is no comparable database in terms of usability and data content.

DATA AVAILABILITY

The Ancient human mitochondrial genomes database can be found at https://amtdb.org.
  40 in total

1.  Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA.

Authors:  R M Andrews; I Kubacka; P F Chinnery; R N Lightowlers; D M Turnbull; N Howell
Journal:  Nat Genet       Date:  1999-10       Impact factor: 38.330

2.  Illumina sequencing library preparation for highly multiplexed target capture and sequencing.

Authors:  Matthias Meyer; Martin Kircher
Journal:  Cold Spring Harb Protoc       Date:  2010-06

3.  HaploSearch: a tool for haplotype-sequence two-way transformation.

Authors:  Rosa Fregel; Sergio Delgado
Journal:  Mitochondrion       Date:  2010-11-06       Impact factor: 4.160

4.  EMPOP--a forensic mtDNA database.

Authors:  Walther Parson; Arne Dür
Journal:  Forensic Sci Int Genet       Date:  2007-03-07       Impact factor: 4.882

Review 5.  Using phylochronology to reveal cryptic population histories: review and synthesis of 29 ancient DNA studies.

Authors:  Uma Ramakrishnan; Elizabeth A Hadly
Journal:  Mol Ecol       Date:  2009-04       Impact factor: 6.185

6.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

7.  Ancient DNA, Strontium isotopes, and osteological analyses shed light on social and kinship organization of the Later Stone Age.

Authors:  Wolfgang Haak; Guido Brandt; Hylke N de Jong; Christian Meyer; Robert Ganslmeier; Volker Heyd; Chris Hawkesworth; Alistair W G Pike; Harald Meller; Kurt W Alt
Journal:  Proc Natl Acad Sci U S A       Date:  2008-11-17       Impact factor: 11.205

8.  Ancient DNA from European early neolithic farmers reveals their near eastern affinities.

Authors:  Wolfgang Haak; Oleg Balanovsky; Juan J Sanchez; Sergey Koshel; Valery Zaporozhchenko; Christina J Adler; Clio S I Der Sarkissian; Guido Brandt; Carolin Schwarz; Nicole Nicklisch; Veit Dresely; Barbara Fritsch; Elena Balanovska; Richard Villems; Harald Meller; Kurt W Alt; Alan Cooper
Journal:  PLoS Biol       Date:  2010-11-09       Impact factor: 8.029

9.  mtDB: Human Mitochondrial Genome Database, a resource for population genetics and medical sciences.

Authors:  Max Ingman; Ulf Gyllensten
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

10.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

View more
  5 in total

1.  Ancient Mitochondrial Genomes Reveal the Absence of Maternal Kinship in the Burials of Çatalhöyük People and Their Genetic Affinities.

Authors:  Maciej Chyleński; Edvard Ehler; Mehmet Somel; Reyhan Yaka; Maja Krzewińska; Mirosława Dabert; Anna Juras; Arkadiusz Marciniak
Journal:  Genes (Basel)       Date:  2019-03-11       Impact factor: 4.096

2.  aYChr-DB: a database of ancient human Y haplogroups.

Authors:  Laurence Freeman; Conrad Stephen Brimacombe; Eran Elhaik
Journal:  NAR Genom Bioinform       Date:  2020-10-09

3.  Human molecular evolutionary rate, time dependency and transient polymorphism effects viewed through ancient and modern mitochondrial DNA genomes.

Authors:  Vicente M Cabrera
Journal:  Sci Rep       Date:  2021-03-03       Impact factor: 4.379

4.  Genetic analysis of a bronze age individual from Ulug-depe (Turkmenistan).

Authors:  Perle Guarino-Vignon; Nina Marchi; Amélie Chimènes; Aurore Monnereau; Sonja Kroll; Marjan Mashkour; Johanna Lhuillier; Julio Bendezu-Sarmiento; Evelyne Heyer; Céline Bon
Journal:  Front Genet       Date:  2022-08-22       Impact factor: 4.772

5.  Variation in the Substitution Rates among the Human Mitochondrial Haplogroup U Sublineages.

Authors:  Sanni Översti; Jukka U Palo
Journal:  Genome Biol Evol       Date:  2022-07-02       Impact factor: 4.065

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.