Literature DB >> 21067986

Inspecting close maternal relatedness: Towards better mtDNA population samples in forensic databases.

Martin Bodner¹, Jodi A Irwin, Michael D Coble, Walther Parson.

Abstract

Reliable data are crucial for all research fields applying mitochondrial DNA (mtDNA) as a genetic marker. Quality control measures have been introduced to ensure the highest standards in sequence data generation, validation and a posteriori inspection. A phylogenetic alignment strategy has been widely accepted as a prerequisite for data comparability and database searches, for forensic applications, for reconstructions of human migrations and for correct interpretation of mtDNA mutations in medical genetics. There is continuing effort to enhance the number of worldwide population samples in order to contribute to a better understanding of human mtDNA variation. This has often lead to the analysis of convenience samples collected for other purposes, which might not meet the quality requirement of random sampling for mtDNA data sets. Here, we introduce an additional quality control means that deals with one aspect of this limitation: by combining autosomal short tandem repeat (STR) marker with mtDNA information, it helps to avoid the bias introduced by related individuals included in the same (small) sample. By STR analysis of individuals sharing their mitochondrial haplotype, pedigree construction and subsequent software-assisted calculation of likelihood ratios based on the allele frequencies found in the population, closely maternally related individuals can be identified and excluded. We also discuss scenarios that allow related individuals in the same set. An ideal population sample would be representative for its population: this new approach represents another contribution towards this goal.

Entities: Chemical

Mesh：

Substances：

Year: 2010 PMID： 21067986 PMCID： PMC3135241 DOI： 10.1016/j.fsigen.2010.10.001

Source DB: PubMed Journal: Forensic Sci Int Genet ISSN： 1872-4973 Impact factor: 4.882

Introduction

An understanding of global mtDNA variation is important for a plethora of applications that take advantage of mtDNA, including phylogeographic studies, as well as population, medical and forensic genetics. Many areas outside West Eurasia are still largely underrepresented from the standpoint of high quality mtDNA sequence data, although the number of regions investigated is steadily increasing as a result of ongoing collaborative efforts. The highest quality standards in mtDNA sequence data generation, validation, transfer, processing, recording, storage, deposition, availability and reporting have largely been derived from forensics and are now well accepted requirements for mtDNA database reliability and publication [1-6]. Additional quality control measures that include a standardized alignment approach relative to the reference based on phylogenetics [1,7] and a posteriori tools that employ phylogenetic analysis to facilitate the inspection of (novel) single nucleotide polymorphisms (SNPs) and indels [2,8,9] have been introduced to further improve the quality of global mtDNA data sets. Yet, despite these extensive quality requirements, an aspect of mtDNA population databasing that has been addressed only coarsely in forensic literature [1,10] is the thorough investigation of close maternal relatedness of sample donors among “random” samples from a given population. Here, we present our considerations on how to identify samples from related donors by inspecting autosomal STR markers.

Convenience samples as a common source of maternally related donors

An appropriate sample set should mirror the true situation, in terms of being cross-sectional and randomly representative for its population. The standard sample (n = 200–400) is often inadequately small, and thus any lineage present therein tends to be overrepresented compared to those not sampled [10-12]. This imminent frequency bias should not be increased by samples from closely maternally related donors, especially for very rare haplotypes. The main reason for the presence of such samples in any particular set is the use of “convenience samples” [1,11], that were collected for purposes other than establishing an mtDNA database. The donors do not necessarily constitute a random sample; and since this may bias the haplotype frequencies, relatedness has to be inspected. Genetic studies can be heavily distorted by unknown relationships [13]. Convenience samples often represent the only opportunity to obtain genetic information on certain populations. The samples can be used, as long as their limitations are understood and addressed. However, they should not be employed for populations from which samples may be easily acquired. Throughout this manuscript, the description “closely maternally related” refers to constellations typically encountered in forensic paternity casework, i.e. mother–children and siblings, because the majority of samples derive from either those cases or cases involving the interrogation of families, where such samples are also likely to be found.

Laboratory workflow for the inspection of maternal relatedness

Fig. 1 depicts the practical workflow we propose for the identification of mtDNA samples that derive from closely maternally related donors. Sample calculations are shown in the supplementary file.

Fig. 1

Workflow for the identification of mtDNA samples deriving from closely maternally related donors (starting top left). All steps are detailed in the text.

Selecting samples with identical mtDNA haplotypes

Samples that reveal identical mtDNA haplotypes in their greatest common range should be subjected to the analysis of maternal relatedness, especially when rare haplotypes are encountered. Screening for consecutiveness in sample numbering or shared rare polymorphisms can give an indication of relatedness, but is not the only means of inspection for identical haplotypes. Haplotypes that only differ in length and/or point heteroplasmies should be considered identical at that step. Such differences may appear among tissues or even within one tissue deriving from the same donor, among samples from maternal relatives and can result from analytical or detection conditions [14-19]. Haplotypes resembling at least one full SNP or indel difference outside the known length variant regions (around positions 16193, 309, 455 and 573) constitute distinct lineages and therefore should remain in the database even though they might be closely maternally related.

STR typing

Further analyses of the samples under consideration are performed by typing autosomal STRs. The selection of a validated STR set with high discrimination power is recommended. In the case that two or more male individuals are included, the analysis of Y-chromosomal markers can be useful. Autosomal STRs are perfectly suited to pinpoint identical samples. Samples that do not share any STR alleles are justified to stay in the data set. In the remaining cases, statistical calculations are necessary to distinguish between allele sharing by descent vs. coincidence (or because of distant shared ancestry).

Constructing pedigrees and calculating their likelihoods

It seems meaningful to avoid impractically high numbers of relationship alternatives [20] by limiting the possible pedigrees to those relevant for the scenario, according to the proportion of STR alleles shared between the samples under discussion along with hints from non-DNA information. In order to determine the pedigree that reflects the true relationship, the pedigrees’ probabilities have to be calculated. We used “Familias”, a freeware to compute probabilities and likelihoods for paternity and identification cases [21,22]. For the calculations, the STR allele frequencies in the specific (sub-)population are required. In case they are not available, neighboring or similar populations may be helpful – assuming genetic homogeneity. Mean value calculations will pinpoint outliers. Repeated calculation using different databases is another practical solution to check the robustness of conclusions [13].

Calculating likelihood ratios and decision making

A specified relationship A is then compared against the alternative B in a likelihood ratio (LR) indicating whether the given genetic data are more likely if pedigree A is true vs. if pedigree B is true (e.g. sibs vs. unrelated; mother and offspring vs. unrelated). The LR cutoffs for excluding one pedigree in favor of another can be applied as is practice in forensic paternity casework [23] (Table 1). Typing additional STR systems may be meaningful in those cases which prove inconclusive when only the core STR loci are typed.

Table 1

Likelihood ratio cutoff values.

Likelihood ratio (pedigree A/pedigree B)	Support for pedigree A
1–10	Limited
10–100	Moderate
100–1000	Strong
1000 and more	Very strong

Applying the strategy presented, the decision if samples that share their mtDNA haplotype are excluded from a data set because of close maternal relation is based on the scientific grounds of LR. If a close maternal relationship is found between two or more samples, we propose maintaining in the data set the sample which produced the highest sequence quality or for which more DNA is available (if further analyses are intended). A notification of the kinship analyses performed and their results will be returned to the sample set provider. The exclusion of samples may not be valid for other analyses, where maternal lines of descent are not relevant.

Discussion

Limitations of the approach to identify close maternal relatives

When testing samples from deficiency case pedigrees with a limited number of markers and no additional family members available, a clear result indicating maternal relatedness will only be yielded if they are as close as mother–child, siblings or monozygotic twins. Even with an expanded set of markers, there is a high rate of misclassification for half-sibs, unless the mother's profile is included [24]. The technical and calculatory limitations of the analyses that are currently feasible also define where to draw the line for unacceptably close relatives. However, these limitations in detection are restricted to rare events, as the convenience samples used for mtDNA databasing very often derive from paternity casework or from a sampling strategy that involves families, where mainly close relations are relevant. These can be clearly identified with our methodology. To resolve cases of more distant maternal relationships, that are much less frequent in convenience samples, other approaches integrating large amounts of autosomal SNP data exist [13]. The presented strategy targets the evaluation of those shared haplotypes that are rare or yet unobserved, as their overrepresentation can be relevant in forensic applications. For common haplotypes (e.g. the control region haplotype 263G 315.1C 16519C in European populations), the problem of closely related samples is less relevant as the haplotype frequency is relatively high anyway. These samples will by chance also share frequent STR alleles to varying extents: “false positive” results with possibly high LRs could arise. An adaptation of LR cutoff values – after more data of known relatives have been evaluated – may be meaningful in such cases. However, the removal of single samples from closely related donors has only minor impact on the frequency values here compared to rare haplotypes. Further, related samples might be missed, if the true STR allele frequencies in the very (sub-) population differ from those used for the calculations or inappropriate STR systems have been chosen. Finally, limited quantity and/or quality of nuclear DNA can impede the generation of STR profiles.

Non-genetic information aids the inspection of maternal relatedness

In this manuscript, we have assumed that the (convenience) population sample, containing any degree of maternal relatedness, and additional information are already defined. The (more comprehensive) collection of metadata at sampling is a meaningful contribution to better mtDNA sample quality, prior to laboratory analyses, since this non-genetic (or “prior”) information can increase the power of relationship inference [13]. Expanding the sample collection questionnaire on the full pedigree (maternal and paternal ancestors and offspring) supports all subsequent studies by helping to exclude or confirm pedigrees. Storage of the non-genetic data by a single institution that only makes it available on specific request could abide by the regulations on the collection of personal information that exist in various institutions [4].

Different types of samples and the definition of a population

In a data set representing one certain settlement, tribe, island, or similar – and likewise, highly endogamous and isolated populations – overrepresentation of several lineages is not necessarily a result of inappropriate sampling (cf. [11]). Related people are part of the limited mtDNA pool of a restricted entity, and reducing the abundance to one sample per lineage would not illustrate the true frequencies. This clearly demonstrates that the type and definition of a population influences its level of random relatedness (and which samples have to be excluded from a representative data set). For the sake of the discussion here, we have presumed that these complex issues [1] have already been carefully addressed and the population samples under consideration, and the populations they represent, have been previously defined. These topics are imperfectly understood and clearly warrant further discussion beyond the scope of this manuscript.

Conclusions

The presence of maternally related donors in a “random” population sample has so far not been as thoroughly addressed in quality control as other aspects of mtDNA analysis and databasing. The simple practical approach presented here helps to detect the “clear and easy” cases of close maternal kinship between donors in a sample set: following the procedure described, these samples can be identified and subsequently excluded. If appreciated, this additional tool will contribute towards better random mtDNA population samples representative for their population, for the benefit of all research applying mtDNA as a genetic marker.

22 in total

1. Detecting errors in mtDNA data by phylogenetic analysis.

Authors: H J Bandelt; P Lahermo; M Richards; V Macaulay
Journal: Int J Legal Med Date: 2001-10 Impact factor: 2.686

2. Different methods to determine length heteroplasmy within the mitochondrial control region.

Authors: Sabine Lutz-Bonengel; Timo Sänger; Stefan Pollak; Reinhard Szibor
Journal: Int J Legal Med Date: 2004-10 Impact factor: 2.686

3. Results of a collaborative study of the EDNAP group regarding mitochondrial DNA heteroplasmy and segregation in hair shafts.

Authors: G Tully; S M Barritt; K Bender; E Brignon; C Capelli; N Dimo-Simonin; C Eichmann; C M Ernst; C Lambert; M V Lareu; B Ludes; B Mevag; W Parson; H Pfeiffer; A Salas; P M Schneider; E Staalstrom
Journal: Forensic Sci Int Date: 2004-02-10 Impact factor: 2.395

4. Discrimination of half-siblings when maternal genotypes are known.

Authors: Lianne R Mayor; David J Balding
Journal: Forensic Sci Int Date: 2005-09-08 Impact factor: 2.395

Review 5. Scientific standards for studies in forensic genetics.

Authors: Peter M Schneider
Journal: Forensic Sci Int Date: 2006-07-27 Impact factor: 2.395

6. Consistent treatment of length variants in the human mtDNA control region: a reappraisal.

Authors: H-J Bandelt; W Parson
Journal: Int J Legal Med Date: 2007-03-09 Impact factor: 2.686

7. EMPOP--a forensic mtDNA database.

Authors: Walther Parson; Arne Dür
Journal: Forensic Sci Int Genet Date: 2007-03-07 Impact factor: 4.882

8. Validation of software for calculating the likelihood ratio for parentage and kinship.

Authors: J Drábek
Journal: Forensic Sci Int Genet Date: 2008-12-24 Impact factor: 4.882

9. Publication of population data for forensic purposes.

Authors: Angel Carracedo; John M Butler; Leonor Gusmão; Walther Parson; Lutz Roewer; Peter M Schneider
Journal: Forensic Sci Int Genet Date: 2010-02-21 Impact factor: 4.882

10. Sequencing strategy for the whole mitochondrial genome resulting in high quality sequences.

Authors: Liane Fendt; Bettina Zimmermann; Martin Daniaux; Walther Parson
Journal: BMC Genomics Date: 2009-03-30 Impact factor: 3.969

9 in total

1. Forensic and phylogeographic characterisation of mtDNA lineages from Somalia.

Authors: Martin Mikkelsen; Liane Fendt; Alexander W Röck; Bettina Zimmermann; Eszter Rockenbauer; Anders J Hansen; Walther Parson; Niels Morling
Journal: Int J Legal Med Date: 2012-04-14 Impact factor: 2.686

2. Helena's Many Daughters: More Mitogenome Diversity behind the Most Common West Eurasian mtDNA Control Region Haplotype in an Extended Italian Population Sample.

Authors: Martin Bodner; Christina Amory; Anna Olivieri; Francesca Gandini; Irene Cardinali; Hovirag Lancioni; Gabriela Huber; Catarina Xavier; Maria Pala; Alessandro Fichera; Lisa Schnaller; Mario Gysi; Stefania Sarno; Davide Pettener; Donata Luiselli; Martin B Richards; Ornella Semino; Alessandro Achilli; Antonio Torroni; Walther Parson
Journal: Int J Mol Sci Date: 2022-06-16 Impact factor: 6.208

3. Southeast Asian diversity: first insights into the complex mtDNA structure of Laos.

Authors: Martin Bodner; Bettina Zimmermann; Alexander Röck; Anita Kloss-Brandstätter; David Horst; Basil Horst; Sourideth Sengchanh; Torpong Sanguansermsri; Jürgen Horst; Tanja Krämer; Peter M Schneider; Walther Parson
Journal: BMC Evol Biol Date: 2011-02-18 Impact factor: 3.260

4. Rapid coastal spread of First Americans: novel insights from South America's Southern Cone mitochondrial genomes.

Authors: Martin Bodner; Ugo A Perego; Gabriela Huber; Liane Fendt; Alexander W Röck; Bettina Zimmermann; Anna Olivieri; Alberto Gómez-Carballa; Hovirag Lancioni; Norman Angerhofer; Maria Cecilia Bobillo; Daniel Corach; Scott R Woodward; Antonio Salas; Alessandro Achilli; Antonio Torroni; Hans-Jürgen Bandelt; Walther Parson
Journal: Genome Res Date: 2012-02-14 Impact factor: 9.043

5. MtDNA diversity of Ghana: a forensic and phylogeographic view.

Authors: Liane Fendt; Alexander Röck; Bettina Zimmermann; Martin Bodner; Thorsten Thye; Frank Tschentscher; Ellis Owusu-Dabo; Tanja M K Göbel; Peter M Schneider; Walther Parson
Journal: Forensic Sci Int Genet Date: 2011-06-30 Impact factor: 4.882

Review 6. Reviewing population studies for forensic purposes: Dog mitochondrial DNA.

Authors: Sophie Verscheure; Thierry Backeljau; Stijn Desmyter
Journal: Zookeys Date: 2013-12-30 Impact factor: 1.546

7. Human settlement history between Sunda and Sahul: a focus on East Timor (Timor-Leste) and the Pleistocenic mtDNA diversity.

Authors: Sibylle M Gomes; Martin Bodner; Luis Souto; Bettina Zimmermann; Gabriela Huber; Christina Strobl; Alexander W Röck; Alessandro Achilli; Anna Olivieri; Antonio Torroni; Francisco Côrte-Real; Walther Parson
Journal: BMC Genomics Date: 2015-02-14 Impact factor: 3.969

8. Platinum-Quality Mitogenome Haplotypes from United States Populations.

Authors: Cassandra R Taylor; Kevin M Kiesler; Kimberly Sturk-Andreaggi; Joseph D Ring; Walther Parson; Moses Schanfield; Peter M Vallone; Charla Marshall
Journal: Genes (Basel) Date: 2020-10-29 Impact factor: 4.096

9. The Value of Whole-Genome Sequencing for Mitochondrial DNA Population Studies: Strategies and Criteria for Extracting High-Quality Mitogenome Haplotypes.

Authors: Kimberly Sturk-Andreaggi; Joseph D Ring; Adam Ameur; Ulf Gyllensten; Martin Bodner; Walther Parson; Charla Marshall; Marie Allen
Journal: Int J Mol Sci Date: 2022-02-17 Impact factor: 5.923

9 in total