Literature DB >> 23885200

Comparison of the effectiveness of microsatellites and SNP panels for genetic identification, traceability and assessment of parentage in an inbred Angus herd.

María E Fernández¹, Daniel E Goszczynski, Juan P Lirón, Egle E Villegas-Castagnasso, Mónica H Carino, María V Ripoli, Andrés Rogberg-Muñoz, Diego M Posik, Pilar Peral-García, Guillermo Giovambattista.

Abstract

During the last decade, microsatellites (short tandem repeats or STRs) have been successfully used for animal genetic identification, traceability and paternity, although in recent year single nucleotide polymorphisms (SNPs) have been increasingly used for this purpose. An efficient SNP identification system requires a marker set with enough power to identify individuals and their parents. Genetic diagnostics generally include the analysis of related animals. In this work, the degree of information provided by SNPs for a consanguineous herd of cattle was compared with that provided by STRs. Thirty-six closely related Angus cattle were genotyped for 18 STRs and 116 SNPs. Cumulative SNPs exclusion power values (Q) for paternity and sample matching probability (MP) yielded values greater than 0.9998 and 4.32E(-42), respectively. Generally 2-3 SNPs per STR were needed to obtain an equivalent Q value. The MP showed that 24 SNPs were equivalent to the ISAG (International Society for Animal Genetics) minimal recommended set of 12 STRs (MP ∼ 10(-11)). These results provide valuable genetic data that support the consensus SNP panel for bovine genetic identification developed by the Parentage Recording Working Group of ICAR (International Committee for Animal Recording).

Entities: CellLine Chemical Disease Mutation Species

Keywords: bovine; exclusion probability; genetic identification; microsatellite; single nucleotide polymorphism

Year: 2013 PMID： 23885200 PMCID： PMC3715284 DOI： 10.1590/S1415-47572013000200008

Source DB: PubMed Journal: Genet Mol Biol ISSN： 1415-4757 Impact factor: 1.771

Introduction

DNA markers are becoming increasingly important in animal breeding and have been successfully used in bovine identification, in parentage testing and to establish relationships between two or more individuals (Glowatzki-Mullis ; Heyen ; Williams ; Heaton ). These markers have also been used to trace meat through the entire food chain (Arana ) because of the reliable and accurate traceability they provide based on matching genetic marker profiles (Dalvit ); the use of such markers has the potential to improve the rate of genetic progress (Van Eenennaam ). Microsatellites or short tandem repeats (STRs) have been the genetic markers of choice for more than two decades. Despite being highly polymorphic, informative and interspersed throughout the entire genome (Baumung ; Tian ), the results obtained with STRs by different laboratories are not always comparable because of inconsistencies in allele size calling and errors in size determination. Furthermore, STRs are time consuming for trained personnel to analyze, even with the use of appropriate software or other automated methods for allele analysis (Vignal ). Recent advances in high-throughput DNA sequencing, computer software and bioinformatics have made the use of SNPs more popular (Heaton ). Although in terms of genetic information a biallelic marker may be considered as a step backwards, SNPs have some promising advantages, including greater abundance (Heaton ), genetic stability in mammals (Markovtsova ; Nielsen, 2000; Thomson ), simpler nomenclature and suitability to automated analysis and data interpretation (Wang ; Lindblad-Toh ). Furthermore, SNPs have been successfully used in the discovery of quantitative trait loci (QTL) and the association of genes with specific productive traits (Chen and Abecasis 2007; Wollstein ) and in the identification of individuals and breeds (Negrini ). A prerequisite for the development of efficient SNP-based identification systems is the description of a minimal set with sufficient power to uniquely identify individuals and their parents in a variety of popular breeds and cross-bred populations (Heaton ), even though the information content in a SNP set may vary significantly between populations (Krawczak, 1999). Previous studies designed strategies to sample the entire genetic diversity in beef cattle or purebred populations and simulated populations of purebred gene frequencies have been used to estimate the resolution and sensitivity of these methods in identifying individuals and in parental analysis (Table S7) (Heaton ; Werner ; López Herráez ; Van Eenennaam ; Baruch and Weller, 2008; Karniol ; Allen ; Hara ,b). Most of the routine work done in livestock genetic laboratories includes the analysis of closely related animals (herdbook registry, half-sibs, etc.). Since high consanguinity is common in commercial ranches, additional markers are required to maintain the accuracy of the analysis (Pollak, 2005). In dealing with this problem, Anderson and Garza (2005) calculated the discriminatory power of SNPs in large scale parentage studies by considering the occurrence of related individuals among the members of putative mother-father-offspring trios. More recently, Fisher used simulated and empirical data to evaluate the effectiveness of SNPs and STRs for parentage matching based on different degrees of relatedness. Recently, the Parentage Recording Working Group of the ICAR (International Committee for Animal Recording) developed a cattle consensus panel of 99 SNPs, and a final ring test to certify laboratories around the world is underway. Considering this scenario, and the fact that there is considerably more experience in the use of microsatellites than SNPs (in terms of laboratory and statistical methods for analysis), the aim of this work was to compare the amount of information provided by microsatellites and SNPs within a consanguineous Angus herd.

Materials and Methods

Sample and DNA extraction

The study was done using 36 consanguineous Angus calves from a herd in Buenos Aires Province. This herd belongs to a typical commercial farm that produces, selects and sells bulls to breeding farms. The samples analyzed included half-sibs from six bulls that shared a grandfather and were obtained from the nucleus herd (consanguinity ∼0.2). Figure S1 provides a schematic diagram of the breeding system used. DNA was extracted from blood using NucleoSpin Blood purification kits (Macherey-Nagel, Düren, Germany), according to the manufacturer’s instructions.

Genotyping

DNA genotyping was done with microsatellites and SNPs. The microsatellite markers used were BM1818, BM1824, BM2113, BRR, CSRM60, CSSM66, ETH3, ETH10, ETH225, HAUT27, HEL1, INRA023, RM067, SPS115, TGLA53, TGLA122, TGLA126, and TGLA227. These 18 STRs belong to the standard FAO panel (Van de Goor ) and/or to the standardized recommended list of the International Society for Animal Genetics (ISAG). A self-developed kit was used for PCR and the fragments were identified in an automatic MegaBACE 1000 DNA sequencer (GE Healthcare, USA). Allele sizes were standardized to the ISAG nomenclature. For SNP genotyping, 116 parentage SNPs from the Illumina BovineHD BeadChip were used (the list of SNPs is detailed in the Supplementary Material). This set comprised all SNPs included in the consensus panel for cattle identification developed by the Parentage Recording Working Group of ICAR (International Committee for Animal Recording). Genotypes with auto-calling < 85% were excluded from the analyses despite the fact that they were highly curated; 30 duplicates were included in the chip used. SNP genotyping was done using the genotyping services of GeneSeek Inc. (Lincoln, NE, USA).

Statistical analysis

Allele frequencies were determined by direct counting. ARLEQUIN 3.5 software (Schneider ) was used to estimate the levels of genetic variability through allelic diversity (na; total number of alleles, average number of alleles and number of alleles per locus) and the unbiased expected (he) and observed heterozygosity (ho) for each locus and all loci. Hardy-Weinberg equilibrium (HWE) was estimated by FIS using the exact test implemented in GENEPOP 4 (Rousset and Raymond, 1997; Rousset, 2007). The FIS index was also used to estimate the degree of molecular consanguinity instead of pedigree consanguinity or kinship because the entire matrilineage was unavailable. The match probability (MP) and exclusion power (Q) were estimated for cases involving two known parents, one known parent, missing parents and individual identification based on one (Q1) and two (Q2) marker exclusion criteria. These parameters were calculated for each marker and for the whole set as described by Weir (1996), using algorithms programmed with Visual Basic and implemented in Excel software (available upon request from the corresponding author).

Results

Thirty-six related animals were studied for 18 STRs and 116 SNPs. The animals belonged to a farm that uses artificial insemination (AI) and a natural multi-sire mating system. The exclusion of data with an auto-calling < 85% resulted in 4144 genotypes (32 missing data), with an average of 35.72 successful genotype (range: 34–36) per locus. All of the SNPs analyzed were polymorphic (na = 2) while an STR na of 5.22 ±1.35 (mean ±SD; range: 3–8) (Table 1). The minimum allele frequency (MAF) for SNPs was > 0.05 in 114 of the 116 SNP markers, the exceptions being the SNPs ARS-USMARC-Parent-EF034087-no-rs and ARS-USMARC-Parent-AY842472-rs29001941. The SNP he values ranged from 0.028 to 0.507, with an average value of 0.417 (Table 1). For STRs, the he values ranged from 0.255 to 0.816, with an average of 0.640 (Tables S1 and S2). In total, 133 HWE tests were done (115 for SNPs and 18 for STRs), nine of which (five for SNPs and four for STRs) showed significant deviations (p < 0.05) from theoretical proportions (Tables S1 and S2). The allele frequencies for SNPs and STRs are available from the corresponding author upon request.

Table 1

Average number of alleles (na), unbiased expected (he), standard deviation of na and he, range of na and he among loci and FIS estimated for the SNP and STR sets of markers in Angus inbred cattle.

Marker type	n_a (range)	H_e (range)	F_IS p value
SNP	2 ± 0^* (2)	0.417 ± 0.0098 (0.028–0.507)	< 0.001
STR	5.22 ± 1.35 (3–8)	0.640 ±0.015 (0.255–0.816)	< 0.001

Mean ± SD.

Q was estimated for each SNP marker for the most common cases of genetic identification (two known parents, one known parent, missing parents and matching samples), while MP was calculated only for matching samples (Tables S3 and S4). As shown in Figure S2, the distribution of the number of SNPs based on their individual Q values yielded a logarithmic curve. In the case of matching samples, more than 50% of the SNPs had Q values > 0.60. When the genotypes of both parents were known, more than 50% of the SNPs had a Q value ≥ 0.17, while in the worst scenario (one known parent) this value was ≥ 0.10. In addition, Q was estimated for each whole set of markers by considering one and two mismatch criteria. The corresponding Q1 and Q2 values were > 0.999991 and > 0.9998 for SNPs and > 0.994 and > 0.957 for STRs, respectively; the MP values were 2.45E−42 and 3.0E−12 for SNPs and STRs, respectively (Table 2). Figures 1 and 2 and Tables S5 and S6 show the cumulative Q1, Q2 and MP values for all of the cases studied. These results show that it is necessary to analyze between eight (matching samples scenario) and 55 (one known parent) SNPs to achieve a Q1 ≥ 0.999 [or cumulative non-exclusion power (1 - Q) = 1.0E−4]. On the other hand, for STRs, three and more than 18 markers, respectively, are necessary. When using the Q2 ≥ 0.999 criterion, 10 (matching samples) and 79 (one known parent) SNPs are needed, whereas for STRs five and > 18, respectively, are required. Finally, in the population studied here, 24 SNPs or 11 STRs were necessary to obtain an MP ≥ 10−11.

Table 2

Non-exclusion power (1 - Q) estimated for the whole set of SNPs and STRs considering one (Q1) and two (Q2) mismatch criteria for the cases of two known parents, one known parent, missing parents and matching samples. MP - match probability calculated for matching samples.

Locus type	N	Both parents		One parent		Missing parent		Matching samples		MP

		1 - Q₁	1 - Q₂	1 - Q₁	1 - Q₂	1 - Q₁	1 - Q₂	1 - Q₁	1 - Q₂
SNPs	116	1.4E-09	3.2E-08	1.6E-05	2.1E-04	4.0E-15	1.6E-13	< 4.1E-15	< 4.1E-15	2.4E-42
STRs	18	6.0E-05	9.0E-04	5.9E-03	4.2E-02	1.0E-08	3.0E-06	3.0E-14	3.0E-12	2.6E-14

Figure 1

Cumulative exclusion power (Q) calculated for SNPs considering (A) one (Q1) mismatch criterion and (B) two mismatch criteria (Q2) for cases of two known parents, one known parent, missing parents and matching samples. Markers are listed based on decreasing expected heterozygosity (he).

Figure 2

Cumulative exclusion power (Q) calculated for STRs considering (A) one (Q1) mismatch criterion and (B) two mismatch criteria (Q2) for cases of two known parents, one known parent, missing parents and matching samples. Markers are listed based on decreasing expected heterozygosity (he).

The minimum number of markers recommended by the ISAG for bovine genetic identification is 12 STRs. In our work, around 24 SNPs were necessary to achieve an MP (1.78E−11) equivalent to the standard marker set, and 31 SNPs (MP = 1.87E−14) were equivalent to the 18 STR set (Tables S5 and S6). For paternity testing, and when the two parents were known, 37 SNPs were needed for a Q value similar to the standard marker set. The resolution of more complex cases requires the use of additional markers. In these situations, such as one known parent or missing parents, around 39 and 49 SNPs are required, respectively, to obtain the same Q values as the 18 STRs (Figure 3).

Figure 3

Comparison of the cumulative exclusion power (Q) curves calculated for SNPs and STRs considering two mismatch criteria (Q2) for cases of two known parents and matching samples. Markers are listed based on decreasing expected heterozygosity (he).

Discussion

Unrelated animal sampling has been successfully used to determine breed genetic profiles in phylogeographic studies and to estimate general theoretical Q and MP values for DNA identification (traceability, parentage analysis, etc.). Several studies have evaluated and compared the Q and/or MP values obtained for STR and SNP sets (Table S7). Most of them used only representative (unrelated) purebred samples to determine the entire genetic diversity. For example, Heaton analyzed three composite bovine beef groups to identify SNPs useful for animal identification and paternity testing. Werner selected unrelated bulls belonging to three dairy or dual-purpose pure breeds to identify SNPs and estimate their respective allelic frequencies. López Herráez genotyped Galloway animals from different farms and used STRs and SNPs to compare the Q values in the identification of individuals and parental analysis. More recently, Karniol evaluated the statistical power of the 25-plex assay in traceability (identity control) and parentage testing by genotyping unrelated animals from six cattle breeds. These common approaches do not take into account population structure and consanguinity. Furthermore, most of the routine genotyping of livestock done in genetic laboratories consists of the analysis of highly related pedigree animals rather than unrelated animals from beef breeding or dairy farms. In this framework, a marker set should have enough exclusion power to resolve any possible situation, including cases of paternity with multi-putative consanguineous sires. In view of this scenario, and considering that there is generally much more experience in the use of STRs compared with SNPs, in this work we examined the amount of information obtained with SNP and STR markers for paternity testing and genetic identification within a consanguineous commercial Angus herd. Almost all of the SNPs examined were polymorphic, with a mean MAF of 0.328, while more than 50% of the SNPs had a high Q value because both alleles had balanced gene frequencies. These findings were not unexpected given that SNPs from the Illumina BovineHD BeadChip were validated in Angus breeds and showed a high rate of polymorphic loci (573,437 out of 770,000). Comparison of the mean MAF values for the parentage subset of 116 SNPs showed that our inbred population gave a similar result in the Illumina test to that of the Red Angus (MAF = 0.327) and Angus (MAF = 0.346) samples used to validate the chips (ftp.illumina.com). These values ranked in the upper third distribution among 29 breeds (MAF = 0.135 to 0.395), as reported by the manufacturer. The average MAF of the parentage subset was greater than those reported for the entire SNP panel (0.13–0.27), perhaps because this subset had been carefully selected and highly curated for this purpose. The comparison of the two types of markers showed that, in the case of matching samples, two SNPs were necessary to provide the same statistical power as one STR (five STRs and 10 SNPs for a Q2 ≥ 0.999). In the parentage analysis, 2.55 SNPs had a Q value equivalent to one STR when both parents were known and the two exclusion (Q2) criteria were used. In this case, 18 STRs and 46 SNPs were required to reach a Q2 ≥ 0.999. The SNP/STR ratios obtained here were similar to those reported by others using unrelated animals. For example, Werner observed that 37 SNPs provided the same power as a typical, commonly used microsatellite set, whereas Weller reported a ratio of 2–2.25 (25 SNPs were equivalent to 11 microsatellites with five alleles) using simulated data. More recently, Fisher , based on an analysis of simulated data and data from a test Jersey herd, indicated that 40 SNPs (with a mean MAF of 0.35, similar to that observed here) would be at least as effective for parentage matching as the 14 STR panel currently used for parentage testing in New Zealand dairy animals. With regard to the MP, our results agreed with previously published data in that 25 SNPs were equivalent to 11–12 STRs (MP ∼10−11) (Table S7), sufficient to resolve simple cases of genetic identification. However, in routine work, more markers (17–18) are usually needed to resolve complicated cases such as parentage analysis with one known parent and multiple, closely related putative sires. As shown in Table S7, an MP value of 10−13 to 10−15 can be obtained by analyzing 17–18 STRs in a purebred breed, whereas 29–34 SNPs were required to reach an equivalent MP in our inbreeding Angus population. Interestingly, by using 12 and 18 STRs we achieved MP values of 10−11 and 10−14, similar to that obtained with 24 and 31 SNPs, respectively. Recently, Baldo showed that in beef traceability ∼25% more microsatellite markers were needed to identify consanguineous animals vs. unrelated animals. In contrast, our results show that, in this same context, the number of SNPs needed to provide the same Q in consanguineous samples and in the Illumina reference samples would be similar. The difference between these two studies can be explained by the fact that biallelic SNP markers are less affected by consanguinity than multiallelic STRs. In this sense, consanguinity affects the number of alleles first and then gene diversity, thereby easily purging rare STR alleles. In conclusion, our results show that approximately twice as many SNP markers were needed to provide the same effectiveness as STRs for genetic identification and parentage analysis in a consanguineous Angus herd. This ratio is similar to previously reported values and provides evidence that biallelic SNPs are apparently less affected by consanguinity and population structure than STRs. International collaborations by the ISAG and ICAR have sought to select and validate SNPs that can be used in a standard panel for genetic identification in cattle. The results described here provide genetic information that supports the consensus SNP panel developed by the Parentage Recording Working Group of ICAR.

32 in total

1. Detection and characterization of SNPs useful for identity control and parentage testing in major European dairy breeds.

Authors: F A O Werner; G Durstewitz; F A Habermann; G Thaller; W Krämer; S Kollers; J Buitkamp; M Georges; G Brem; J Mosner; R Fries
Journal: Anim Genet Date: 2004-02 Impact factor: 3.169

2. Development of novel SNP system for individual and pedigree control in a Japanese Black cattle population using whole-genome genotyping assay.

Authors: Kazuhiro Hara; Yukari Kon; Shinji Sasazaki; Fumio Mukai; Hideyuki Mannen
Journal: Anim Sci J Date: 2010-08-01 Impact factor: 1.749

3. Effect of consanguinity on Argentinean Angus beef DNA traceability.

Authors: A Baldo; A Rogberg-Muñoz; A Prando; A S Mello Cesar; J P Lirón; N Sorarrain; P Ramelli; D M Posik; E Pofcher; M V Ripoli; E Beretta; P Peral-García; R Vaca; P Mariani; G Giovambattista
Journal: Meat Sci Date: 2010-03-21 Impact factor: 5.209

4. The number of single nucleotide polymorphisms and on-farm data required for whole-herd parentage testing in dairy cattle herds.

Authors: P J Fisher; B Malthus; M C Walker; G Corbett; R J Spelman
Journal: J Dairy Sci Date: 2009-01 Impact factor: 4.034

5. Use of bovine single nucleotide polymorphism markers to verify sample tracking in beef processing.

Authors: Michael P Heaton; James E Keen; Michael L Clawson; Gregory P Harhay; Nathan Bauer; Craig Shultz; Benedict T Green; Lisa Durso; Carol G Chitko-McKown; William W Laegreid
Journal: J Am Vet Med Assoc Date: 2005-04-15 Impact factor: 1.936

6. DNA-based paternity analysis and genetic evaluation in a large, commercial cattle ranch setting.

Authors: A L Van Eenennaam; R L Weaber; D J Drake; M C T Penedo; R L Quaas; D J Garrick; E J Pollak
Journal: J Anim Sci Date: 2007-09-18 Impact factor: 3.159

7. Microsatellite-based parentage control in cattle.

Authors: M L Glowatzki-Mullis; C Gaillard; G Wigger; R Fries
Journal: Anim Genet Date: 1995-02 Impact factor: 3.169

8. Development of a 25-plex SNP assay for traceability in cattle.

Authors: B Karniol; A Shirak; E Baruch; C Singrün; A Tal; A Cahana; M Kam; Y Skalski; G Brem; J I Weller; M Ron; E Seroussi
Journal: Anim Genet Date: 2009-03-09 Impact factor: 3.169

Review 9. A review on SNP and other types of molecular markers and their use in animal genetics.

Authors: Alain Vignal; Denis Milan; Magali SanCristobal; André Eggen
Journal: Genet Sel Evol Date: 2002 May-Jun Impact factor: 4.297

10. A proposal for standardization in forensic equine DNA typing: allele nomenclature for 17 equine-specific STR loci.

Authors: L H P van de Goor; H Panneman; W A van Haeringen
Journal: Anim Genet Date: 2009-10-11 Impact factor: 3.169

32 in total

1. Genetic diversity of Afrikaner cattle in southern Africa.

Authors: Lené Pienaar; J Paul Grobler; Michiel M Scholtz; Hannelize Swart; Karen Ehlers; Munro Marx; Michael D MacNeil; Frederick W C Neser
Journal: Trop Anim Health Prod Date: 2017-10-18 Impact factor: 1.559

2. Genomic diversity and population structure of three autochthonous Greek sheep breeds assessed with genome-wide DNA arrays.

Authors: S Michailidou; G Tsangaris; G C Fthenakis; A Tzora; I Skoufos; S C Karkabounas; G Banos; A Argiriou; G Arsenos
Journal: Mol Genet Genomics Date: 2018-01-25 Impact factor: 3.291

Review 3. Alternatives to amelogenin markers for sex determination in humans and their forensic relevance.

Authors: Hirak R Dash; Neha Rawat; Surajit Das
Journal: Mol Biol Rep Date: 2020-01-25 Impact factor: 2.316

4. Genome-Wide Association Study of Egg-Laying Traits and Egg Quality in LingKun Chickens.

Authors: Jinfeng Gao; Wenwu Xu; Tao Zeng; Yong Tian; Chunqin Wu; Suzhen Liu; Yan Zhao; Shuhe Zhou; Xinqin Lin; Hongguo Cao; Lizhi Lu
Journal: Front Vet Sci Date: 2022-06-20

5. DNA-based Determination of Ancestry in Cynomolgus Macaques (Macaca fascicularis).

Authors: George Q Day; Jillian Ng; Robert F Oldt; Paul W Houghton; David Glenn Smith; Sree Kanthaswamy
Journal: J Am Assoc Lab Anim Sci Date: 2018-08-30 Impact factor: 1.232

6. Analytical and statistical consideration on the use of the ISAG-ICAR-SNP bovine panel for parentage control, using the Illumina BeadChip technology: example on the German Holstein population.

Authors: Ekkehard Schütz; Bertram Brenig
Journal: Genet Sel Evol Date: 2015-02-05 Impact factor: 4.297

7. Extent and direction of introgressive hybridization of mule and white-tailed deer in western Canada.

Authors: Ty Russell; Catherine Cullingham; Mark Ball; Margo Pybus; David Coltman
Journal: Evol Appl Date: 2021-06-01 Impact factor: 5.183

8. Genome-wide association study revealed a promising region and candidate genes for eggshell quality in an F2 resource population.

Authors: Congjiao Sun; Liang Qu; Guoqiang Yi; Jingwei Yuan; Zhongyi Duan; Manman Shen; Lujiang Qu; Guiyun Xu; Kehua Wang; Ning Yang
Journal: BMC Genomics Date: 2015-07-31 Impact factor: 3.969

9. Imputation of microsatellite alleles from dense SNP genotypes for parentage verification across multiple Bos taurus and Bos indicus breeds.

Authors: Matthew C McClure; Tad S Sonstegard; George R Wiggans; Alison L Van Eenennaam; Kristina L Weber; Cecilia T Penedo; Donagh P Berry; John Flynn; Jose F Garcia; Adriana S Carmo; Luciana C A Regitano; Milla Albuquerque; Marcos V G B Silva; Marco A Machado; Mike Coffey; Kirsty Moore; Marie-Yvonne Boscher; Lucie Genestout; Raffaele Mazza; Jeremy F Taylor; Robert D Schnabel; Barry Simpson; Elisa Marques; John C McEwan; Andrew Cromie; Luiz L Coutinho; Larry A Kuehn; John W Keele; Emily K Piper; Jim Cook; Robert Williams; Curtis P Van Tassell
Journal: Front Genet Date: 2013-09-18 Impact factor: 4.599

10. Recent development of allele frequencies and exclusion probabilities of microsatellites used for parentage control in the German Holstein Friesian cattle population.

Authors: Bertram Brenig; Ekkehard Schütz
Journal: BMC Genet Date: 2016-01-08 Impact factor: 2.797