| Literature DB >> 35205275 |
Viktoria Remer1, Elif Bozlak1,2, Sabine Felkel1,2,3, Lara Radovic1,2, Doris Rigler1, Gertrud Grilz-Seger1, Monika Stefaniuk-Szmukier4,5, Monika Bugno-Poniewierska5, Samantha Brooks6, Donald C Miller7, Douglas F Antczak7, Raheleh Sadeghi7,8, Gus Cothran9, Rytis Juras9, Anas M Khanshour9,10, Stefan Rieder11, Maria C Penedo12, Gudrun Waiditschka1, Liliya Kalinkova13, Valery V Kalashnikov13, Alexander M Zaitsev13, Saria Almarzook14, Monika Reißmann14, Gudrun A Brockmann14, Gottfried Brem1, Barbara Wallner1.
Abstract
The Y chromosome is a valuable genetic marker for studying the origin and influence of paternal lineages in populations. In this study, we conducted Y-chromosomal lineage-tracing in Arabian horses. First, we resolved a Y haplotype phylogeny based on the next generation sequencing data of 157 males from several breeds. Y-chromosomal haplotypes specific for Arabian horses were inferred by genotyping a collection of 145 males representing most Arabian sire lines that are active around the globe. These lines formed three discrete haplogroups, and the same haplogroups were detected in Arabian populations native to the Middle East. The Arabian haplotypes were clearly distinct from the ones detected in Akhal Tekes, Turkoman horses, and the progeny of two Thoroughbred foundation sires. However, a haplotype introduced into the English Thoroughbred by the stallion Byerley Turk (1680), was shared among Arabians, Turkomans, and Akhal Tekes, which opens a discussion about the historic connections between Oriental horse types. Furthermore, we genetically traced Arabian sire line breeding in the Western World over the past 200 years. This confirmed a strong selection for relatively few male lineages and uncovered incongruences to written pedigree records. Overall, we demonstrate how fine-scaled Y-analysis contributes to a better understanding of the historical development of horse breeds.Entities:
Keywords: Arabian horse; Y chromosome; foundation sire; genotyping; haplotype; horse breeding; male genealogy; paternal lineage tracing; pedigree
Mesh:
Year: 2022 PMID: 35205275 PMCID: PMC8871751 DOI: 10.3390/genes13020229
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Glossary.
| Allele | Variant or Alternative form of the DNA Sequence at a Given Locus |
|---|---|
| clade | a branch on a phylogenetic tree that is formed from individuals of common descent; synonymous to ‘monophyletic group’ |
| crown Haplogroup | very recently expanded horse MSY HG, predominant in modern breeds |
| founder/foundation sire | in this article this term is used to mean the earliest recorded paternal ancestor of an established sire line |
| genetic marker | a genetic marker is a DNA sequence with a known physical location on a chromosome that is polymorphic and thus informative for differentiating individuals |
| genotype | alleles possessed by an individual at one locus |
| haplotype (HT) | a set of DNA polymorphisms that are inherited together due to linkage; a Y-chromosomal haplotype is characterized by the allelic state at several markers |
| haplotyping | the laboratory process of determining the haplotype of an individual via the genotyping of appropriate markers |
| haplogroup (HG) | a monophyletic group of MSY HTs |
| indel | an insertion or deletion of bases in the genome that occurs at a specific genomic position |
| key variant | in this article, this term entitles the marker selected for haplotyping among markers with tautological readout |
| locus | a location in the genome; references to any sequence or genomic region, including non-coding regions |
| mitochondrial DNA (mtDNA) | circular, double-stranded DNA located in the matrix of a mitochondrion; the mtDNA is inherited uniparentally from mother to offspring |
| modern breed | in this article, this term is used for breeds that were developed or created within the westernized industry of horse breeding, spanning the recent 200 years; here it applies to Arabians, Thoroughbreds, Warmbloods, Central European and British Coldbloods, Central European and British Ponies, Baroque Type, Iberian, and New World breeds |
| most recent common ancestor (MRCA) | the sequence status from which all HT variations in a clade descend |
| male-specific region of the Y chromosome (MSY) | the non-recombining region of the Y chromosome which is inherited as a single linkage group from father to sons |
| patriline | the line of descent traced through the paternal side of the pedigree; synonymous with ‘male-tail line’ |
| parsimony | a principle when drawing phylogenetic trees where the most likely tree is the one with the fewest evolutionary changes |
| pedigree | the record of descent of a horse |
| phylogeny | a branching tree showing the evolutionary relationships among a set of DNA sequences |
| sequence assembly | recreation of the original genome from the sequenced reads |
| short tandem repeats (STRs) | a tract of repetitive DNA in which short DNA motifs (ranging in length from one to six or more base pairs) are repeated; synonymous to ‘microsatellites’ |
| single copy Y (scY) regions | well-explored regions of the MSY that are screenable with short-read data |
| single nucleotide variant (SNVs) | a variant of a single nucleotide that occurs at a specific genomic position |
| sire line | members of a sire line descend from the same foundation sire in their patriline (male-tail line); hence the breeding influence of a foundation sire is apparent from the sire line distribution range |
| subline HT | those HTs which emerged recently, within the pedigree supported timeframe, from a new mutation and are therefore unique markers for a particular stallion |
| target enriched sequencing | NGS sequencing of a specific (desired) regions of the genome |
| variant calling | identification of variation, mostly SNVs and Indels, present in sequence data by comparing NGS data from individuals with a reference sequence |
| Y chromosome | one of the two sex chromosomes; determines male sex in horses |
Figure A1Target-enriched sequencing depth.
Versions of the NGS tools used in data analysis.
| Tool | Reference | Version |
|---|---|---|
| AdapterRemoval | [ | 2.3.1 |
| ReadTools | [ | 0.2.1.r_716422a3 |
| bwa | [ | 0.7.17 |
| samtools | [ | 1.10 |
| GenomeAnalysisTK | [ | 3.7 |
| bedtools | [ | 2.27.1 |
| IGV | [ | 2.5.3 |
| freebayes | [ | 1.3.2-46-g2c1e395 |
| Network | [ | 10.2 |
Figure A2NGS data analysis pipeline. Two NGS data sets were merged to produce the final structure: mappings of 118 WGS males to LipY764 from a previous study and newly generated TES data of 39 males. The 77 novel variants ascertained in the TES dataset were merged with 2199 previously defined variants resulting in a total of 2267 variants. Those variants were genotyped in the mapping files, missing positions imputed, and HTs constructed by concatenating the polymorphic sites. HTs were visualized in a network format. A total of 118 variants was selected for genotyping. Detailed descriptions of the workflow, including programs, and parameters, is provided in the Material and Methods section.
Explanation of HG/HT nomenclature. HG/HT-determining variants are provided in Figure 2 and Table S3.
| Major Crown Clades | Haplogroups | Subhaplogroups and Haplotypes Detectable with Genotyping | Comments Regarding Nomenclature |
|---|---|---|---|
| Clade A was first described in an Arabian horse ‘A’ | |||
| Ad-b | A draft: british | ||
| Ad-h | A draft: heavy | ||
| Am | A marchador | ||
| Ao-aA | A original: arabian Autochthonous | ||
| Ao-aA1a*, Ao-aA1a1, Ao-aA1a2, Ao-aA1a2a, Ao-aA1a3, Ao-aA1a4, Ao-aA1a5, Ao-aA1b, Ao-aA2, Ao-aA3 | |||
| Ao-aD | A original: arabian Duelmener | ||
| Ao-aD1, Ao-aD2 | |||
| Ao-aM | A original: arabian Marwari | ||
| Ao-n | A original: noriker | ||
| Clade H was first described in a Spanish (H) Sorraia horse | |||
| Hc | H china | ||
| Hs | H spanish | ||
| Hs-a | H spanish: autochthonous | ||
| Hs-b | H spanish: barb | ||
| Clade T was first described in Thoroughbreds (T) | |||
| Ta | T arabian | ||
| Ta* | |||
| Ta-s | T arabian: shagya | ||
| Ta-b | T arabian: bairactar | ||
| Ta-bA | |||
| Tb-d | T(horough)bred: darley | ||
| Tb-dM | T(horough)bred: darley Mambrino | ||
| Tb-dW | T(horough)bred: darley Whalebone | ||
| Tb-dW1, Tb-dW2, Tb-dW3, Tb-dW4 | |||
| Tb-o | T(horough)bred: other | ||
| Tb-oB | T(horough)bred: other Byerley/Godolphin | ||
| Tb-oB1*, Tb-oB1a, Tb-oB1b, Tb-oB1c, Tb-oB2, Tb-oB3*, Tb-oB3a*, Tb-oB3a1, Tb-oB3b*, Tb-oB3b1*, Tb-oB3b1a, Tb-oB3b1b, Tb-oB3b1c, Tb-oB4 | |||
| Tb-oL | T(horough)bred: other Lipizzan | ||
| Tk | T kladruber | ||
| Tu | T ubiquitous | ||
| Non-crown Haplogroups | |||
| I | Icelandic horse | ||
| N | Northern Europe | ||
| P | Przewalski |
Figure 1NGS HT Network. The MSY HT network from 118 WGS sequenced males and 39 TES males, based on a total of 1639 variants (281 crown, 1358 outside the crown). HTs are indicated as circles, with circle size being proportional to frequency. HT-IDs and sample information are provided in Table S1. HTs first described in this study based on TES data are shown in bold. Variants are indicated on branches and underlined when selected for genotyping. The position of the crown MRCA is marked with a cross. The 14 crown HGs are indicated in the outer circle, with the breeds listed beside them. Blue HTs were detected in Arabians, and light blue HTs were detected in a horse that traced back to an imported Arabian in the paternal lineage. The signatures of the three founders of the English Thoroughbred (Tb-oB1, Tb-oB31, and Tb-d) are marked with red lines.
Y chromosome haplotypes in occidental Arabian horse lines. Full information about the 145 males sampled (81 registered Arabians and 67 from other breeds), including the tail-male line, is given in Table S5. See Figure 2 for genetic relationships among the haplotypes.
| Foundation Sire 1 | Imported | Line Represented | Registered Arabians | Breed | Y-Chromosomal | Remarks |
|---|---|---|---|---|---|---|
| 1885 Hungary | Piolun, 1934, Poland | Austria(1), Russia(5) | Trak(2) | Ao-aA1a* | ||
| India, | Jussuf I, 1962, Shagya Arabian | ShA(2) | Ao-aA1a* | |||
| Bahrain | Dhahmaan Alawwal, 1938, Bahrain | Austria(2), Poland(1) | Ao-aA1a* | |||
| 1840 Germany | Rex II 372, 1941, | FH(4) | Ao-aA1a* | |||
| Egyptian | Mahruss II, 1893, Egypt | Germany(1) | Ao-aA1a* | |||
|
| 1900 Poland | Aquinor, 1951, Poland | Poland(3) | Ao-aA1a* | ||
|
| Egyptian | Ansata Ibn Halima, 1958, Egypt | Egypt(1), Qatar(3), Poland(2), | Ao-aA1a1 | ||
| Egyptian | Akhtal, 1968, Egypt | Qatar(2) | Ao-aA1a1 | |||
| 1814 Austria | Siglavy Monterosa, 1907, | Lip(7), Kl(1) | Ao-aA1a2 | |||
| 1907 Poland | Negatiw, 1945, Russia | Austria(1), Iran(2), Poland(2), | Ao-aA1a3 | |||
| 1902 Hungary | Siglavy Bagdady VI, 1949, Babolna | Germany(1) | ShA(1), PA(1) | Ao-aA1a5 | ||
| 1852 Hungary | Dahoman XVI, 1904, | ShA(1) | Ao-aA3 | |||
| Egyptian Foundation horse | Anter, 1946, Egypt | Germany(1) | Ao-aD2 | |||
|
| 1876 Poland | Enwer Bey, 1923, Poland | Poland(2) | Ao-aD2 | ||
| 1923 Great Britain | Bey Shah, 1976, USA | Qatar(1), Poland(2) | Ao-aD2 | |||
| 1931 Poland | Bask, 1956, Poland | Poland(2), Qatar(1) | PA(1) | Ao-aD2 | ||
| 1897 Hungary | Habdan XI, 1954, | ShA(1) | Ao-aD2 | |||
| 1852 Lipizza | Gazal VII, 1944, Shagya Arabian | ShA(6) | Ta* | |||
| 1885 Hungary | O’Bajan VII-4 530, 1936, | ShA(2) | Ta* | |||
| 1836 Hungary | Shagya IV, 1875, | ShA(3) | Ta-s | |||
| 1902 Hungary | Mersuch IV, 1936, Shagya Arabian | ShA(1) | Ta-b | |||
| 1817 Germany | Arax, 1952, Poland | Poland(1), Qatar(1), Russia(1) | Ta-b | |||
| 1902 Poland | Saludo, 1954, Spain | Qatar(1) | PA(1) | Ta-b | ||
| 1908 Spain | Tabal, 1952, Spain | Germany(1) | T2* | |||
| 1931 Poland | Comet, 1953, Poland | Germany(1), Poland(1) | Tb-oB1* | |||
| 1909 France | Baroud II, 1927, France | Qatar(3), Russia(1) | Tb-oB1* |
1 Name of Foundation Sire, Strain, Bedouin tribe, or breeder (country). 2 The number of horses sampled is given in parenthesis. 3 The number of horses sampled is given in parenthesis. Abbreviations: Trakehner (Trak), Shagya Arabian (ShA), Fredriksborg Horse (FH), Knabstrupper (Ks), Warmblood (Wb), Lewitzer (Le), Lipizzaner (Lip), Kladruber (Kl),Partbred Arabian (PA), Pintabian (Pi). 4 Incongruence between tail-male line documentation and HT.
Figure 2MSY HTs in Arabian sire lines. Simplified crown HT network based on 118 variants. Genotyping results from 145 males are shown in blue circles with size proportional to frequency. Details on samples are given in Table 1, and Table S5 (samples are indicated in Column J; Foundation sires shown in Column AI). Foundation sires are shown for each HT, with the number of samples for each line in parenthesis. HTs/foundation sires that are only active in breeds other than Arabians are given in light blue. Genealogies of the English Thoroughbred founders are outlined with red crosses and the thoroughbred specific subhaplogroups with the branches in red.
Figure 3Genealogical cases. (a) Paternal genealogies of 15 genotyped male horses after Siglavy, 1810. (b) Paternal genealogies of eight genotyped male horses after Ilderim db. Dotted lines indicate that at least one generation is omitted. Abbreviation of horse breeds other than Arabian is given by: L = Lipizzaner, ShA = Shagya Arabian, Trak = Trakehner, AA = Anglo Arabian. The number of genotyped horses and HTs is listed on the bottom (dark HTs were detected in Arabians, light blue HTs in other breeds). The complete tail-male line reconstruction is provided in Table S5.
Figure 4MSY Haplotypes in globally active Arabian lines, Middle Eastern Arabians, and other breeds. Haplogroup (bold) and haplotype distribution in breed or breed groups in absolute numbers (N = total number). Thoroughbred HG/HTs are marked in red, Arabian in blue, and Akhal Teke/Turkoman in yellow. The Tb-oB1* subhaplogroup was detected in Thoroughbreds, Akhal Teke/Turkoman, and in a small subset of Arabians, but the ancestry of horses carrying this haplogroup remains unresolved.