Literature DB >> 33196682

Selection signatures in two oldest Russian native cattle breeds revealed using high-density single nucleotide polymorphism analysis.

Natalia Anatolievna Zinovieva1, Arsen Vladimirovich Dotsev1, Alexander Alexandrovich Sermyagin1, Tatiana Evgenievna Deniskova1, Alexandra Sergeevna Abdelmanova1, Veronika Ruslanovna Kharzinova1, Johann Sölkner2, Henry Reyer3, Klaus Wimmers3, Gottfried Brem1,4.   

Abstract

Native cattle breeds can carry specific signatures of selection reflecting their adaptation to the local environmental conditions and response to the breeding strategy used. In this study, we comprehensively analysed high-density single nucleotide polymorphism (SNP) genotypes to characterise the population structure and detect the selection signatures in Russian native Yaroslavl and Kholmogor dairy cattle breeds, which have been little influenced by introgression with transboundary breeds. Fifty-six samples of pedigree-recorded purebred animals, originating from different breeding farms and representing different sire lines, of the two studied breeds were genotyped using a genome-wide bovine genotyping array (Bovine HD BeadChip). Three statistical analyses-calculation of fixation index (FST) for each SNP for the comparison of the pairs of breeds, hapFLK analysis, and estimation of the runs of homozygosity (ROH) islands shared in more than 50% of animals-were combined for detecting the selection signatures in the genome of the studied cattle breeds. We confirmed nine and six known regions under putative selection in the genomes of Yaroslavl and Kholmogor cattle, respectively; the flanking positions of most of these regions were elucidated. Only two of the selected regions (localised on BTA 14 at 24.4-25.1 Mbp and on BTA 16 at 42.5-43.5 Mb) overlapped in Yaroslavl, Kholmogor and Holstein breeds. In addition, we detected three novel selection sweeps in the genome of Yaroslavl (BTA 4 at 4.74-5.36 Mbp, BTA 15 at 17.80-18.77 Mbp, and BTA 17 at 45.59-45.61 Mbp) and Kholmogor breeds (BTA 12 at 82.40-81.69 Mbp, BTA 15 at 16.04-16.62 Mbp, and BTA 18 at 0.19-1.46 Mbp) by using at least two of the above-mentioned methods. We expanded the list of candidate genes associated with the selected genomic regions and performed their functional annotation. We discussed the possible involvement of the identified candidate genes in artificial selection in connection with the origin and development of the breeds. Our findings on the Yaroslavl and Kholmogor breeds obtained using high-density SNP genotyping and three different statistical methods allowed the detection of novel putative genomic regions and candidate genes that might be under selection. These results might be useful for the sustainable development and conservation of these two oldest Russian native cattle breeds.

Entities:  

Mesh:

Year:  2020        PMID: 33196682      PMCID: PMC7668599          DOI: 10.1371/journal.pone.0242200

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

National livestock genetic resources are the most important carriers of the unique forms of variability; they are necessary to ensure the sustainability of local agricultural production systems [1]. In Russia, over a long period, various cattle breeds have formed, which are characterised by good adaptation to the local environmental conditions and better suitability for the development of geographic production systems [2]. Among the Black-and-White cattle breeds, Kholmogor and Yaroslavl dairy cattle have remarkable importance for animal husbandry in the Central and Northern regions of European Russia. Both breeds originated from small-stature Great Russian Cattle (height at withers is 105–110 cm, body weight up to 325–390 kg [3, 4]. The first records of the presence of cattle on the islands in the upper reaches of the Northern Dvina River near the Kholmogory town in Northern Russia date back to the sixteenth century (the breed was named after this town) [5]. Over a long history, the good pastures and hayfields in the Kholmogor breeding zone have contributed to the development of the tall cattle breed well suited for dairy production. In the 18th–19th centuries, the Kholmogor cattle were well-known in Russia and abroad owing to their high productivity and excellent quality of dairy products. Cattle breeding began to develop in the Yaroslavl region (near Moscow), based on which the breed is named, in the seventeenth and eighteenth centuries [5]. The Yaroslavl cattle were selected for adaptability, high feed efficiency and good reproductive ability even in poor forage conditions during winter. Moreover, this breed was in great demand among peasants and small cattle holders [4]. Most researchers suggested that foreign breeds, including Holsteins contributed little in the development of the Yaroslavl and Kholmogor cattle populations [6, 7], which was subsequently confirmed by molecular genetics studies [8-10]. The studies of the historical specimens of Yaroslavl and Kholmogor cattle, dated from the end of 19th to the first half of 20th century using microsatellites, indicated that Russian breeds and Holsteins were developed from different ancestral populations [11]. In the 1960s, the number of Kholmogor and Yaroslavl cattle was close to million individuals (993 and 951 thousand, respectively) each; in contrast, in 2015, their census population size remarkably decreased to 222.9 and 50.5 thousand heads, respectively [12]. During the last decade, along with the remarkable decrease in population size, crossbreeding has been actively used in the remaining purebred animals, which could lead to the loss of genetic distinctness. Several molecular genetic studies have attempted to characterise the demographic history and population structure of the modern populations of Yaroslavl and Kholmogor breeds and their relationship to other taurine breeds [9, 10, 13, 14], but their results were, in some cases, controversial. By using microsatellite-based clustering, Li and Kantanen [13] suggested the composite origin of these breeds. Whole-genome studies performed using medium-density SNP chips showed that Yaroslavl and Kholmogor cattle are valuable national genetic resources, which have been little influenced by introgression with non-Russian breeds [9, 10]. The maintenance of a visible part of historical components in the modern populations both of Yaroslavl and Kholmogor breeds was confirmed by the study of historical specimens [11]. Further elucidation of the genetic architecture of these two breeds and more precise identification of genomic regions that are affected by putative selection may be achieved by using more powerful molecular genetic tools such as bovine high-density SNP BeadChip (Illumina Inc., USA) which contains 777,962 SNPs. Several studies have explored the genome diversity and adaptation of cattle breeds by using high-density DNA arrays [15-17]. However, the patterns of genetic variation within Russian cattle breeds have not yet been assessed using this powerful genetic tool. Identification of the genome regions exhibiting evidence of being subjected to selective pressure during the breeding process is of particular interest [18]. Detailed analysis of these ‘signatures of selection’ can help to elucidate the breed history and to identify genes responsible for the adaptive and economically significant traits in livestock populations [19]. Studies have highlighted the importance of utilising multiple methodologies for detecting regions of the genome that have been the target of selection [20], including fixation index (FST), hapFLK analysis and identification of runs of homozygosity (ROH) islands. FST quantifies the differences in allele frequencies between populations [21, 22] and is routinely used for identifying highly differentiated alleles [23, 24]. The hapFLK statistics is a haplotype-based method for the detection of positive selection in multiple population data [25]. ROH islands are genomic regions with high homozygosity around the selected locus that might harbour targets of positive selection [26-28]. Several studies have focused on identifying signatures of selection in transboundary and local cattle populations by using high-throughput SNP genotypes and FST [29-31], hapFLK, and ROH statistics [32]. However, only a limited number of studies have thus far attempted to identify signatures of selection in Russian cattle breeds, and they were based on the medium-density SNP genotypes [33, 34]. In this study, we performed the comprehensive analysis of high-density SNP genotypes to characterise the population structure and detect the signatures of selection in two old Russian cattle breeds (Yaroslavl and Kholmogor), by using Holsteins as the reference. The three methods (FST, hapFLK, and ROH) were implemented to detect the genome regions that could be under putative selection. We identified the regions revealing signatures of selection in each of the two breeds, annotated the genes localised within or close to the selected regions, and performed a functional study of these genes. Our results might be useful for the genetic improvement of Yaroslavl and Kholmogor cattle breeds and developing programs for the conservation of their biodiversity.

Materials and methods

Ethics statement

This study does not involve any endangered or protected species. Samples were derived in part from the Bioresource Collection of the L.K. Ernst Federal Science Center for Animal Husbandry, supported by the Russian Ministry of Science and Higher Education. Blood samples (5 ml of whole blood) were collected for routine diagnostic purposes by trained personnel under strict veterinary rules. Sperm samples were provided by the artificial insemination (AI) stations according to specific scientific collaboration agreements. The genotyping data supporting the results of this study are deposited in the Dryad Digital Repository (https://doi.org/10.5061/dryad.vt4b8gtq9).

Sampling and DNA extraction

The pedigree records were analysed to identify the purebred animals of Yaroslavl (YRSL) and Kholmogor (KHLM) breeds (S1 Fig). Only individuals having purebred ancestors on the both sides of pedigrees at least in three generations were selected for the analyses. To cover more breed variability, we sampled the animals originating from different breeding farms and representing different sire lines for each of the two breeds. Blood or semen samples were collected from 56 animals, including 31 samples of Yaroslavl breed and 25 samples of Kholmogor breed. DNA was extracted using Nexttec columns (Nexttec Biotechnology GmbH, Germany), according to manufacturer’s instructions. The concentrations of dsDNA solutions were measured using a Qubit 3.0 fluorometer (Thermo Fisher Scientific, Wilmington, DE, USA). The purity of DNA samples was checked by determining the OD ratio by using NanoDrop-2000 (Thermo Fisher Scientific, Wilmington, DE, USA). In addition, the DNA quality was checked using 1% agarose gel electrophoresis.

SNP genotyping and quality control

The Illumina Bovine HD BeadChip (Illumina, San Diego, CA, USA) was used for genotyping. High-density SNP genotypes for European Holsteins (n = 25) obtained from Bahbahani et al. [16, 35] were included in the dataset as the reference. R software [36] was used to create input files. Valid genotypes for each SNP were determined by setting the cut-off of 0.5 for the GenCall (GC) and GenTrain (GT) scores [37]. PLINK 1.9 software [38] was used to perform SNP quality control. All animals were subjected to filtering for genotyping efficiency (—mind 0.1). The SNPs genotyped in less than 90% of the samples (—geno 0.1) and those located on sex chromosomes were excluded from the analysis. The final data set used for analysis included 650,134 autosomal SNPs. Additional filters for linkage disequilibrium (LD) and minor allele frequency (MAF) were used to calculate several statistical parameters (see below). The positions of SNPs were assigned according Bos taurus genome assembly UMD 3.1.1 (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6).

Genetic diversity

Within-population genetic diversity was evaluated using 133,685 SNPs after LD-based pruning by using PLINK 1.9 [38]. One locus from each SNP pair where the LD (r) exceeded 0.5 within 50 SNP windows was removed (—indep-pairwise 50 5 0.5 flag, where 50 is the size of the sliding window, 5 is the number of SNPs shifted in each step, and 0.5 is the r threshold). We calculated the observed heterozygosity (H), unbiased expected heterozygosity (H) [39], rarefied allelic richness (A), and inbreeding coefficient (F) based on the unbiased expected heterozygosity by using R package “diveRsity” [40]. The genomic inbreeding coefficient based on ROH (F) was computed as the ratio of the sum of the length of all ROHs per animal to the total autosomal SNP coverage (2.51 Gb; for ROH estimation, see section ‘Runs of homozygosity’ below).

Effective population size

Trends of effective population size were estimated from LD by using the SNeP tool, as described by Barbato et al. [41]. We applied default parameters, except for sample size correction, mutation occurrence (α = 2.2) [42], and recombination rate between a pair of genetic markers, according to Sved and Feldman [43]. To determine the rate of N changes in the 50 most recent generations, we determined the N changing ratio (NC) by calculating the slope of each segment that links a pair of neighbouring N estimates and normalising the values by using the median of the most recent 50 N estimates. Only SNPs with an MAF of >0.05 were used to estimate the effective population size.

Runs of homozygosity

A window-free method for consecutive SNP-based detection (consecutive runs method [44] implemented in R package “detectRUNS” [45] was used for the estimation of ROHs. We allowed one SNP with missing genotype and up to one possible heterozygous genotype in one run to avoid the underestimation of the number of ROHs that were longer than 8 Mb [46]. The minimum ROH length was 1000 kb. For minimising false-positive results, we calculated the minimum number of SNPs (l) as was initially proposed by Lencz et al. [47] and then by Purfield et al. [48]: ns = the number of genotyped SNPs per individual; ni = the number of genotyped individuals; α = the percentage of false-positive ROHs (set to 0.05 in our study), and het = the mean heterozygosity across all SNPs. In our case, the minimum number of SNPs was 31. The ROH number was estimated for each individual and then distributed between the following ROH length classes: (1, 2), (2, 4), (4, 8), (8, 16), and >16. To estimate the proportion of genome covered by different ROH segments, we calculated the sum of ROHs for different minimum lengths (>1, >2, >4, >8, or >16 Mb). The number and length of ROHs were averaged per animal within each breed.

Principal component analysis, Neighbor-Net, and admixture

To characterize the relationship within and between populations at the genomic level and to assess the individuals with regard to their assignment to the corresponding breed, we performed principal component analysis (PCA), Neighbor-Net clustering, and model-based clustering (admixture). PLINK v1.9 software was used to perform PCA, and R package “ggplot2” was used to visualise the results [49]. A Neighbor-Net tree was constructed on the basis of pairwise identical-by-state (IBS) distances in SplitsTree4 [50]. We employed Admixture v1.3 [51] for genetic admixture and R package “pophelper” [52] for plotting the results. The number of assumed ancestral populations (K) was estimated on the basis of the lowest cross-validation (CV) error compared to other K values by using a standard admixture cross-validation procedure [53].

Selection signature analysis

We used three different statistics for detecting the signatures of selection in the genome of the studied cattle populations: calculation of the fixation index (F) for each SNP for comparison of the pairs of breeds, hapFLK analysis, and estimation of the ROH islands, which were overlapped among different animals within each breed. We calculated pairwise genetic differentiation (F) [53] between all SNPs in pairs of the studied cattle breeds (KHLM/HLST, YRSL/HLST, and KHLM/YRSL) by using PLINK 1.9. We applied a low threshold for MAF of less than 5% (—maf 0.05), because the filtering of SNPs based on MAF may affect the probability of identifying alleles related to selection [33]. The data set used for F analysis included 567,206 autosomal SNPs. The top 0.1% F values were used to represent the selection signature, as was previously considered by Kijas et al. [18] and Zhao [24]. The hapFLK method considers the haplotype structure of the population [25]. We used hapFLK 1.4 program [25] to detect the signatures of selection through haplotype differentiation among the studied populations. The number of haplotype clusters per chromosome was determined in fastPHASE by using cross-validation-based estimation and was set at 25 [54]. For detailed analyses, we selected the hapFLK regions containing at least one SNP with p-value threshold of 0.001 (-log10(p) > 3). The hapFLK regions carrying at least one SNP with p-value threshold of 0.00001 (-log10(p) > 5) were defined as the strongest hapFLK regions. To limit the number of false positives, we calculated q-values by using R package qvalue [55]. We applied a q-value threshold of 0.05 (corresponds to a false discovery rate of 5%). We used R package “detectRUNS” [45] for ROH estimation. We selected the overlapping ROH segments with the minimal ROH length of 0.3 Mb shared by more than 50% of the samples as an indicator of possible ROH islands in the genome. Special attention was paid to the ROH islands shared by more than 70% of the samples, which were defined as the strongest ROH islands. The results obtained by the three different methods (F, hapFLK, and ROH) were compared, and genomic regions selected by at least two different methods were identified. Moreover, we compared our results with the data previously obtained by Yurchenko et al. [34] by using de-correlated composite of multiple signals (DCMS) statistics [56] for the presence of the overlapping genomic regions. Besides, we compared the identified genomic regions with the meta-assembly of selection signature in cattle [57].

Identification of genes and gene function within selected regions of the genome

The F statistic windows of 0.4 Mb (0.2 Mb upstream and 0.2 Mb downstream of the top 0.1% SNPs) were investigated to select overlapping gene segments. For hapFLK and ROH analyses, the genes were selected if they were localised entirely or in part within the identified genomic regions. Gene identification was performed using Bos_taurus_UMD_3.1.1 genome assembly (). The selected genes were considered as candidate genes. In addition, genes were compared with quantitative trait locus (QTL) regions included in the Cattle QTL database () [58]. Phenotypes that are known to be associated with the identified candidate genes were inferred from the most recent literature, by integrating PubMed database of the National Center of Biotechnological Information (). The information for the annotation of some specific genes was sourced from Gene Cards ().

Results

The estimation of the average heterozygosity within each breed indicated significant genetic diversities among the studied breeds (Table 1).
Table 1

Summary statistics for the studied breeds calculated on the basis of the high-density SNP genotypes.

BreedAcronymnHOUHEUFIS [CI 95%]FROHARNE5
YaroslavlYRSL310.354±0.00010.349±0.0001-0.013 [-0.014; -0.012]0.103±0.0061.967±0.0001111
KholmogorKHLM250.370±0.00010.361±0.0001-0.022 [-0.023; -0.021]0.059±0.0031.977±0.0001130
HolsteinHOL250.358±0.00010.354±0.0001-0.01 [-0.011; -0.009]0.108±0.0061.966±0.000195

Note: n, the number of individuals; HO, observed heterozygosity; UHE, unbiased expected heterozygosity; UFIS, the inbreeding coefficient; FROH, inbreeding coefficient, calculated on the basis of ROH; AR, allelic richness; NE5, effective population size 5 generations ago.

Note: n, the number of individuals; HO, observed heterozygosity; UHE, unbiased expected heterozygosity; UFIS, the inbreeding coefficient; FROH, inbreeding coefficient, calculated on the basis of ROH; AR, allelic richness; NE5, effective population size 5 generations ago. The average level of UHE for the Yaroslavl breed was lower, whereas that for the Kholmogor breed was higher than that of Holsteins. The calculation of UFIS indicated heterozygote excess in all populations beyond the expected number of heterozygotes. The UFIS values were similar for Yaroslavl and Holstein breeds, whereas the Kholmogor breed had higher heterozygote excess. The estimation of inbreeding coefficients based on FROH showed a similar pattern. That is, the FROH values for Yaroslavl and Holstein breeds were significantly higher than those of the Kholmogor breed (0.103 and 0.108 vs 0.059, p < 0.001). Higher FROH values are associated with increased degree of inbreeding. We did not observe differences in allele richness (AR) between Yaroslavl and Holstein breeds; in contrast the Kholmogor breed had higher AR values. The Yaroslavl and Kholmogor breeds had higher effective population size (NE5 = 111 and 130, respectively) than that of Holsteins (NE5 = 95) (Table 1). The NE values for Kholmogor breed declined over time, whereas the lowest NE value was observed for Yaroslavl breed 5 generations ago (NE = 111) with the subsequent increase to 118 at 3 generation ago (Fig 1A, S2 Fig).
Fig 1

The effective population size (N) across generations from approximately 50 generations ago based on linkage disequilibrium (LD) calculations (a) and NE slope (b). Note: Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins.

The effective population size (N) across generations from approximately 50 generations ago based on linkage disequilibrium (LD) calculations (a) and NE slope (b). Note: Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins. All the three studied breeds had a similar pattern of the NE slope changes up to 7 generations ago. We observed 3 common peaks in NE changes characterised by accelerated decline of the effective population size 23, 15, and 7 generations ago. The most recent peak in NE for Kholmogor breed was found 4 generations ago (Fig 1B).

Estimates of ROHs and FROH

The ROH segments were found in all the studied breeds, with an average number of 71.84 ± 3.27 segments in Yaroslavl breed and 58.96 ± 2.38 segments in Kholmogor breed and mean ROH length of 3.59 and 2.50 Mb, respectively. In Holsteins, the values of the above-mentioned estimations were 74.60 ± 2.28 segments and 3.62 Mb, respectively. Short segments (ROH1–2 Mb), caused by common ancestors around 25 to 50 generations ago, were the most distributed through the genome and accounted for 38.73% in Yaroslavl breed and 57.26% in Kholmogor breed of all ROHs identified, although the proportion of the genome covered by ROHs of this length class was relatively small (15.0% and 31.1%, respectively). In Holsteins, 44.97% of ROHs were characterised by the shortest ROH segments, which covered 17.2% of the genome (Fig 2, S1 Table).
Fig 2

Descriptive statistics of runs of homozygosity (ROH) by ROH length class.

Note: Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins. (A). Number of ROHs by ROH length class: axis X, ROH classes (1–2 Mb, 2–4 Mb, 4–8 Mb, 8–16 Mb, and >16 Mb); axis Y, mean number of ROHs; (B). Length of ROHs by ROH length class: axis X, ROH classes (>1 Mb, >2 Mb, >4 Mb, >8 Mb, and >16 Mb); axis Y, mean length of ROHs.

Descriptive statistics of runs of homozygosity (ROH) by ROH length class.

Note: Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins. (A). Number of ROHs by ROH length class: axis X, ROH classes (1–2 Mb, 2–4 Mb, 4–8 Mb, 8–16 Mb, and >16 Mb); axis Y, mean number of ROHs; (B). Length of ROHs by ROH length class: axis X, ROH classes (>1 Mb, >2 Mb, >4 Mb, >8 Mb, and >16 Mb); axis Y, mean length of ROHs. The proportion of ROH segments with the greatest length (>16 Mb), typically caused by inbreeding to a very recent ancestor, as in parent–offspring, half-sib mating, or first cousin mating, was extremely low in the Yaroslavl and Holstein breeds (0.65% and 0.10% of the total number of ROHs, respectively). The genome coverage by the longest ROHs was the lowest and averaged 9.6% in the Yaroslavl breed. We did not find long ROH segments (>16 Mb) in the Kholmogor breed.

Breed relationship and admixture

The results of PCA revealed well-separated clusters corresponding to each of the three breeds. The first component, which explained 7.02% of the genetic variability, split the Yaroslavl breed from the Kholmogor breed and Holsteins, suggesting an early divergence of Yaroslavl from old Friesians; in contrast, the Kholmogor breed was separated from Holsteins by the second component, which was responsible for 5.03% of diversity (S3A Fig). A Neighbor-Net tree based on pairwise IBS distances showed breed-specific distribution of individuals between three branches, which joined individuals of the same breed (S3B Fig). To identify ancestry and admixture level among the studied breeds, we performed admixture analysis for the number of clusters (K) from 1 to 5 (S3C Fig). At K = 2, the Yaroslavl breed was clearly divided from Holsteins, confirming its earlier divergence. At K = 3 (corresponding to the minimal value of CV error; S3D Fig), each of the three breeds formed their own clusters. Only slight level of admixture was observed in the several animals of Yaroslavl breed, whereas the Kholmogor breed was characterised by slightly greater level of admixture.

FST statistics

We observed variation in genetic differentiation between breeds, based on FST through the genome (Fig 3A–3C).
Fig 3

Genomic distribution of FST values estimated between pairs of breeds.

(A). Yaroslavl and Kholmogor breeds. (B). Yaroslavl and Holstein breeds. (C). Kholmogor and Holstein breeds. Note: Axis X, cattle autosomes (the breadth of autosomes corresponds to their length); axis Y, FST values. SNPs were plotted relative to their positions within each autosome. The horizontal line on each plot indicates the threshold, which was estimated as the top 0.1% for FST values. The SNPs with FST beyond the cut-off value were considered to be under selection pressure.

Genomic distribution of FST values estimated between pairs of breeds.

(A). Yaroslavl and Kholmogor breeds. (B). Yaroslavl and Holstein breeds. (C). Kholmogor and Holstein breeds. Note: Axis X, cattle autosomes (the breadth of autosomes corresponds to their length); axis Y, FST values. SNPs were plotted relative to their positions within each autosome. The horizontal line on each plot indicates the threshold, which was estimated as the top 0.1% for FST values. The SNPs with FST beyond the cut-off value were considered to be under selection pressure. The average pairwise FST values between breeds were 0.090 for Yaroslavl and Kholmogor breeds, 0.107 for Yaroslavl and Holstein breeds, and 0.081 for Kholmogor and Holstein breeds. We identified 561, 564, and 565 SNPs with FST beyond the cut-off value for the above-mentioned pairs of breeds, respectively, and thus considered them to be under selection pressure (S2 Table). The selected SNPs included single SNPs as well as SNP blocks consisting of several neighbour SNPs. The greatest number of top 0.1% SNPs for YRSL/HOL comparison was obtained on chromosomes 2, 16, 6, 11, 1, 26, 5, 20, 17, and 18 (74, 45, 39, 37, 33, 33, 30, 30, 28, and 24 SNPs, respectively), and the lowest was obtained on chromosomes 25, 22, 9, 27, and 21 (5, 3, 2, 1, and 0 SNPs, respectively). The KHLM/HOL comparison revealed the greatest number of top 0.1% SNPs on chromosomes 20, 8, 29, 2, 7, 5, 16, 1, 10, and 6 (60, 59, 48. 46, 34, 30, 30, 27, 26, and 25 SNPs), whereas the lowest were on chromosomes 25, 22, and 28 (5, 3, and 0 SNPs, respectively). Although the distribution of selected top 0.1% SNPs was breed specific in most cases, we identified a small number of SNPs which were under selection pressure in two pairs of breeds, including 37 SNPs for YRSL/KHLM and YRSL/HOL, 28 SNPs for YRSL/HOL and KHLM/HOL, and 10 SNPs for KHLM/HOL and YRSL/KHLM (Fig 4, S3 Table).
Fig 4

Venn diagram illustrating the distribution of the top 0.1% of SNPs by FST value between pairs of breeds.

Note: Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins.

Venn diagram illustrating the distribution of the top 0.1% of SNPs by FST value between pairs of breeds.

Note: Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins.

The hapFLK analysis

The hapFLK analysis revealed 24 putative regions affected by selection, which were distributed among 16 autosomes (S4 Table, Fig 5).
Fig 5

Genome-wide scan for the signature of selection based on the hapFLK statistics.

Note: Each dot corresponds to a single SNP; axis X, genomic coordinates by chromosome; axis Y, statistical significance (−log10 P-values); the red and blue lines indicate the thresholds of significance: 10−5 and 10−3, respectively; the figures above and below the plot present the magnified plots of the chromosome areas containing the hapFLK regions.

Genome-wide scan for the signature of selection based on the hapFLK statistics.

Note: Each dot corresponds to a single SNP; axis X, genomic coordinates by chromosome; axis Y, statistical significance (−log10 P-values); the red and blue lines indicate the thresholds of significance: 10−5 and 10−3, respectively; the figures above and below the plot present the magnified plots of the chromosome areas containing the hapFLK regions. Thirteen hapFLK regions were identified in the Yaroslavl breed, 6 in the Kholmogor breed, and 12 in Holsteins. In most cases, the chromosomal distribution of these regions was breed-specific. The regions identified by the hapFLK analysis were observed on 9 autosomes (BTA4, BTA5, BTA6, BTA11, BTA13, BTA14, BTA15, BTA16, and BTA22) in Yaroslavl breed and on 6 autosomes (BTA4, BTA6, BTA12, BTA15, BTA18, and BTA23) in Kholmogor breed. For comparison, the hapFLK regions were detected on nine autosomes (BTA1, BTA3, BTA7, BTA8, BTA11, BTA13, BTA14, BTA22, and BTA23) in Holsteins. Both Yaroslavl and Kholmogor breeds had hapFLK regions overlapping with Holsteins (4 and 1 region, respectively) and with each other (2 regions). The sizes of the putative regions to be under selection pressure ranged from 0.336 to 2.974 Mb in Yaroslavl breed, from 0.284 to 1.266 in Kholmogor breed, and from 0.080 to 2.974 in Holsteins.

Identification of ROH islands

Overlapping ROH islands observed in more than 50% of samples within each breed were found across the genome in all the studied breeds (S5 Table, S4 Fig). We identified 17 and 20 ROH islands which were distributed among 12 chromosomes in the genome of Kholmogor and Yaroslavl cattle breeds, respectively. In most cases, the genomic distribution of ROH islands was specific for each breed both in length and localisation across chromosomes. The lengths of ROHs varied from 304.8 to 1276.9 kb in Kholmogor breed and from 297.2 to 1640.6 kb in Yaroslavl breed and were averaged 638.1 ± 59.4 and 691.0 ± 71.6 kb, respectively. For comparison, in Holsteins, 29 ROH islands were identified, which were distributed among 16 autosomes. The mean ROH length was 735.6 ± 85.8 kb with variation from 304.8 to 2741.3 kb. We identified several overlapping ROHs, which were common for two or three breeds. The ROHs located on BTA3 (positions from 9,205,782 to 9,515,045), BTA6 (positions 5,252,711 to 6,502,610), and BTA14 (positions 24,419,295 to 25,066,322) were common for all the three studied breeds. The ROH regions on BTA7 (positions 52,077,420 to 52,572,307), BTA13 (positions 50,231,261 to 50,724,998), BTA16 (positions 42,691,909 to 43,535,617), and BTA29 (positions 38,947,760 to 39,252,602) were common for Kholmogor and Holstein breeds. The common ROH islands on BTA12 (positions 72,509,035 to 73,492,233), BTA16 (positions 43,804,119 to 44,269,629), and BTA21 (positions 2,082,072 to 2,482,464) were observed in the genome of Yaroslavl and Holstein breeds. Among the described ROH islands, we identified 6 strongest ROHs (present in more than 70% of samples) in the genome of Yaroslavl breed on BTA5 (positions 24,567,855 to 24,934,918), BTA6 (positions 5,252,711 to 6,522,730), BTA7 (positions 20,531,857 to 21,146,198), BTA16 (positions 39,698,386 to 40,620,308 and 42,507,253 to 43,049,359), BTA17 (35,494,057 to 36,118,075), and BTA21 (positions 2,082,072 to 2,504,481) as well as 3 strongest ROHs in the genome of Kholmogor breed on BTA7 (positions 51,178,057 to 51,555,664), BTA14 (positions 24,562,756 to 24,874,608), and BTA16 (positions 42,691,909 to 43,472,069). One strongest ROH region common for Kholmogor and Holstein breeds and one region common for Yaroslavl and Holstein breeds was found on BTA14 and BTA16 (S5 Table). Comparison of the genomic localisation of the regions identified in our study by using FST, hapFLK, and ROH analyses, as well as in the previous study performed by Yurchenko et al. [34] by using DCMS statistic [56] showed the presence of several genomic regions under putative selection in the studied breeds identified at least by two different methods (Table 2).
Table 2

Genomic regions under putative selection in the two old Russian cattle breeds compared to those in Holsteins identified at least by two different methods.

BTABreedGenomic regions (Mbp) under selection identified by different methods
FST 1hapFLK1ROH501DCMS2
1HOL32.45–32.5132.45–32.53
YRSL65.31–65.6165.25–65.69
HOL84.20–84.2383.68–84.5583.86–84.48
2HOL101.86–102.34101.68–102.16101.84–102.00
3HOL8.97–9.729.39–9.53
HOL53.89–54.25 54.26–54.3353.81–54.80
4YRSL5.01–5.114.74–5.36
5YRSL23.76–24.4024.48–25.2623.95–26.34
6KHLM60.34–60.7060.05–60.7460.30–60.76
YRSL71.46–71.5571.38–72.0471.24–71.96
7YRSL10.22–10.6910.07–10.29
YRSL20.53–21.1920.63–20.98
HOL43.01–43.6143.43–43.57
HOL46.96–47.9747.38–47.74
HOL52.08–52.6051.57–52.42
8HOL59.67–60.3259.81–60.21
HOL107.80 – 108.58107.44–109.46a107.62–108.84107.76–108.68
9KHLM64.53–65.1364.86–65.11
12KHLM81.45–81.5881.40–81.69
13HOL, YRSL5.18–5.674.97–6.825.28–6.40
14KHLM1.54–2.091.70–1.89
HOL21.1221.10–21.58
HOL, YRSL, KHLM24.42–25.0724.43–25.10
HOL28.58–28.6928.34–29.17
15KHLM16.13, 16.1416.04–16.62
YRSL18.03, 18.2517.80–18.77
16YRSL41.91–42.4842.48–43.1342.43–43.37
HOL42.56–43.5442.79–43.50
KHLM42.69–43.5442.87–43.17
17YRSL45.6045.59–45.61
18KHLM0.85, 0.880.19–1.46
KHLM14.04–14.9814.12–14.99
24YRSL41.7441.76–41.8941.67–42.18
26HOL19.53–20.2819.19–20.29
HOL21.20–22.9122.13–23.0121.53–22.95
29HOL38.67–38.92 38.95–39.2538.41–39.33

Note: BTA, Bos taurus autosome; Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins; genomic regions: start and end positions (Mbp) according to Bos_taurus_UMD_3.1.1 genome assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6); methods used for the identification of the signature of selection: F, top 0.1% SNPs by FST value at pair-wise breed comparison; hapFLK, regions identified by hapFLK analysis; ROH50, ROH segments distributed in more than 50% of animals within each of the studied breed

1present study

2Yurchenko et al. [34]; pairs of breeds, used for FST calculations

aKHLM/HOL

bYRSL/HOL

cYRSL/KHLM

dweak selection signal was also observed in Kholmogor breed.

Note: BTA, Bos taurus autosome; Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins; genomic regions: start and end positions (Mbp) according to Bos_taurus_UMD_3.1.1 genome assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6); methods used for the identification of the signature of selection: F, top 0.1% SNPs by FST value at pair-wise breed comparison; hapFLK, regions identified by hapFLK analysis; ROH50, ROH segments distributed in more than 50% of animals within each of the studied breed 1present study 2Yurchenko et al. [34]; pairs of breeds, used for FST calculations aKHLM/HOL bYRSL/HOL cYRSL/KHLM dweak selection signal was also observed in Kholmogor breed. Twelve such regions were detected in Yaroslavl breed and nine in Kholmogor breed compared to 17 regions in Holsteins.

Identification of candidate genes within selected regions of the genome

We identified candidate genes within which the SNPs selected by FST analysis were localised (S2 Table), as well as the genes entirely or in part localised within the detected hapFLK regions and ROH islands (S6 Table). For more thorough analysis, we selected the genes, localised within genomic regions, which revealed the signature of selection identified by hapFLK and/or carried the strongest ROHs (ROH islands observed in more than 70% of animals) or were detected at least by two different methods (Table 3).
Table 3

Genes localised within the genomic regions affected by putative selection in Yaroslavl and Kholmogor cattle breeds.

BTARegion (Mb)BreedMethodsGenes
165.3 – 65.7YRSLROH50, DCMS1GSK3B, GPR156
44.7 – 5.4YRSLFST, hapFLKGrb10, DDC*
524.5 – 25.3YRSLhapFLK, ROH, DCMSPLXNC1, CCDC41, TMCC3, NDUFA12, FGD6, VEZT, MIR331
65.2 – 6.5YRSLROHMAD2 , MAD2L1, MGC134093
660.1 – 60.8KHLMFST, hapFLK, DCMSWDR19, KLB*, RPL9*, LIAS*, UGDH*, UBE2K, PDS5A
671.4 – 72.1YRSLFST, hapFLK, DCMSPDGFRA*, KIT
710.2 – 10.7YRSLROH, DCMS
720.5 – 21.2YRSLROH, DCMSTICAM1, FEM1A, MIR7, DPP9, TNFAIP8L1, SEMA6B, PLIN5, LRG1, PLIN4, HDGF2UBX, N6, CHAF1A, SH3GL1, STAP2, MPND, FSD1, TMIGD2, SHD, CCDC94, EBI3, ANKRD24, SIRT6, CREB3L3, MARP2K2
751.2 – 51.6KHLMROHNME5, BRD8, MIR2459, GFRA3, KIF20A, CDC23, FAM53C, CDC25C, REEP2, EGR1, ETF1, HSPA9
964.5 – 65.1KHLMROH, DCMSSYNCRIP, NT5E
1281.4 – 81.7KHLMFST, hapFLKNALCN
135.0 – 6.8YRSLaFST, hapFLK, DCMSBTBD3*
141.5 – 2.1KHLMROH, DCMSLRRC24, LRRC14, RECQL4, MFSD3, GPT, PPP1R16A, FOXH1, CYHR1, NFKBIL2, SLC39A4, CPSF1, ADCK5, GPR172B, FBXL6, SCRT1, DGAT1, HSF1, BOP1, SCXB, HEATR7A, MAF1, SHARPIN, CYC1, GPAA1, EXOSC4, OPLAH, GRINA, PARP10, MIR23089
1424.4 – 25.1YRSLaROH, DCMSXKR4, TMEM68, TGS1, LYN, RPS20, MOS, PLAG1, CHCHD7
1424.4 – 25.1KHLMaROH, DCMS
1516.0 – 16.6KHLMFST, hapFLKPIWIL4, FUT4*, MGC137061*, CWF19L2, GUCY1A2
17.8 – 18.8YRSLFST, hapFLKRAB39, CUL5*, ACAT1*, NPAT, ATM, KDELC2*, EXPH5, DDX10
1639.7 – 40.6YRSLROHFMO1, FMO4, BAT2L2, MYOC, VAMP4, METTL13, MIR214, MIR199A1
42.5 – 43.1YRSLahapFLK, ROH, DCMSVPS13D, TNFRSF1B, TNFRSF8, MIIP, MFN2, PLOD1, KIAA2013, NPPB, NPPA, CLCN6, MTHFR, AGTRAP, MAD2L2, FBXO6, FBXO44, FBXO2, PTCHD2
42.7 – 43.5KHLMaROH, DCMSNPPB, NPPA, CLCN6, MTHFR, AGTRAP, MAD2L2, FBXO6, FBXO44, FBXO2, PTCHD2, UBIAD1, MTOR, ANGPTL7, EXOSC10, SRM, MASP2, TARDBP
1735.5 – 36.1YRSLROHIL2, ADAD1
180.2 – 1.5KHLMFST, hapFLKUQCRFS1*, VSTM2B, VAC14
14.0 – 15.0KHLMROH, DCMSCDT1, APRT, GALNS, TRAPPC2L, CBFA2T3, ACSF3, CDH15, MGC157263, ANKRD11, SPG7, RPL13, CPNE7, DPEP1, CHMP1A, CDK10, SPATA2L, ZNF276, FANCA, TCF25, SPIRE2, MC1R, TUBB3, MIR220D, DEF8, CENPBD1, DBNDD1, GAS8, SHCBP1
212.0 – 2.5YRSLROHUBE3A
2441.7 – 41.9YRSLFST, ROH, DCMSNDUFV2*

Note: BTA, Bos taurus autosome; region, start and end positions of the genomic region affected by putative selection (Mbp) according to Bos_taurus_UMD_3.1.1 genome assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6); breed: YRSL, Yaroslavl; KHLM, Kholmogor; methods: F, top 0.1% SNPs by FST value at pair-wise breed comparison; hapFLK, regions identified by hapFLK analysis; ROH, ROH segments distributed in more than 50% of animals within each of the studied breed (genes localised within ROH segments identified in more than 70% of animals are shown by bold)

1the signature of selection identified previously by Yurchenko et al. [34] based on de-correlated composite of multiple signals (DCMS)

agenomic regions, which are common with Holsteins

*causative SNPs, detected by FST analyses were localized within 0,4 Mb windows; functionally annotated genes are underlined.

Note: BTA, Bos taurus autosome; region, start and end positions of the genomic region affected by putative selection (Mbp) according to Bos_taurus_UMD_3.1.1 genome assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6); breed: YRSL, Yaroslavl; KHLM, Kholmogor; methods: F, top 0.1% SNPs by FST value at pair-wise breed comparison; hapFLK, regions identified by hapFLK analysis; ROH, ROH segments distributed in more than 50% of animals within each of the studied breed (genes localised within ROH segments identified in more than 70% of animals are shown by bold) 1the signature of selection identified previously by Yurchenko et al. [34] based on de-correlated composite of multiple signals (DCMS) agenomic regions, which are common with Holsteins *causative SNPs, detected by FST analyses were localized within 0,4 Mb windows; functionally annotated genes are underlined. Functional annotation of the identified candidate genes showed the presence of genes associated with growth (localised on BTA 4, BTA 6, BTA 7, BTA 14, BTA 15, BTA 16, BTA 17, BTA 18, and BTA 21), carcass-related traits (BTA 6, BTA 7, BTA 15, and BTA 18), feed efficiency (BTA 14, BTA 15, and BTA 18), milk production (BTA 5, BTA 6, BTA 14, and BTA 16), and reproduction (BTA 1, BTA 4, BTA 6, BTA 7, BTA 9, BTA 15, BTA 16, BTA 17, and BTA 18), involved in the metabolic process (BTA 6, BTA 7, BTA 15, BTA 16, and BTA 24), and responsible for immunity (BTA 5, BTA 6, BTA 7, BTA 15, BTA 16, BTA 17, and BTA 18) and adaptation (BTA 18; S7 Table).

Discussion

Genetic diversity, effective population size, and breed relationship

To our knowledge, this is the first comprehensive genome-wide study of putative signatures of selection in the genomes of Yaroslavl and Kholmogor breeds performed on the basis of the analysis of high-density SNP genotypes (650,134 autosomal SNPs) generated using Bovine SNP HD BeadChip (Illumina, USA). In our recent studies, we showed that these two oldest Russian native breeds have been little influenced by introgression with transboundary breeds and maintained their historical genomic components [9, 11]; thus, they may carry specific signatures of selection, reflecting their adaptation to the local environmental conditions and response to the breeding strategy used. Adaptation to the changing environment is increasingly important [59]; thus, native cattle breeds might become valuable sources of genetic variability, which will be necessary for developing sustainable animal production systems in the future. Considering the relatively small sample size, we were very thoughtful when choosing animals for research. The selected individuals represented different sire lines and different breeding farms, and unveiled the low amount of Holstein ancestral components (S3 Fig). Besides, the small sample sizes appear to be sufficient to obtain the reliable results using high-density SNP markers [56]. We found that 9.36% of variability was attributed to genetic differences between breeds (FST = 0.094), and the remaining 90.64% was the result of allelic variation within breeds. We observed slightly higher level of genetic variability in Yaroslavl (UHE = 0.349, AR = 1.967) and Kholmogor (UHE = 0.361, AR = 1.977) breeds (Table 1) than was previously detected using medium-density DNA BeadChip [9, 10]. Variability was significantly lower in the Yaroslavl breed than in the Kholmogor breed (p < 0.001), probably because of more drastic decline in the census population size. During the last 60 years, the number of Yaroslavl cattle has decreased by practically 20 times compared to that of Kholmogor breed, which decreased by 4.5 times [12]. Our data agreed with the findings of Xu et al. [15], who performed the genotyping of three commercial taurine breeds by using high-density DNA array and observed similar level of genetic variability (HE = 0.337–0.356). In contrast, Stronen et al. [17] detected lower genetic variability in taurine breeds (HE = 0.226 and 0.298); this was probably because of the small population size of the studied native breeds. We observed slight excess of heterozygotes in both the Russian breeds (UFIS = -0.022 and -0.013 for Kholmogor and Yaroslavl breeds, respectively), which agrees with the findings obtained using DNA chips of lower density [9, 10]. The detected inbreeding coefficient in Holsteins (UFIS = -0.010) was similar or lower than one which was discovered in recent reports (FIS = −0.004 [60], FIS = 0.0526 [15]), which might reflect the differences in sample origins and population sizes. The NE showed an overall decline for the studied breeds over generations (Fig 1A, S2 Fig). The most recent NE values observed for Yaroslavl (NE5 = 111) and Kholmogor (NE5 = 130) cattle were higher than those in Holsteins (NE5 = 95) and exceeded the critical threshold of NE = 100 estimated for long-term viability of discrete livestock breeds [61]. We observed four periods of accelerated decline in NE during the last 45 generations (Fig 1B). The more ancient periods were observed between 28 and 23 as well as 18 and 15 generations ago, which probably reflect the World wars crisis. The subsequent acceleration was detected between 8 and 7 generations ago, which was most likely the result of the implementation of artificial insemination, which is associated with the decrease in bull number. The recent accelerated decline in NE was found between 5 and 4 generations ago for Kholmogor breed and between 6 and 5 generations ago for Yaroslavl breed, when the NE decreased from 130 to 115 and 113 to 111, respectively. The possible explanation for this could be the large decrease in the population size of the purebred cattle from the mid-1990s to the mid-2000s owing to the general decline of industry in the first decade of the post-Soviet period. We observed visible differences in the autozygosity level among the studied breeds. The average ROH length (258.3 Mb) in Yaroslavl breed was similar to that in Holsteins (270.1 Mb) and exceeded that in the Kholmogor breed (147.6 Mb). The higher total ROH length in Yaroslavl breed might be attributed to the higher inbreeding owing to the lower population size. Among the different length classes, short ROH segments (1–2 Mb) were prevalent (Fig 2, S1 Table), which agreed with the findings obtained in different cattle populations [48, 62, 63]. The results of the PCA plot, Neighbor-Net analysis, and admixture clustering (S3 Fig) suggest a low contribution of Holstein ancestors (probably Holland cattle) in the formation of the architecture of the Yaroslavl and Kholmogor breeds, which has been proposed by several researchers [5, 9–11].

Identification of genomic regions under putative selection in the Kholmogor and Yaroslavl breeds

We applied three different statistical methods (selection of top 0.1% SNPs by FST value for pair-wise breed comparison, identification of ROH islands shared by more than 50% of animals within each breed, and hapFLK analysis) to identify the genomic regions and genes that are affected by selection (S2 and S6 Tables). Considering a possible impact of genetic drift or recent inbreeding on ROHs [64], we selected the ROH islands to be under selection pressure, if they were confirmed by at least one more method or were observed in more than 70% of animals. Results obtained using multiple analytical approaches typically have higher informative power, furthermore, methods based on different approaches may also complement each other [20, 65]. Fifteen of seventeen regions, identified in Holsteins were overlapped with genomic regions, which were described by Yurchenko et al. [34] and eleven were overlapped with regions, which were validated by meta-assembly of selection signatures (S8 Table) [57]. We confirmed nine regions under putative selection in the genome of Yaroslavl cattle and six regions in the genome of Kholmogor cattle, which were previously identified by Yurchenko et al. [34] by using DCMS analysis [56] of medium-density SNP genotypes; in this study, the flanking positions of most of these regions were expanded (Table 3). In addition, we detected three new putative genomic regions affected by selection by using at least two different methods in the genome of Yaroslavl (BTA 4 at 4.74–5.36 Mbp, BTA 15 at 17.80–18.77 Mbp, and BTA 17 at 45.59–45.61 Mbp) and Kholmogor (BTA 12 at 82.40–81.69 Mbp, BTA 15 at 16.04–16.62 Mbp, and BTA 18 at 0.19–1.46 Mbp) breeds. Only two of the selected regions (localised on BTA 14 at 24.4–25.1 Mbp and BTA 16 at 42.5–43.5 Mb) overlapped in Yaroslavl, Kholmogor, and Holstein breeds. An additional overlapping region with Holsteins (localised on BTA 13 at 5.0–6.8 Mbp) was found in the Yaroslavl breed. Seventeen of twenty-one regions, found in the Yaroslavl and Kholmogor breeds were overlapped with the signature of selections, which were identified in other European breeds by meta-assembly [57], while two genomic regions in each breed were novel (S8 Table).

Functional annotation of candidate genes localised within the selected regions

We performed functional annotation of genes localised within the putative regions carrying signals of selection in the genome of the studied breeds. Along with genes, which were previously described for Yaroslavl and Kholmogor cattle [34], we expanded the list of candidate genes by elucidating the known genomic regions and detecting additional selection sweeps in the genome of the studied breeds (Table 3, S6 Table). We found two regions overlapping with the Holstein genomic region under selection pressure in both Yaroslavl and Kholmogor breeds (Table 3, S6 Table). One of these regions is localized on BTA 14 and contains the XKR4, TMEM68, TGS1, LYN, RPS20, MOS, PLAG1, and CHCHD7 genes which are well-known candidates for growth, carcass-related traits [66-68], and feed intake [69-71]. The PLAG1 gene (or PLAG1-CHCHD7 region) is a strong candidate responsible for stature; body size, including height [72]; and weight in many cattle breeds [73-78]. Fink et al. [79] found a strong association of the PLAG1 genotype with milk fat and protein yield. Utsunomiya et al. [80] provided an insightful account of the effect of PLAG1 in the history of domesticated cattle. The polymorphic SNP BovineHD1400007259, located within the intronic region of the PLAG1 gene, is considered as a causal mutation responsible for stature [81-83]. This mutation was segregated in both Yaroslavl (pA = 0.367, qC = 0.633; HO = 0.323) and Kholmogor (pA = 0.140, qC = 0.860; HO = 0.200) breeds, which could have resulted from selection in both breeds for animals of the larger size. Selection pressure on genes, involved in growth regulation is confirmed by identification of additional notable candidate genes with known effect on body size and/or carcass traits, including Grb10 on BTA 4 [84-86], PDGFRA, MAD2L and UBE2K on BTA 6 [87-90], SIRT6 on BTA 7 [91-93], and ACAT1, KDELC2 on BTA 15 [94-96] in Yaroslavl breed, and EGR1, HSPA9 on BTA 7 [97, 98], and CDH15, DPEP1, GAS8, GALNS, and MC1R on BTA 18 [30, 99–102] in Kholmogor breed. The second overlapping genomic region under positive selection in Yaroslavl, Kholmogor and Holstein breeds containing TNFRSF1B, MFN2, PLOD1, KIAA2013, NPPB, NPPA, CLCN6, MTHFR, AGTRAP, FBXO2, MTOR, ANGPTL7, EXOSC10, and MASP2 genes was found on BTA16 (Table 3, S6 Table). The selected candidate genes are involved in different biological processes in cattle, including milk production—MTHFR [103] and mTOR genes [104, 105]; reproduction—PLOD1 [106], NPPA and NPPB [107], EXOSC10 genes [108]; immunity—TNFRSF1B [109] and MASP2 genes [110, 111]; and metabolism—MFN2 gene [112]. Additional large set of candidate genes (LRRC24, LRRC14, RECQL4, HEATR7A, MFSD3, GPT, PPP1R16A, FOXH1, CYHR1, SLC39A4, CPSF1, ADCK5, FBXL6, BOP1, SCRT1, DGAT1, GPAA1, EXOSC4, PARP10, HSF1, OPLAH, and GRINA), associated with milk production traits in Kholmogor cattle was found on BTA 14. These genes were shown to be involved in regulation of milk fatty acid composition [113-116], milk yield [117, 118], milk fat yield, and milk fat percentage [119-123]. In contrast to Kholmogor cattle, we identified only two candidate genes on BTA 5, which were associated with milk production traits in Yaroslavl cattle, including TMCC3 and DDX10 [124, 125]. The possible explanation for the presence of numerous milk trait genes affected by selection in Kholmogor cattle could be that one of the breeding targets for this breed from the middle of the 19th century was to supply the Petersburg and Moscow citizens with high-quality dairy products [5]. This stimulated the selection of high-producing cows with excellent milk quality, suitable for butter and cheese production, and thus led to the alterations in the allele frequencies of genes which are involved in the regulation of milk production and composition. We identified a few more candidate genes associated with reproduction traits, including GSK3B on BTA 1 [126], ADAD1 on BTA 17 [127, 128], and UBE3B on BTA 21 [129, 130], in the Yaroslavl breed. These genes are associated mainly with female fertility traits, that can reflect the unique capacity of cows of Yaroslavl breed to produce healthy calves even with severe weight losses because of poor feeding in winter [4]. The genes detected in Kholmogor breed, including WDR19 on BTA 6 [131], NME5 on BTA 7 [132, 133], NT5E on BTA 9 [134], PIWIL4 on BTA 15 [135], TUBB3 and UQCRFS1 on BTA 18 [136, 137], are mainly related to male fertility. Since the Kholmogor breed had become the breed of choice for the growing market of Petersburg and Moscow in the 19th century, the use of bulls with increased fertility was preferred. We detected selection pressure on genes involved in lipid metabolism in Yaroslavl cattle, including PLIN4, PLIN5, and CREB3L3 on BTA 7 [138-141], MFN2 on BTA 16 [112], and NDUFV2 on BTA 24 [142]. This observation is probably associated with the capacity of cattle population of Central Russia to accumulate sufficient subcutaneous fat during the pasture period that helped them to survive under poor nutrition conditions in winter and to maintain pregnancy [4]. In Yaroslavl breed, we found a putative signature of selection around the KIT gene on BTA 6, which is a functional candidate for coat coloration in cattle, including white spotting patterns in the Holstein [143] and Fleckvieh cattle breeds [144]. According to the QTL database [58], the region on BTA 6 between 71.4 and 72.1 Mbp, where the KIT gene is localised, contains the genes associated with eye area pigmentation in Fleckvieh cattle [145]. The black pigmentation of the eye area is one of the well-known distinguishing features of Yaroslavl cattle (S1 Fig). In summary, our study provides deeper insight into the genomic architecture of Yaroslavl and Kholmogor cattle breeds and allowed the identification of the genomic regions and genes that were affected by selection during the century-long history of breed formation. Our research results will be useful for the sustainable development and conservation of these two oldest Russian native cattle breeds.

The photos of the pure-bred Yaroslavl and Kholmogor bulls.

(TIF) Click here for additional data file.

The effective population size (N) across generations from approximately 500 generations ago based on linkage disequilibrium (LD) calculations.

Note: Breeds: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins. (TIF) Click here for additional data file. (A). Principal Component Analysis (PCA) plot showing the distribution of the studied cattle breeds in the dimensions of two coordinates. (B). A Neighbor-Net tree constructed on the basis of pairwise nucleotide genetic distances among individuals of the three studied breeds. (C). Admixture plot represents the cluster structure of the studied breeds for the number of clusters (K) 2 and 3. Different colours correspond to different ancestral populations. (D). Cross-validation (CV) error for the different number of clusters. Note: Axis X, first principal component (PC1); Axis Y, second principal component (PC2). The part of total genetic variability, which can be explained by each of the two components are indicated in curly parentheses. Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins. (TIF) Click here for additional data file.

Distribution of ROHs within chromosomes.

(a). BTA5, (b). BTA6, (c). BTA7, (d). BTA8, (e). BTA14, (f). BTA16, (g). BTA17, (h). BTA21. Note: Axis X, chromosomal positions (Mbps); Axis Y, the ratio of animals carrying ROHs (in percent); YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins. (TIF) Click here for additional data file.

Mean number and length of ROHs of different length classes.

Note: Class: ROH length class (1–2 Mb, 2–4 Mb, 4–8 Mb, 8–16 Mb, and >16 Mb); breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins; n, number of animals; mean number of ROH: values, mean values for each length class and breed; sd, standard deviation; se, standard error; ci, confidence interval; %, ratio of the number of ROHs of certain length class in the common number of ROHs within each breed; mean length of ROH: values, mean values for each length class and breed; sd, standard deviation; se, standard error; ci, confidence interval; %, ratio of the length of ROHs of certain length class in the common length of ROHs within each breed. (XLSX) Click here for additional data file.

Top 0.1% SNPs by F values during pair-wise comparison of breeds.

Note: BTA, Bos taurus autosome; breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins; compared pairs of breeds indicated by different colour filling: KHLM/HOL, blue filling; YRSL/HOL, green filling; KHLM/YRSL, pink filling; SNP, SNPs selected in top 0.1% for each pair of breeds; POS, position of SNPs according to Bos_taurus_UMD_3.1.1 genome assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6); NMISS, number of missing SNPs; FST, FST value calculated for each SNP during pair-wise breed comparison; genes, genes within which or close (within window from 0.2 Mbp up-stream to 0.2 Mbp down-stream of the causal SNP) to which is localised the causal SNP; the candidate genes within which the causal SNPs are localised are indicted in bold. (XLSX) Click here for additional data file.

Top 0.1% SNPs by F value, which are common for two pairs of breeds, during pair-wise breed comparison: YRSL/KHLM and YRSL/HOL, YRSL/HOL and KHLM/HOL, as well as KHLM/HOL and YRSL/KHLM.

Note: breed: HOL, Holsteins; KHLM, Kholmogor; YRSL, Yaroslavl. (XLSX) Click here for additional data file.

Putative selective sweeps identified in the hapFLK-based analysis.

Note: BTA, Bos taurus autosome with the identified hapFLK regions; Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins; Start and End positions, the positions of the first and last SNPs in the identified hapFLK region according to Bos_taurus_UMD_3.1.1 genome assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6); Length, length of hapFLK regions (Mbp); No. of SNPs, the number of SNPs within localised hapFLK regions; the hapFLK regions are sorted by chromosome and then by start position on chromosome. (XLSX) Click here for additional data file.

Genomic regions in Yaroslavl and Kholmogor cattle breeds under putative selection compared to those in Holsteins revealed by ROH analysis.

Note: BTA, Bos taurus autosome with the overlapping ROH shared by more than 50% of the animals (the strongest ROH islands shared by more than 70% of animals are shown in bold); Breed: YRSL, Yaroslavl (indicated by green color); KHLM, Kholmogor (indicated by blue color); HOL, Holsteins (indicated by pink color); No. of SNPs, number of SNPs within an ROH island; Start and End positions, the positions of the first and last SNPs in an ROH island according to Bos_taurus_UMD_3.1.1 genome assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6); Length, length of ROH island (Mb); Percent of animals, the percent of animals carrying the ROH island. The ROH islands are sorted by chromosome and then by start position on chromosome. (XLSX) Click here for additional data file.

Genomic regions and genes affected by putative selection identified using HapFLK and ROH analyses, and overlapping regions identified by F calculations in Yaroslavl and Kholmogor cattle breeds.

Note: BTA, Bos taurus autosome; Start_SNP and End_SNP, the names of the start and end SNPs on Bovine HD BeadChip (Illumina Inc., USA) flanking the genomic regions under putative selection; breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins; methods: F, top 0.1% SNPs by F value during pair-wise breed comparison; hapFLK, regions identified by hapFLK analysis; ROH_50% and ROH_70%, ROH segments distributed in more than 50% and more than 70% of animals, respectively, within each of the studied breed (the ROH segments identified in more than 70% animals are shown in bold); nSNP, number of SNPs localised within the identified genomic region; Start position and End position, start and end positions of the genomic region affected by putative selection (Mbp) according to Bos_taurus_UMD_3.1.1 genome assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6); length, the length of the identified genomic regions under putative selection (Mbp); No. of genes, number of genes localised within identified genomic regions; genes, the list of genes localised within identified genomic regions (genes localised within ROH segments identified in more than 70% of animals are shown in bold). (XLSX) Click here for additional data file.

Selected genes under putative selection in Kholmogor and Yaroslavl cattle breeds.

Note: 1Genes: GSK3B—glycogen synthase kinase 3 beta; Grb10—growth factor receptor-bound protein 10; PLXNC1—plexin C1; TMCC3—transmembrane and coiled-coil domain family 3; FGD6—FYVE, RhoGEF, and pH domain containing 6; MAD2L1—mitotic arrest deficient 2 like 1; WDR19—WD repeat domain 19; KLB—klotho beta; RPL9—ribosomal protein L9; LIAS—lipoic acid synthase; UGDH—UDP-glucose 6-dehydrogenase; UBE2K—ubiquitin-conjugating enzyme E2 K; PDS5A—PDS5 cohesin-associated factor A; PDGFRA—platelet-derived growth factor receptor, alpha polypeptide; KIT—V-kit Hardy–Zuckerman 4 feline sarcoma viral oncogene homolog; TICAM1—toll like receptor adaptor molecule 1; PLIN5perilipin 5; PLIN4—perilipin 4; SIRT6—sirtuin 6; CREB3L3—cAMP-responsive element-binding protein 3 like 3; NME5—NME/NM23 family member 5; EGR1—early growth response 1; HSPA9—heat shock protein family A (Hsp70) member 9; NT5E—ecto-5′-nucleotidase; LRRC24—leucine-rich repeat-containing protein 24; LRRC14—leucine-rich repeat-containing protein 14; RECQL4—RecQ like helicase 4; MFSD3—major facilitator superfamily domain containing 3; GPT—glutamic-pyruvic transaminase; PPP1R16A—protein phosphatase 1 regulatory subunit 16A; FOXH1—forkhead box H1; CYHR1—cysteine and histidine rich 1; SLC39A4—solute carrier family 39 member 4; CPSF1—cleavage and polyadenylation specific factor 1; ADCK5—aarF domain-containing kinase 5; FBXL6—F-Box and leucine rich repeat protein 6; SCRT1—Scratch family transcriptional repressor 1; DGAT1—diacylglycerol acyltransferase 1; HSF1—Heat shock factor 1; BOP1BOP1 ribosomal biogenesis factor; HEATR7A—maestro heat like repeat family member 1; GPAA1—glycosylphosphatidylinositol anchor attachment 1; EXOSC4—exosome component 4; OPLAH—5-oxoprolinase, ATP-hydrolysing; GRINA—glutamate ionotropic receptor NMDA type subunit-associated protein 1; PARP10—poly(ADP-ribose) polymerase family member 10; XKR4—XK, Kell blood group complex subunit-related family, member 4; TMEM68—transmembrane protein 8B; TGS1—trimethylguanosine synthase 1; LYNLYN proto-oncogene, Src family tyrosine kinase; RPS20—ribosomal protein S20; MOSMOS proto-oncogene, serine/threonine kinase; PLAG1—pleomorphic adenoma gene 1; CHCHD7—coiled-coil-helix-coiled-coil-helix domain containing 7; PIWIL4—piwi like RNA-mediated gene silencing 4; GUCY1A2—guanylate cyclase 1 soluble subunit alpha 2; ACAT1—acetyl-CoA acetyltransferase 1; KDELC2—KDEL (Lys-Asp-Glu-Leu) containing 2; EXPH5—exophilin 5; DDX10—DEAD-box helicase 10; NPPB—natriuretic peptide B; NPPA—natriuretic peptide A; CLCN6—chloride voltage-gated channel 6; MTHFR—methylenetetrahydrofolate reductase; AGTRAP—angiotensin II receptor associated protein; FBXO2—F-box protein 2; TNFRSF1B—tumour necrosis factor receptor superfamily, member 1B; MFN2mitofusin 2; PLOD1—procollagen-lysine, 2-oxoglutarate 5-dioxygenase 1; mTOR—mammalian target of rapamycin; ANGPTL7—angiopoietin like 7; EXOSC10—exosome component 10; MASP2—mannose-binding lectin-associated serine protease 2; IL2—interleukin 2; ADAD1—adenosine deaminase domain containing 1; UQCRFS1—ubiquinol-cytochrome c reductase, Rieske iron–sulphur polypeptide 1; GALNS—galactosamine(N-acetyl)-6-sulfatesulfatase; CBFA2T3—CBFA2/RUNX1 partner transcriptional co-repressor 3; CDH15—cadherin15, type1, M-cadherin; SPG7SPG7 matrix AAA peptidase subunit, paraplegin; CPNE7—copine 7; DPEP1—dipeptidase 1; CHMP1A—charged multivesicular body protein 1A; CDK10—cyclin dependent kinase 10; SPATA2L—spermatogenesis associated 2 like; FANCA—Fanconi anaemia, complementation group A; TCF25—transcription factor 25; SPIRE2—spire type actin nucleation factor 2; TUBB3—tubulin beta 3 class III; GAS8—growth arrest specific 8; MC1Rmelanocortin 1 receptor; UBE3A—ubiquitin protein ligase E3A; NDUFV2—NADH dehydrogenase flavoprotein 2; breed: YRSL—Yaroslavl, KHLM—Kholmogor; BTABos taurus autosome; positions—start and end positions of a gene (bp) according to Bos_taurus_UMD_3.1.1 genome assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6). (XLSX) Click here for additional data file.

Comparison of the identified genomic regions under putative selection with those identified by meta-assembly of selection signatures.

Note: BTA, Bos taurus autosome; Breed: YRSL, Yaroslavl; KHLM, Kholmogor; HOL, Holsteins; genomic regions: start and end positions (Mbp) according to Bos_taurus_UMD_3.1.1 genome assembly (https://www.ncbi.nlm.nih.gov/assembly/GCF_000003055.6); methods used for the identification of the signature of selection: F, top 0.1% SNPs by FST value at pair-wise breed comparison; hapFLK, regions identified by hapFLK analysis; ROH50, ROH segments distributed in more than 50% of animals within each of the studied breed; 1present study; 2Yurchenko et al. [34]; 3meta-assembly of selection signatures, performed by Randhawa et al. [58]: overlapped MSS region (Mb)–overlapped genomic regions, which were validated by meta-selection scores; pairs of breeds, used for FST calculations: aKHLM/HOL, bYRSL/HOL, cYRSL/KHLM; dweak selection signal was also observed in Kholmogor breed. (XLSX) Click here for additional data file. 28 Aug 2020 PONE-D-20-15794 Selection signatures in two oldest Russian native cattle breeds revealed using high-density single nucleotide polymorphism analysis PLOS ONE Dear Dr. Zinovieva, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. As you can see both of the reviewers found the study interesting, however raised concerns about many of the things. Therefore, please make a point-by-point reply addressing all the concerns and revise the manuscript accordingly. Please submit your revised manuscript by Oct 11 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, Gyaneshwer Chaubey, Ph.D. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. We note that you are reporting an analysis of a microarray, next-generation sequencing, or deep sequencing data set. PLOS requires that authors comply with field-specific standards for preparation, recording, and deposition of data in repositories appropriate to their field. Please upload these data to a stable, public repository (such as ArrayExpress, Gene Expression Omnibus (GEO), DNA Data Bank of Japan (DDBJ), NCBI GenBank, NCBI Sequence Read Archive, or EMBL Nucleotide Sequence Database (ENA)). In your revised cover letter, please provide the relevant accession numbers that may be used to access these data. For a full list of recommended repositories, see http://journals.plos.org/plosone/s/data-availability#loc-omics or http://journals.plos.org/plosone/s/data-availability#loc-sequencing. 3. In your Methods section, please state the volume of the blood samples collected for use in your study. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This manuscript present analysis of selection signature in two Russian breeds using bovine SNP chip panel. The analysis and results presented are detailed. My main concern is on the very small samples size (31 and 25) used in the study presented. These samples may not be enough to represent the haplotypes present in these populations, hence may not be repeatable if another sample from these population are tested. Most of the estimates are based on underlying LD which will require large samples size for reliable estimates of LD. It should be established and discussed if the samples were enough. The manuscript is too long and will require restructuring. Most of the results (tables and graphs) related to the individual three tests can go into supplementary. The key selection signature obtained from combining the tests could be included. E.g. Table 2, Table 3, Table 4 can be moved to the supplementary. Table 5 (or 4) can stay in the main manuscript. The main manuscript can focus on the main novel regions identified. The length of the manuscript can be easily reduced to one third. Minor: Abstract could be more informative – including specific results : e.g. regions discovered. Line 23: any cattle breed may carry selection signature – why Russian native only. – may rephrase. Line 28: “in which most of the ancestral genomic components are retained. – how this was established? Line: 46 : “genomic regions and candidate genes that might be under putative selection” – I guess it should be putative regions and candidate genes. ROH may be because of random drift or recent inbreeding and may not be specific to the selection signatures. – Line 254: these signature could be compared with the meta-assembly of selection signature in cattle ( https://doi.org/10.1371/journal.pone.0153013 ) Line 309: should be “Estimates of ROHs and Froh Reviewer #2: This is a manuscript in which the authors prepared a small but adequate amount of genotypic data on two Russian breeds of cattle to look for signatures of selection. Similar works (citations 31, 32) have been published before on a wider range of Russian breeds, but the tool used in those studies had a sever-fold lower density of SNP. The main reason to study rare or dying breeds is to try and find genetic variation that is useful to understand the biology of cattle for use in future generations as humanity moves to a more monoculture approach to livestock farming. It would be very much appreciated if the authors would move the historic information and description of the breed type (dairy or beef or dual-purpose) to the introduction of the article. This provides the entire justification as to why the work has value and if the experiment will yield results of interest. Otherwise one has to go to google to see and find out about these breeds, one which looks like an old time Holstein and the other looks like a black-whiteface dairy animal (both horned). In the introduction - Lines 93-109 are not necessary. Be succint and summarize in a few sentences why the 3 method approach is better. In the methods - its not quite clear. Were genes nearby but outside the zones of selection still considered as candidates. Genes beyond the detectable boundaries of the selection signatures should not be in the positional candidate genes list. Population size - suggest moving lines 293-295 into the subsequent section where it belongs. The authors were given HD data from a set of Holsteins cited as Bahbahani et al.; however, this citation is not appropriate since this data was contributed to that paper as well. The original source of the Holstein data should be cited, as this is important to provide context as to what animals are in this cohort and how they were selected. If they are the animals from the Hapmap, then they would represent a diverse set of the most highly used AI bulls in USA history. Not really a reflection of European Holsteins or even Holsteins from Holland (at least 5 generations removed). Were the marker heterozygosity calculations for Holsteins comparable to previously published works (i.e. Hapmap)? The results is full of a lot of lists to try and comprehensively describe all the results. Comparing highest and lowest chromosomes really doesnt matter in the big picture. What matters are the regions in common between breeds. At this point in the article - one is wondering why they would have regions in common with Holstein, since we dont know if the Russian breeds are dairy or beef animals. It would have been more intriguing to see another continental (Fleckvieh/Hereford) or island breed and also a Mongolian or N. Chinese breed of cattle and maybe a zebu breed for contrast in the PCA. Is the result of nearly equal number of Fst SNPs between all 3 breeds surprising - considering the Holstein are under more intense selective pressure than the Russian breeds? The other 2 methods suggest this hypothesis is true (more ROH and FLKhap in Holsteins. The SNP in the intron of PLAG1 still has not been proven as the causative mutation, its purely speculative. There is another paper (Yuri Tani et al.) that disputes this with very strong evidence. This should be toned down in citation. Overall, a nice piece of thesis work; but the paper should not be presented like a thesis. The discussion is way too long, and should be trimmed down to highlight only the most important findings. The paper should be halved in length. The PCA and admixture should be moved to supplemental data, since they dont give very much insight (only 3 breeds in comparison has its drawbacks for interpretation). ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 11 Oct 2020 Response to the Editor: We deposited our Dataset files in the Dryad Digital Repository. Zinovieva, Natalia et al. (2020), Selection signatures in two oldest Russian native cattle breeds revealed using high-density single nucleotide polymorphism analysis, Dryad, Dataset, https://doi.org/10.5061/dryad.vt4b8gtq9. The Dataset will be free available upon the manuscript will be accessed for publication. The editors and reviewers could access the Dataset files using following URL: https://datadryad.org/stash/share/UDZHQNBg-m05QtB3jtPLze6UTRRXMXd5xGUmBDhjF3I We indicated the volume of collected blood samples – 5 ml of whole blood (L143). Responses to the Reviewer №1: 1. This manuscript present analysis of selection signature in two Russian breeds using bovine SNP chip panel. The analysis and results presented are detailed. My main concern is on the very small samples size (31 and 25) used in the study presented. These samples may not be enough to represent the haplotypes present in these populations, hence may not be repeatable if another sample from these population are tested. Most of the estimates are based on underlying LD which will require large samples size for reliable estimates of LD. It should be established and discussed if the samples were enough. Response: We agreed with the Reviewer. Considering the relatively small sample size, we were very thoughtful when choosing animals for research. The selected individuals represented different sire lines and different breeding farms, and revealed the low amount of Holstein ancestral components (S3 Figure). We specified it in Materials and methods section (L152-L155) and discussed in the Discussion section (L511 - L515). 2. The manuscript is too long and will require restructuring. Most of the results (tables and graphs) related to the individual three tests can go into supplementary. The key selection signature obtained from combining the tests could be included. E.g. Table 2, Table 3, Table 4 can be moved to the supplementary. Table 5 (or 4) can stay in the main manuscript. The main manuscript can focus on the main novel regions identified. The length of the manuscript can be easily reduced to one third. Response: We restructured the manuscript according to the suggestions of the Reviewer. We moved the PCA, individual three and Admixture results (Fig. 3) as well as the distribution of the ROH islands (Fig. 7) to the Supplemental data (S3 Fig and S4 Fig. in the revised manuscript). We moved the Table 2 and 3 (S4 and S5 Tables in the revised manuscript, respectively) to the supplemental data. We significantly shortened the discussion section, focusing on the most important observations. We described in details only the genomic regions, which were common for three breeds and provided only a brief analysis of other breed-specific regions. We tried to find the relationships between functions of identified candidate genes and the history of breeds. The length of the manuscript was reduced from 52 sites to 34 sites (excluding references). 3. Abstract could be more informative – including specific results: e.g. regions discovered. Response: We made the Abstract section more specific and identified genomic regions were pointed out. 4. Line 23: any cattle breed may carry selection signature – why Russian native only. – may rephrase. Response: We rephrased the sentence. 5. Line 28: “in which most of the ancestral genomic components are retained. – how this was established? Response: In our previous study of nine Russian cattle breeds (Sermyagin A.A. et al., GSE, 2018) based on the medium-density SNP BeadChips we identified the group of Russian cattle breeds including the Yaroslavl and Kholmogor breeds, which have been the least influenced by introgression with non-Russian breeds. In our last study of the historical specimens of Kholmogor and Yaroslavl cattle, dated from the end of 19th to the first half of 20th century using microsatellites (Abdelmanova A.S. et al., Genes, 2020) we confirmed the maintenance of historical components in modern populations of above-mentioned breeds. We added the paper of Abdelmanova et. al. to the references (ref. 11). Considering that these results belong to our previous studies, we rephrased the sentence in the Abstract section and made additional explanation in the Introduction section (L80 - L83). 6. Line: 46: “genomic regions and candidate genes that might be under putative selection” – I guess it should be putative regions and candidate genes. Response: We corrected the sentence. 7. ROH may be because of random drift or recent inbreeding and may not be specific to the selection signatures. – Response: We agree with the Reviewer. Considering that genetic drift or recent inbreeding may have an impact on ROHs (Kim ES et al., 2013, ref. 65), we selected the ROH islands to be under selection pressure, if they were distributed in more than 50% of animals and were confirmed by at least two methods or if the identified ROH islands were observed in more than 70% of animals. We added this information to the Discussion section (L575 – 578). 8. Line 254: these signature could be compared with the meta-assembly of selection signature in cattle (https://doi.org/10.1371/journal.pone.0153013 ) Response: We performed the comparison of the identified genomic regions with the meta-assembly of selection signature in cattle. We summarized the results in supplementary (S7 Table) and discussed it in the main text of the Manuscript (L577 – L578, L591- L594) 9. Line 309: should be “Estimates of ROHs and Froh Response: We corrected the title of the paragraph. Responses to the Reviewer №2: This is a manuscript in which the authors prepared a small but adequate amount of genotypic data on two Russian breeds of cattle to look for signatures of selection. Similar works (citations 31, 32) have been published before on a wider range of Russian breeds, but the tool used in those studies had a sever-fold lower density of SNP. Response: Thank you very much for the positive evaluation of our manuscript. 1. The main reason to study rare or dying breeds is to try and find genetic variation that is useful to understand the biology of cattle for use in future generations as humanity moves to a more monoculture approach to livestock farming. It would be very much appreciated if the authors would move the historic information and description of the breed type (dairy or beef or dual-purpose) to the introduction of the article. This provides the entire justification as to why the work has value and if the experiment will yield results of interest. Otherwise, one has to go to google to see and find out about these breeds, one which looks like an old time Holstein and the other looks like a black-white face dairy animal (both horned). Response: We expanded the description of the studied cattle breeds in the Introduction section by specifying the breed type (dairy) and adding the relevant historical information (L60 – L64, L67 – L71, L73 – L76, L80 – L83). We added a citation of our recent study of the historical specimens of Kholmogor and Yaroslavl cattle, dated from the end of 19th to the first half of 20th century using microsatellites (Abdelmanova et al., genes, 2020, ref. 11), confirming their earlier differentiation from Holstein ancestors. 2. In the introduction - Lines 93-109 are not necessary. Be succinct and summarize in a few sentences why the 3 method approach is better. Response: We shortened the paragraph (L115 - L121) 3. In the methods - its not quite clear. Were genes nearby but outside the zones of selection still considered as candidates. Genes beyond the detectable boundaries of the selection signatures should not be in the positional candidate genes list. Response: We sincerely appreciate this valuable comment. The windows of 0.4 Mb (0.2 Mb upstream and 0.2 Mb downstream of the top 0.1 % SNPs) were selected only for SNPs, which were identified by FST statistics. We provided the full list of genes, which were entirely or partly localized within selected windows (see S2 Table) with the aim to compare the genes with those, which were identified by hapFLK and ROH analysis. We consider the genes as positional candidate genes only if the causal SNPs were localized within genes. These genes were marked in bold in S2 Table. For hapFLK and ROH analyses, the genes were selected only if they were localized entirely or in part within the identified genomic regions. Table 4 includes only the genes, if they were localized within genomic regions identified by hapFLK and ROH analysis. The results of FST statistics were used only if they were overlapped with hapFLK and ROH analysis. In Table 4, we marked by asterisks the genes, which were localized within genomic region identified by hapFLK and ROH analysis, and thus could be considered as positional candidate genes and albeit were not overlapped with causal SNPs, selected by FST analysis, but were localized within 0,4 Mb windows of such SNPs. We have carefully checked the entire text and removed any inaccuracies in the use of the term positional candidate genes. 4. Population size - suggest moving lines 293-295 into the subsequent section where it belongs. Response: In accordance with the Reviewer’s suggestion, we moved the sentence concerning Ne in the subsequent section (L307 – L308). 5. The authors were given HD data from a set of Holsteins cited as Bahbahani et al.; however, this citation is not appropriate since this data was contributed to that paper as well. The original source of the Holstein data should be cited, as this is important to provide context as to what animals are in this cohort and how they were selected. If they are the animals from the Hapmap, then they would represent a diverse set of the most highly used AI bulls in USA history. Not really a reflection of European Holsteins or even Holsteins from Holland (at least 5 generations removed). Response: We checked thoroughly the paper of Bahbahani et al. (2017). According to the Materials and Methods section, the authors independently conducted genotyping of Holsteins using HD BeadChips. The row genotypes were deposited in Dryad Digital Repository as SNP genotypes data from Illumina BovineHD Genotyping BeadChip of the examined cattle populations under URL https://doi.org/10.5061/dryad.38jp6. The entire sample of Holstein breed, genotyped by Bahbahani et al. included 63 animals of both sexes (bulls and cows). We performed the kinship analysis and avoided the close-related animals (parent-offspring and full- or half-sibling) and selected for our study 25 non-related individuals of Holstein breed of both sexes. We added in the Manuscript the reference on the source of SNP data for Holsteins (ref. 35) 6. Were the marker heterozygosity calculations for Holsteins comparable to previously published works (i.e. Hapmap)? Response: We added in the Discussion section (L533 – L536) the comparison of inbreeding coefficient, detected in our study, with the one which were found in other recent reports, including HapMap (Huson HJ et al., ref. 61). 7. The results are full of a lot of lists to try and comprehensively describe all the results. Comparing highest and lowest chromosomes really doesnt matter in the big picture. What matters are the regions in common between breeds. At this point in the article - one is wondering why they would have regions in common with Holstein, since we dont know if the Russian breeds are dairy or beef animals. Response: According to the Reviewer’s suggestion, we expanded the description of studied breeds in the Introduction section, specifying the breed type (dairy) and short history of breeds (L60 – L64, L67 – L71, L73 – L76, L80 – L83). We added a citation of our recent study of the historical specimens of Kholmogor and Yaroslavl cattle, dated from the end of 19th to the first half of 20th century using microsatellites (Abdelmanova et al., genes, 2020, ref. 11), confirming their earlier differentiation from Holstein ancestors. We rewrote and significantly shortened the Discussion section. We provided the detail description of two genomic regions on BTA14 and BTA16, which are common for three breeds (Yaroslavl, Kholmogor and Holsteins). The other identified breed-specific genomic regions were only briefly described. We tried to find the possible explanations of relationships between functions of the identified candidate genes and the history of the studied breeds and provided these in the Discussion section. 8. It would have been more intriguing to see another continental (Fleckvieh/Hereford) or island breed and also a Mongolian or N. Chinese breed of cattle and maybe a zebu breed for contrast in the PCA. Response: Thank you for your suggestion. In our previous study of nine Russian cattle breeds, using medium-density SNP BeadChip (Sermyagin A.A. et al., GSE, 2018, https://gsejournal.biomedcentral.com/track/pdf/10.1186/s12711-018-0408-8) we have considered the Russian cattle breeds in context of European and worldwide breeds. Because the high-density SNP genotypes are still available for a limited number of breeds, we did not consider the studied breeds in comparison with the world breeds. 9. Is the result of nearly equal number of Fst SNPs between all 3 breeds surprising - considering the Holstein are under more intense selective pressure than the Russian breeds? The other 2 methods suggest this hypothesis is true (more ROH and FLKhap in Holsteins. Response: The top 0.1% SNP for FST values were considered to be under selection pressure. The differences between pairs of breeds were low, because the number of polymorphic SNPs, which were selected for the analysis for the studied pairs of breeds, was similar. 10. The SNP in the intron of PLAG1 still has not been proven as the causative mutation, its purely speculative. There is another paper (Yuri Tani et al.) that disputes this with very strong evidence. This should be toned down in citation. Response: We thoroughly analyzed the original papers describing the effect of PLAG1 genes on stature. We found an additional recently published report confirming the effect of causative SNP on stature in Chinese cattle breeds (Hou et al., Animal Genetics, 2020). We included this citation in the Manuscript (ref. 83). Unfortunately, we did not find another paper of Yuri Tani Utsunomiya et al. concerning PLAG1. Could the Reviewer, please, provide us with the reference? According to the Reviewer’s suggestion, we rewrote this part of the Discussion section and toned down the description of possible effect of PLAG1 in relationship with stature characteristics of studied breeds (L612 – L617). 11. Overall, a nice piece of thesis work; but the paper should not be presented like a thesis. The discussion is way too long, and should be trimmed down to highlight only the most important findings. The paper should be halved in length. The PCA and admixture should be moved to supplemental data, since they dont give very much insight (only 3 breeds in comparison has its drawbacks for interpretation). Response: Thank you very much for positive evaluation of our manuscript. We tried to take into account all the comments and made necessary corrections to the text of the manuscript. We significantly shortened the discussion section, highlighting the most important observations. We described in details only the genomic regions, which were common for three breeds and provided only a brief analysis of other breed-specific regions. We tried to find the relationships between gene functions of identified candidate genes and the history of breeds. We moved the PCA and Admixture results (Fig. 3), as well as the Distribution of ROH islands (Fig. 7) to the Supplemental data (Fig. S3 and S4 in the revised manuscript, respectively). According to the suggestion of the Reviewer 1, we moved to the supplemental data the Table 2 and 3 (S4 and S5 Tables in the revised manuscript, respectively). Submitted filename: Response to Reviewers.doc Click here for additional data file. 29 Oct 2020 Selection signatures in two oldest Russian native cattle breeds revealed using high-density single nucleotide polymorphism analysis PONE-D-20-15794R1 Dear Dr. Zinovieva, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Gyaneshwer Chaubey, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: 4 Nov 2020 PONE-D-20-15794R1 Selection signatures in two oldest Russian native cattle breeds revealed using high-density single nucleotide polymorphism analysis Dear Dr. Zinovieva: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Gyaneshwer Chaubey Academic Editor PLOS ONE
  126 in total

1.  Adaptation to climate change--exploring the potential of locally adapted breeds.

Authors:  Irene Hoffmann
Journal:  Animal       Date:  2013-06       Impact factor: 3.240

2.  Variants modulating the expression of a chromosome domain encompassing PLAG1 influence bovine stature.

Authors:  Latifa Karim; Haruko Takeda; Li Lin; Tom Druet; Juan A C Arias; Denis Baurain; Nadine Cambisano; Stephen R Davis; Frédéric Farnir; Bernard Grisart; Bevin L Harris; Mike D Keehan; Mathew D Littlejohn; Richard J Spelman; Michel Georges; Wouter Coppieters
Journal:  Nat Genet       Date:  2011-04-24       Impact factor: 38.330

3.  Correlation and probability methods for one and two loci.

Authors:  J A Sved; M W Feldman
Journal:  Theor Popul Biol       Date:  1973-03       Impact factor: 1.570

4.  Evidence for pleiotropism and recent selection in the PLAG1 region in Australian Beef cattle.

Authors:  M R S Fortes; K Kemper; S Sasazaki; A Reverter; J E Pryce; W Barendse; R Bunch; R McCulloch; B Harrison; S Bolormaa; Y D Zhang; R J Hawken; M E Goddard; S A Lehnert
Journal:  Anim Genet       Date:  2013-08-05       Impact factor: 3.169

Review 5.  Genetics in geographically structured populations: defining, estimating and interpreting F(ST).

Authors:  Kent E Holsinger; Bruce S Weir
Journal:  Nat Rev Genet       Date:  2009-09       Impact factor: 53.242

6.  Bivariate genome-wide association analysis of the growth and intake components of feed efficiency.

Authors:  Nick V L Serão; Dianelys González-Peña; Jonathan E Beever; Germán A Bollero; Bruce R Southey; Daniel B Faulkner; Sandra L Rodriguez-Zas
Journal:  PLoS One       Date:  2013-10-29       Impact factor: 3.240

7.  A Genetic Investigation of Island Jersey Cattle, the Foundation of the Jersey Breed: Comparing Population Structure and Selection to Guernsey, Holstein, and United States Jersey Cattle.

Authors:  Heather J Huson; Tad S Sonstegard; James Godfrey; David Hambrook; Cari Wolfe; George Wiggans; Harvey Blackburn; Curtis P VanTassell
Journal:  Front Genet       Date:  2020-04-17       Impact factor: 4.599

8.  Discovery of single nucleotide polymorphisms in candidate genes associated with fertility and production traits in Holstein cattle.

Authors:  Sarah D Cochran; John B Cole; Daniel J Null; Peter J Hansen
Journal:  BMC Genet       Date:  2013-06-07       Impact factor: 2.797

9.  A Meta-Assembly of Selection Signatures in Cattle.

Authors:  Imtiaz A S Randhawa; Mehar S Khatkar; Peter C Thomson; Herman W Raadsma
Journal:  PLoS One       Date:  2016-04-05       Impact factor: 3.240

10.  Assessing genetic architecture and signatures of selection of dual purpose Gir cattle populations using genomic information.

Authors:  Amanda Marchi Maiorano; Daniela Lino Lourenco; Shogo Tsuruta; Alejandra Maria Toro Ospina; Nedenia Bonvino Stafuzza; Yutaka Masuda; Anibal Eugenio Vercesi Filho; Joslaine Noely Dos Santos Goncalves Cyrillo; Rogério Abdallah Curi; Josineudson Augusto Ii de Vasconcelos Silva
Journal:  PLoS One       Date:  2018-08-02       Impact factor: 3.240

View more
  3 in total

1.  Selection and Drift: A Comparison between Historic and Recent Dutch Friesian Cattle and Recent Holstein Friesian Using WGS Data.

Authors:  Ina Hulsegge; Kor Oldenbroek; Aniek Bouwman; Roel Veerkamp; Jack Windig
Journal:  Animals (Basel)       Date:  2022-01-29       Impact factor: 2.752

2.  Genomic Diversity Profiling and Breed-Specific Evolutionary Signatures of Selection in Arunachali Yak.

Authors:  Aneet Kour; Saket Kumar Niranjan; Mohan Malayaperumal; Utsav Surati; Martina Pukhrambam; Jayakumar Sivalingam; Amod Kumar; Mihir Sarkar
Journal:  Genes (Basel)       Date:  2022-01-28       Impact factor: 4.096

3.  Genomic inbreeding and runs of homozygosity analysis of indigenous cattle populations in southern China.

Authors:  Yuqiang Liu; Guoyao Zhao; Xiaojue Lin; Jiahao Zhang; Guanyu Hou; Luepei Zhang; Dewu Liu; Yaokun Li; Junya Li; Lingyang Xu
Journal:  PLoS One       Date:  2022-08-25       Impact factor: 3.752

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.