Literature DB >> 30145872

Analysis of genetic characteristics of pig breeds using information on single nucleotide polymorphisms.

Sang-Min Lee1, Jae-Don Oh1, Kyung-Do Park1, Kyoung-Tag Do2.   

Abstract

OBJECTIVE: This study was undertaken to investigate the genetic characteristics of Berkshire (BS), Landrace (LR), and Yorkshire (YS) pig breeds raised in the Great Grandparents pig farms using the single nucleotide polymorphisms (SNP) information.
METHODS: A total of 25,921 common SNP genotype markers in three pig breeds were used to estimate the expected heterozygosity (HE), polymorphism information content, F-statistics (FST), linkage disequilibrium (LD) and effective population size (Ne).
RESULTS: The chromosome-wise distribution of FST in BS, LR, and YS populations were within the range of 0-0.36, and the average FST value was estimated to be 0.07±0.06. This result indicated some level of genetic segregation. An average LD (r2) for the BS, LR, and YS breeds was estimated to be approximately 0.41. This study also found an average Ne of 19.9 (BS), 31.4 (LR), and 34.1 (YS) over the last 5th generations. The effective population size for the BS, LR, and YS breeds decreased at a consistent rate from 50th to 10th generations ago. With a relatively faster Ne decline rate in the past 10th generations, there exists possible evidence for intensive selection practices in pigs in the recent past.
CONCLUSION: To develop customized chips for the genomic selection of various breeds, it is important to select and utilize SNP based on the genetic characteristics of each breed. Since the improvement efficiency of breed pigs increases sharply by the population size, it is important to increase test units for the improvement and it is desirable to establish the pig improvement network system to expand the unit of breed pig improvement through the genetic connection among breed pig farms.

Entities:  

Keywords:  Effective Population Size; F-statistics; Heterozygosity; Linkage Disequilibrium; Polymorphism Information Content

Year:  2018        PMID: 30145872      PMCID: PMC6409452          DOI: 10.5713/ajas.18.0304

Source DB:  PubMed          Journal:  Asian-Australas J Anim Sci        ISSN: 1011-2367            Impact factor:   2.509


INTRODUCTION

An investigation of the genetic architecture is the first important step towards genomic selection for the improvement of pig breeds. Today, the genetic information on breeding pigs has been accumulating. If the reference population is entirely established in the future, genomic selection can be possible and used to increase selection accuracy through the use of genomic and phenotypic data along with pedigree information [1]. A comprehensive information on the genetic diversity and introgression is essential for an improvement of national breeding as well as the design of conservation programs. In the past, the genetic diversity in pigs was mostly reported using information on both microsatellite markers [2,3] and mtDNA [4]. However, the advantages of single nucleotide polymorphism (SNP) over microsatellite or mtDNA are that they represent the major source of genetic variation, show low mutation rates, and are associated with complex heritable traits [5]. Nowadays, thousands of SNP information are readily available, with the advent of next generation sequencing technology [6]. Through various high-density SNP panels, the Illumina Porcine 60k Bead Chip allows for more precise and comprehensive genome-wide investigation of genetic diversity, and the degree of admixture among pig breeds [7-9]. Linkage disequilibrium (LD), on the other hand, existing within population could assist in determining the relationship among the SNPs which affect the economic traits, mapping the quantitative trait locus (QTL), and selecting the tagging SNP. Additionally, the LD between SNP among specific physical distance can be used to estimate the effective population size, and to identify the genetic diversity through genetic characteristics [10]. Furthermore, the QTLs governing pig economic traits have been studied frequently, primarily through genome-wide association studies using single marker regression [11-13]. This experiment was conducted to investigate the genetic characteristics and effective population sizes of Berkshire, Landrace, and Yorkshire pig breeds raised in the great grandparent (GGP) farms using the SNP information.

MATERIALS AND METHODS

Description of single nucleotide polymorphism data

A total of 3,710 pigs of consisting of the Berkshire (1,615), Landrace (1,041), and Yorkshire (1,054) were genotyped using Porcine SNP 60k and 61,565 SNP were collected. To ensure the quality of the genotypic data, SNP on the sex chromosomes, SNP without information on chromosome, SNP with higher than 10% of missing rate, SNP without polymorphism (all homo or hetero), SNP with less than 1% of minor allele frequency and SNP with more than 23.93 (p<10−6) of Hardy-Weinberg disequilibrium chi-squared value, and animals with more than 10% of SNP missing rate were excluded from the analysis. We found that 30, 3, and 19 pigs in Berkshire, Landrace and Yorkshire breeds had an SNP missing rate higher than 10%, respectively. Therefore, the number of pigs (SNP) after quality control in Berkshire, Landrace and Yorkshire were 1,585 (38,962), 1,038 (26,392), and 1,035 (40,783) pigs, respectively. In this study, only the 25,921 common SNPs among three breeds were used for analyses.

Statistical models

Expected heterozygosity

The expected heterozygosity (H) of a locus is defined as the probability that an individual is heterozygous in the population. where p is the frequency of the ith allele of the n alleles [14].

Polymorphism information content

The polymorphism information content (PIC) refers to the value of a marker for detecting polymorphism with a population, depending on the number of detectable alleles and the distribution of their frequency [15]. where n is the total number of alleles, p and p are frequency of the ith and jth alleles in the population, respectively [16]. The PIC is defined as the probability that the marker genotype of a given offspring will allow deduction, in the absence of crossing over, of which of the two marker alleles of the affected parents it received [17].

F-statistics (FIS,FST,FIT)

The F-statistics were used for comparing the genetic characteristics among the breeds. where F is the correlation of genes within individuals, Ø is the correlation of genes of different individuals in the same population, and f is the correlation of genes within individuals within populations. These parameters are related to Wright’s F-statistics [18] as: F = F, Ø = F, f = F. For F-statistics, F represents the degree of genetic fixation of a breed, F represents the degree of inbreeding of individuals in a population, and F indicates the degree of genetic segregation of the populations. By using F-statistics, paired t-tests analyses with entirely SNP among breeds were performed. Linkage disequilibrium: The LD between SNP pairs was used to calculate the standardized LD value D′ [19] and r2 [20]. But, D′ is dependent on the frequencies of the individual alleles. Another measure of LD is r2, which is less dependent on allele frequencies. The amount of LD is the value for the linkage between two different alleles and can be estimated by the standardized D (D′) [19] or r2 [20]. However, since the estimation of the LD using D′ can be overestimated when the population size or frequency of allele is small, it was estimated using r2. The measure of LD was expressed as the square (r2) of the correlation coefficient between SNP pairs, and was calculated between each allele at locus A and each allele at locus B [21]. The correlation coefficient (r2) was calculated by the formula: where D = P–PP and P, P, P, and P are the frequencies of alleles A, a, B, and b, respectively. P is the frequency of the genotype AB. Haplotypes within haploblocks were obtained using the expectation maximization (EM) algorithm, similar to the partition/ligation [22]. When the two loci are homozygotes or one of genotypes of the two loci is homozygote, the frequency of haplotype can be calculated, but when the two loci are double heterozygotes it is difficult to distinguish the coupling (A1B1/A2B2) and repulsion (A1B2/A2B1) by the DNA chip analysis. Therefore, using the EM algorithm [22] that determines maximum likelihood estimates for the parameters in the probability model which depends on the invisible potential variables, conditional probabilities for coupling (A1B1/A2B2) and repulsion (A1B2/A2B1) were calculated and the value of LD was estimated through the repeated arithmetic calculation until the amount of change reaches less than 10−5 [23]. Effective population size: The effective population size was determined based on a simple expectation from the amount of LD and a given chromosome segment. Since LD breaks down more rapidly over the generations for loci that are further apart, LD at large distances reflects N at recent generations [24]. where, N is the effective population size t generations ago, c is the recombination distance between the SNP in Morgan, c = (1/2t), is the mean value of r2 for markers that are c Morgan apart. It was assumed that 1 cM of physical distance and 1 Mb of genetic distance were identical.

RESULTS AND DISCUSSION

Expected heterozygosity

Figure 1 illustrates the distribution of expected heterozygosity for chromosome-wise SNP in the studied pig breeds. All three pig breeds showed a similar trend in the H estimates. The estimates of the average H in the Berkshire, Landrace, and Yorkshire were 0.33±0.15, 0.36±0.14, and 0.36±0.14, respectively. While the estimates of the average H were low in Berkshire, they were the same in Landrace and Yorkshire. Ai et al [25] reported that research regarding genetic diversity of 18 pig breeds using 60K SNP Chip showed the similar expected heterozygosity (0.38) of Landrace and Large White. The results of this study indicated the same about expected heterozygosity. The estimates of the average H were found to be highest in Sus scrofa chromosome 6 (SSC6 (0.36±0.15) of Berkshire, SSC18 (0.38±0.12) of Landrace, and SSC14 (0.38± 0.13) and SSC16 (0.38±0.13) of Yorkshire. On the other hand, the lowest H estimates, in contrast, were found in SSC15 (0.29± 0.16), SSC10 (0.34±0.13), and SSC1 (0.34±0.14) in Berkshire, Landrace, and Yorkshire pigs (Figure 1).
Figure 1

Chromosome-wise distribution of single nucleotide polymorphism heterozygosity (dots) and their average (solid line) in Berkshire (a), Landrace (b) and Yorkshire (c).

Polymorphism information contents

The estimates of PIC obtained using the H values represented polymorphism information on each gene locus [16]. The estimates of the average PIC in Berkshire, Landrace, and Yorkshire breeds were 0.26±0.11, 0.28±0.10, and 0.29±0.10, respectively. Across the chromosomes, the estimates of PIC for the SNP was highest in SSC6 (0.28±0.10) of Berkshire, SSC18 (0.30± 0.08) of Landrace, and SSC14 and 16 (0.30±0.09) of Yorkshire. On the other hand, the lowest values of PIC were observed for SSC15 (0.23±0.12) in Berkshire, for SSC10 (0.27±0.09) in Landrace, and for SSC1 (0.27±0.10) in Yorkshire. Overall, the estimates of PIC were lower than those of the average H (Table 1).
Table 1

Chromosome-wise polymorphism information content (PIC) in Berkshire (BS), Landrace (LR), and Yorkshire (YS) pigs

SSCBSLRYS
10.27±0.140.27±0.100.27±0.10
20.25±0.110.28±0.100.28±0.09
30.26±0.110.29±0.090.29±0.09
40.26±0.110.29±0.090.28±0.10
50.25±0.110.27±0.110.29±0.09
60.28±0.100.29±0.090.29±0.09
70.27±0.110.29±0.090.29±0.10
80.28±0.110.28±0.100.29±0.09
90.26±0.110.28±0.100.29±0.09
100.25±0.110.27±0.090.28±0.10
110.26±0.110.29±0.090.28±0.09
120.26±0.110.28±0.100.28±0.10
130.27±0.100.29±0.090.28±0.10
140.26±0.110.28±0.100.30±0.09
150.23±0.120.29±0.090.28±0.10
160.26±0.100.28±0.100.30±0.09
170.26±0.100.28±0.100.29±0.10
180.27±0.110.30±0.080.29±0.09
Overall0.26±0.110.28±0.100.29±0.10

SSC, Sus scrofa chromosome.

Pairwise t-test

Using the estimates of the average H and PIC in each breed, pairwise t-tests were performed, across the breeds. For the H estimates, there was no significant (p<0.05) difference in SSC1 and SSC8 in the comparison between Berkshire and Landrace, and in SSC6 and SSC8 in the comparison between Berkshire and Yorkshire, and in SSC2, SSC3, SSC8, SSC10, SSC12, SSC15, and SSC17 in the comparison between Landrace and Yorkshire. For the PIC estimates of the average, there was no significant (p<0.05) difference in SSC1 and SSC8 in the comparison between Berkshire and Landrace, and in SSC6 and SSC8 in the comparison between Berkshire and Yorkshire, and in SSC1, SSC2, SSC3, SSC8, SSC10, SSC12, and SSC17 in the comparison between Landrace and Yorkshire (Table 2).
Table 2

Chromosome-wise mean differences1) for heterozygosity (H) and polymorphism information content (PIC) among Berkshire (BS), Landrace (LR), and Yorkshire (YS) breeds

SSCHE ((|D| )PIC ((|D| )


BS vs LRBS vs YSLR vs YSBS vs LRBS vs YSLR vs YS
10.0049NS0.0119**0.0069**0.0041NS0.0078**0.0036NS
20.0487**0.0487**0.001NS0.0349**0.0362**0.0013NS
30.0366**0.0438**0.0073NS0.0265**0.0319**0.0054NS
40.0383**0.0258**0.0125**0.0285**0.0196**0.0089**
50.0304**0.0523**0.0219**0.0209**0.0376**0.0167**
60.0141**0.0029NS0.0113*0.0106**0.0031NS0.0075*
70.0321**0.0243**0.0078*0.0238**0.0181**0.0057*
80.0013NS0.0071NS0.0083NS0.0001NS0.0068NS0.0067NS
90.381**0.0515**0.0134**0.0274**0.0366**0.0092**
100.0273**0.0380**0.0107NS0.0218**0.0281**0.0063NS
110.0450**0.0332**0.0118*0.0335**0.0251**0.0084*
120.0282**0.0187*0.0094NS0.0208**0.0144**0.0065NS
130.0190**0.0102*0.0088*0.0133**0.0069**0.0064*
140.0245**0.0478**0.0233**0.0169**0.0335**0.0166**
150.0754**0.0656**0.0098NS0.0553**0.0484**0.0069
160.0219**0..0600**0.0382**0.0143**0.0414**0.0271**
170.0288**0.0515**0.0227**0.0198**0.0342**0.0144**
180.0398**0.0306**0.0092NS0.0298**0.0229**0.0068NS
Overall0.0289**0.0327**0.0038**0.0211**0.0236**0.0144**

SSC, Sus scrofa chromosome; NS, not significant.

Differences were inferred based on pairwise T test.

p<0.05,

p<0.01.

However, the pairwise t-tests using all SNP revealed significant differences (p<0.01) in the estimates of the average H and PIC among Berkshire, Landrace, and Yorkshire breeds (Table 2). According to the study of Edea et al [26], the H estimate was reported to be lowest in Berkshire breed (0.31±0.17), highest in Landrace breed (0.42±0.22), while that of Yorkshire breed was reported to be 0.35±0.17. The results of this study were consistent with those of the study [26], and the estimates of expected heterozygosity were observed the same pattern (Berkshire, 0.327±0.017; Landrace, 0.363±0.012; and Yorkshire: 0.361±0.011).

F-statistics

To investigate differences in the genetic characteristics, F-statistics were estimated among Berkshire, Landrace, and Yorkshire populations. The estimates of F by chromosome among breeds were in the range of 0 to 0.36, and the distributions of F for chromosome-wise SNP were shown Figure 2. Previous study showed that F among Berkshire, Landrace, and Yorkshire breeds are 0.22 for Berkshire vs Landrace, 0.24 for Berkshire vs Yorkshire, and 0.20 for Landrace vs Yorkshire [26]. As the F value by chromosome among breeds increased, the frequency of SNP definitely decreased, and the same trend was shown in all chromosomes.
Figure 2

Chromosome-wise FST estimates of single nucleotide polymorphisms in Berkshire, Landrace and Yorkshire.

When the F value among breeds was less than 0.05, the number of SNPs was 12,008 (46.3%), while it was 12,901 (49.8%) when the F value was between 0.05 and 0.2. Also, when the F value among breeds was more than 0.2, it was 1,012 (3.9%). The average F in all chromosomes was 0.07±0.06. This result indicated that some genetic segregation has occurred partly.

Linkage disequilibrium

The average physical distance between adjacent SNP pairs by chromosome was largest in SSC6 (126.59 kb), smallest in SSC14 (66.73 kb) and the overall average distance was 94.09 kb (Table 3). A total of 22,571,445 SNP pairs were used to estimate LD (r2). The estimates of the average r2 between adjacent SNP were 0.411, 0.408, and 0.413 in Berkshire, Landrace, and Yorkshire, respectively. Similar results were reported in Landrace, Yorkshire, Hampshire and Duroc in the USA and their estimates were 0.36, 0.39, 0.44, and 0.46 [27]. However, Uimarie and Tapio [28] reported that their estimates were 0.47 (Yorkshire) and 0.49 (Landrace) in Finland, which were higher than those of our results. Across the chromosomes, the estimate for the r2 between adjacent markers was highest in SSC1 of the Berkshire breed (0.47), SSC14 of the Landrace breed (0.49), and SSC1, SSC13 and SSC14 (0.47) of the Yorkshire breed (Table 4).
Table 3

Chromosome-wise number of SNP, SNP pairs and average distance between adjacent marker pairs (ADAM, kb) among three pig breeds

SSCNo. of SNPNo. of SNP pairsADAM (kb)
13,2335,224,52897.26
21,7281,492,12893.73
31,218741,153117.26
41,9961,991,01071.86
51,126633,37598.44
61,243771,903126.59
71,9941,987,02167.41
81,252783,126117.72
91,6561,370,34092.56
10827341,55193.91
11885391,17098.55
12651211,57597.39
131,9991,997,001108.92
142,3022,648,45166.73
151,414998,991111.24
16872379,75699.54
17929431,05674.66
18596177,310100.73
Overall25,92122,571,44594.09

SNP, single nucleotide polymorphism; ADAM, average distance between adjacent marker; SSC, Sus scrofa chromosome.

Table 4

Mean linkage disequilibrium (r2) estimates for adjacent (ADJ) and all pairs (ALL) of SNP in Berkshire (BS), Landrace (LR), and Yorkshire (YS)

SSCBSLRYS



AllADJAllADJAllADJ
10.050.470.020.460.020.47
20.040.400.030.390.030.39
30.040.380.030.390.030.34
40.030.420.020.410.020.43
50.030.350.020.350.020.35
60.030.380.030.400.030.43
70.030.400.030.390.020.38
80.040.390.020.370.030.40
90.040.360.030.380.030.42
100.030.340.020.370.020.32
110.040.390.030.350.020.33
120.040.410.030.370.030.36
130.040.410.040.410.040.47
140.050.460.040.490.030.47
150.030.440.020.430.020.43
160.040.390.030.390.040.39
170.040.410.030.370.030.40
180.050.430.040.380.030.36
Overall0.040.410.030.410.030.41

SSC, Sus scrofa chromosome.

The values of r 2 decreased with increasing distance between SNP pairs (Figure 3) and the most rapid decline was observed over the first 2 Mb. But r2 decreased more slowly with increasing distance and was constant after 5 Mb of distance [28]. In each breed, the pattern and magnitude LD decline with distance at less than 10 Mb were almost similar.
Figure 3

Changes in linkage disequilibrium estimates (r2) between single nucleotide polymorphism markers within 10 mega base (Mb) pair distance.

Effective population size

It can be predicted that when the LD (r2) between SNP located within close physical distances is low, genetic recombination at that locus occurred a long time ago. Similarly, when the r2 between SNP located within far physical distances is high, genetic recombination at that locus occurred recently. The extent of genetic recombination can be estimated by the population size, while the N across the generations can be estimated from the r2 [10,29]. The N for the Berkshire, Landrace, and Yorkshire over 1st–5th generation was estimated to consist of 19.87, 31.41, and 34.09 pigs, respectively (Figure 4). It was reported in a previous study that the N of the Landrace and Yorkshire in Finland consists of approximately 80 and 55 pigs, respectively [28].
Figure 4

Changes in past effective population size (Ne) in Berkshire (a), Landrace (b) and Yorkshire (c).

The effective population size was estimated small compared to those of the advanced countries in pig industry since the scales of domestic GGP farms were relatively small. Additionally, closed herds have been maintained and inbreeding mating system have been applied. In Berkshire, the size of past N from 50th to 5th generations ago had changed noticeably, from 97.7 to 50, with a gradual increase in declining rate per generation (0.8% to 9.7%). Similarly, N declines were also observed in Landrace (100.2 to 50) and Yorkshire (102.3 to 34.1) pigs, followed by a somewhat similar declining rate. The N for the Berkshire, Landrace, and Yorkshire decreased at constant slope from 50th generations ago to 10th generations ago, with a sharp decrease in the recent 10th generations. Similar results were reported in a study by Uimari and Tapio [28]. From these results, the intensive artificial selection seemed to be made from recent 10th generations (Figure 4).

CONCLUSION

In order to develop customized chips for the genomic selection of various breeds, it is important to select and utilize SNP based on the genetic characteristics of each breed. Since the improvement efficiency of breed pigs increases sharply by the population size, it is important to increase test units for the improvement and it is desirable to establish the pig improvement network system to expand the unit of breed pig improvement through the genetic connection among breed pig farms.
  1 in total

1.  A genome-wide association study (GWAS) for pH value in the meat of Berkshire pigs.

Authors:  Jun Park; Sang-Min Lee; Ja-Yeon Park; Chong-Sam Na
Journal:  J Anim Sci Technol       Date:  2021-01-31
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.