Literature DB >> 31188930

Whole genome SNPs discovery in Nero Siciliano pig.

Enrico D'Alessandro1, Domenico Giosa2, Irene Sapienza1, Letterio Giuffrè1, Riccardo Aiese Cigliano3, Orazio Romeo2,4, Alessandro Zumbo1.   

Abstract

Autochthonous pig breeds represent an important genetic reserve to be utilized mainly for the production of typical products. To explore its genetic variability, here we present for the first time whole genome sequencing data and SNPs discovered in a male domestic Nero Siciliano pig compared to the last pig reference genome Sus scrofa11.1.A total of 346.8 million paired reads were generated by sequencing. After quality control, 99.03% of the reads were mapped to the reference genome, and over 11 million variants were detected.Additionally, we evaluated sequence diversity in 21 fitness-related loci selected based on their biological function and/or their proximity to relevant QTLs. We focused on genes that have been related to environmental adaptation and reproductive traits in previous studies regarding local breeds. A total of 6,747 variants were identified resulting in a rate of 1 variant every ~276 bases. Among these variants 1,132 were novel to the dbSNP151 database. This study represents a first step in the genetic characterization of Nero Siciliano pig and also provides a platform for future comparative studies between this and other swine breeds.

Entities:  

Year:  2019        PMID: 31188930      PMCID: PMC6905442          DOI: 10.1590/1678-4685-GMB-2018-0169

Source DB:  PubMed          Journal:  Genet Mol Biol        ISSN: 1415-4757            Impact factor:   1.771


Preservation of genetic variability, used or potentially usable for food production, of non-food raw materials or related to social, cultural, and economic aspects, represents a challenge of fundamental importance on a planetary level. Although the difficulty of conserving biodiversity in species of zootechnical interest is not a recent concern, in recent years the need to preserve genetic variability within breeds has increased and is important in production systems (Kuec ). Animal genetic resources and their management systems are an integral part of ecosystems and productive landscapes in Italy, especially in Sicily. The role of livestock is more than ever, to provide sufficient food for humans that is protein-rich, safe and healthy, and with high nutritional and organoleptic values. Local pig breeds can be used for the production of raw materials particularly suitable for production of typical processed products. The higher economic value of typical productions compared to conventional commercial products and the growing consumer preference towards quality food could give support to plans for livestock biodiversity conservation (Herrero-Medrano ; Wilkinson ). In this respect, governments, institutional breeding organizations, private breeders and market demand play a crucial role in this endeavor toprotect and valuation of local breeds (Tapio ; Ollivier, 2009). Although the interest in local pig breeds has increased significantly in recent years, only a few of them have been included in whole-genome sequencing projects (Esteve-Codina , 2013; Herrero-Medrano ). The knowledge of the genetic background of these local breeds is very important as many of them have unique characteristics that could help address the challenges related to climate change, increase in world population and for food security and nutrition, as highlighted in the Domestic Animal Diversity Information System (DAD-IS) of FAO. The Nero Siciliano pig is an autochthonous genetic type of the rural areas in Sicily (Italy). It lives in the woods of the Nebrodi and Madonie mountains and is reared in extensive and semi-extensive systems, making good use of pasture and other natural plant resources following the traditional practices used in this area. It is resistant to disease and with a great potential for adaptation to difficult environments, as it has a great ability for rooting and for procuring food. Its “Register of Native Breeds” was established in 2001 and now contains about 14.000 animals, of which ~5.000 are sows (ANAS - Italian Pig Breeders Association, 2018), from over 128 farms.The meat obtained from these pigs is sold at a higher price than that of commercial pigs, and in 2005 a request was made to allow labelling fresh Nero Siciliano meat with the Protected Denomination of Origin (D’Alessandro ). Black pigs are rustic, disease resistant animals, and live well in harsh conditions, but run a high risk of losing their original traits because of the lack of a real plan for genetic selection and setting up appropriate breeding systems and controls. The genetic variability of the Nero Siciliano pig has been assessed with the use of various genetic markers in several studies on molecular characterization of genetic structure and analysis of coat colour genes (MC1R and KIT gene) to evaluate their usefulness for breed traceability (Russo ; Fontanesi ; Guastella ). All the procedures used in this research were in compliance with the European guidelines for the care and use of animals in research (Directive 2010/63/EU). A blood sample from a male of Nero Siciliano pig was used for DNA extraction. The individual was chosen for this study as one of the most representative boars of this breed, registered in the “Register of native breed” (ANAS; ID:163347). The leukocytes fraction recovered from the fresh whole blood sample was used for total genomic DNA (gDNA) extraction using the Wizard® Genomic DNA Purification Kit (Promega Corporation, Italy), following the manufacturer’s instructions. For DNA quantification a Qubit 2.0 Fluorometer was used with the Qubit dsDNA HS Assay Kit (Thermo Fisher, Italy). DNA quality was assessed by a Nanophotometer P-330 (Implen GmbH) and also by visual inspection after agarose gel electrophoresis (1% agarose in TAE 1X buffer). A PCR-Free library was prepared with TruSeq DNA kit (insert size 350 bp) using 1 μg of gDNA and following the protocol provided by Illumina. Paired-sequencing was carried out with a HiSeqX platform (Illumina). The sequenced raw reads were checked using the FastQC program and cleaned with Trimmomatic v. 0.36 (Bolger ) to remove adapters and low-quality sequences (Phred score < 30). Good quality reads were mapped against the Sus scrofa reference genome (version 11.1; GenBank: GCA_000003025.6) with BWA (version 0.7.12-r1039) (Li and Durbin, 2009) and mapping quality was evaluated using Qualimap2 (Okonechnikov ). Single nucleotide polymorphisms (SNPs), short insertions and deletions (INDEL), and structural variants (SVs) analyses were performed using SUPERW and PINDEL pipelines (Ye ; Sanseverino ). The resulting variants were further filtered using the following parameters: QUAL (phred-scaled quality score of called variant) ≥ 30, DP (number of high-quality bases for called variant) ≥ 10, AD (allele depth) ≥ 10, removal of all called variants that showed the same genotype of the reference. Putative effects of SNPs were evaluated using SnpEff software v4_3m_core (Cingolani ). We further focused on 21 fitness-related gene sequences (Table 1) obtained using samtools (Li ) and bcftools (Li, 2011).
Table 1

List of 21 fitness related genes investigated in this study. The table shows the chromosome, gene symbol, gene function or putative gene association, starting and ending coordinates, reference.

ChrGene symbolGene function or putative association with QTLsStartEndReference
1ESR1total newborn, newborn alive1421703214604906 Rothschild et al., 1996
1VPS13Amaintenance of thermostatic status, blood circulation230069339230331343 Groenen, 2016
1NR6A1body size265320597265570941 Groenen, 2016
2FSHBtotal newborn, newborn alive3039576930399282 Zhao et al., 1998
3EIF2AK3gene overlaps with QTLs for osteochondrosis score and feet and leg conformation5742389457506247 Laenoi et al., 2011
3AZGP1adaptation to environment78675217874857 Beeckmann et al., 2003; Ma et al., 2009; Harmegnies et al., 2006
4PLAG1body size7564658575696718 Rubin et al., 2012
6IL12RB2immune related gene145210251145292399 Koch et al., 2012; Herrero-Medrano et al., 2014
6FUT1resistance to disease5407743154080475 Meijerink et al., 1997, 2000; Bao et al., 2012; Zhang et al., 2015; Fernández et al., 2017
8GNRHROvulation rate6547020665488900 Jiang et al., 2001
8LCORLbody size1280687812969370 Rubin et al., 2012
9AHRlitter size8651186686555950 Bosse et al., 2014
12PPP1R1Bcandidate genes affecting behaviour2268124422690978 Groenen, 2016
13STAB1immune related gene, defence against bacterial infection3463044834659371 Herrero-Medrano et al., 2014; Kzhyshkowska, 2010
13GPR149potential effect on fertility, prolificacy9435637194419917 Choi et al., 2015
13CLDN1potential effect on fertility127714857127730628 Choi et al., 2015
14RBP4total newborn, newborn alive105037360105044552 Rothschild et al., 2000
14JMJD1Cpotential effect on fertility6664084566966911 Choi et al., 2015
15DCAF17maintenance of thermostatic status, hair growth7756462977603913 Groenen et al., 2016
16PRLRinvolved in several reproductive traits, including litter size2063756820655881 Vincent et al., 1998; van Rens et al., 2003; Tomás et al., 2006
18TAS2R40adaptation to specific dietary repertoires and environment70244187027197 Dong et al., 2009; Fischer et al.,2005,Ribani et al., 2017
Subsequently, the resulting high impact effects mutations were aligned and manually inspected with MEGA7 using the reference genomic and relative transcripts sequences retrieved from GenBank, in order to evaluate the putative functional role of the variants on the respective protein sequences. Variants called by SUPERW and PINDEL were compared with bedtools intersect and duplicates were removed from the PINDEL output. In order to detect novel SNPs, snpSift (Cingolani ) was utilized against dbSNP151 database (ftp.ncbi.nih.gov/snp/organisms/pig_9823/) and all resulting novel SNPs were manually examined and confirmed. To explore the genetic resources of this breed, here we present for the first time the whole genome sequencing analysis of a male domestic Nero Siciliano pig, as well as a comparison with the most recent pig reference genome (Sscrofa 11.1) released by the International Swine Genome Sequencing Consortium and improved in annotation and assembly by Warr . In particular, we focused our attention on 21 genes that were selected according to their function and/or their association with specific traits (Table 1). These genes have been chosen because they affect phenotypes related to rusticity, adaptability to poor conditions of management and feeding, and great resistance to diseases, all these representing some of the most distinctive features of autochthonous breeds, especially the Nero Siciliano pig. In this study, a total of 346.8 million raw paired-reads were produced by Illumina HiSeq X sequencing. After quality filtering and trimming, ~344.3 million (99.29%) high-quality reads were mapped to the S. scrofa reference genome, with a mean coverage of 39.5 X. A total of 11,253,945 genetic variants were detected by SUPERW in this study. Of these, ~82% were SNPs whereas ~12% and ~5% were short insertions and deletions respectively. Moreover, more than 58% of the detected SNPs (6,555,556 variants) were heterozygous, while the remaining 42% were found in alternative homozygosity state. The overall observed frequency was 1 variant every 222 bases, with a SNP mutation rate of 1/269 bp. However, we cannot confirm that all DNA mutations detected in this study segregate in the Nero Siciliano breed, as only one sample was considered. SnpEff analysis showed that most of the variants recognized were located in non-coding regions of the genome, such as introns and intergenic regions (Figure 1a). Approximately 36% of the missense, 0.4% nonsense, and 63,6% silent mutations were observed, resulting in a missense/silent and Ts/Tv (transition/transversion) ratio of 0.5617 and 2.3956 respectively. However, the Ts/Tv ratio was similar to other pig genomes (Kang ), while the observed SNP mutation rate was slightly higher than that reported by Jungerius .
Figure 1

SNPs and short INDELs detected in this study in (a) whole genome and (b) in fitness related genes, and their location based on genomic annotation. Y-axis, represents the percentage of the variants.

Among the structural variants identified by PINDEL, we observed a total of 808,486 insertions, 452,926 deletions, 196,971 replacements, 2,383 tandem duplications, and 1,029 inversions. Of these, 586,686 were heterozygous, whereas 875,109 were in alternative homozygosity. Using the panel of fitness-related genes selected in this study, we identified a total of 6,747 SNPs and short INDELs (Figure 1b), that were classified according to Cingolani in 7 “high”, 35 “moderate”, 54 “low impact” and 6,651 as modifiers (Table 2). This resulted in a mutation rate of 1 per ~276 bases; for further details see the supplementary material Tables S1, S2, and S3. Among the total variants identified, 1,132 were novel, consisting of 476 heterozygous and 656 in alternative homozygosity form.
Table 2

SNPs, short INDELs, and structural variants detected in the 21 fitness-related genes examined in this study.

Gene symbolLengthVariants (SNPs and short INDEL) classified by impactTotal%Variants/lengthStructural variants
HighLowModerateModifier
ESR13878750140289129050.749102
VPS13A2620051113703730.14236
NR6A12503450019100.00413
FSHB3514000000.0000
EIF2AK3823540104104110.49924
AZGP1733724766791.0772
PLAG150134000330.0061
IL12RB2821490443183260.39711
FUT13045200350.1641
GNRHR1869500025250.1343
LCORL1624931414224280.26330
AHR440850894945111.15920
PPP1R1B973500013130.1342
STAB1289240139130.0451
GPR149635470943703830.60310
CLDN115772000880.0510
RBP4719301056570.7922
JMJD1C3260670507627670.23562
DCAF17392850121781810.46110
PRLR183141132402451.33815
TAS2R402780000440.1440
TOTAL186564875435665167470.362345
The seven high impact mutations, all in the alternative homozygous state, affected five out of the 21 examined genes: VPS13A (Vacuolar protein sorting 13 homolog A); AZGP1 (Alpha-2-glycoprotein 1, zinc-binding); LCORL (Ligand-dependent nuclear receptor corepressor-like protein), FUT1 (Fucosyltransferases 1); PRLR (Prolactin Receptor). Such variants consisted in one SNP and six nucleotide insertions. Four of these latter were gain of function mutations and restored the reading frames of the VPS13A, AZGP1, FUT1 and PRLR genes, as evidenced by comparative analysis with the reference genome and its transcripts. The remaining two insertions produced a premature stop codon and a lack stop codon in the AZGP1 and LCORL genes respectively, whereas the unique SNP detected was a missense mutation (ACGàGCG; Thr103àAla103) affecting the FUT1 gene. Five of these seven high impact mutations were novel to the dbSNP database (see Table S1). Among the structural variations affecting the subset of the fitness-related genes we observed 101 replacements (RPL), 132 insertions, and 112 deletions. Of these, 203 were heterozygous and 142 were in the alternative homozygosity state. Figure 2 shows the gene-wide distribution of all detected mutations including the related sequencing coverage for all genes investigated.
Figure 2

Variants detected in 21 fitness-related genes. From outside to inside, rings show: all SNPs and INDELs (blue circles), HIGH impact (red square), MODERATE impact (orange circles), LOW impact (green triangles), novel SNPs and short INDELs (yellow circles), SV INDEL (blue triangles), SV RPL (purple circles), reads coverage (black lines).

SNPs discovery analysis of the 21 fitness-related genes showed a coherent rate of mutation compared to the whole genome data. We focused on high impact mutations that may affect the gene product. The VPS13A gene plays a role in maintenance of thermostatic conditions during thermal stress and is involved in blood circulation (Groenen, 2016). We found a novel nucleotide insertion (G, genome position: 230125827) that causes a frameshift mutation restoring the VPS13A reading frame. AZGP1 is a putative candidate gene for adaptation to environment. A mutation in this gene, that overlaps QTLs for the number of vertebra (Beeckmann ; Harmegnies ), abdominal fat and ear shape and size (Ma ), was identified in Mangalica, Cinta Senese, and one European wild boar, but not in commercial pigs (Herrero-Medrano ). Furthermore, AZGP1 is correlated with lipid mobilization and it is considered a candidate gene for body weight regulation and obesity in humans (Mracek ). We found two novel frameshift mutations in this gene, both nucleotide insertions (genome positions: 7874326 and 7874521 of the chromosome 3), which could affect its function, with possible effects on fat deposition. The LCORL gene overlaps a QTL involved in morphological modifications occurring during domestication events regarding elongation of the back and an increased number of vertebrae (Rubin ). This gene is considered a candidate gene for body size. A known LCORL frameshift mutation (rs791023757; genome position: 12829718) was detected in this study and resulted in a lacking stop codon. Unfortunately, no phenotypes have been associated so far with this mutation, as evidenced by lack of information in the dbSNP database. We found two variants with high impact also in FUT1 gene. This gene encodes a membrane protein involved in the synthesis of a precursor of blood group antigen. Previous studies showed that polymorphisms in this gene are associated with adhesion and colonization capacity of F18 fimbriated Escherichia coli to intestinal mucosa (Bao ; Zhang ). The toxins produced by this microrganism cause piglet post-weaning diarrhea (Luo ; Zhang ). We identified a missense mutation in position 54079560 of chromosome 6 (FUT1 gene) that results in an amino acid change at position 103 (ThràAla) of the protein. This SNP, already recorded in the dbSNP database (rs335979375), was associated with E. coli F18-resistant or susceptible genotypes (Meijerink , 2000). Tthe second identified variant was a G insertion (genome position: 54079637), but further studies will be needed to validate these findings and the role of these mutations in the Nero Siciliano breed. The PRLR gene encodes a receptor for prolactin and is considered a strong candidate gene for various traits affecting directly (ovulation rate) or indirectly (ovarian weight, uterine length and number of teats) litter size and general reproductive performance in pigs (Vincent ; van Rens ; Tomás ). In the PRLR gene we detected a G insertion in position 20642378 (chromosome 16), but its contribution to the phenotypic variation remains to be elucidated. The Nero Siciliano pig is not a well-characterised breed, and this study represents a first step in the genetic characterization of this animal, even if further research on the whole population reared in Sicily is needed to confirm the observed genetic variation and to integrate our data. In fact, all genetic changes detected in this study are only differences compared to the reference genome used and are therefore not indicative of the presence of mutated loci in the breed. Since publication of the Sus scrofa reference genome (Warr ), several re-sequencing projects have been undertaken, but few have focused on local breeds. In this study we report, for the first time, the sequencing and variant calling analysis of a single boar of Nero Siciliano pig, with the aim of starting to acquire useful information on its genetic background that could be crucial to understand new genetic selection concepts for creating new sustainable pork chains based on local pig breeds. Therefore, the importance of preserving local breeds as a source of genomic diversity for further improvements of commercial pigs represents an added value in typical local productions. However, currently, in Italy the information regarding local pigs is strongly limited and therefore further sequencing studies will be essential for detecting the extent of genetic diversity occurring in Nero Siciliano pig. The data sets supporting the results of this article are included within the article and its additional files. The raw reads used for the genome-wide analysis have been deposited in the NCBI Sequence Read Archive (SRA) under the following accession number: SRX3406507.
  44 in total

1.  A genome scan for quantitative trait loci affecting three ear traits in a White Duroc x Chinese Erhualian resource population.

Authors:  J Ma; W Qi; D Ren; Y Duan; R Qiao; Y Guo; Z Yang; L Li; D Milan; J Ren; L Huang
Journal:  Anim Genet       Date:  2009-03-18       Impact factor: 3.169

2.  Partial short-read sequencing of a highly inbred Iberian pig and genomics inference thereof.

Authors:  A Esteve-Codina; R Kofler; H Himmelbauer; L Ferretti; A P Vivancos; M A M Groenen; J M Folch; M C Rodríguez; M Pérez-Enciso
Journal:  Heredity (Edinb)       Date:  2011-03-16       Impact factor: 3.821

3.  Analysis of polymorphisms in the FUT1 and TAP1 genes and their influence on immune performance in Pudong White pigs.

Authors:  Y Zhang; M Wang; X Q Yu; C R Ye; J G Zhu
Journal:  Genet Mol Res       Date:  2015-12-17

4.  Evolution of bitter taste receptors in humans and apes.

Authors:  Anne Fischer; Yoav Gilad; Orna Man; Svante Pääbo
Journal:  Mol Biol Evol       Date:  2004-10-20       Impact factor: 16.240

5.  Dissecting structural and nucleotide genome-wide variation in inbred Iberian pigs.

Authors:  Anna Esteve-Codina; Yogesh Paudel; Luca Ferretti; Emanuele Raineri; Hendrik-Jan Megens; Luis Silió; María C Rodríguez; Martein A M Groenen; Sebastian E Ramos-Onsins; Miguel Pérez-Enciso
Journal:  BMC Genomics       Date:  2013-03-05       Impact factor: 3.969

6.  Molecular characterization and genetic structure of the Nero Siciliano pig breed.

Authors:  Anna Maria Guastella; Andrea Criscione; Donata Marletta; Antonio Zuccaro; Luigi Chies; Salvatore Bordonaro
Journal:  Genet Mol Biol       Date:  2010-12-01       Impact factor: 1.771

7.  Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data.

Authors:  Konstantin Okonechnikov; Ana Conesa; Fernando García-Alcalde
Journal:  Bioinformatics       Date:  2015-10-01       Impact factor: 6.937

8.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

9.  Whole-genome sequence analysis reveals differences in population management and selection of European low-input pig breeds.

Authors:  Juan Manuel Herrero-Medrano; Hendrik-Jan Megens; Martien A M Groenen; Mirte Bosse; Miguel Pérez-Enciso; Richard P M A Crooijmans
Journal:  BMC Genomics       Date:  2014-07-16       Impact factor: 3.969

10.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

View more
  2 in total

Review 1.  Sicilian Black Pig: An Overview.

Authors:  Alessandro Zumbo; Anna Maria Sutera; Giuseppe Tardiolo; Enrico D'Alessandro
Journal:  Animals (Basel)       Date:  2020-12-07       Impact factor: 2.752

2.  Genetic Diversity and Population Structures in Chinese Miniature Pigs Revealed by SINE Retrotransposon Insertion Polymorphisms, a New Type of Genetic Markers.

Authors:  Cai Chen; Xiaoyan Wang; Wencheng Zong; Enrico D'Alessandro; Domenico Giosa; Yafen Guo; Jiude Mao; Chengyi Song
Journal:  Animals (Basel)       Date:  2021-04-15       Impact factor: 2.752

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.