| Literature DB >> 27709373 |
Martyna Bieniek-Kobuszewska1,2, Grzegorz Panasiewicz3, Aleksandra Lipka1, Marta Majewska4, Bozena Szafranska1.
Abstract
This is a pioneer study of single nucleotide polymorphisms (SNPs) within the entire promoter region (1204 bp) of the dominant pPAG2-L subfamily in the pig. The pPAG2-L subfamily was sequenced/examined using genomic deoxyribonucleic acid (gDNA) templates of crossbreed pigs (Landrace x Large White), and compared to two bacterial artificial chromosome (BAC) clones containing gDNA of the Duroc breed (as the positive controls). Our analysis of the pPAG2-L promoter identified 31 SNPs and one InDel mutation in crossbreed pigs. Among 42 SNPs identified in two BAC clones, 24 SNPs had not been previously detected in crossbreed pigs. The sequence alignment of pPAG2-L promoter, performed with Lasagne-Search 2.0, Cluster Bluster and MatInspector software, revealed a total of 28 transcription factor binding sites (TFBS) and 10 TFBS (AP-1, CCAAT, CHOP:C, FOXP1, LSF, MRF-2, Myc, NF1, NF-Y, TGIF) within SNPs in the core sequences. It was noted that TFBS (NF1) was found to be unique to the pPAG2 promoter sequence containing SNPs: g.-1100G>A(R), g.-1101T>C(Y), represented by GA and TC genotypes (p x = 0.12). Our broad-based novel database thus provides an SNP PAG2-L pattern for modern genotyping of female and male progenitors. This is required for further studies of various potential correlations between guiding SNP genotypes of the pPAG2-L subfamily in the sows of many breeds, in which the most economically important reproductive traits are properly documented on each farm.Entities:
Keywords: PAG; Polymorphism; Promoter; SNP; TBSF
Mesh:
Substances:
Year: 2016 PMID: 27709373 PMCID: PMC5071369 DOI: 10.1007/s10142-016-0522-z
Source DB: PubMed Journal: Funct Integr Genomics ISSN: 1438-793X Impact factor: 3.410
Fig. 1Schematic localization of the SNPs in the promoter sequence (1204 bp upstream from ATG) of the pPAG2-L subfamily examined in the crossbreed pigs. This figure includes the transcription start site (+1, +9, −29); potential binding sites for transcription factors - Ets, GATA, STAT (boxed); TATA-box (TATATAA); unique tandem repeats (double underlined); the occurrence of SNP (p.−168 g > c*) in the coding sequence for GATA
Genotype frequencies (p ) of the identified SNPs in the promoter of the pPAG2-L gene subfamily in crossbreed pigs
| SNP locus (IUPAC code)a | SNP locus acc. DNASISb | Genotypes and frequencies ( | ||||||
|---|---|---|---|---|---|---|---|---|
| Genotype |
| Genotype |
| Genotype |
| |||
| F1 region | ||||||||
| 1) |
|
| CC | 0.45 | CT | 0.55 | TT | 0.0 |
| 2) |
|
| CC | 0.45 | CG | 0.55 | GG | 0.0 |
| 3) |
|
| GG | 1.0 | –/– | 0.0 | –/– | 0.0 |
| 4) |
|
| AA | 0.57 | AT | 0.43 | TT | 0.0 |
| 5) |
|
| CC | 0 | CT | 1.0 | TT | 0.0 |
| 6) |
|
| TT | 0.56 | TG | 0.44 | GG | 0.0 |
| 7) |
|
| TT | 0.14 | GT | 0.86 | GG | 0.0 |
| 8) |
|
| GG | 0.14 | CG | 0.86 | CC | 0.0 |
| 9) |
|
| GG | 0.17 | CG | 0.83 | CC | 0.0 |
| 10) |
|
| CC | 0.0 | CG | 1.0 | GG | 0.0 |
| 11) |
|
| TT | 0.0 | CT | 1.0 | CC | 0.0 |
| 12) |
|
| AA | 0.2 | AG | 0.2 | GG | 0.6 |
| 13) |
|
| CC | 0.0 | AC | 1.0 | AA | 0.0 |
| F2 region | ||||||||
| 14) |
|
| CC | 0.2 | AC | 0.4 | AA | 0.4 |
| 15) |
|
| GG | 0.71 | AG | 0.29 | AA | 0.0 |
| 16) |
|
| AA | 0.71 | AT | 0.29 | TT | 0.0 |
| 17) |
|
| AA | 0.71 | AC | 0.29 | CC | 0.0 |
| 18) |
|
| AA | 0.0 | AG | 1.0 | GG | 0.0 |
| 19) |
|
| AA | 0.0 | AG | 1.0 | GG | 0.0 |
| 20) |
|
| GG | 0.0 | AG | 1.0 | AA | 0.0 |
| 21) |
|
| AA | 0.12 | AC | 0.82 | CC | 0.06 |
| 22) |
|
| CC | 0.76 | CG | 0.24 | GG | 0.0 |
| 23) |
|
| TT | 0.76 | GT | 0.24 | GG | 0.0 |
| 24) |
|
| TT | 0.76 | CT | 0.24 | CC | 0.0 |
| 25) |
|
| TT | 0.76 | AT | 0.24 | AA | 0.0 |
| 26) |
|
| GG | 0.76 | GT | 0.24 | TT | 0.0 |
| 27) |
|
| TT | 0.0 | AT | 1.0 | AA | 0.0 |
| 28) |
|
| CC | 0.82 | CG | 0.18 | GG | 0.0 |
| 29) |
|
| TT | 0.82 | CT | 0.18 | CC | 0.0 |
| 30) |
|
| AA | 0.82 | AG | 0.18 | GG | 0.0 |
| 31) |
|
| GG | 0.88 | AG | 0.12 | GG | 0.0 |
| 32) |
|
| TT | 0.88 | CT | 0.12 | CC | 0.0 |
aNumbering of SNPs submitted in dbSNP/NCBI database
bNumbering according to promoter sequence U39198, GenBank (Szafranska et al. 2001)
Fig. 2Chromatograms of identified SNPs (arrows) in homozygotes and heterozygotes within the promoter sequence region (from g.-91C>T to g.-1029A>G) of the pPAG2-L subfamily in crossbreed pigs. [For an interpretation of the references to color in this figure legend, the reader is referred to the web version of this article]
SNP identification in promoter sequence of the pPAG2-L gene subfamily within BAC clones: CH242-60C13 and CH242-294016
| SNP localization in 1204-bp promoter | BAC clones (Duroc) | Crossbreeds | ||||||
|---|---|---|---|---|---|---|---|---|
| SNP locus (IUPAC code)b | SNP locus acc. DNASISc |
| CH242-60C13d | CH242-294016e | ||||
| Amplicon 1 | Amplicon 2 | Amplicon 3 | Amplicon 4 | |||||
| F1 region | ||||||||
| 1) |
|
|
| GG | –/– | GG | –/– | GG |
| 2) |
|
|
| TT | –/– | TT | –/– | CC |
| 3) |
|
|
| AG | –/– | AG | –/– |
|
| 4) |
|
|
| AC | –/– | AC | –/– |
|
| F2 region | ||||||||
|
|
|
|
| GG | TT | GG | TT | GG |
|
|
|
|
| AA | GG | AA | GG | AA |
| 7) |
|
|
| AC | CC | AC | CC |
|
|
|
|
|
| GG | AA | GG | AA | GG |
|
|
|
|
| GG | CC | GG | CC | GG |
|
|
|
|
| GG | AA | GG | AA | GG |
|
|
|
|
| CC | AA | CC | AA | CC |
|
|
|
|
| GG | AA | GG | AA | GG |
|
|
|
|
| TT | GG | TT | GG | TT |
|
|
|
|
| CC | TT | CC | TT | CC |
| 15) |
|
|
| GG | AA | GG | AA | GG |
| 16) |
|
|
| AA | TT | AA | TT | AA |
| 17) |
|
|
| AA | CC | AA | CC | AA |
|
|
|
|
| AA | DelA | AA | DelA | AA |
|
|
|
|
| GG | DelG | GG | DelG | GG |
| 20) |
|
|
| AC | CC | AC | CC |
|
|
|
|
|
| GG | AA | GG | AA | GG |
|
|
|
|
| GG | DelG | GG | DelG | GG |
|
|
|
|
| GG | DelG | GG | DelG | GG |
|
|
|
|
| GG | DelG | GG | DelG | GG |
|
|
|
|
| GG | TT | GG | TT | GG |
|
|
|
|
| TT | GG | TT | GG | TT |
| 27) |
|
|
| CC | GG | CC | GG |
|
| 28) |
|
|
| TT | GG | TT | GG |
|
| 29) |
|
|
| TT | CC | TT | CC |
|
| 30) |
|
|
| TT | AA | TT | AA |
|
| 31) |
|
|
| GG | TT | GG | TT |
|
| 32) |
|
|
| TT | CC | TT | CC |
|
| 33) |
|
|
| CC | GG | CC | GG |
|
| 34) |
|
|
| CT | CC | CT | CC |
|
| 35) |
|
|
| AG | GG | AG | GG |
|
|
|
|
|
| CC | AA | CC | AA | CC |
|
|
|
|
| CC | TT | CC | TT | CC |
|
|
|
|
| GG | AA | GG | AA | GG |
|
|
|
|
| TT | –/– | TT | –/– | CC |
|
|
|
|
| AA | –/– | AA | –/– | GG |
|
|
|
|
| GG | –/– | GG | –/– | TT |
| 42) |
|
| CC | AA | –/– | AA | –/– | CC |
SNPs occur in Duroc and crossbreed pigs are underlined
aSNPs not detected in crossbreed pigs are italicized
bNumbering of SNPs submitted in dbSNP/NCBI database
cNumbering according to promoter sequence U39198, GenBank (Szafranska et al. 2001)
dBAC clone (CH242-294016) containing pPAG3 gene sequences
eBAC clone (CH242-60C13) containing pPAG3 and pPAG6 gene sequences
Prediction of transcription factors (TF) binding sites with the context of novel SNPs [in brackets] identified in silico in the promoter of the pPAG2-L gene with the use of Lasagne-Search 2.0, Cluster Buster, and MatInspector
| TF motif | Localization | Strand | Score | TF sequence |
|---|---|---|---|---|
| Lasagne-Search 2.0 | ||||
| AP-2rep | 23–29 | − | 8.14 |
|
|
| 93–103 | + | 13.05 |
|
| MoyD | 188–199 | + | 11.99 | ACA |
|
| 275–297 | − | 13.2 | ATG |
| HNF-1 | 404–420 | − | 14.2 | A |
| FAC1 | 574–587 | − | 10.21 | ATCC |
| AREB6 | 667–678 | − | 11.89 | TGA |
|
| 995–1005 | − | 18.64 | CAG |
| GATA-2 | 1051–1060 | − | 11.32 | CCT |
| GATA-2 | 1070–1079 | − | 11.32 | CCT |
| STAT5A | 1158–1165 | − | 8.85 | GAG |
| Cluster Buster | ||||
| GATA | 11–23 | − | 6.87 | GACA |
| CCAAT | 37–52 | + | 6.16 | CTTGA |
| Ets | 47–57 | + | 8.18 | GGAA |
| AP-1 | 52–62 | − | 4.49 | TG |
| Myc | 57–66 | − | 7.12 | AC |
|
| 92–107 | + | 4.28 | TTTGA |
|
| 97–106 | + | 4.58 | CC |
|
| 101–115 | + | 5.26 | [T/C][A/G]TGGCTGGAACCTC |
|
| 102–116 | − | 5.85 | GGAGGTTCCAGCCA[T/C] |
| NF-1 | 128–145 | + | 4.65 |
|
| AP-1 | 129–139 | + | 4.63 | CT |
| Sp1 | 133–145 | − | 7.06 | TGG |
|
| 173–187 | + | 5.45 | [A/G]CAGG[T/C]TGAATCCAG |
| LSF | 174–188 | − | 8.86 | GCTGGATTCAACCTG |
|
| 100–117 | + | 5.25 | A[C/T][ |
|
| 177–187 | + | 4.12 | GC |
| NF-1 | 812–829 | − | 7.85 | TC |
| SRF | 838–850 | + | 5.33 | AA |
| TATA | 840–854 | + | 4.45 | C |
| SRF | 848–860 | + | 5.9 | TG |
| NF-1 | 854–871 | + | 5.03 | GC |
| Myf | 864–875 | − | 4.46 | AAATAAGTGCTG |
| Mef-2 | 868–879 | + | 7.84 | AC |
| GATA | 872–884 | − | 6.1 | ATGT |
| GATA | 1050–1062 | − | 6.94 | TCCT |
| GATA | 1069–1081 | − | 8.43 | ACCT |
| MatInspector | ||||
| MZF1.01 | 18–28 | − | 1.0 | GT |
|
| 265–281 | + | 1.0 | ATTTGA[A/C] |
| SMAD3.01 | 293–303 | − | 1.0 | GAA |
| FOXP1 ES.01 | 568–584 | + | 1.0 | TGAATA |
| FOXP1 ES.01 | 573–589 | − | 1.0 | CATCCA |
| TBX20.01 | 797–819 | − | 1.0 | CACTTTGTG |
| GATA.01 | 1069–1081 | − | 1.0 | ACCT |
|
| 1106–1122 | + | 1.0 | A[C/G]AGG[C/T]A |
aUnderlined motifs are the TFBS with SNPs in the core sequence
bBolded letters are the core nucleotides
cBolded and underlined motifs are the new TFBS that appeared as a result of SNP occurrence
dUnderlined and italic motifs are the TFBS with SNP outside the core sequence
Fig. 3Promoter sequence (1204 bp with ATG) of the pPAG2 gene (U39198; GenBank) containing identified SNPs (acc. IUPAC code; bolded line) and TFBS predicted with the use of Lasagne-Search 2.0 (dashed line), Cluster Buster (solid line), and MatInspector (boxed)