| Literature DB >> 24204379 |
Muhammad Ahsan1, Xidan Li, Andreas E Lundberg, Marcin Kierczak, Paul B Siegel, Orjan Carlborg, Stefan Marklund.
Abstract
Mapping of chromosomal regions harboring genetic polymorphisms that regulate complex traits is usually followed by a search for the causative mutations underlying the observed effects. This is often a challenging task even after fine mapping, as millions of base pairs including many genes will typically need to be investigated. Thus to trace the causative mutation(s) there is a great need for efficient bioinformatic strategies. Here, we searched for genes and mutations regulating growth in the Virginia chicken lines - an experimental population comprising two lines that have been divergently selected for body weight at 56 days for more than 50 generations. Several quantitative trait loci (QTL) have been mapped in an F2 intercross between the lines, and the regions have subsequently been replicated and fine mapped using an Advanced Intercross Line. We have further analyzed the QTL regions where the largest genetic divergence between the High-Weight selected (HWS) and Low-Weight selected (LWS) lines was observed. Such regions, covering about 37% of the actual QTL regions, were identified by comparing the allele frequencies of the HWS and LWS lines using both individual 60K SNP chip genotyping of birds and analysis of read proportions from genome resequencing of DNA pools. Based on a combination of criteria including significance of the QTL, allele frequency difference of identified mutations between the selected lines, gene information on relevance for growth, and the predicted functional effects of identified mutations we propose here a subset of candidate mutations of highest priority for further evaluation in functional studies. The candidate mutations were identified within the GCG, IGFBP2, GRB14, CRIM1, FGF16, VEGFR-2, ALG11, EDN1, SNX6, and BIRC7 genes. We believe that the proposed method of combining different types of genomic information increases the probability that the genes underlying the observed QTL effects are represented among the candidate mutations identified.Entities:
Keywords: QTL; SNP; candidate genes; functional prediction; genetic divergence; growth; resequencing
Year: 2013 PMID: 24204379 PMCID: PMC3817360 DOI: 10.3389/fgene.2013.00226
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Fine-mapped growth QTL regions with significance according to Besnier et al. (2011).
| GGA[ | QTL[ | Region name | Start (Mbp[ | End (Mbp[ | Size (Mbp) |
|---|---|---|---|---|---|
| 1 | Growth1 | C1G1 | 169.6 | 181.0 | 11.4 |
| 2 | Growth2 | C2G2 | 47.9 | 65.4 | 17.5 |
| 3 | Growth4 | C3G4 | 24.0 | 68.0 | 43.9 |
| 4 | Growth6 | C4G6 | 1.3 | 13.5 | 12.1 |
| 5 | Growth8 | C5G8 | 33.6 | 39.0 | 5.3 |
| 7 | Growth9 | C7G9 | 10.9 | 35.4 | 24.5 |
| 20 | Growth12 | C20G12 | 7.1 | 13.8 | 6.7 |
| Total | 121.4 |
GGA: Gallus Gallus Autosome;
QTL names as in Besnier et al. (2011);
Coordinates based on the Chicken (Gallus gallus) assembly v 2.1/galGal3
Candidate regions selected based on QTL data and allele frequency differences between the lines inferred from SNP chip genotyping and FSV computation from resequencing. The selected percentages of the QTL regions significant with model B, are given (Besnier et al., 2011).
| Region name | Start Mbp[ | End Mbp | Size (Mbp) | QTL support[ | Ensembl genes[ |
|---|---|---|---|---|---|
| C1G1 | 169.6 | 175.0 | 5.4 | 5.4 | 97 |
| C2G2 | 59.7 | 65.4 | 5.7 | 2.1 | 52 |
| C3G4 | 24.1 | 35.8 | 11.7 | 10.3 | 142 |
| C4G6 | 10.6 | 12.9 | 2.3 | 0.0 | 62 |
| C5G8 | 34.2 | 36.8. | 2.6 | 0.0 | 20 |
| C5G8 | 38.2 | 39.0 | 0.8 | 0.0 | 16 |
| C7G9 | 20.4 | 35.4 | 15.0 | 4.3 | 209 |
| C20G12 | 8.3 | 9.5 | 1.2 | 1.2 | 38 |
| Total | 44.7 | 23.3 | 636 |
Coordinates based on the Chicken (Gallus gallus) assembly v 2.1/galGal3;
Size of the selected regions significant with QTL model B (Besnier et al., 2011);
Number of Ensembl genes in the initial list in the selected regions
The variant effect predictor summary of SNPs in selected candidate segments of the QTL regions (according to Table ).
| Location within gene | Region | |||||||
|---|---|---|---|---|---|---|---|---|
| C1G1 | C2G2 | C3G4 | C4G6 | C5G8 | C7G9 | C20G12 | Total | |
| 3Prime UTR | 200 | 93 | 200 | 153 | 73 | 348 | 75 | 1142 |
| 3Prime UTR, Splice site | 1 | 1 | 2 | |||||
| 5Prime UTR | 22 | 9 | 44 | 20 | 3 | 50 | 28 | 176 |
| 5Prime UTR, Splice site | 1 | 4 | 5 | |||||
| Coding unknown | 1 | 1 | 2 | |||||
| Downstream | 6118 | 2636 | 5318 | 2373 | 1395 | 7930 | 1384 | 27154 |
| Essential splice site | 2 | 3 | 6 | 1 | 1 | 4 | 3 | 20 |
| Non-synonymous coding | 215 | 82 | 255 | 92 | 60 | 470 | 80 | 1254 |
| Non-synonymous coding, Splice site | 6 | 4 | 8 | 5 | 3 | 17 | 1 | 44 |
| splice site, Intronic | 78 | 37 | 133 | 33 | 24 | 191 | 31 | 527 |
| Stop gained | 5 | 7 | 3 | 2 | 10 | 27 | ||
| Stop gained, Non-synonymous coding | 1 | 1 | ||||||
| Synonymous coding | 350 | 208 | 543 | 165 | 99 | 1113 | 159 | 2637 |
| Synonymous coding, splice site | 9 | 9 | 12 | 5 | 6 | 20 | 12 | 73 |
| Upstream | 5506 | 2626 | 5755 | 2570 | 1200 | 8312 | 1479 | 27488 |
| Within mature miRNA | 1 | 1 | ||||||
| Within non-coding gene | 4 | 2 | 1 | 3 | 12 | 4 | 26 | |
| Within non-coding gene, splice site | 1 | 1 | ||||||
| Total | 12516 | 5708 | 12284 | 5422 | 2870 | 18484 | 3256 | 60540 |
Candidate mutations identified in the evaluated QTL regions.
| Region | SNP (bp)[ | Gene | SNP location[ | No of AA[ | Qual[ | AFD[ | PC Score[ | EC Score[ | PE Score[ |
|---|---|---|---|---|---|---|---|---|---|
| C1G1 | 174634021 | Asparagine-linked glycosylation 11 homolog ( | CpG island, upstream | 7; 10 | 72 | 0.97 | N/A | N/A | N/A |
| C2G2 | 63823523 | Endothelin 1( | CpG island, upstream | 3; 13 | 53 | 0.95 | N/A | N/A | N/A |
| C3G4 | 33678270 | Cysteine rich transmembrane BMP regulator 1 ( | Protein code, NS K/I | 10; 19 | 182 | 0.97 | 0.67 | 0.63 | 0.42 |
| C4G6 | 12044024 | Similar to receptor tyrosine kinase ( | CpG island, upstream | 4; 8 | 82 | 0.97 | N/A | N/A | N/A |
| C4G6 | 12902414 | Fibroblast growth factor 16 ( | CpG island, downstream | 8; 16 | 175 | 0.95 | N/A | N/A | N/A |
| C5G8 | 38316301 | Sorting nexin 6 ( | CpG island, upstream | 8; 14 | 142 | 0.97 | N/A | N/A | N/A |
| C7G9 | 21686625 | Growth factor receptor-bound protein 14 ( | CpG island, downstream | 3; 12 | 52 | 0.97 | N/A | N/A | N/A |
| C7G9 | 22711910 | Glucagon ( | CpG island, downstream | 3; 9 | 46 | 0.87 | N/A | N/A | N/A |
| C7G9 | 24802616 | Insulin-like growth factor binding protein 2 ( | Protein code, synonymous, CpG island | 4; 8 | 69 | 0.95 | N/A | N/A | N/A |
| C20G12 | 8715398 | Baculoviral IAP repeat-containing 7 ( | Protein code, NS I/V | 5; 8 | 65 | 0.97 | 0.29 | 0.14 | 0.04 |
Coordinates based on the Chicken (Gallus gallus) assembly v 2.1/galGal3;
Location of the SNP in gene and also amino acid substitution in case of non-synonymous (NS) SNP;
Total number of reads in both lines representing the alternate allele (AA) versus the total depth coverage across the SNP position;
The Phred scaled probability that a REF/ALT polymorphism exists at this site given sequencing data. Because the Phred scale is -10 * log(1 - p), a value of 10 indicates a 1 in 10 chance of error, while a 100 indicates a 1 in 1010 chance;
Allele frequency difference between the chicken lines as estimated using the GigaBayes software given that a total of 19 individuals from each line were included in the pools;
Physico-chemical score of amino acid substitution calculated using PASE (Li et al., 2013).
Evolutionary conservation score of amino acid substitution calculated using PASE (Li et al., 2013);
Combined score of PC and EC of amino acid substitution calculated using PASE (Li et al., 2013).