| Literature DB >> 27760519 |
D S Fleming1, J E Koltes1,2, E R Fritz-Waters1, M F Rothschild1, C J Schmidt3, C M Ashwell4, M E Persia5, J M Reecy1, S J Lamont6.
Abstract
BACKGROUND: Analyses of sequence variants of two distinct and highly inbred chicken lines allowed characterization of genomic variation that may be associated with phenotypic differences between breeds. These lines were the Leghorn, the major contributing breed to commercial white-egg production lines, and the Fayoumi, representative of an outbred indigenous and robust breed. Unique within- and between-line genetic diversity was used to define the genetic differences of the two breeds through the use of variant discovery and functional annotation.Entities:
Keywords: Genomic diversity; Resequencing; Single nucleotide variant
Mesh:
Year: 2016 PMID: 27760519 PMCID: PMC5070165 DOI: 10.1186/s12864-016-3147-7
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Variants discovered by breed type
| Fayoumi | Leghorn | aFayoumi vs. Leghornized reference | |
|---|---|---|---|
| Depth of coverage | ~24× | ~22× | ~24× |
| Assembly coverage | 84.5 | 93.7 | 84.5 |
| Total variants | 4,462,467 | 4,605,732 | 3,792,327 |
| Previously uncharacterized variants | 3,223,583 | 3,287,720 | 3,791,430 |
| Homozygosity | 99.9482 % | 99.9736 % | 99.9330 % |
| bTs/Tv (All Variants) | 2.371 | 2.365 | 2.286 |
| Change rate | 1 change every 235 bases | 1 change every 227 bases | 1 change every 276 bases |
Total reads computed using GATK DepthofCoverage. Assembly coverage and Sequence coverage calculated using Samtools and GATK DepthofCoverage. All other data calculated using SnpEff. All data is pre-filter. aFayoumi vs. Leghorn only compared using SNV data; indels were excluded along with Chromosomes Z and W. bTs/Tv is the ratio of transitions/transversions within each population
Comparison of variant changes for each line
| Type | Total | Homozygous | Heterozygous |
|---|---|---|---|
| Fayoumi vs. RJFb | |||
| SNV | 4,146,394 | 3,638,803 | 507,591 |
| INSc | 180,752 | 158,256 | 22,496 |
| DELd | 135,321 | 125,514 | 9,807 |
| TOTAL | 4,462,467 | 3,922,573 | 539,894 |
| Leghorn vs. RJF | |||
| SNV | 4,271,399 | 4,010,609 | 260,790 |
| INS | 189,494 | 177,474 | 12,020 |
| DEL | 144,839 | 142,181 | 2,658 |
| TOTAL | 4,605,732 | 4,330,264 | 275,468 |
| Fayoumi vs. Leghorn: referencea | |||
| SNV | 3,792,327 | 3,094,177 | 698,150 |
| TOTAL | 3,792,327 | 3,094,177 | 698,150 |
Fayoumi and Leghorn vs. RJF and Fayoumi vs. the Leghornized reference genome. aFayoumi vs. the Leghornized reference genome analysis done on SNVs only
b RJF Red Jungle Fowl, c INS insertion variants, d DEL deletion variants
Variant annotations and counts by effect type for each line
| Effect Type | Fayoumi | Leghorn | Chi-Square Statistic |
|---|---|---|---|
| Codon_Change_Plus_Codon_Deletion | 16 | 24 | |
| Codon_Change_Plus_Codon_Insertion | 23 | 34 | |
| Codon_Deletion | 40 | 53 | |
| Codon_Insertion | 46 | 63 | |
| Downstream | 401,163 | 440,064 |
|
| Exon | 418 | 473 | |
| Frame_Shift | 384 | 504 |
|
| Intergenic | 2,344,623 | 2,430,279 |
|
| Intron | 2,205,047 | 2,264,238 |
|
| Non_Synonymous_Coding | 14,335 | 16,924 |
|
| Non_Synonymous_Start | 5 | 2 | |
| Splice_Site_Acceptor | 229 | 236 | |
| Splice_Site_Donor | 196 | 251 |
|
| Start_Gained | 1,015 | 1,185 |
|
| Start_Lost | 35 | 41 | |
| Stop_Gained | 107 | 121 | |
| Stop_Lost | 16 | 18 | |
| Synonymous_Coding | 37,502 | 41,860 |
|
| Synonymous_Stop | 9 | 12 | |
| Upstream | 397,941 | 438,052 |
|
| Utr_3_Prime | 48,430 | 52,433 |
|
| Utr_5_Prime | 6,445 | 7,802 |
|
Table shows variant annotations and counts for Fayoumi and Leghorn populations vs. RJF by effect type. The “effect type” is the sequence ontology meaning for example that the variant hits an intron or causes a frameshift. A Pearson’s chi-square goodness-of-fit test was used for comparison (P < 0.01)
Variant totals by mutation type
| Mutation | Count | Percent | |
|---|---|---|---|
| Fayoumi | Missense | 14,389 | 27.6 % |
| Nonsense | 106 | 0.2 % | |
| Silent | 37,512 | 72.1 % | |
| Leghorn | Missense | 16,982 | 28.7 % |
| Nonsense | 118 | 0.2 % | |
| Silent | 41,873 | 71.0 % |
The Missense/Silent ratio: 0.3836 for Fayoumi and 0.4056 for Leghorn populations respectively
Classification of SNVs used for validation
| Class | Fayoumi SNVs | Leghorn SNVs | Common SNVs | Total | Classification description |
|---|---|---|---|---|---|
| A | 5 | 0 | 0 | 5 | Segregating in population only |
| B | 14 | 0 | 0 | 14 | Fayoumi and Leghorn different (one segregating, one not segregating), and segregating in controls |
| C | 2 | 0 | 0 | 2 | Fayoumi, Leghorn, and controls mix of homozygous for reference or alternate allele, but no heterozygotes |
| D | 4 | 10 | 0 | 14 | Failed |
| E | 11 | 24 | 2 | 37 | Evidence of duplication |
| F | 1 | 2 | 0 | 3 | Only Homozygous (Fayoumi and Leghorn homozygous for different alleles, segregating in controls) |
| Total | 37 | 36 | 2 | 75 | |
| Pass rate = 81.3 % (75–14)/75 |
Table shows the results of wet-lab validation of 100 uncharacterized SNVs. The data from validation was used to inform the additional filtering steps used in downstream analysis (strict-filter) of the within-line variation
Overrepresented gene ontology terms for moderate impactb, line-specific variants in Fayoumi and Leghorn lines
| Fayoumi GO Term | Count |
|
|
| 22 | 1.10E-03 |
| Ribonucleotide binding | 154 | 1.70E-03 |
| Purine ribonucleotide binding | 154 | 1.70E-03 |
| Fibronectin, type III | 22 | 8.10E-03 |
|
| 8 | 1.70E-02 |
| Nucleotide binding | 177 | 2.10E-02 |
| Protein kinase activity | 61 | 2.60E-02 |
| Leghorn GO Term | Count |
|
| ECM-receptor interaction | 25 | 9.60E-05 |
| Extracellular matrix | 41 | 1.90E-04 |
| Metal ion binding | 270 | 2.30E-03 |
| Proteinaceous extracellular matrix | 37 | 1.30E-03 |
| Extracellular region | 94 | 2.30E-03 |
| Cell division and chromosome partitioning | 26 | 2.60E-02 |
| Calcium ion binding | 77 | 3.90E-02 |
| Aminophospholipid transporter activity | 7 | 4.30E-02 |
| Phospholipid-translocating atpase activity | 7 | 4.30E-02 |
Over-represented GO terms for moderate impact variants (fixed/segregating) and unique for the inbred Fayoumi and Leghorn lines. aBenjamini-Hochberg Corrected p-value cut-off α = 0.05. bModerate impact variants: non_synonymous_coding, codon_change, codon_insertion, codon_change_plus_codon_insertion, codon_deletion, codon_change_plus_codon_deletion, utr_5_deleted, utr_3_deleted
Gene ontology terms from DAVID for variant regions with greatest difference (F = 1)
| GO Terms | Count |
|
|---|---|---|
| Nucleoside binding | 630 | 6.90E-13 |
| Purine nucleoside binding | 626 | 4.30E-13 |
| Adenyl nucleotide binding | 622 | 3.50E-13 |
| Nucleotide binding | 869 | 9.10E-13 |
| Purine nucleotide binding | 745 | 2.90E-12 |
| Adenyl ribonucleotide binding | 585 | 1.20E-11 |
| ATP binding | 581 | 1.70E-11 |
| Ribonucleotide binding | 708 | 5.20E-11 |
| Purine ribonucleotide binding | 708 | 5.20E-11 |
| Protein kinase activity | 263 | 1.60E-07 |
| Protein amino acid phosphorylation | 273 | 7.30E-05 |
| Atp-binding | 243 | 1.50E-05 |
| Nucleoside-triphosphatase regulator activity | 129 | 1.80E-05 |
| Gtpase regulator activity | 125 | 3.10E-05 |
| Protein serine/threonine kinase activity | 168 | 3.70E-05 |
| Extracellular ligand-gated ion channel activity | 49 | 4.00E-05 |
| Nucleotide-binding | 300 | 1.00E-04 |
| Phosphorus metabolic process | 361 | 9.20E-04 |
| Phosphate metabolic process | 361 | 9.20E-04 |
| Enzyme activator activity | 78 | 2.90E-04 |
| Nucleotide phosphate-binding region: ATP | 103 | 2.50E-02 |
| Gtpase activator activity | 63 | 1.10E-03 |
| Identical protein binding | 112 | 1.50E-03 |
| Ligand-gated ion channel activity | 64 | 1.50E-03 |
|
| 64 | 1.50E-03 |
Functional categories from DAVID representing the genes that had F value’s of 1. GO Terms from DAVID based on F values of 1 for comparison of variant position between populations. Benjamini Corrected p-value cut-off α = 0.05
Gene ontology terms from REViGO for variant regions with greatest difference (F = 1)
| Description | Frequency | Uniqueness |
|---|---|---|
| Immune system process | 0.86 % | 0.99 |
| Cellular protein modification process | 2.99 % | 0.83 |
| Behavior | 0.09 % | 0.92 |
| Metabolic process | 78.07 % | 1 |
| Cellular process | 70.74 % | 1 |
| Cellular component organization | 4.20 % | 0.95 |
| Sexual reproduction | 0.08 % | 0.99 |
| Biological adhesion | 2.09 % | 0.99 |
| Signaling | 5.13 % | 0.99 |
| Multicellular organismal process | 1.33 % | 0.99 |
| Developmental process | 1.67 % | 0.99 |
| Growth | 0.14 % | 0.99 |
| Locomotion | 3.09 % | 0.99 |
| Single-organism process | 25.74 % | 1 |
| Single-multicellular organism process | 1.30 % | 0.8 |
| Positive regulation of biological process | 0.84 % | 0.76 |
| Anatomical structure development | 1.38 % | 0.89 |
| Response to stimulus | 10.51 % | 0.99 |
| Localization | 17.22 % | 1 |
| Multi-organism process | 4.65 % | 0.99 |
REViGO visualization showing the most unique GO terms represented by the F list of genes for the comparison of the population structures of the Fayoumi and Leghorn. The list reveals terms such as immune system processes and sexual reproduction that represent the traits for which each breed is characterized
Genomic annotations and count of variants for Fayoumi vs. Leghorn: referencea
| Type (Fayoumi vs. Leghorn: referencea) | Count |
|---|---|
| DOWNSTREAM | 161,833 |
| EXON | 300 |
| INTERGENIC | 872,171 |
| INTRON | 812,038 |
| NONE | 2,121,836 |
| NON_SYNONYMOUS_CODING | 26,500 |
| NON_SYNONYMOUS_START | 16 |
| SPLICE_SITE_ACCEPTOR | 602 |
| SPLICE_SITE_DONOR | 634 |
| START_GAINED | 507 |
| START_LOST | 31 |
| STOP_GAINED | 1,511 |
| STOP_LOST | 53 |
| SYNONYMOUS_CODING | 21,093 |
| SYNONYMOUS_START | 21 |
| SYNONYMOUS_STOP | 27 |
| UPSTREAM | 158,646 |
| UTR_3_PRIME | 23,377 |
| UTR_5_PRIME | 3,428 |
Counts by region are based on SNVs only. aAlternate reference genome
Fig. 1Chromosome 16 variants/10 kilobase (kb) in Fayoumi/Leghorn vs. RJF and Fayoumi vs. Leghornized reference. Shape of the graph shows the amount of variability still present on chromosome 16 despite high levels of homozygosity for each population vs. the reference genome and for the Fayoumi vs. Leghorn alternate reference. The MHC regions are highlighted to show differences in variation possibly related to the difference in pathogen resistance between the two populations. The y-axis represents variants, x-axis position, and the dashed lines show peak heights for the first 250 Kb of the chromosome. Fayoumi vs. leghorn alternate reference is based on SNV comparison only