| Literature DB >> 23814129 |
Wen-Lang Fan1, Chen Siang Ng, Chih-Feng Chen, Mei-Yeh Jade Lu, Yu-Hsiang Chen, Chia-Jung Liu, Siao-Man Wu, Chih-Kuan Chen, Jiun-Jie Chen, Chi-Tang Mao, Yu-Ting Lai, Wen-Sui Lo, Wei-Hua Chang, Wen-Hsiung Li.
Abstract
Domestic chickens are excellent models for investigating the genetic basis of phenotypic diversity, as numerous phenotypic changes in physiology, morphology, and behavior in chickens have been artificially selected. Genomic study is required to study genome-wide patterns of DNA variation for dissecting the genetic basis of phenotypic traits. We sequenced the genomes of the Silkie and the Taiwanese native chicken L2 at ∼23- and 25-fold average coverage depth, respectively, using Illumina sequencing. The reads were mapped onto the chicken reference genome (including 5.1% Ns) to 92.32% genome coverage for the two breeds. Using a stringent filter, we identified ∼7.6 million single-nucleotide polymorphisms (SNPs) and 8,839 copy number variations (CNVs) in the mapped regions; 42% of the SNPs have not found in other chickens before. Among the 68,906 SNPs annotated in the chicken sequence assembly, 27,852 were nonsynonymous SNPs located in 13,537 genes. We also identified hundreds of shared and divergent structural and copy number variants in intronic and intergenic regions and in coding regions in the two breeds. Functional enrichments of identified genetic variants were discussed. Radical nsSNP-containing immunity genes were enriched in the QTL regions associated with some economic traits for both breeds. Moreover, genetic changes involved in selective sweeps were detected. From the selective sweeps identified in our two breeds, several genes associated with growth, appetite, and metabolic regulation were identified. Our study provides a framework for genetic and genomic research of domestic chickens and facilitates the domestic chicken as an avian model for genomic, biomedical, and evolutionary studies.Entities:
Keywords: CNVs; chicken; genetic variation; single nucleotide polymorphism; whole genome resequencing
Mesh:
Year: 2013 PMID: 23814129 PMCID: PMC3730349 DOI: 10.1093/gbe/evt097
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Statistics of Genetic Variants in the Silkie and L2 Genomes
| Silkie-RJF | L2-RJF | Silkie-L2 | ||||||
|---|---|---|---|---|---|---|---|---|
| Total | Specific | Novel | Total | Specific | Novel | Shared | Different | |
| All SNPs | ||||||||
| Intergenic | 2,828,057 | 1,322,844 | 1,317,281 | 2,695,771 | 1,190,558 | 1,166,362 | 1,505,213 | 2,513,402 |
| Intergenic (Upstream w/5-kb) | 291,237 | 133,325 | 136,942 | 285,544 | 127,632 | 127,049 | 157,912 | 260,957 |
| Intergenic (Downstream w/5-kb) | 263,964 | 121,474 | 123,511 | 256,170 | 113,680 | 112,260 | 142,490 | 235,154 |
| Intergenic (Up/Down w/5-kb) | 75,295 | 33,842 | 36,228 | 75,621 | 34,168 | 35,152 | 41,453 | 68,010 |
| Genic | ||||||||
| Intronic | 2,463,943 | 1,153,339 | 1,096,560 | 2,364,973 | 1,054,369 | 974,496 | 1,310,604 | 2,207,708 |
| ncRNA | 590 | 304 | 320 | 532 | 245 | 280 | 280 | 549 |
| 5′ UTR | 3,827 | 1,718 | 1,947 | 4,110 | 2,001 | 2,051 | 2,109 | 3,719 |
| 3′ UTR | 24,239 | 11,262 | 10,838 | 23,649 | 10,672 | 9,917 | 12,977 | 21,934 |
| 5′/3′ UTR | 65 | 27 | 38 | 64 | 26 | 31 | 38 | 53 |
| Exonic splice site | 1,395 | 577 | 594 | 1,416 | 598 | 557 | 818 | 1,175 |
| Intronic splice site | 917 | 390 | 609 | 935 | 408 | 618 | 527 | 798 |
| Exonic | 67,503 | 29,404 | 25,994 | 67,619 | 29,520 | 24,420 | 38,099 | 58,924 |
| Synonymous | 48,112 | 20,828 | 16,660 | 48,078 | 20,794 | 15,474 | 27,284 | 41,622 |
| Stop-gained | 232 | 114 | 116 | 228 | 110 | 103 | 118 | 224 |
| Stop-loss | 23 | 8 | 14 | 21 | 6 | 15 | 15 | 14 |
| Nonframeshift indel | 324 | 174 | 324 | 342 | 192 | 342 | 150 | 366 |
| Frameshift indel | 948 | 385 | 948 | 986 | 423 | 986 | 563 | 808 |
| Nonsynonymous | 19,259 | 8,472 | 8,527 | 19,380 | 8,593 | 8,057 | 10,787 | 17,065 |
aRJF, red jungle fowl.
bVariants only exist in one breed.
cVariants previously do not exist in the SNPdb.
dTotal variants shared by two breeds.
eTotal variants different between two breeds.
fVariants in intergenic regions.
gVariants overlap 5-kb regions upstream of transcription start site.
hVariants overlap 5-kb regions downtream of transcription end site.
iVariants located in both downstream and upstream regions (possibly for two different genes).
jVariants overlap introns.
kVariant overlaps a transcript without coding annotation in the gene definition. It does not mean that the RNA will never be translated and merely means that the gene annotation system did not give a coding sequence annotation.
lVariants overlap 5′ untranslated regions.
mVariants overlap 3′ untranslated regions.
nVariants located in both 5′ UTR and 3′ UTR regions (possibly for two different genes).
oVariants within exons but close to exon/intron boundaries.
pThe 2-bp in introns that are close to exons.
qOnly coding exonic portions, but not UTR portions.
rSingle nucleotide changes that do not cause amino acid changes.
sNonsynonymous SNVs, frameshift indels, nonframeshift indels, or block substitutions that lead to the immediate creation of stop codon at the variant site. For frameshift mutations, the stop codon downstream of the variant was not be considered as “stop-gained.”
tNonsynonymous SNVs, frameshift indels, nonframeshift indels, or block substitutions that lead to the immediate elimination of stop codon at the variant sites.
uIndels of three or multiples of three nucleotides that do not cause frameshift changes in protein coding sequence.
vIndels of one or more nucleotides that cause frameshift changes in protein coding sequence.
xSingle nucleotide changes that cause amino acid changes.
FDensities of SNPs and indels on individual chromosomes detected between the Silkie and L2 genomes.
FAnnotation of SNPs and indels and distribution of SNPs. Predicted functional consequences of SNPs and indels of the Silkie (A) and L2 (B).
FDistribution of indel lengths in (A) the Silkie and (B) the L2.
Over-represented GO Categories Associated with Chicken CNVs
| Term | Description | Fold Enrichment | FDR | |
|---|---|---|---|---|
| GO:0003700 | Transcription factor activity | 1.48E-06 | 7.33 | 0.0015 |
| GO:0043565 | Sequence-specific DNA binding | 2.42E-06 | 8.57 | 0.0024 |
| GO:0006355 | Regulation of transcription, DNA-dependent | 5.32E-06 | 5.77 | 0.0065 |
| GO:0051252 | Regulation of RNA metabolic process | 6.08E-06 | 5.68 | 0.0074 |
| GO:0030528 | Transcription regulator activity | 2.75E-05 | 5.15 | 0.0272 |
Note.—FDR, false discovery rate.
FSummary of resequencing data of the Silkie and L2 genomes. Distribution of SNPs, indels, and structural variants (≥50 bp) in the Silkie and L2 genomes, relative to the reference red jungle fowl genome. The circular map shows the genomic distributions of different classes of same variations for the Silkie and L2 genomes based on the resolution of 1 Mb bins. Chromosomes are shown in different colors in the outermost circle. The heatmaps from the outermost to the innermost circle show the abundance of four structural variants (length ≥ 50 bps): deletions, insertions, tandem duplications, and inversions. The purple and blue colors in the CNV (≥5 kb) ring represent gain and loss of copy number variation for the Silkie genome relative to the L2 genome. For SNPs, the red color stands for the homozygous SNPs, whereas blue for the heterozygous SNPs in the exonic regions.
FCNVs overlapping with the EDN3 gene region. (A) Log2 ratio plot of the EDN3 (Q3MU75_CHICK; ENSGALT00000039060) gene region. Each point shows the log2 of the number of Silkie reads mapped to the number of L2 reads mapped. Points are colored based on the log10 P value calculated by the CNV-seq software. (B) The CNVR containing the EDN3 gene in chromosome 20 as visualized using the UCSC Genome Browser.
Top Annotation Clusters of Loss-of-Function Mutations Identified by the DAVID Functional Annotation Clustering Tool
| Breeds | Mutation Types | Representative Annotation Terms | Enrichment Score | |
|---|---|---|---|---|
| Silkie | Stop-gained | Muscle contraction | 1.87 | |
| Response to radiation | 1.30 | |||
| Adult behavior | 1.27 | |||
| Cell death | 1.15 | |||
| Regulation of programmed cell death | 1.05 | |||
| Frameshift | Nucleotide binding | 1.33 | ||
| Modification-dependent protein catabolic process | 1.05 | |||
| L2 | Stop-gained | Endopeptidase inhibitor activity | 1.04 | |
| Frameshift | Histone-lysine | 1.28 | ||
| Amino acid transmembrane transporter activity | 1.26 | |||
| Large-effect nsSNP | Nitrogen compound biosynthetic process | 2.65 | ||
| Cofactor binding | 1.47 | |||
| Vitamin binding | 1.01 | |||
NOTE.—The genes were analyzed by the Functional Annotation Clustering Tool. The top annotation clusters have group enrichment scores greater than 1 were listed. The representative biology terms associated with the top annotation clusters are manually summarized.
FDistributions of H values. The horizontal dashed lines indicate the threshold at H = −1.
Top Annotation Clusters of Gene Identified in the Putative Selective Sweep Regions by the DAVID Functional Annotation Clustering Tool
| Representative Annotation Terms | Enrichment Score |
|---|---|
| GTPase regulator activity | 1.75 |
| Glycoprotein biosynthetic process | 1.70 |
| Nucleoside binding | 1.43 |
| Regulation of generation of precursor metabolites and energy | 1.40 |
| Regulation of cell adhesion | 1.34 |
| Purinergic nucleotide receptor activity, G-protein coupled | 1.25 |
| Microtubule binding | 1.23 |
| Stem cell development | 1.16 |
| Cell motion | 1.05 |
Note.—The genes were analyzed by the Functional Annotation Clustering Tool. The top annotation clusters that have group enrichment scores >1 are listed. The representative biology terms associated with the top annotation clusters are manually summarized.
Identified nsSNPs Reportedly Associated with Phenotypes in the Other Breeds of Chicken
| Gene | Ensembl ID | Chr | Breed | Mutation | Association | Reference |
|---|---|---|---|---|---|---|
| ENS-25999 | 1 | L2 | Radical nsSNP | Morbidity, early mortality, viral shedding, and cytokine responses | ||
| ENS-26793 | 3 | L2 | Radical nsSNP | Production traits | ||
| ENS-15634 | 9 | B | Radical nsSNP | Eggshell quality traits | ||
| ENS-21020 | 10 | S | Stop gained | Egg-laying traits | ||
| ENS-38569 | 10 | L2 | Radical nsSNP | Tibial dyschondroplasia | ||
| ENS-09129 | 19 | L2 | Radical nsSNP | Performance | ||
| ENS-35206 | 19 | B | Radical nsSNP | Feather damage | ||
| ENS-01590 | 21 | B | Radical nsSNP | Performance traits | ||
| ENS-04765 | 21 | B | Radical nsSNP | Experimental Pasteurella multocida infection | ||
| ENS-24882 | Z | B | Radical nsSNP | Fat deposition |
aOnly the last five digits of the Ensembl chicken gene annotation are shown.
bS: Silkie; L2: Taiwanese L2 chicken; B: Both breeds of S and L2.
Statistics of Structural Variants in Silkie and L2
| SV Type | Breed | |
|---|---|---|
| Silkie | L2 | |
| Large deletion (≥50 bps) | ||
| Downstream | 149 | 138 |
| Exonic | 54 | 60 |
| Intergenic | 6501 | 5446 |
| Intronic | 4791 | 4271 |
| ncRNA_exonic | 3 | 5 |
| Splicing | 13 | 14 |
| Upstream | 114 | 145 |
| Upstream; downstream | 6 | 10 |
| UTR3 | 36 | 26 |
| UTR5 | 2 | 2 |
| Large insertion (≥50 bps) | ||
| Downstream | 3 | 2 |
| Exonic | 1 | 0 |
| Intergenic | 145 | 168 |
| Intronic | 104 | 113 |
| Upstream | 4 | 3 |
| Tandem duplication (≥ 50bps) | ||
| Downstream | 4 | 5 |
| Exonic | 2 | 4 |
| Exonic; splicing | 0 | 1 |
| Intergenic | 97 | 123 |
| Intronic | 29 | 48 |
| Splicing | 0 | 1 |
| Upstream | 2 | 6 |
| Inversion (≥50 bps) | ||
| Downstream | 0 | 2 |
| Exonic | 10 | 12 |
| Intergenic | 56 | 48 |
| Intronic | 30 | 31 |
| Upstream | 1 | 0 |
| Splicing | 0 | 1 |