Literature DB >> 23814129

Genome-wide patterns of genetic variation in two domestic chickens.

Wen-Lang Fan1, Chen Siang Ng, Chih-Feng Chen, Mei-Yeh Jade Lu, Yu-Hsiang Chen, Chia-Jung Liu, Siao-Man Wu, Chih-Kuan Chen, Jiun-Jie Chen, Chi-Tang Mao, Yu-Ting Lai, Wen-Sui Lo, Wei-Hua Chang, Wen-Hsiung Li.   

Abstract

Domestic chickens are excellent models for investigating the genetic basis of phenotypic diversity, as numerous phenotypic changes in physiology, morphology, and behavior in chickens have been artificially selected. Genomic study is required to study genome-wide patterns of DNA variation for dissecting the genetic basis of phenotypic traits. We sequenced the genomes of the Silkie and the Taiwanese native chicken L2 at ∼23- and 25-fold average coverage depth, respectively, using Illumina sequencing. The reads were mapped onto the chicken reference genome (including 5.1% Ns) to 92.32% genome coverage for the two breeds. Using a stringent filter, we identified ∼7.6 million single-nucleotide polymorphisms (SNPs) and 8,839 copy number variations (CNVs) in the mapped regions; 42% of the SNPs have not found in other chickens before. Among the 68,906 SNPs annotated in the chicken sequence assembly, 27,852 were nonsynonymous SNPs located in 13,537 genes. We also identified hundreds of shared and divergent structural and copy number variants in intronic and intergenic regions and in coding regions in the two breeds. Functional enrichments of identified genetic variants were discussed. Radical nsSNP-containing immunity genes were enriched in the QTL regions associated with some economic traits for both breeds. Moreover, genetic changes involved in selective sweeps were detected. From the selective sweeps identified in our two breeds, several genes associated with growth, appetite, and metabolic regulation were identified. Our study provides a framework for genetic and genomic research of domestic chickens and facilitates the domestic chicken as an avian model for genomic, biomedical, and evolutionary studies.

Entities:  

Keywords:  CNVs; chicken; genetic variation; single nucleotide polymorphism; whole genome resequencing

Mesh:

Year:  2013        PMID: 23814129      PMCID: PMC3730349          DOI: 10.1093/gbe/evt097

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

The domestic chicken (Gallus gallus domesticus) was the first bird domesticated and has been deeply integrated into the human culture for more than 8,000 years (Dohner 2001; Price 2002). Chickens and their eggs have been the most important supply of animal protein for human populations. All chicken breeds are considered to share a common ancestor in the red jungle fowl (G. gallus gallus), which is still wildly distributed in parts of India and Southeast Asia (Crawford 1990; Miao et al. 2013), although some recent genetic evidence suggested that several other jungle fowl species (grey junglefowl [G. sonnerati], Geylon junglefowl [G. lafayettei], and green junglefowl [G. varius]) might also have contributed to the ancestral genetic background of domestic chickens (Nishibori et al. 2005; Eriksson et al. 2008; Sawai et al. 2010). Charles Darwin recognized that phenotypic variations in domesticated plants and animals arise from selective breeding, giving credence to evolution in natural populations by natural selection. Domesticated animals have played a significant role in introducing the ideas of evolution, particularly natural selection, in Darwin’s On the Origin of Species (Darwin 1859). Darwin further expanded his ideas in the book The Variation of Animals and Plants under Domestication (1868), using traits exaggerated by artificial selection to demonstrate the potential power of natural and sexual selection (Darwin 1868). The vast diversity of phenotypes among breeds that has been created by selective breeding of domesticated animals can compare to those observed among wild species in nature. In addition to agricultural applications, domesticated animals have also contributed to basic and medical biology, as they provide excellent opportunities for unraveling the genetic and molecular basis of phenotypic variations (Andersson 2001; Andersson and Georges 2004). The chicken is the most variable bird (Somes 1988), because there are hundreds of chicken breeds, integrating various mutations affecting body size, reproduction, growth rate, posture, color, feather structure and distribution, comb shape, and behavior. Domestic animals have some advantages over traditional model organisms. Extreme phenotypes that may not be found in the wild can be maintained by artificial selection for aesthetic or economic demands. Mutations of biologically important traits have a greater chance of appearing and of being selected due to a large population size and higher longevity in domestic animals, providing us an exceptional opportunity to identify novel functions for specific genes. The identification of genetic variants controlling morphology in the chicken population has helped us understand the genetics and genomics of both simple and complex traits of birds (Wright et al. 2009; Dorshorst et al. 2011; Mou et al. 2011; Imsland et al. 2012; Johnsson et al. 2012; Ng et al. 2012; Shinomiya et al. 2012; Wang et al. 2012a). Among the chicken breeds, the Silkie chicken exhibits several phenotypic variations that are not commonly seen in other chicken breeds. These unique features include elongated feathers on the crown of the head, fluffy plumage, dark blue flesh, viscera, and bones, blue earlobes, feathered legs and feet, and five toes on each foot. Moreover, they have several color variants. Most of these unique traits are controlled by single Mendelian genes. These mutant alleles include crest (Cr) and muff (Mb), polydactyly (Po), ptilopody (Pt), hyperpigmentation (Fm), and silkiness or hookless (h). Genomic regions associated with morphological traits of the Silkie have been published (Dorshorst et al. 2010), giving us an opportunity to perform fine mapping for the causative genes and mutations. Furthermore, the International Chicken Polymorphism Map Consortium has described 2.8 million single-nucleotide polymorphisms (SNPs) among broiler, layer, and Silkie (Wong et al. 2004; Wang et al. 2005), but the coverage was at one-quarter for each of these domestic chicken lines, which is too low for finding causative mutations. Another breed used in the present study is the Taiwan country chicken (TCC) L2 breed. This breed originated from a single population and were subjected to selection for ∼30 years for egg production and body weight/comb size (Chao and Lee 2001). This local breed represents both a heritage and a reservoir of genetic variability that deserves to be explored and properly managed. Advances in sequencing technologies allow whole genomes to be sequenced more economically and efficiently than ever before, providing an excellent opportunity to discover numerous genetic polymorphisms in a genome and for quantitative trait locus (QTL) analysis and marker-assisted selection. Therefore, we conducted a whole-genome analysis to examine the genetic and genomic features of the Silkie and L2 chickens, providing detailed genetic information of these chicken breeds. We identified genetic variants, including SNPs, insertion/deletion polymorphisms (indels), structural variations (SVs), and copy number variations (CNVs) in these two chicken genomes. Genetic variants were annotated and their potential impacts on gene structures and functions were examined. The genomic sequences were also explored for mutations associated with QTLs in chickens. Furthermore, we analyzed biological processes and molecular functions enriched for genetic variants. The genome resources presented here are useful for breeding and comparative genomics in chicken and related species.

Materials and Methods

Chickens

We obtained genomic DNA samples from the following two chickens: a male Silkie and a male TCC L2 that were raised at the National Chung Hsing University, Taichung, Taiwan.

DNA Library Construction and Sequencing

Genomic DNAs were extracted from peripheral venous blood using the QIAGEN - Gentra Puregene Cell Kit (Qiagen, Venlo, Netherlands). The purified DNAs were assessed for purity and quality by NanoDrop (Thermo Fisher Scientific, Waltham, MA, USA), Qubit (Invitrogen Corp., Carlsbad, CA, USA), and gel electrophoresis. High quality genomic DNAs were then selected for paired-end (PE) library preparation with the method adopted from manufacturer’s protocol (Illumina Inc., San Diego, CA, USA). The genomic DNA was sonicated to the 200–600 bp range with a major peak at 400 bps, followed by end repair, A-tailing, and adaptor ligation. The ligation reaction was then separated on 2% low-range agarose gel (Bio-Rad, Hercules, CA, USA), and four fractions were made at 100-bp intervals on the gel. The DNAs were purified from gel slices using Qiaquick Gel Extraction kit (Qiagen, Venlo, Netherlands) and amplified by 12 cycles of polymerase chain reaction (PCR) using the reagents provided in the Illumina kit. The PCR products were purified using Ampure beads (Beckman Coulter Inc., Brea, CA, USA) and the resulting libraries were determined for quantification and size profiling using Qubit and BioAnalyzer 1000 (Agilent Technologies, Santa Clara, CA, USA). PE sequencing was performed on Illumina GA IIx in the High Throughput Sequencing Core Facility, Biodiversity Research Center, Academia Sinica, Taiwan. High-quality reads (pass-filter rate 78–92%) of PE 2 × 120 nt were obtained from five lanes for each chicken breed.

Public Data Used

The chicken reference genome, together with annotation of genes and repeats, was downloaded from the Ensembl Genome Browser (http://www.ensembl.org/), which has the same sequence as the NCBI build 2.1. The chicken SNP database (dbSNP, http://www.ncbi.nlm.nih.gov/projects/SNP/website) were used to compare the putative SNPs we found. The Chicken QTLdb (http://www.animalgenome.org/cgi-bin/QTLdb/GG/index) were used to locate candidate genes. These data were retrieved in March 2011.

Short Reads Alignment

For read alignment and consensus assembly, we used BWA ver. 0.5.8 (http://biobwa.sourceforge.net/) (Li and Durbin 2010). The following parameters were used: maximum edit distance (maxDiff) = 0.04, maximum number of gap opens (maxGapO) = 1, maximum number of gap extensions (maxGapE) = −1 (disabling long gaps), disallow a long deletion within bp (nDelTail) = 10, disallow an indel within bp (nIndelEnd) = 5, take the first subsequence as seed (seedLen) = 32, maximum edit distance in the seed (maxSeedDiff) = 2, number of threads (nThrds) = 8, mismatch penalty (misMsc) = 3, gap open penalty (gapOsc) = 11, gap extension penalty (gapEsc) = 4. The reads mapped to multiple chromosomal positions and unmapped reads were discarded. We only used reads mapped to a unique position on the reference chicken genome for SNP calling.

SNP Calling

To call SNPs, we first filtered out reads with mapping quality score <20. Then, SAMtools (Li et al. 2009) were used and additional filters were applied as follows: minimum read depth = 3, minimum read depth calling the SNP = 2, and a 20% cutoff of percent aligned reads calling the SNP per total mapped reads at the SNP sites. These identified SNPs were also filtered with more stringent parameters (i.e., minimum depth = 4, minimum SNP = 2, and 20% or higher aligned reads calling the SNP; and minimum depth = 5, minimum SNP = 2, and 20% or higher aligned reads calling the SNP). We distinguished heterozygous and homozygous SNPs using an 80% cutoff of percent aligned reads calling the SNP. BWA was also used to estimate the sequence read depth, which influences the coverage and accuracy of SNP calling. After SNP calling, the SNPs were annotated using the Ensembl gene sets (17,934 genes; available from the Ensembl BioMart site [http://www.ensembl.org/biomart/]). The SNPs and indels in gene regions were annotated using the custom software and the ANNOVAR annotation tool (Wang et al. 2010a). Functional annotations of these loci were compared with the complete genome using annotations from the Database for Annotation, Visualization, and Discovery (DAVID) (Huang et al. 2007, 2009), which uses fuzzy clustering to group genes into functionally related classes based on the similarity of their annotations.

SNP and Small Indel Validation

PCR primers to validate SNPs and small indels were designed using PRIMER3 software (http://frodo.wi.mit.edu/primer3) and were used to amplify 200–1000-bp fragments that were positioned within each selected SNP and indel region according to the reference genome sequence of the red jungle fowl. The optimal primer length was set at 20 bp and all other PRIMER 3 defaults were used. PCR was performed on the Silkie and L2 genomic DNA using TaKaRa Ex Taq® DNA Polymerase (Takara Bio Inc., Shiga, Japan). The PCR reaction was done in 20 μl containing 5 ng of genomic DNA, 2.5 mM Mg2+, 0.5 mM dNTPs, 0.2 μM of each primer, 2 μl of 10× PCR buffer, and 0.5 U of Taq DNA polymerase and was conducted using 35 thermal cycles: one cycle of pre-incubation at 94 °C for 3 min and 94 °C for 30 s, 55–60 °C for 30 s, and 72 °C for 2 min, 72 °C for 7 min at the end of the final cycle. The PCR products were run on 1–2% agarose gels containing ethidium bromide and visualized using a UV transilluminator. The approximate sizes of the PCR products were estimated by running molecular weight markers (GeneRuler 100 bp DNA Ladder; Thermo Fisher Scientific, Waltham, MA, USA) on each gel. All PCR products were sequenced directly after treatment with exonuclease I and calf intestinal alkaline phosphatase (New England BioLabs, Ipswich, MA, USA) by standard methods. Each PCR product was sequenced by the BigDye terminator cycle sequencing kit and sequencer, Applied Biosystems ABI3700 (Applied Biosystems, Santa Clara, CA, USA).

In Silico Analysis of Large-Effect Nonsynonymous Variants

A large-effect nsSNP is defined as a homozygous radical nsSNP in a protein domain, which is computationally predicted by InterProScan on the Ensembl peptide (Quevillon et al. 2005). Radical nsSNPs are computationally predicted by SIFT (http://sift.jcvi.org), which is a sequence homology-based tool that distinguishes intolerant from tolerant amino acid changes and predicts whether an amino acid substitution in a protein will be likely to produce a phenotypic effect (Ng and Henikoff 2003; Kumar et al. 2009). A file containing genomic coordinates and base pair substitutions of all identified novel nonsynonymous variants was used as an input for the SIFT genome tool.

Structure Variance Identification

To call insertions and deletions, we used Pindel v. 0.2.4 p to search for reads where one end is mapped on the genome and the other end can be mapped with high confidence in two (split) portions (Ye et al. 2009). The following parameters were used: the maximum size of structural variations to be detected (max_range_index = 5) = 32,368, the expected fraction of sequencing errors (sequencing_error_rate) = 0.03, the fraction of mismatches (maximum_allowed_mismatch_rate) = 0.05, the minimum alignment length of read (min_num_matched_bases) = 30, the minimum mapping quality of the reads Pindel uses as anchor (anchor_quality) = 20.

CNV Identification

Putative CNVs were identified using the CNV-seq, which examines the mapped reads from two individual chickens and reports regions that exhibit significant read depth differences (Xie and Tammi 2009). The following parameters were used as the default value (log2 threshold = 0.6, P value = 0.001 and minimum windows required = 4) to generate a list of CNVs from the best-hit files. A custom script was used to identify overlapping genes between Ensembl and CNVs.

CNV Validation

For the qPCR analysis, primers were designed using PRIMER3 software (http://frodo.wi.mit.edu/primer3) and were used to amplify 150–250-bp fragments that were positioned within each selected copy number variation region (CNVR) locus (Rozen and Skaletsky 2000). The primers for EDN3 are 5′-GCTGCAGGCAGGTTGGAACT-3′ and 5′-GCAGAGGCTTCCCAGGATCA-3′. The primers for the propionylcoenzyme A carboxylase gene (PCCA, chr1) and thyroid hormone-inducible hepatic protein gene (THRSP, chr1) are used as references as described (Wang et al. 2010b). The qPCR reaction was done in 10 μl containing 6 ng of genomic DNA, 0.2 μM of each primer, and was conducted using 45 thermal cycles: one cycle of pre-incubation at 95 °C for 3 min, 40 cycles of gain (95 °C for 10 s, 60 °C for 20 s, and 72 °C for 1 s), a dissociation stage (95 °C for 5 s, 65 °C for 1 min, and 97 °C for continuous) for melting curve analysis, and a final stage for cooling (40 °C for 10 s). The SYBR green-based qPCR assays were performed on a Roche LightCycler® 480 Instrument II with a 96-well block (Roche Applied Science, Penzberg, Germany). All test samples for each qPCR were assayed in duplicates. All the data were analyzed by the HTC1 software (Roche Applied Science, Penzberg, Germany). The 2−ΔΔCt method was used to calculate the copy numbers (Livak and Schmittgen 2001).

Selective-Sweep Analysis

Identified SNPs were used to detect signatures of selection in 10-kb sliding windows with a step size of 0.5 kb for the two genome sequences. At each detected SNP position, H values were calculated using the formula N = total SNPs in the window; N = total heterozygous SNPs in the window; N = total homozygous identical SNPs in the two domestic line and different from the red jungle fowl in the window. We defined selective sweeps by the criterion of the H of the window ≤−1. The formula and the criterion allow us to omit windows containing very few SNPs and require the identical homozygous SNPs to be in large proportions while effectively detecting a local reduction of heterozygosity.

Data Availability

The full data sets have been submitted to NCBI Sequence Read Archive (SRA) under accession nos. SRX286765, SRX286766, SRX286773, SRX286776, SRX286777, SRX286779-SRX286781, SRX286798, and SRX286799. Bioproject: PRJNA202483. The validated SNPs have been submitted to dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/) and the accession nos. are provided in supplementary table S5, Supplementary Material online. The whole set of SNPs and indels will be provided upon request.

Results

Data Production and Short Read Alignment

Genomic libraries were gel size-selected and deep sequencing was carried out on Illumina GA IIx with PE 2 × 120 nt read length. The PE libraries were prepared by gel size selection and the averaged peak sizes of the inserts were 210, 321, 441, 539, and 539 bp for the five Silkie libraries and 202, 312, and 442 bp for the three L2 libraries (supplementary table S1, Supplementary Material online); the pass filtering rate ranged from 85 to 92%. We collected 250 and 319 billion reads for Silkie and L2, respectively (∼30 and 38 gigabases [Gb] of sequence; supplementary table S1, Supplementary Material online). Read mapping to the chicken reference sequence (WUGSC 2.1/galGal 3) was conducted using BWA (Li and Durbin 2010), and 85.70% and 74.83% of the input reads were mapped to unique positions on the chicken reference genome for Silkie and L2, respectively. The total read coverage of the chicken reference genome was 93.22% (including 5.1% Ns, supplementary fig. S1 and S2, supplementary table S2, Supplementary Material online). The reference genome sequence was covered at an average depth of 23.05-fold for Silkie and 24.80-fold for L2. The alignments between the uniquely mapped reads and the reference genome were used to categorize genetic variation, including SNPs, indels, CNVs, and SVs.

SNP and Indel Identification

Mapping of the sequencing reads of Silkie and L2 to the genome of the red jungle fowl revealed ∼7.6 million SNPs for the two breeds (table 1), similar to previous findings (Wong et al. 2004; Wang et al. 2005). In total, 6,021,032 and 5,776,404 variants including 5,385,458 and 5,142,622 SNPs and 635,574 and 633,782 indels (1–73 bp) were found for Silkie and L2, respectively (fig. 1). Comparison of the putative SNPs with the chicken SNP database (dbSNP, http://www.ncbi.nlm.nih.gov/projects/SNP/) revealed 3.28 and 3.33 million sites in “known SNPs” for Silkie and L2, respectively. The remaining 2.12 and 1.82 million SNPs for Silkie and L2 were at positions not previously identified as polymorphic sites. For the genes defined by mRNA transcripts, 11% of the testable SNPs are radical as predicted by SIFT (http://sift.jcvi.org) (Ng and Henikoff 2003; Kumar et al. 2009). As in previous studies, the SNPs are not uniformly distributed on the chromosomes (fig. 1). The genome sequences of other chicken lines (Rubin et al. 2010) had a 44.5-fold genome coverage using ABI SOLiD reads and revealed approximately 7 million known SNPs, in close agreement with our results from Illumina reads of the Silkie and L2 genomes at a lower coverage.
Table 1

Statistics of Genetic Variants in the Silkie and L2 Genomes

Silkie-RJFa
L2-RJFa
Silkie-L2
TotalSpecificbNovelcTotalSpecificbNovelcShareddDifferente
All SNPs
    Intergenicf2,828,0571,322,8441,317,2812,695,7711,190,5581,166,3621,505,2132,513,402
    Intergenic (Upstream w/5-kb)g291,237133,325136,942285,544127,632127,049157,912260,957
    Intergenic (Downstream w/5-kb)h263,964121,474123,511256,170113,680112,260142,490235,154
    Intergenic (Up/Down w/5-kb)i75,29533,84236,22875,62134,16835,15241,45368,010
Genic
    Intronicj2,463,9431,153,3391,096,5602,364,9731,054,369974,4961,310,6042,207,708
    ncRNAk590304320532245280280549
    5′ UTRl3,8271,7181,9474,1102,0012,0512,1093,719
    3′ UTRm24,23911,26210,83823,64910,6729,91712,97721,934
    5′/3′ UTRn6527386426313853
    Exonic splice siteo1,3955775941,4165985578181,175
    Intronic splice sitep917390609935408618527798
    Exonicq67,50329,40425,99467,61929,52024,42038,09958,924
    Synonymousr48,11220,82816,66048,07820,79415,47427,28441,622
    Stop-gaineds232114116228110103118224
    Stop-losst23814216151514
    Nonframeshift indelu324174324342192342150366
    Frameshift indelv948385948986423986563808
    Nonsynonymousx19,2598,4728,52719,3808,5938,05710,78717,065

aRJF, red jungle fowl.

bVariants only exist in one breed.

cVariants previously do not exist in the SNPdb.

dTotal variants shared by two breeds.

eTotal variants different between two breeds.

fVariants in intergenic regions.

gVariants overlap 5-kb regions upstream of transcription start site.

hVariants overlap 5-kb regions downtream of transcription end site.

iVariants located in both downstream and upstream regions (possibly for two different genes).

jVariants overlap introns.

kVariant overlaps a transcript without coding annotation in the gene definition. It does not mean that the RNA will never be translated and merely means that the gene annotation system did not give a coding sequence annotation.

lVariants overlap 5′ untranslated regions.

mVariants overlap 3′ untranslated regions.

nVariants located in both 5′ UTR and 3′ UTR regions (possibly for two different genes).

oVariants within exons but close to exon/intron boundaries.

pThe 2-bp in introns that are close to exons.

qOnly coding exonic portions, but not UTR portions.

rSingle nucleotide changes that do not cause amino acid changes.

sNonsynonymous SNVs, frameshift indels, nonframeshift indels, or block substitutions that lead to the immediate creation of stop codon at the variant site. For frameshift mutations, the stop codon downstream of the variant was not be considered as “stop-gained.”

tNonsynonymous SNVs, frameshift indels, nonframeshift indels, or block substitutions that lead to the immediate elimination of stop codon at the variant sites.

uIndels of three or multiples of three nucleotides that do not cause frameshift changes in protein coding sequence.

vIndels of one or more nucleotides that cause frameshift changes in protein coding sequence.

xSingle nucleotide changes that cause amino acid changes.

F

Densities of SNPs and indels on individual chromosomes detected between the Silkie and L2 genomes.

Densities of SNPs and indels on individual chromosomes detected between the Silkie and L2 genomes. Statistics of Genetic Variants in the Silkie and L2 Genomes aRJF, red jungle fowl. bVariants only exist in one breed. cVariants previously do not exist in the SNPdb. dTotal variants shared by two breeds. eTotal variants different between two breeds. fVariants in intergenic regions. gVariants overlap 5-kb regions upstream of transcription start site. hVariants overlap 5-kb regions downtream of transcription end site. iVariants located in both downstream and upstream regions (possibly for two different genes). jVariants overlap introns. kVariant overlaps a transcript without coding annotation in the gene definition. It does not mean that the RNA will never be translated and merely means that the gene annotation system did not give a coding sequence annotation. lVariants overlap 5′ untranslated regions. mVariants overlap 3′ untranslated regions. nVariants located in both 5′ UTR and 3′ UTR regions (possibly for two different genes). oVariants within exons but close to exon/intron boundaries. pThe 2-bp in introns that are close to exons. qOnly coding exonic portions, but not UTR portions. rSingle nucleotide changes that do not cause amino acid changes. sNonsynonymous SNVs, frameshift indels, nonframeshift indels, or block substitutions that lead to the immediate creation of stop codon at the variant site. For frameshift mutations, the stop codon downstream of the variant was not be considered as “stop-gained.” tNonsynonymous SNVs, frameshift indels, nonframeshift indels, or block substitutions that lead to the immediate elimination of stop codon at the variant sites. uIndels of three or multiples of three nucleotides that do not cause frameshift changes in protein coding sequence. vIndels of one or more nucleotides that cause frameshift changes in protein coding sequence. xSingle nucleotide changes that cause amino acid changes. Among the identified SNPs of Silkie, 55% were homozygous and 45% were heterozygous, while the SNPs of L2 showed a slightly higher proportion of heterozygous sites (57%). Among the SNPs of Silkie, 2,949,783 (55%) sites were located in intergenic regions and 142,142 (2.64%) were within the 1-kb gene-flanking regions (fig. 2). The corresponding values for the SNPs of L2 were 2,804,238 (55%) and 141,508 (2.75%).
F

Annotation of SNPs and indels and distribution of SNPs. Predicted functional consequences of SNPs and indels of the Silkie (A) and L2 (B).

Annotation of SNPs and indels and distribution of SNPs. Predicted functional consequences of SNPs and indels of the Silkie (A) and L2 (B). Among the indels of Silkie, 349,210 (55%) were in intergenic regions, 3,625 (0.57%) were within the 1-kb flanking regions of genes, and 264,954 (42%) were in genic regions (fig. 2). The corresponding numbers for the indels of L2 were 348,983 (55%), 3,812 (0.60%), and 262,235 (41%). The genic indels of Silkie included 1,292 coding, 263,662 intronic, and 3,625 UTR sites, while those of L2 included 1,347 coding, 348,983 intronic, and 3,812 UTR sites. There were 342,259 and 346,192 homozygous indels, and 293,762 and 287,946 heterozygous indels in Silkie and L2, respectively (fig. 3). Insertions accounted for 298,503 and 299,267 events, and deletions for 337,518 and 334,871 events in Silkie and L2, respectively. Indels accounted for only 11% of all events identified, but they involved 28% of all variant bases. The indel sizes ranged from 1 to 73 bp in length and homozygous indels show a wider length distribution than heterozygous ones (fig. 3). The size range over which indels were recognized was limited by the length of the reads. In the analysis of our 120 bp Illumina sequencing reads, the largest identified indel was 41 bp by SAMtools. In total, 968 and 1,005 indels for Silkie and L2, respectively, were found overlapping with coding sequences and can potentially affect protein functions.
F

Distribution of indel lengths in (A) the Silkie and (B) the L2.

Distribution of indel lengths in (A) the Silkie and (B) the L2. To assess the reliability of our data, PCR amplification and Sanger sequencing were applied to 352 SNPs and 38 indels to determine whether they agreed with the deep sequencing results in the same individuals in which they were detected (supplementary table S4, Supplementary Material online). We found that 335 (95.2%) SNPs and 26 (68.4%) indels were consistent with the Illumina sequencing data (supplementary fig. S3 and table S5, Supplementary Material online).

CNV and SV Identification

Putative CNVs (≥2 kb) were detected by identifying genomic regions significantly different in coverage depth between the Silkie and L2 mapped read data sets using the software CNV-Seq (Xie and Tammi 2009). In total, 8,839 CNVs for Silkie relative to L2 were observed, involving ∼24.6 Mb of the reference assembly used for mapping (table 2). The CNVs varied in length from 2,081 bp to 45,241 bp and the mean and median were 2,785 bp and 2,081 bp, respectively (fig. 4).
Table 2

Over-represented GO Categories Associated with Chicken CNVs

TermDescriptionP ValueFold EnrichmentFDR
GO:0003700Transcription factor activity1.48E-067.330.0015
GO:0043565Sequence-specific DNA binding2.42E-068.570.0024
GO:0006355Regulation of transcription, DNA-dependent5.32E-065.770.0065
GO:0051252Regulation of RNA metabolic process6.08E-065.680.0074
GO:0030528Transcription regulator activity2.75E-055.150.0272

Note.—FDR, false discovery rate.

F

Summary of resequencing data of the Silkie and L2 genomes. Distribution of SNPs, indels, and structural variants (≥50 bp) in the Silkie and L2 genomes, relative to the reference red jungle fowl genome. The circular map shows the genomic distributions of different classes of same variations for the Silkie and L2 genomes based on the resolution of 1 Mb bins. Chromosomes are shown in different colors in the outermost circle. The heatmaps from the outermost to the innermost circle show the abundance of four structural variants (length ≥ 50 bps): deletions, insertions, tandem duplications, and inversions. The purple and blue colors in the CNV (≥5 kb) ring represent gain and loss of copy number variation for the Silkie genome relative to the L2 genome. For SNPs, the red color stands for the homozygous SNPs, whereas blue for the heterozygous SNPs in the exonic regions.

Summary of resequencing data of the Silkie and L2 genomes. Distribution of SNPs, indels, and structural variants (≥50 bp) in the Silkie and L2 genomes, relative to the reference red jungle fowl genome. The circular map shows the genomic distributions of different classes of same variations for the Silkie and L2 genomes based on the resolution of 1 Mb bins. Chromosomes are shown in different colors in the outermost circle. The heatmaps from the outermost to the innermost circle show the abundance of four structural variants (length ≥ 50 bps): deletions, insertions, tandem duplications, and inversions. The purple and blue colors in the CNV (≥5 kb) ring represent gain and loss of copy number variation for the Silkie genome relative to the L2 genome. For SNPs, the red color stands for the homozygous SNPs, whereas blue for the heterozygous SNPs in the exonic regions. Over-represented GO Categories Associated with Chicken CNVs Note.—FDR, false discovery rate. Among the 209 CNVRs larger than 5 kb, 53 (25.4%) variants together completely covered 66 annotated genes, which are enriched for transcription factor activity (P < 0.001) (supplementary table S6, Supplementary Material online). Using the Ensembl gene annotations, we detected CNV genes and then assigned a CNV estimate to each gene. The availability of the two genomes helped find the causative mutations associated with interesting traits in Silkie caused by CNV. For instance, a duplicated region on Chr20 containing endothelin 3 (EDN3), a gene with a known role in promoting melanoblast proliferation, increases the expression level of EDN3 and causes dermal and internal organ hyperpigmentation in Silkie (Dorshorst et al. 2011; Shinomiya et al. 2012). This duplication is easily identified and confirmed in our study (fig. 5). This CNV is validated by using quantitative Real-Time PCR assays (supplementary fig. S4, Supplementary Material online).
F

CNVs overlapping with the EDN3 gene region. (A) Log2 ratio plot of the EDN3 (Q3MU75_CHICK; ENSGALT00000039060) gene region. Each point shows the log2 of the number of Silkie reads mapped to the number of L2 reads mapped. Points are colored based on the log10 P value calculated by the CNV-seq software. (B) The CNVR containing the EDN3 gene in chromosome 20 as visualized using the UCSC Genome Browser.

CNVs overlapping with the EDN3 gene region. (A) Log2 ratio plot of the EDN3 (Q3MU75_CHICK; ENSGALT00000039060) gene region. Each point shows the log2 of the number of Silkie reads mapped to the number of L2 reads mapped. Points are colored based on the log10 P value calculated by the CNV-seq software. (B) The CNVR containing the EDN3 gene in chromosome 20 as visualized using the UCSC Genome Browser. Using Pindel v. 0.2.4p (Ye et al. 2009), we generated a catalogue of 23,454 structural variants (SVs) (≥50 bp), including 12,068 and 10,778 deletions and 278 and 330 insertions for Silkie and L2, respectively, and also combinations of SVs using stringent SV detection constraints (fig. 4, table 3, and supplementary table S7, Supplementary Material online). The vast majority of SVs are located in non-coding regions. Ensembl annotated genes overlapped with SVs are not significantly enriched for any GO categories. We compared the SV affected genes in Silkie to those in L2 and found that only 13.3% of the predicted SVs were shared between them and all of the shared SVs are large deletions, suggesting that most of the gene-affecting SV events occurred after the separation of the two breeds.

Annotation of SNPs and Indels

We conducted a DAVID functional annotation clustering analysis (Huang et al. 2007; Huang et al. 2009) of genes containing variants to identify molecular functions (MF) and biological processes (BP) enriched for these classes of genetic variants. In our data set, nsSNPs were detected in 7,302 and 7,360 genes in Silkie and L2, respectively (supplementary fig. S6 and table S8, Supplementary Material online). The SNPs in gene regions for Silkie and L2 were annotated using the Ensembl gene set (17,934 genes). We found, for the two genomes, 2,463,943 and 2,364,973 SNPs in introns, 28,131 and 27,823 SNPs in untranslated regions (UTRs), 2,312 and 2,351 SNPs at splice sites, and 68,898 and 69,035 SNPs in coding regions, leading to 19,259 and 19,380 nonsynonymous nucleotide changes in the two genomes (table 1, fig. 3, supplementary table S8, Supplementary Material online). We applied SIFT (Ng and Henikoff 2003; Kumar et al. 2009) to predict radical nsSNPs (Supplementary table S9, Supplementary Material online). We defined a large-effect nsSNP as a radical nsSNP located in a protein domain predicted by InterProScan (supplementary table S10, Supplementary Material online) (Quevillon et al. 2005). We found that the large-effect nsSNPs in L2 are significantly enriched in the nitrogen compound biosynthetic process, cofactor binding, and vitamin binding, whereas the nsSNPs in Silkie are not enriched in any group. These results suggest that the phenotypes associated with genes containing these mutations may represent specific characteristics of these breeds. Statistics of Structural Variants in Silkie and L2

Mutation and Selection

We investigated two types of loss-of-function mutations: stop-gained (nonsense) mutations and frameshift mutations (table 4). The frameshift mutations of L2 are enriched in the histone-lysine N-methyltransferase and amino acid transmembrane transporter activities, whereas those of Silkie are enriched in the nucleotide binding and the modification-dependent protein catabolic process. The stop-gained mutations of L2 are enriched in endopeptidase inhibitor activity, whereas those of Silkie are enriched in proteins involved in muscle contraction, response to radiation, adult behavior, cell death, and regulation of programmed cell death.
Table 4

Top Annotation Clusters of Loss-of-Function Mutations Identified by the DAVID Functional Annotation Clustering Tool

BreedsMutation TypesRepresentative Annotation TermsEnrichment Score
SilkieStop-gainedMuscle contraction1.87
Response to radiation1.30
Adult behavior1.27
Cell death1.15
Regulation of programmed cell death1.05

FrameshiftNucleotide binding1.33
Modification-dependent protein catabolic process1.05

L2Stop-gainedEndopeptidase inhibitor activity1.04

FrameshiftHistone-lysine N-methyltransferase activity1.28
Amino acid transmembrane transporter activity1.26

Large-effect nsSNPNitrogen compound biosynthetic process2.65
Cofactor binding1.47
Vitamin binding1.01

NOTE.—The genes were analyzed by the Functional Annotation Clustering Tool. The top annotation clusters have group enrichment scores greater than 1 were listed. The representative biology terms associated with the top annotation clusters are manually summarized.

Top Annotation Clusters of Loss-of-Function Mutations Identified by the DAVID Functional Annotation Clustering Tool NOTE.—The genes were analyzed by the Functional Annotation Clustering Tool. The top annotation clusters have group enrichment scores greater than 1 were listed. The representative biology terms associated with the top annotation clusters are manually summarized. To understand the genetic and genomic changes associated with chicken domestication, we searched for local reductions in heterozygosity that might have accompanied selective sweeps in the common ancestor of Silkie and L2. We examined 10-kb sliding windows with at least 10 homozygous SNP sites in every 0.5-kb step. We then computed how often at least 90% of the homozygous SNP sites are identical in the two domestic lines but different in the red jungle fowl. The segments identified contained 509 genes, representing 2.84% of total Ensemble annotated genes (supplementary table S11, Supplementary Material online). We went further to investigate the functional distribution of loss-of-function mutations (fig. 6), which are predicted to disable gene functions. Loss-of-function mutations have been proposed to be an important consequence of domestication (Olson 1999). However, in agreement with the previous study (Rubin et al. 2010), we found little evidence that selection for loss-of-function mutations played an important role in chicken domestication. Stop-gained mutations accounted for 1.31% and 1.37% of all Ensembl annotated genes for Silkie and L2, respectively, and frameshift mutations accounted for 4.53% and 4.61% of all Ensembl annotated genes, while stop-gained and frameshift mutations accounted for 1.57% and 3.14% of gene located in the putative selective sweeps, respectively (Pearson's χ2 test, P > 0.1). These low proportions suggest that loss-of-function mutations were not enriched in the selective sweeps. We considered the overlap of our CNVRs with putative selective sweeps and found only six Ensembl annotated genes within CNVRs (ENS-12789, ENS-07597, ENS-08000, ENS-09357, ENS-10772, and ENS-11056; only the last five digits of the Ensembl chicken gene annotation were shown).
F

Distributions of H values. The horizontal dashed lines indicate the threshold at H = −1.

Distributions of H values. The horizontal dashed lines indicate the threshold at H = −1. Among the 509 genes in the putative selective sweeps in our study, 46 were also found in the study of Rubin et al. (2010). TSHR, IGF1, PMCH, TBC1D1, ARID4B, ROBO2, ANK2, SLC16A12, and OSGIN1 showed the highly selective sweeps in all domestic or commercial breeds (Rubin et al. 2010). In the putative selective sweeps, IGF1 and HMGA2 have already been shown to be associated with body weight gains in Silkie (Tang et al. 2010; Song et al. 2011). It is also interesting to investigate whether protein-coding changes played an important role in evolution under domestication. We found that the protein-coding changes (nsSNP, frameshift, stop-gained) occurred in 50.29% of the genes in the putative selective sweeps compared with the 42.11% and 42.49% in the set of all genes in the Silkie and L2 genomes (Pearson's χ2 test, P < 0.001). The genes in the putative selective sweep regions are functionally enriched for the GTPase regulator activity, the glycoprotein biosynthetic process, nucleoside binding, the regulation of generation of precursor metabolites and energy, the regulation of cell adhesion, the purinergic nucleotide receptor activity, microtubule binding, stem cell development, and cell motion (table 5).
Table 5

Top Annotation Clusters of Gene Identified in the Putative Selective Sweep Regions by the DAVID Functional Annotation Clustering Tool

Representative Annotation TermsEnrichment Score
GTPase regulator activity1.75
Glycoprotein biosynthetic process1.70
Nucleoside binding1.43
Regulation of generation of precursor metabolites and energy1.40
Regulation of cell adhesion1.34
Purinergic nucleotide receptor activity, G-protein coupled1.25
Microtubule binding1.23
Stem cell development1.16
Cell motion1.05

Note.—The genes were analyzed by the Functional Annotation Clustering Tool. The top annotation clusters that have group enrichment scores >1 are listed. The representative biology terms associated with the top annotation clusters are manually summarized.

Top Annotation Clusters of Gene Identified in the Putative Selective Sweep Regions by the DAVID Functional Annotation Clustering Tool Note.—The genes were analyzed by the Functional Annotation Clustering Tool. The top annotation clusters that have group enrichment scores >1 are listed. The representative biology terms associated with the top annotation clusters are manually summarized.

nsSNPs and QTLs

Of the 7,291 and 7,349 nsSNP-containing genes of Silkie and L2, respectively, 4,665 (63.98%) and 4,736 (64.43%) matched to the genes that were found to locate at positions of significant QTLs (data from Chicken QTLdb, http://www.animalgenome.org/cgi-bin/QTLdb/GG/index). However, nsSNP-containing genes are not particularly concentrated in the QTL regions because the proportion of the genome covered by all the QTL regions is 64.11%, which is close to the proportion of nsSNP-containing genes (∼64%). We used DAVID to cluster radical nsSNPs in particular QTL regions and found that some significant enrichments of functionally related genes are associated with the traits (Huang et al. 2007; Huang et al. 2009). For instance, radical nsSNP-containing immunity genes are enriched in the QTL regions associated with several traits such as body weight, early mortality, oxygen saturation, cloacal bacterial burden after a challenge with Salmonella E, and Salmonella presence in ovary for both Silkie and L2. This may suggest that radical nsSNPs in immunity genes were subjected to human selection because they are associated with these economic traits. However, it is premature to draw a definitive conclusion. Associations between genotypes and phenotypes have been recently reported in many studies in chickens. Some SNP sites identified in L2 or/and Silkie have been reported to be associated with phenotypes in other chicken breeds (table 6). For mutations found in a single breed, MX1 exon 13 polymorphisms in broiler chickens are associated with commercial traits such as body weight, and early and late mortality (Livant et al. 2007), and with morbidity, early mortality, viral shedding, and cytokine responses in chickens infected with an avian influenza virus (Ewald et al. 2011). Pro-opiomelanocortin (POMC) was associated with growth and carcass traits in Anak and Gushi chickens (Bai et al. 2012), greater body weight in females of commercial broilers (Sharma et al. 2008), and a mutation in aggrecan (ACAN) was associated with the incidence and severity of tibial dyschondroplasia (TD) in chickens (Ray et al. 2006). Inducible nitric oxide synthase (iNOS) is associated with general mortality and other performance traits in three elite commercial broiler chicken lines raised in high and low hygiene environments (Ye et al. 2006). Gonadotropin-Releasing Hormone Receptor (GnRHR) is associated with egg-laying traits in the Wenchang Chicken (Wu et al. 2007).
Table 6

Identified nsSNPs Reportedly Associated with Phenotypes in the Other Breeds of Chicken

GeneEnsembl IDaChrBreedbMutationAssociationReference
MXENS-259991L2Radical nsSNPMorbidity, early mortality, viral shedding, and cytokine responsesLivant et al. 2007; Ewald et al. 2011
POMCENS-267933L2Radical nsSNPProduction traitsSharma et al. 2008; Bai et al. 2012
Ovocalyxin-32ENS-156349BRadical nsSNPEggshell quality traitsTakahashi et al. 2010
GnRHRENS-2102010SStop gainedEgg-laying traitsWu et al. 2007
ACANENS-3856910L2Radical nsSNPTibial dyschondroplasiaRay et al. 2006
iNOSENS-0912919L2Radical nsSNPPerformanceYe et al. 2006
CCL4ENS-3520619BRadical nsSNPFeather damageBiscarini et al. 2010
PRDM16ENS-0159021BRadical nsSNPPerformance traitsHan et al. 2012
MBL2ENS-0476521BRadical nsSNPExperimental Pasteurella multocida infectionSchou et al. 2010
LIPLENS-24882ZBRadical nsSNPFat depositionLiu et al. 2006

aOnly the last five digits of the Ensembl chicken gene annotation are shown.

bS: Silkie; L2: Taiwanese L2 chicken; B: Both breeds of S and L2.

Identified nsSNPs Reportedly Associated with Phenotypes in the Other Breeds of Chicken aOnly the last five digits of the Ensembl chicken gene annotation are shown. bS: Silkie; L2: Taiwanese L2 chicken; B: Both breeds of S and L2. For mutations found in both breeds, mannan-binding lectin (MBL) plays an important role as the first line of defense against Pasteurella multocida by diminishing the infection before the adaptive immune response takes over (Schou et al. 2010). Ovocalyxin-32 is associated with eggshell traits such as egg weight, short length of the egg, long length of the egg, and non-destructive deformation (Takahashi et al. 2010). C–C motif chemokine 4 (CCL4) was found to be associated to plumage condition in laying hens (Biscarini et al. 2010) and also in the selective sweeps we found. PR domain containing 16 gene (PRDM16) was found to have positive effects on chicken growth, fatness, and meat quality traits at different stages (Han et al. 2012). Lipoprotein lipase gene (LPL) is significantly associated with intermuscular fat width, abdominal fat weight, and thickness of subcutaneous fat in chickens (Liu et al. 2006).

Discussion

Applying Illumina sequencing, we obtained the first draft genome sequences of Silkie and L2 and identified a total of 7.6 million SNPs in comparison with the genome of their wild ancestor, the red jungle fowl. DNA Capture Array, exome sequencing, and other target-enrichment strategies for next-generation sequencing have allowed the sequencing of targeted regions in the genome more efficiently and economically (Mamanova et al. 2010; Teer and Mullikin 2010; Mertes et al. 2011), especially with rapidly increasing numbers of samples. These techniques have been used successfully to identify human disease-causing variants in some cases (Bamshad et al. 2011; Gilissen et al. 2011; Ku et al. 2011; Gilissen et al. 2012; Gonzaga-Jauregui et al. 2012; Rabbani et al. 2012) and have been useful for crop improvement in agriculture (Singh et al. 2012). However, exome capture by hybridization can introduce considerable coverage variation (Majewski et al. 2011), affecting comparative analysis. Moreover, targeted sequencing of only specific regions or a specified list of protein-coding sequences could miss DNA variations, especially valuable genetic variants in intronic and intergenic regions which potentially affect gene expression. Furthermore, whole-genome sequencing can also detect CNVs as well as SVs, giving us a more comprehensive view of genetic variation in a genome. We chose to use several PE libraries for whole-genome resequencing. PE reads can provide rigorous read alignment and enhance the accuracy and coverage of SNP calling and consensus sequence inference. Applying PE reads to study structural variation, we provided the first genome-wide pattern of structural variation and copy number variants in the chicken using stringent SV detection constraints. We found that 96% of the predicted SVs resided in intronic (41%) or intergenic regions (55%), consistent with a previous finding in chickens using PE sequencing of reduced representation libraries (Kerstens et al. 2011). It is possible that some of the SVs identified have contributed to phenotypic differences among domestic chickens and the red jungle fowl. For example, a large insertion affecting the expression of BMP12 has been shown to be the causative mutation associated with the naked neck trait (Mou et al. 2011). Next-generation sequencing technologies and analysis programs can provide an efficient pipeline to characterize CNVs at the genome-wide level. CNVs have been analyzed in several domesticated animals (Clop et al. 2012). In particular, comparative genomic hybridization (CGH) and SNP arrays have been applied to screen for CNVs in chickens (Wang et al. 2010b; Clop et al. 2012; Wang et al. 2012b; Jia et al. 2013) and a few traits have been shown to be associated with CNVs. For instance, the chicken pea comb phenotype was linked to a duplication near the first intron of SOX5 (Wright et al. 2009). The hyperpigmentation in the Silkie is caused by duplicated regions containing endothelin 3 (EDN3) (Dorshorst et al. 2011; Shinomiya et al. 2012). Late feathering is caused by a partial duplication of the PRLR and SPEF2 genes in chickens (Elferink et al. 2008). However, accurate CNV detection using NGS data is still difficult due to the propensity of falsely detecting CNVs by the software CNV-Seq since it does not take local read count variability into account (Klambauer et al. 2012). The chicken has been used for quantitative genetic studies for decades (Georges and Andersson 2003; Siegel et al. 2006; Georges 2007). Vast numbers of QTLs are being detected in experiments for a large variety of economic traits in chickens, such as growth, carcass composition, reproductive, behaviors, disease resistance, etc. (Andersson 2001; Burt and Pourquie 2003). The chicken may be the only farm animal that can be applied for refining the map position of QTL with relatively low costs (Andersson and Georges 2004). Several candidate genes were successfully identified to be responsible for chicken disease resistance by combining high-resolution QTL mapping, comparative mapping, and functional genomic data (Vallejo et al. 1998; Zhu et al. 2001; Lipkin et al. 2002). The growing genomic resources in addition to a relatively short reproduction time make the chicken even more ideal for unraveling the molecular basis of phenotypic diversity in birds (Burt and Pourquie 2003). With the availability of genome-wide variations of Silkie and L2, we tried to identify candidate genes for important traits in chicken domestication using multiple approaches including SIFT analysis, functional enrichment, and selective sweep detection. We found that the genes involved in nitrogen compound metabolism, the ATP-binding cassette (ABC) transporters, extracellular matrix, and cytoskeletons are enriched in the large-effect containing genes or in the selective sweeps. The ABC transporter superfamily is the largest transporter gene family responsible for transporting specific molecules across cell membranes and essential for regulating organismic homeostasis in all animals (Dean and Annilo 2005); they also might be important for silkworm domestication (Xie et al. 2012). Moreover, genes involved in extracellular matrix have been shown to be differentially expressed between domestic chickens and red jungle fowls, suggesting possible selection pressures on this kind of genes (Li et al. 2012). Similarly, the genes involved in these processes or functions are found to evolve faster in other domesticated animals, such as dogs, cattle, and pigs (Groenen et al. 2012). Although we only sequenced one individual for each of the two breeds, we can still identify local reductions in heterozygosity due to selective sweeps. We found that coding sequence mutations were slightly enriched in the putative selective sweeps. This observation suggests that physiological traits are likely to be artificially selected since adaptive mutations affecting physiology are more likely to occur in the protein-coding regions than in the cis-regulatory regions of genes (Carroll 2005). We also found important genes such as TSHR, IGF1, PMCH, and TBC1D1 that are associated with growth, appetite, and metabolic regulation in broiler in a previous study (Rubin et al. 2010). The thyroid stimulating hormone receptor (TSHR) mutation could be involved in regulating photoperiod control of reproduction (Yoshimura et al. 2003; Hanon et al. 2008; Nakao et al. 2008), affecting development, growth, and behavior in the domestic chicken with selective advantages during domestication. Therefore, TSHR is suggested to be a domestication gene in chickens, where all individuals of domestic breeds carry the same mutant allele. In fact, Silkie and L2 also carry the G558A mutation, which may drive the residue outwards from the cell membrane and thus influence ligand binding (Rubin et al. 2010). Our results indicate that these genes are also important for lowly selected breeds of chickens. Within the selective sweeps in all of the domestic chickens used in our and Rubin et al.'s studies, some of the genes have also been detected to be associated with domestication traits in chicken or other farm animals, reinforcing their important roles in chicken domestication. For instance, ERO1LB and ARID4B have been detected to be associated with residual feed intake in swines (Gorbach 2011). ARID4B encodes a subunit of the histone deacetylase-dependant SIN3A transcriptional corepressor complex, which functions in various cellular processes including proliferation, differentiation, apoptosis, oncogenesis, and cell fate determination (Wu et al. 2006; Winter et al. 2012). In addition, NELL1 has been identified in a selective sweep in broilers (Elferink et al. 2012), and ESRP2 is associated with chicken abdominal fat contents (Zhang et al. 2012). NELL1 encodes a cytoplasmic protein, which contains epidermal growth factor (EGF)-like repeats and may be involved in cell growth regulation and differentiation in bone and cartilage (Zhang et al. 2010; Chen et al. 2011; Zhang et al. 2011; Zou et al. 2011; Cowan et al. 2012; Siu et al. 2012), and ESPR2 encodes an epithelial cell-type-specific splicing regulator (Warzecha et al. 2009a, 2009b; Warzecha et al. 2010; Dittmar et al. 2012). ROBO2, which was identified in highly selective sweeps in commercial broilers, also appeared in our study. ROBO2 encodes a protein that is a receptor for SLIT2, which is known to function in axon guidance and cell migration (Anitha et al. 2008). These findings imply that the selection for traits controlled by these genes occurred early and throughout the history of chicken domestication, maintaining a low heterozygosity in these genes. Thus, our approach is successful in identifying some of the important genes in domestication. ANK2, SLC16A12, ARID4B, and OSGIN1, which were found in the highly selective sweep regions in all domestic lines of chickens in the study of Rubin et al., have also been identified in the selective sweep regions in our study. They are excellent candidate genes for functional studies in animal sciences. Differential expression of ANK2, which encodes a member of the ankyrin proteins required for targeting and stability of Na+/Ca++ exchanger 1 in cardiomyocytes during cardiac muscle contraction (Mohler 2006; Hashemi et al. 2009), between two layer lines, Lohmann Selected Leghorn (LSL) and Lohmann Brown (LB), has also been found in one study using genome-wide microarray analyses (Habig et al. 2012). In addition to the above issues, it is important to learn the genetic basis underlying phenotypic differences that are responsible for specific breed characteristics. We observed enrichments in different functional categories in the two chicken breeds we studied, implying differentially selective forces. The observed enrichment in nitrogen compound biosynthetic process, cofactor binding, and vitamin binding genes may be related to selection of L2 for a meat and egg purpose. In conclusion, we present a whole genome map of SNPs, indels, SVs, and CNVs of two chicken breeds here. Genome-wide comparisons with trait data of chicken breeds using the SNPs, indels, SVs, and CNVs identified here will provide additional clues to the genetic and genomic bases of the interesting traits of domestic chickens and will be a useful resource for future studies of the molecular basis of disease and phenotypic variation in chickens.

Supplementary Material

Supplementary tables S1–S11 and figures S1–S5 are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).
Table 3

Statistics of Structural Variants in Silkie and L2

SV TypeBreed
SilkieL2
Large deletion (≥50 bps)
    Downstream149138
    Exonic5460
    Intergenic65015446
    Intronic47914271
    ncRNA_exonic35
    Splicing1314
    Upstream114145
    Upstream; downstream610
    UTR33626
    UTR522
Large insertion (≥50 bps)
    Downstream32
    Exonic10
    Intergenic145168
    Intronic104113
    Upstream43
Tandem duplication (≥ 50bps)
    Downstream45
    Exonic24
    Exonic; splicing01
    Intergenic97123
    Intronic2948
    Splicing01
    Upstream26
Inversion (≥50 bps)
    Downstream02
    Exonic1012
    Intergenic5648
    Intronic3031
    Upstream10
    Splicing01
  95 in total

Review 1.  Genetic dissection of phenotypic diversity in farm animals.

Authors:  L Andersson
Journal:  Nat Rev Genet       Date:  2001-02       Impact factor: 53.242

2.  Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method.

Authors:  K J Livak; T D Schmittgen
Journal:  Methods       Date:  2001-12       Impact factor: 3.608

3.  SIFT: Predicting amino acid changes that affect protein function.

Authors:  Pauline C Ng; Steven Henikoff
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

4.  Genetics. Chicken genome--science nuggets to come soon.

Authors:  Dave Burt; Olivier Pourquie
Journal:  Science       Date:  2003-06-13       Impact factor: 47.728

Review 5.  Positional identification of structural and regulatory quantitative trait nucleotides in domestic animal species.

Authors:  M Georges; L Andersson
Journal:  Cold Spring Harb Symp Quant Biol       Date:  2003

Review 6.  Domestic-animal genomics: deciphering the genetics of complex traits.

Authors:  Leif Andersson; Michel Georges
Journal:  Nat Rev Genet       Date:  2004-03       Impact factor: 53.242

7.  Quantitative trait locus mapping in chickens by selective DNA pooling with dinucleotide microsatellite markers by using purified DNA and fresh or frozen red blood cells as applied to marker-assisted selection.

Authors:  E Lipkin; J Fulton; H Cheng; N Yonash; M Soller
Journal:  Poult Sci       Date:  2002-03       Impact factor: 3.352

8.  Relationship between reproductive performance and immunity in Taiwan country chickens.

Authors:  C H Chao; Y P Lee
Journal:  Poult Sci       Date:  2001-05       Impact factor: 3.352

9.  Screening for highly heterozygous chickens in outbred commercial broiler lines to increase detection power for mapping quantitative trait loci.

Authors:  J J Zhu; H S Lillehoj; H H Cheng; D Pollock; M Sadjadi; M G Emara
Journal:  Poult Sci       Date:  2001-01       Impact factor: 3.352

10.  Light-induced hormone conversion of T4 to T3 regulates photoperiodic response of gonads in birds.

Authors:  Takashi Yoshimura; Shinobu Yasuo; Miwa Watanabe; Masayuki Iigo; Takashi Yamamura; Kanjun Hirunagi; Shizufumi Ebihara
Journal:  Nature       Date:  2003-11-13       Impact factor: 49.962

View more
  34 in total

Review 1.  Development, regeneration, and evolution of feathers.

Authors:  Chih-Feng Chen; John Foley; Pin-Chi Tang; Ang Li; Ting Xin Jiang; Ping Wu; Randall B Widelitz; Cheng Ming Chuong
Journal:  Annu Rev Anim Biosci       Date:  2014-11-03       Impact factor: 8.923

2.  Complex evolutionary trajectories of sex chromosomes across bird taxa.

Authors:  Qi Zhou; Jilin Zhang; Doris Bachtrog; Na An; Quanfei Huang; Erich D Jarvis; M Thomas P Gilbert; Guojie Zhang
Journal:  Science       Date:  2014-12-11       Impact factor: 47.728

3.  Positive selection rather than relaxation of functional constraint drives the evolution of vision during chicken domestication.

Authors:  Ming-Shan Wang; Rong-Wei Zhang; Ling-Yan Su; Yan Li; Min-Sheng Peng; He-Qun Liu; Lin Zeng; David M Irwin; Jiu-Lin Du; Yong-Gang Yao; Dong-Dong Wu; Ya-Ping Zhang
Journal:  Cell Res       Date:  2016-04-01       Impact factor: 25.617

4.  863 genomes reveal the origin and domestication of chicken.

Authors:  Ming-Shan Wang; Mukesh Thakur; Min-Sheng Peng; Yu Jiang; Laurent Alain François Frantz; Ming Li; Jin-Jin Zhang; Sheng Wang; Joris Peters; Newton Otieno Otecko; Chatmongkon Suwannapoom; Xing Guo; Zhu-Qing Zheng; Ali Esmailizadeh; Nalini Yasoda Hirimuthugoda; Hidayat Ashari; Sri Suladari; Moch Syamsul Arifin Zein; Szilvia Kusza; Saeed Sohrabi; Hamed Kharrati-Koopaee; Quan-Kuan Shen; Lin Zeng; Min-Min Yang; Ya-Jiang Wu; Xing-Yan Yang; Xue-Mei Lu; Xin-Zheng Jia; Qing-Hua Nie; Susan Joy Lamont; Emiliano Lasagna; Simone Ceccobelli; Humpita Gamaralalage Thilini Nisanka Gunwardana; Thilina Madusanka Senasige; Shao-Hong Feng; Jing-Fang Si; Hao Zhang; Jie-Qiong Jin; Ming-Li Li; Yan-Hu Liu; Hong-Man Chen; Cheng Ma; Shan-Shan Dai; Abul Kashem Fazlul Haque Bhuiyan; Muhammad Sajjad Khan; Gamamada Liyanage Lalanie Pradeepa Silva; Thi-Thuy Le; Okeyo Ally Mwai; Mohamed Nawaz Mohamed Ibrahim; Megan Supple; Beth Shapiro; Olivier Hanotte; Guojie Zhang; Greger Larson; Jian-Lin Han; Dong-Dong Wu; Ya-Ping Zhang
Journal:  Cell Res       Date:  2020-06-25       Impact factor: 25.617

5.  Integrating transcriptome and genome re-sequencing data to identify key genes and mutations affecting chicken eggshell qualities.

Authors:  Quan Zhang; Feng Zhu; Long Liu; Chuan Wei Zheng; De He Wang; Zhuo Cheng Hou; Zhong Hua Ning
Journal:  PLoS One       Date:  2015-05-14       Impact factor: 3.240

6.  Genome-wide analysis reveals the extent of EAV-HP integration in domestic chicken.

Authors:  David Wragg; Andrew S Mason; Le Yu; Richard Kuo; Raman A Lawal; Takele Taye Desta; Joram M Mwacharo; Chang-Yeon Cho; Steve Kemp; David W Burt; Olivier Hanotte
Journal:  BMC Genomics       Date:  2015-10-14       Impact factor: 3.969

Review 7.  The challenges and importance of structural variation detection in livestock.

Authors:  Derek M Bickhart; George E Liu
Journal:  Front Genet       Date:  2014-02-18       Impact factor: 4.599

8.  Detection and characterization of small insertion and deletion genetic variants in modern layer chicken genomes.

Authors:  Clarissa Boschiero; Almas A Gheyas; Hannah K Ralph; Lel Eory; Bob Paton; Richard Kuo; Janet Fulton; Rudolf Preisinger; Pete Kaiser; David W Burt
Journal:  BMC Genomics       Date:  2015-07-31       Impact factor: 3.969

9.  Genome-wide identification of copy number variations between two chicken lines that differ in genetic resistance to Marek's disease.

Authors:  Yiyuan Yan; Ning Yang; Hans H Cheng; Jiuzhou Song; Lujiang Qu
Journal:  BMC Genomics       Date:  2015-10-23       Impact factor: 3.969

10.  Large-scale genomic analysis reveals the genetic cost of chicken domestication.

Authors:  Ming-Shan Wang; Jin-Jin Zhang; Xing Guo; Ming Li; Rachel Meyer; Hidayat Ashari; Zhu-Qing Zheng; Sheng Wang; Min-Sheng Peng; Yu Jiang; Mukesh Thakur; Chatmongkon Suwannapoom; Ali Esmailizadeh; Nalini Yasoda Hirimuthugoda; Moch Syamsul Arifin Zein; Szilvia Kusza; Hamed Kharrati-Koopaee; Lin Zeng; Yun-Mei Wang; Ting-Ting Yin; Min-Min Yang; Ming-Li Li; Xue-Mei Lu; Emiliano Lasagna; Simone Ceccobelli; Humpita Gamaralalage Thilini Nisanka Gunwardana; Thilina Madusanka Senasig; Shao-Hong Feng; Hao Zhang; Abul Kashem Fazlul Haque Bhuiyan; Muhammad Sajjad Khan; Gamamada Liyanage Lalanie Pradeepa Silva; Le Thi Thuy; Okeyo A Mwai; Mohamed Nawaz Mohamed Ibrahim; Guojie Zhang; Kai-Xing Qu; Olivier Hanotte; Beth Shapiro; Mirte Bosse; Dong-Dong Wu; Jian-Lin Han; Ya-Ping Zhang
Journal:  BMC Biol       Date:  2021-06-16       Impact factor: 7.431

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.