| Literature DB >> 31821332 |
Rooksana E Noorai1, Vijay Shankar2, Nowlan H Freese3, Christopher M Gregorski4, Susan C Chapman4.
Abstract
Gallus gallus (chicken) is phenotypically diverse, with over 60 recognized breeds, among the myriad species within the Aves lineage. Domestic chickens have been under artificial selection by humans for thousands of years for agricultural purposes. The North American Araucana (NAA) breed arose as a cross between the Chilean "Collonocas" that laid blue eggs and was rumpless and the "Quetros" that had unusual tufts but with tail. NAAs were introduced from South America in the 1940s and have been kept as show birds by enthusiasts since then due to several distinctive traits: laying eggs with blue eggshells, characteristic ear-tufts, a pea comb, and rumplessness. The population has maintained variants for clean-faced and tufted, as well as tailed and rumplessness traits making it advantageous for genetic studies. Genome resequencing of six NAA chickens with a mixture of these traits was done to 71-fold coverage using Illumina HiSeq 2000 paired-end reads. Trimmed and concordant reads were mapped to the Gallus_gallus-5.0 reference genome (galGal5), generated from a female Red Junglefowl (UCD001). To identify candidate genes that are associated with traits of the NAA, their genome was compared with the Korean Araucana, Korean Domestic and White Leghorn breeds. Genomic regions with significantly reduced levels of heterogeneity were detected on five different chromosomes in NAA. The sequence data generated confirm the identity of variants responsible for the blue eggshells, pea comb, and rumplessness traits of NAA and propose one for ear-tufts.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31821332 PMCID: PMC6903725 DOI: 10.1371/journal.pone.0225834
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Photographs of the four traits known to be associated with North American Araucanas (NAAs).
(A) NAA with ear-tufts, a pea comb, and rumplessness and (B) NAA laid blue shelled eggs.
Phenotypes for NAA selected for whole-genome resequencing.
| ird id | Flock Location | Sex | Color | Tufted or Clean Faced | Rumpless or Tailed |
|---|---|---|---|---|---|
| S.C. | M | White | Tufted | Rumpless | |
| S.C. | M | White | Tufted | Rumpless | |
| S.C. | U | White | Clean | Rumpless | |
| N.J. | F | Blue | Tufted | Rumpless | |
| N.J. | M | Black Breasted Red | Clean | Tailed | |
| S.C. | M | Black | Tufted | Rumpless |
1 S.C. location 1
2 S.C. location 2
*Heterozygous for rumpless
Fig 2The relationship between groups of birds.
(A) Principal Components Analysis and (B) Dendrogram. Colors of dots in PCA and labels in dendrogram correspond to different groups of birds. Violet dots and text represent North American Araucana, green dots and text represent Korean Domestics, gold dots and text represent White Leghorns, and red dots and text represent Korean Araucana.
Fig 3Visualization of multidimensional scaling plot.
Discriminant model from random forest analysis on North American Araucanas (NAAs) compared to all other groups. The ellipses represent three standard errors around the centroid for each group. Violet dots represent NAA. Yellow dots represent the combined group of birds from Korean Domestics, White Leghorns and Korean Araucana.
Candidate genes selected based on annotation of the top 22 markers found to be within genes and fitting the pattern of the alternate allele being present at a higher frequency in the NAA samples.
| Ensembl ID | Gene name | Gene description | VEP results | snpEff results |
|---|---|---|---|---|
| ENSGALG00000043223 | collagen type XXVI alpha 1 chain | intron variant | intron variant | |
| ENSGALG00000011630 | GLI family zinc finger 2 | intron variant | intron variant | |
| ENSGALG00000041098 | novel gene | intron variant | intron variant | |
| ENSGALG00000015211 | dystrobrevin alpha | intron variant | intron variant | |
| ENSGALG00000014974 | mindbomb E3 ubiquitin protein ligase 1 | intron variant | intron variant | |
| ENSGALG00000005657 | corticotropin-releasing factor receptor 2 | intron variant | intron variant | |
| ENSGALG00000046171 | novel gene | intron variant | NA | |
| ENSGALG00000002242 | polypeptide N-acetylgalactosaminyltransferase 9 | NA | intergenic region | |
| ENSGALG00000002272 | nucleolar complex associated 4 homolog (NOC4L) | NA | intergenic region | |
| ENSGALG00000004756 | centrosomal protein 89 | upstream gene variant | upstream gene variant | |
| ENSGALG00000004814 | Rhophilin Rho GTPase binding protein 2 | downstream gene variant | intergenic region | |
| ENSGALG00000034534 | Fanconi anemia core complex associated protein 24 | downstream gene variant | downstream gene variant | |
| ENSGALG00000006562 | MCF.2 cell line derived transforming sequence | intron variant | intron variant | |
| ENSGALG00000036938 | RALY RNA binding protein like | intron variant | intron variant | |
| ENSGALG00000029378 | integrin alpha-9 precursor | intron variant | intron variant | |
| ENSGALG00000000908 | tRNA-specific adenosine deaminase 1 | intron variant | intron variant | |
| ENSGALG00000008477 | exocyst complex component 4 | intron variant | intron variant | |
| ENSGALG00000015379 | transgelin 3 | intron variant | intron variant | |
| ENSGALG00000015307 | ABI family member 3 binding protein | intron variant | intron variant | |
| ENSGALG00000011910 | MNAT1, CDK activating kinase assembly factor | intron variant | intron variant | |
| ENSGALG00000017389 | SIX homeobox 4 | upstream gene variant | upstream gene variant | |
| ENSGALG00000031741 | Focal adhesion kinase 1 | intron variant | intron variant | |
| ENSGALG00000042125 | adaptor related protein complex 2 beta 1 subunit | intron variant | intron variant | |
| ENSGALG00000037890 | novel gene | NA | upstream gene variant | |
| ENSGALG00000016845 | NAD(P)HX dehydratase | intron variant | intron variant | |
| ENSGALG00000042644 | novel gene | downstream gene variant | downstream gene variant | |
| ENSGALG00000016547 | taxilin gamma | upstream gene variant | upstream gene variant | |
| ENSGALG00000016548 | synapse associated protein 1 | NA | intergenic region | |
| ENSGALG00000010352 | Rho guanine nucleotide exchange factor 26 | intron variant | intron variant | |
| ENSGALG00000041255 | disintegrin and metalloproteinase domain-containing protein 12 isoform 2 precursor | intron variant | intron variant |
Fig 4Selective sweeps.
The negative tail of the ZH distribution presented along chromosomes 1–28, 30, 32, 33, and LGE64. The Z and W chromosomes were not included because the sex of many birds in the study was not reported. Each dot represents a creeping window of up to 40 kb. The horizontal red line stands for the suggestive sweep significance level at ZHp = -4. The horizontal blue line stands for the strong sweep significance level at ZHp = -6. (A) Results from six North American Araucana, (B) results from three Korean Araucana, (C) results from three White Leghorns, and (D) results from three Korean Domestics.
Descriptive statistics of the negative ZHp values for all conducted selective sweeps.
| Statistic | NAA (n = 6) | NAA (n = 3) | KA | KD | WL |
|---|---|---|---|---|---|
| Maximum -ZHp | 0 | 0 | 0 | 0 | 0 |
| Minimum -ZHp | -6.27 | -4.8 | -4.33 | -3.31 | -1.87 |
| Mode (NModes) | 0 (20,831) | 0 (20,283) | -0.02 (17,509) | -3.31 (31,397) | -1.87 (171,172) |
| Median | -0.59 | -0.62 | -0.69 | -0.68 | -0.80 |
| Mean | -0.86 | -0.90 | -0.89 | -0.94 | -0.93 |
| Standard deviation | 0.87 | 0.86 | 0.79 | 0.86 | 0.63 |
| # of Creeping windows | 1,885,372 | 1,839,836 | 1,926,092 | 1,789,416 | 1,993,801 |
Regions defined as strong selective sweeps in six NAAs.
| Full NAA set (6 birds) | Half NAA set (3 birds) | KA set (3 birds) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| chr | start | stop | windows | Avg ZHp | SD | windows | Avg ZHp | SD | windows | Avg ZHp | SD | Known gene(s) |
| 1 | 19,176,800 | 19,225,898 | 10 | -6.19 | 0.11 | 10 | -4.72 | 0.10 | 10 | 1.05 | 0.12 | None |
| 1 | 19,248,994 | 19,452,814 | 376 | -6.23 | 0.04 | 376 | -4.77 | 0.04 | 376 | 0.87 | 0.70 | None |
| 1 | 65,229,480 | 65,268,953 | 1 | -6.04 | - | 1 | -4.56 | - | 1 | 1.99 | - | |
| 2 | 147,275,921 | 147,337,002 | 33 | -6.19 | 0.01 | 33 | -4.69 | 0.02 | 33 | 0.33 | 0.15 | |
| 4 | 31,400,689 | 31,482,936 | 68 | -6.11 | 0.06 | 68 | -4.78 | 0.01 | 68 | -0.96 | 0.12 | |
| 4 | 55,307,353 | 55,357,189 | 4 | -6.13 | 0.04 | 4 | -4.65 | 0.05 | 4 | 0.64 | 0.06 | |
| 15 | 2,976,357 | 3,132,521 | 389 | -6.14 | 0.07 | 427 | -4.69 | 0.08 | 427 | 0.91 | 0.26 | |
| 18 | 390,709 | 469,520 | 68 | -6.19 | 0.07 | 68 | -4.71 | 0.07 | 68 | 1.31 | 0.10 | |
| Total | 949 | 987 | 987 | |||||||||
*Half group windows overlap with full group, no windows outside of full group were counted.
Break end structural variants present in study population on chromosome 15 (NC 006102.4).
All tufted birds are characterized by an 8.3kb heterozygous deletion on chromosome 15.
| North American Araucanas | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Tufted | Tufted | Clean | Tufted | Clean | Tufted | ||||||||||||
| START | STOP | SIZE | NAA01 | NAA02 | NAA04 | NAA04 | NAA05 | NAA06 | Gene | Description | Position | ||||||
| 347,555 | 348,072 | 517 | U | (0/1) | ✓ | (1/1) | NA | (0/0) | NA | (0/0) | NA | (0/0) | ✓ | (1/1) | ubiquitin conjugating enzyme E2 L3 | intronic | |
| 1,019,583 | 1,027,972 | 8,389 | ✓ | (0/1) | ✓ | (0/1) | NA | (0/0) | U | (0/1) | NA | (0/0) | U | (0/1) | thioredoxin reductase 2 | exon (whole) | |
| 2,166,472 | 2,167,057 | 585 | ✓ | (1/1) | ✓ | (0/1) | NA | (0/0) | ✓ | (1/1) | ✓ | (1/1) | ✓ | (1/1) | No genes | NA | intergenic |
| 2,826,383 | 2,826,602 | 219 | ✓ | (0/1) | U | (1/1) | ✓ | (1/1) | ✓ | (1/1) | NA | (0/0) | ✓ | (1/1) | splicing factor SWAP | intronic | |
| 3,225,855 | 3,226,300 | 445 | ✓ | (1/1) | NA | (0/0) | NA | (0/0) | NA | (0/0) | NA | (0/0) | NA | (0/0) | RIMS binding protein 2 | intronic | |
| 4,106,455 | 4,106,572 | 117 | ✓ | (0/1) | NA | (0/0) | NA | (0/0) | NA | (0/0) | NA | (0/0) | U | (0/1) | uncharacterized (long non-coding RNA) | intronic | |
| 4,189,586 | 4,189,756 | 170 | ✓ | (0/1) | NA | (0/0) | NA | (0/0) | NA | (0/0) | NA | (0/0) | NA | (0/0) | No genes | NA | intergenic |
| 4,264,865 | 4,270,692 | 5,827 | U | (1/1) | U | (0/1) | U | (0/1) | U | (1/1) | NA | (0/0) | ✓ | (1/1) | uncharacterized (long non-coding RNA) | intronic | |
| 4,280,664 | 4,281,215 | 551 | ✓ | (1/1) | U | (0/1) | ✓ | (1/1) | NA | (0/0) | ✓ | (1/1) | NA | (0/0) | uncharacterized (long non-coding RNA) | intronic | |
| 5,075,785 | 5,076,236 | 451 | ✓ | (1/1) | NA | (0/0) | NA | (0/0) | NA | (0/0) | NA | (0/0) | ✓ | (1/1) | uncharacterized (long non-coding RNA) | exon (partial) | |
| 5,218,248 | 5,218,471 | 223 | NA | (0/0) | NA | (0/0) | NA | (0/0) | NA | (0/0) | ✓ | (0/1) | ✓ | (0/1) | phosphatidylinositol transfer protein membrane associated 2 | intronic | |
| 6,597,976 | 6,598,893 | 917 | NA | (0/0) | NA | (0/0) | NA | (0/0) | NA | (0/0) | ✓ | (0/1) | ✓ | (1/1) | coronin 1C | intronic | |
| 7,598,349 | 7,598,465 | 116 | NA | (0/0) | ✓ | (0/1) | ✓ | (0/1) | NA | (0/0) | NA | (0/0) | NA | (0/0) | tetratricopeptide repeat domain 28 | exon (partial) | |
| 7,720,778 | 7,720,969 | 191 | NA | (0/0) | NA | (0/0) | NA | (0/0) | NA | (0/0) | ✓ | (0/1) | NA | (0/0) | tetratricopeptide repeat domain 28 | intronic | |
| 7,746,729 | 7,747,057 | 328 | ✓ | (0/1) | NA | (0/0) | NA | (0/0) | NA | (0/0) | ✓ | (0/1) | NA | (0/0) | solute carrier family 2 member 11-like 5 | intronic | |
| 7,838,965 | 7,839,053 | 88 | ✓ | (1/1) | NA | (0/0) | NA | (0/0) | NA | (0/0) | U | (0/1) | NA | (0/0) | zinc and ring finger 3 | intronic | |
| 9,286,836 | 9,287,033 | 197 | NA | (0/0) | ✓ | (1/1) | ✓ | (1/1) | ✓ | (1/1) | U | (1/1) | U | (1/1) | No genes | NA | intergenic |
| 9,608,637 | 9,609,387 | 750 | NA | (0/0) | U | (0/1) | NA | (0/0) | NA | (0/0) | ✓ | (0/1) | U | (0/1) | No genes | NA | intergenic |
| 10,428,724 | 10,428,823 | 99 | NA | (0/0) | ✓ | (1/1) | NA | (0/0) | NA | (0/0) | NA | (0/0) | NA | (0/0) | reticulon 4 receptor | intronic | |
| 10,746,849 | 10,747,291 | 442 | ✓ | (1/1) | U | (1/1) | ✓ | (1/1) | NA | (0/0) | ✓ | (1/1) | ✓ | (1/1) | uncharacterized (long non-coding RNA) | intronic | |
| 12,123,689 | 12,123,786 | 97 | ✓ | (1/1) | NA | (0/0) | ✓ | (1/1) | NA | (0/0) | NA | (0/0) | NA | (0/0) | uncharacterized (long non-coding RNA) | intronic | |
NA—Structural variant not present in vcf
U—Structural variant present in unfiltered vcf
✓—Structural variant present in filtered vcf (passed filtering)
0/1—Heterozygous for structural variation
1/1—Homozygous for structural variation
0/0—Homozygous for reference alignment
Detecting the mutation for pea comb by identifying copy-number variations in SOX5.
This table contains the median value of Log2 ratios comparing WL to NAA, KA and KD in the first intron of SOX5.
| WL01 | WL02 | WL03 | |
|---|---|---|---|
| 2.34 | 2.31 | 2.38 | |
| 3.73 | 3.81 | 3.78 | |
| 2.26 | 2.34 | 2.25 | |
| 2.08 | 2.11 | 2.19 | |
| 3.30 | 3.35 | 3.32 | |
| 3.09 | 3.18 | 3.15 | |
| 2.85 | 2.93 | 2.70 | |
| 2.94 | 3.02 | 2.99 | |
| 2.94 | 3.02 | 3.00 | |
| 0.05 | 0.09 | 0.17 | |
| -0.2 | -0.11 | -0.08 | |
| -0.04 | 0.01 | 0.08 |
Results from genotyping of wildtype, tailed, partial and rumpless NAA chicken breeds for SNP variants SV1 and SV2 from wing vein blood.
| total number of birds tested | SV1 | one or both alelles | SV2 | one or both alelles | |
|---|---|---|---|---|---|
| 18 | 18 | both | 18 | both | |
| 42 | 42 | both | 42 | both | |
| 21 | 19 | one | 19 | one | |
| 2 | one | 2 | both | ||
| 61 | 26 | one | 26 | one | |
| 32 | both | 32 | both | ||
| 3 | both | 3 | one |