| Literature DB >> 33785770 |
Alana R Rodney1, Reuben M Buckley2, Robert S Fulton3, Catrina Fronick3, Todd Richmond4, Christopher R Helps5, Peter Pantke6, Dianne J Trent7, Karen M Vernau8, John S Munday9, Andrew C Lewin10, Rondo Middleton11, Leslie A Lyons2, Wesley C Warren12.
Abstract
Over 94 million domestic cats are susceptible to cancers and other common and rare diseases. Whole exome sequencing (WES) is a proven strategy to study these disease-causing variants. Presented is a 35.7 Mb exome capture design based on the annotated Felis_catus_9.0 genome assembly, covering 201,683 regions of the cat genome. Whole exome sequencing was conducted on 41 cats with known and unknown genetic diseases and traits, of which ten cats had matching whole genome sequence (WGS) data available, used to validate WES performance. At 80 × mean exome depth of coverage, 96.4% of on-target base coverage had a sequencing depth > 20-fold, while over 98% of single nucleotide variants (SNVs) identified by WGS were also identified by WES. Platform-specific SNVs were restricted to sex chromosomes and a small number of olfactory receptor genes. Within the 41 cats, we identified 31 previously known causal variants and discovered new gene candidate variants, including novel missense variance for polycystic kidney disease and atrichia in the Peterbald cat. These results show the utility of WES to identify novel gene candidate alleles for diseases and traits for the first time in a feline model.Entities:
Year: 2021 PMID: 33785770 PMCID: PMC8009874 DOI: 10.1038/s41598-021-86200-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Description and diseases of 41 cats for WES evaluation.
| No | Id | Breed | Sex | Disease/Trait | Gene(s) |
|---|---|---|---|---|---|
| 1 | 19725 | Lykoi | F | Lykoi | |
| 2 | 13230 | Mixed Breed | F | Bengal PRA/Bobbed tail | |
| 3 | 14056 | Mixed Breed | M | Persian PRA/ | |
| 4 | 17994 | Mixed Breed | F | Hydrocephalus | |
| 5 | 19067 | Munchkin | F | Dwarfism/Dominant White | |
| 6 | 5012 | Oriental | M | Lymphoma | |
| 7 | 20382 | Peterbald | M | ||
| 8 | 11615 | Random Bred | M | ||
| 9 | 18528 | Random Bred | M | ||
| 10 | 20424 | Siberian | F | ||
| 11 | 22550 | Bengal | F | Polyneuropathy | |
| 12 | 20957 | Devon Rex | U | Papilloma virus | |
| 13 | 22752 | Devon Rex | M | Neurological disorder | |
| 14–15 | 21983/21464 | Ojos Azules | 1M:1F | Ojos Azules | |
| 16 | 20964 | Oriental | F | Cardiac disease | |
| 17 | 22728 | Random bred | F | Cystinuria | |
| 18 | 20617 | Random Bred | M | Neuronal ceroid lipofuscinosis | |
| 19 | 20948 | Random Bred | M | Cinnamic acid urea | |
| 20 | 21153 | Random Bred | M | Ambulatory paraparesis | |
| 21 | 22287 | Random Bred | F | Myotonia congenita | |
| 22 | 22397 | Random Bred | M | Neurological disorder | |
| 23 | 22505 | Random Bred | M | Cardiac disease | |
| 24 | 22623 | Random Bred | U | Pycnodysostosis | |
| 25 | 22740 | Random Bred | F | Epidemolysis bullosa | |
| 26–27 | 22741/22742 | Random Bred | 1F:1M | Eyelid coloboma | |
| 28 | 22751 | Random Bred | M | Ehlers-Danlos | |
| 29–30 | 22763/22764 | Random Bred | 2F | Hypothyroidism | |
| 31–32 | 22761/22762 | Savannah | 2M | Hypovitaminosis D | |
| 33 | 21984 | Scottish Fold | F | Cardiac disease | |
| 34–35 | 20384/20385 | Selkirk Rex | 1F:1U | Seizures | |
| 36 | 20953 | Siamese | F | Cardiac disease | |
| 37 | 22622 | Siberian | U | PKD | |
| 38 | 22711 | Singapura | F | Hypovitaminosis D | |
| 39–40 | 8641/8642 | Tennessee Rex | 1F:1M | Rexoid hair coat | |
| 41 | 6623 | Oriental | M | Lymphoma | |
| 41 | 14 breeds | 19F:18M:4U | ~ 31 diseases and traits |
A complete description of diseases and traits for entire cohort. Candidate genes are potential genes that been identified with less evidence of a causal mutations.
U unknown sex, F female, M male.
aMutations as tentative causal variants for diseases presented.
Summary of Metrics across both cohorts.
| Average-First 10 | Range-First 10 | Average-Cohort of 31 | Range-Cohort of 31 | |
|---|---|---|---|---|
| Depth of coverage | 267 × | 76–485 × | 80 × | 60–108 × |
| % of bases covered | 99.1% | 92.3–100% | 96.4% | 91–98% |
| % reads aligned | 99.9% | 99.9–100% | 82% | 75–85% |
Figure 1The proportion of bases covered with the exome capture probes. The initial 10 samples are colored in red, with the X axis showing the depth of coverage, which is how many times a nucleotide base is covered starting at a depth of 10x and increasing to 50x.
Indel consequence counts of WES versus WGS as determined by variant effect predictor.
| Impact | Consequence | WES (%) | WGS (%) | ||||
|---|---|---|---|---|---|---|---|
| Common | Exclusive | Total | Common | Exclusive | Total | ||
| High | Frameshift | 1440 (93) | 109 (7) | 1549 | 1451 (84.8) | 260 (15.2) | 1711 |
| High | Splice acceptor | 69 (83.1) | 14 (16.9) | 83 | 71 (69.6) | 31 (30.4) | 102 |
| High | Splice donor | 107 (88.4) | 14 (11.6) | 121 | 107 (81.1) | 25 (18.9) | 132 |
| High | Start lost | 11 (100) | 0 (0) | 11 | 11 (84.6) | 2 (15.4) | 13 |
| High | Stop gained | 16 (76.2) | 5 (23.8) | 21 | 17 (56.7) | 13 (43.3) | 30 |
| High | Stop lost | 12 (92.3) | 1 (7.7) | 13 | 12 (85.7) | 2 (14.3) | 14 |
| High | All | 1602 (92.1) | 137 (7.9) | 1739 | 1615 (83.6) | 316 (16.4) | 1931 |
| Moderate | Inframe deletion | 709 (90.5) | 74 (9.5) | 783 | 710 (91.1) | 69 (8.9) | 779 |
| Moderate | Inframe insertion | 557 (92.4) | 46 (7.6) | 603 | 557 (90) | 62 (10) | 619 |
| Moderate | Protein altering | 13 (81.3) | 3 (18.8) | 16 | 13 (54.2) | 11 (45.8) | 24 |
| Moderate | All | 1267 (91.2) | 122 (8.8) | 1389 | 1268 (90.1) | 139 (9.9) | 1407 |
| Low | 3′ UTR | 173 (91.5) | 16 (8.5) | 189 | 176 (81.5) | 40 (18.5) | 216 |
| Low | 5′ UTR | 194 (96.5) | 7 (3.5) | 201 | 195 (91.5) | 18 (8.5) | 213 |
| Low | Splice region | 641 (94.8) | 35 (5.2) | 676 | 644 (92.9) | 49 (7.1) | 693 |
| Low | Start retained | 7 (100) | 0 (0) | 7 | 7 (100) | 0 (0) | 7 |
| Low | Stop retained | 10 (100) | 0 (0) | 10 | 10 (83.3) | 2 (16.7) | 12 |
| Low | All | 299 (94.3) | 18 (5.7) | 317 | 302 (92.9) | 23 (7.1) | 325 |
| All | 4333 (92.5) | 351 (7.5) | 4684 | 4364 (87.8) | 609 (12.2) | 4973 | |
SNV consequence counts of WES versus WGS as determined by variant effect predictor.
| Impact | Consequence | WES (%) | WGS (%) | ||||
|---|---|---|---|---|---|---|---|
| Common | Exclusive | Total | Common | Exclusive | Total | ||
| High | Splice acceptor | 97 (97) | 3 (3) | 100 | 98 (89.9) | 11 (10.1) | 109 |
| High | Splice donor | 137 (97.9) | 3 (2.1) | 140 | 139 (88) | 19 (12) | 158 |
| High | Start lost | 63 (96.9) | 2 (3.1) | 65 | 63 (100) | 0 (0) | 63 |
| High | Stop gained | 237 (97.9) | 5 (2.1) | 242 | 232 (92.8) | 18 (7.2) | 250 |
| High | Stop lost | 35 (100) | 0 (0) | 35 | 36 (97.3) | 1 (2.7) | 37 |
| High | All | 569 (97.8) | 13 (2.2) | 582 | 568 (92.1) | 49 (7.9) | 617 |
| Moderate | missense | 43,518 (99.3) | 309 (0.7) | 43,827 | 43,419 (98.1) | 821 (1.9) | 44,240 |
| Moderate | All | 43,516 (99.3) | 309 (0.7) | 43,825 | 43,417 (98.1) | 821 (1.9) | 44,238 |
| Low | 3′ UTR | 2022 (97.9) | 43 (2.1) | 2065 | 2031 (94.7) | 114 (5.3) | 2145 |
| Low | 5′ UTR | 2458 (99.5) | 13 (0.5) | 2471 | 2459 (98.6) | 35 (1.4) | 2494 |
| Low | Splice region | 3938 (99.5) | 21 (0.5) | 3959 | 3923 (98.7) | 50 (1.3) | 3973 |
| Low | Stop retained | 60 (100) | 0 (0) | 60 | 58 (96.7) | 2 (3.3) | 60 |
| Low | Synonymous | 87,341 (99.6) | 321 (0.4) | 87,662 | 87,182 (98.9) | 956 (1.1) | 88,138 |
| Low | All | 88,584 (99.6) | 336 (0.4) | 88,920 | 88,417 (98.9) | 975 (1.1) | 89,392 |
| All | All | 144,012 (99.4) | 834 (0.6) | 144,846 | 143,745 (98.5) | 2194 (1.5) | 145,939 |
Figure 2Variant calling statistics for 10 cats sequenced on both platforms. (a) Venn diagrams showing the number of exclusive and common variants per platform. Dark red text indicates the number of variants found in WES and black text indicates the number of variants found in WGS. The reason the number of common variants differ between platforms is because common variants were identified prior to filtering. (b) The number of SNPs found in each sample in both platforms. (c) The percentage of SNPs found as exclusive to each sample for each platform. The first, third, eighth, and tenth samples are males. All other samples are female. (d) Allele count distribution for common and exclusive SNPs in both platforms. WES SNPs are shown on top and WGS SNPs are shown upside down on the bottom. In addition, the Ti/Tv ratio for sets of SNPs is also shown.
Mean SNVs per individual for ten WES and WGS cats.
| Genes | Top 50 WGS outliers | |||||
|---|---|---|---|---|---|---|
| Platform | WGS | WES | ||||
| Sex | Male | Female | Difference (%)a | Male | Female | Difference (%)a |
| Autosome | 1595.00 | 1445.67 | 149.33 (9.36) | 946.25 | 872.83 | 73.42 (7.76) |
| X chromosome | 1363.75 | 22.83 | 1340.92 (98.33) | 829.75 | 23.00 | 806.75 (97.73) |
aPercentage differences in parentheses were calculated as a fraction of mean SNVs per male individual.
Figure 3Gene-wise platform bias. Each individual point on the scatterplot is a gene with the y axis displaying differences in SNP counts per gene. Genes with more WGS SNPs than WES SNPs have positive values, where genes have negative values when there is more WES SNPs instead. Expected SNP number is calculated as the mean number of SNPs per gene across both platforms and is plotted on a log scale.
Figure 4Distribution of SNPs per gene along chromosome X. (a) Total SNPs per kb of coding sequence per gene. (b) Sex biased variant detection along chromosome X. Bias is calculated as fold change ratio between the mean number of SNPs per individual per gene for males and females. Specifically, this was calculated for each gene as log2((mean male SNPs + 1)/(mean female SNPs + 1)). The ones were added to remove undefined results caused by dividing by the number 0.