| Literature DB >> 33264312 |
Hyo Jun Lee1, Yoon Ji Chung1, Sungbong Jang2, Dong Won Seo1, Hak Kyo Lee3, Duhak Yoon4, Dajeong Lim5, Seung Hwan Lee1.
Abstract
It was hypothesized that single-nucleotide polymorphisms (SNPs) extracted from text-mined genes could be more tightly related to causal variant for each trait and that differentially weighting of this SNP panel in the GBLUP model could improve the performance of genomic prediction in cattle. Fitting two GRMs constructed by text-mined SNPs and SNPs except text-mined SNPs from 777k SNPs set (exp_777K) as different random effects showed better accuracy than fitting one GRM (Im_777K) for six traits (e.g. backfat thickness: + 0.002, eye muscle area: + 0.014, Warner-Bratzler Shear Force of semimembranosus and longissimus dorsi: + 0.024 and + 0.068, intramuscular fat content of semimembranosus and longissimus dorsi: + 0.008 and + 0.018). These results can suggest that attempts to incorporate text mining into genomic predictions seem valuable, and further study using text mining can be expected to present the significant results.Entities:
Year: 2020 PMID: 33264312 PMCID: PMC7710051 DOI: 10.1371/journal.pone.0241848
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary statistics of text mining and SNP calling.
| Trait | Article6 | Gene7 | SNP8 | Used query9 |
|---|---|---|---|---|
| 1,893 | 288 | 17,662 | carcass weight[TIAB] OR dressed weight[TIAB] | |
| 1,097 | 156 | 6,143 | Warner-Bratzler Shear Force [TIAB] OR cuttability [TIAB] OR meat tenderness [TIAB] | |
| 1,854 | 576 | 30,983 | intramuscular fat [TIAB] | |
| 602 | 195 | 9,335 | back fat [TIAB] | |
| 546 | 167 | 12,371 | eye muscle area [TIAB] OR ribeye [TIAB] OR rib eye [TIAB] |
CWT: Carcass weight; SF: Warner-Bratzler Shear Force; IMF: intramuscular fatty acid content; BF: Backfat thickness; EMA: Eye muscle area; Article: number of articles searched in PubMed; Gene: number of mined genes from searched articles; SNP: number of SNPs called from imputed 777K markers; Used query: queries used to search articles in PubMed.
The 30 genes symbol that appeared with highest frequency in text mining.
| Trait | Symbol | Freq | Trait | Symbol | Freq | Trait | Symbol | Freq |
|---|---|---|---|---|---|---|---|---|
| 36 | 35 | 19 | ||||||
| 28 | 26 | 18 | ||||||
| 25 | 24 | 15 | ||||||
| 24 | 18 | 15 | ||||||
| 24 | 16 | 15 | ||||||
| 20 | 15 | 13 | ||||||
| 19 | 14 | 13 | ||||||
| 19 | 14 | 12 | ||||||
| 19 | 11 | 12 | ||||||
| 19 | 11 | 11 | ||||||
| 18 | 10 | 11 | ||||||
| 17 | 10 | 9 | ||||||
| 16 | 9 | 9 | ||||||
| 15 | 8 | 9 | ||||||
| 13 | 8 | 9 | ||||||
| 13 | 8 | 8 | ||||||
| 13 | 8 | 8 | ||||||
| 13 | 8 | 8 | ||||||
| 13 | 8 | 7 | ||||||
| 13 | 8 | 7 | ||||||
| 11 | 8 | 7 | ||||||
| 11 | 8 | 7 | ||||||
| 11 | 7 | 7 | ||||||
| 10 | 7 | 6 | ||||||
| 10 | 7 | 6 | ||||||
| 10 | 7 | 6 | ||||||
| 10 | 7 | 6 | ||||||
| 10 | 7 | 5 | ||||||
| 10 | 7 | 5 | ||||||
| 10 | 7 | 5 | ||||||
| 110 | 105 | |||||||
| 104 | 80 | |||||||
| 19 | 70 | |||||||
| 18 | 54 | |||||||
| 17 | 52 | |||||||
| 16 | 52 | |||||||
| 14 | 47 | |||||||
| 11 | 38 | |||||||
| 11 | 36 | |||||||
| 9 | 36 | |||||||
| 8 | 27 | |||||||
| 8 | 26 | |||||||
| 8 | 25 | |||||||
| 7 | 23 | |||||||
| 7 | 22 | |||||||
| 6 | 22 | |||||||
| 6 | 20 | |||||||
| 6 | 19 | |||||||
| 6 | 17 | |||||||
| 6 | 16 | |||||||
| 6 | 15 | |||||||
| 6 | 14 | |||||||
| 5 | 14 | |||||||
| 5 | 14 | |||||||
| 4 | 14 | |||||||
| 4 | 14 | |||||||
| 4 | 14 | |||||||
| 4 | 13 | |||||||
| 4 | 13 | |||||||
| 4 | 13 | |||||||
The top five significant biological processes for each trait.
| Trait | GO_ID | Biological process | GeneRatio1 | − |
|---|---|---|---|---|
| GO:0009725 | response to hormone | 19.8% | 9.5 | |
| GO:0010469 | regulation of signaling receptor activity | 21.4% | 8.2 | |
| GO:0009719 | response to endogenous stimulus | 24.6% | 7.5 | |
| GO:0043066 | negative regulation of apoptotic process | 19.0% | 6.9 | |
| GO:0043069 | negative regulation of programmed cell death | 19.0% | 6.7 | |
| GO:0019752 | carboxylic acid metabolic process | 21.7% | 2.6 | |
| GO:0043436 | oxoacid metabolic process | 21.7% | 2.4 | |
| GO:0072330 | monocarboxylic acid biosynthetic process | 12.0% | 2.3 | |
| GO:0006082 | organic acid metabolic process | 21.7% | 2.3 | |
| GO:0032787 | monocarboxylic acid metabolic process | 14.5% | 1.8 | |
| GO:0019216 | regulation of lipid metabolic process | 11.5% | 12.7 | |
| GO:0032787 | monocarboxylic acid metabolic process | 15.3% | 12.2 | |
| GO:0006629 | lipid metabolic process | 23.4% | 11.5 | |
| GO:0006631 | fatty acid metabolic process | 11.1% | 9.4 | |
| GO:0046890 | regulation of lipid biosynthetic process | 7.7% | 9.3 | |
| GO:0009725 | response to hormone | 23.4% | 9.6 | |
| GO:0032868 | response to insulin | 12.8% | 8.2 | |
| GO:1901700 | response to oxygen-containing compound | 28.7% | 8.1 | |
| GO:0009719 | response to endogenous stimulus | 28.7% | 8.0 | |
| GO:0043434 | response to peptide hormone | 12.8% | 5.7 | |
| GO:1901652 | response to peptide | 14.1% | 4.1 | |
| GO:0032868 | response to insulin | 11.3% | 4.0 | |
| GO:0010243 | response to organonitrogen compound | 19.7% | 4.0 | |
| GO:0043434 | response to peptide hormone | 12.7% | 3.6 | |
| GO:0062013 | positive regulation of small molecule metabolic process | 9.9% | 3.5 |
GeneRatio: gene calling rate, i.e., the ratio of genes involved in each biological process among entire set of text-mined genes; −: −log10 P-value adjusted by the Bonferroni method.
Fig 1The karyotype of QTL regions registered in QTLDB, text-mined region, and the intersection of both regions.
Each karyotype represents the region for the trait indicated above. Percentages in parentheses beside the trait names indicate the ratio of text-mined region within QTLDB region.
Fig 2Manhattan plots with results of genome-wide association study using text-mined SNPs for each trait.
The y-axis shows the −log10P-value of each SNP and the x-axis is the marker index. The green line is the Bonferroni-line representing 0.05/number of markers. The blue line is the suggestive-line representing 0.1/number of markers.
Variance components at different marker set.
| Trait | Value | Im_777K1 | exp_777K2 | exp_777K + tm_SNPs3 |
|---|---|---|---|---|
| 913.66 | 908.35 | 705.05 + 171.76 | ||
| 1287.6 | 1297.2 | 1307.3 | ||
| 0.42 | 0.41 | 0.4 | ||
| 9.51 | 9.44 | 8.91 + 0.63 | ||
| 13.65 | 13.71 | 13.64 | ||
| 0.41 | 0.41 | 0.41 | ||
| 50.37 | 50.04 | 48 + 2.43 | ||
| 77.59 | 77.87 | 77.55 | ||
| 0.39 | 0.39 | 0.39 | ||
| 0.11 | 0.11 | 0.07 + 0.04 | ||
| 1.02 | 1.02 | 1.02 | ||
| 0.1 | 0.09 | 0.09 | ||
| 0.13 | 0.12 | 0.07 + 0.04 | ||
| 0.55 | 0.55 | 0.55 | ||
| 0.19 | 0.18 | 0.17 | ||
| 0.66 | 0.67 | 0.65+ 0.000024 | ||
| 2.46 | 2.44 | 2.47 | ||
| 0.21 | 0.22 | 0.21 | ||
| 5.28 | 5.24 | 4.34 + 0.73 | ||
| 11.51 | 11.55 | 11.72 | ||
| 0.32 | 0.31 | 0.3 |
Im_777K: estimated variance components with imputed 777K SNPs; exp_777K: estimated variance components with imputed 777K SNPs except text-mined SNPs; exp_777K + tm_SNPs: estimated variance components when using two marker sets (exp_777K, text-mined SNPs) to different genetic variance. First genetic variance was a component of exp_777K and second was a component of text-mined SNPs.
Carcass traits average correlation between the GEBV and corrected phenotypic values (y) and standard error for 10-validation set.
Meat quality traits average correlation between the GEBV and corrected phenotypic values (y) and standard error for 10-validation set.
| Trait | Im_777K | exp_777K | exp_777K + tm_SNPs |
|---|---|---|---|
| 0.453 ± 0.01 | 0.449 ± 0.01 | 0.451 ± 0.01 | |
| 0.419 ± 0.01 | 0.413 ± 0.01 | 0.421 ± 0.01 | |
| 0.423 ± 0.01 | 0.429 ± 0.01 | 0.437 ± 0.004 | |
| 0.105 ± 0.04 | 0.102 ± 0.02 | 0.129 ± 0.03 | |
| 0.121 ± 0.03 | 0.115 ± 0.04 | 0.189 ± 0.03 | |
| 0.16 ± 0.02 | 0.15 ± 0.03 | 0.168 ± 0.02 | |
| 0.207 ± 0.04 | 0.163 ± 0.03 | 0.225 ± 0.02 |
Accuracy of evenly-mined GBLUP and text-mined GBLUP.
| Traits | exp_777k + tm_SNPs | exp_777k + em_SNPs1 |
|---|---|---|
| 0.451 ± 0.01 | 0.471 ± 0.01 | |
| 0.421 ± 0.01 | 0.419 ± 0.01 | |
| 0.437 ± 0.004 | 0.438 ± 0.01 | |
| 0.129 ± 0.03 | 0.099 ± 0.02 | |
| 0.189 ± 0.03 | 0.095 ± 0.02 | |
| 0.168 ± 0.02 | 0.147 ± 0.02 | |
| 0.225 ± 0.02 | 0.202 ± 0.03 |
exp_777k + em_SNPs: multi-GRM GBLUP with evenly-mined SNPs and except SNPs.