| Literature DB >> 28253840 |
Pratyaydipta Rudra1, W Jenny Shi2, Brian Vestal1, Pamela H Russell1, Aaron Odell3, Robin D Dowell4,5, Richard A Radcliffe6, Laura M Saba6, Katerina Kechris7.
Abstract
BACKGROUND: Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing.Entities:
Keywords: Compound Poisson mixed model; Heritability; Negative binomial mixed model; RNAseq; Recombinant inbred panel; Variance partition coefficient
Mesh:
Substances:
Year: 2017 PMID: 28253840 PMCID: PMC5333443 DOI: 10.1186/s12859-017-1539-6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
NBMM
:Simulation setup summary
| Simulation | ( |
| ( |
|---|---|---|---|
| I. Parameter effects | Constant for all features | Random samples from Unif | (1000, 50, 6) |
| II. Exhaustive combo of parameters | Ind. combo of parameters for every feature | Random samples from Unif (0,5) | (1000, 50, 3) |
| III. Observed combo of parameters | Estimated from the LXS miRNA dataset | (881, 59, 2 or 3) | |
| IV. Size & power | Estimated from the LXS miRNA dataset | 0,0.1,0.25,0.5,0.75 or 1 | (1000, 50, 3) |
| V. Confidence intervals | Specifically chosen to generate heritability scores 0.2, 0.5 and 0.8 | (500, 50, 3) | |
The parameters (ϕ,α) and (p,ϕ,α) are for NB-sim and CP-sim, respectively; σ 2 denotes the random effect variance in either model. The last column shows the number of features (G), strains (S), and biological replicates (R ) in each simulation. Under each scenario, we simulated data using both NB-sim and CP-sim models. Simulation II and III include 10 replicated synthetic datasets for each case
Fig. 1Comparison of heritability precision for different combinations of true model parameters. (Simulation I). Heatmap of Root Mean Square Error (RMSE) corresponding to the four methods for estimating VPC. The data are generated from a NB-sim b CP-sim. The parameter combinations for the simulation are shown by the sidebars, the darker the color in the sidebar, the higher the value of the parameter. In the heat map, red indicates better performance (small RMSE) compared to green
Fig. 2Bias comparisons for the four methods for a single NB-sim dataset with 1000 features, 50 strains, and 3 replicates per strain. Bias is measured as estimated VPC − true VPC
Fig. 3Accuracy comparisons for Simulation III. RMSE for NB-fit (salmon), CP-fit (green), and VST (blue) across 10 simulated dataset generated based on either a NB-sim or b CP-sim. The true positives are defined as miRNAs with true VPC greater than 0.5
Type-I error and power (α=0.05) of the four methods for data simulated from NBMM and CPMM
| Data from NB-sim | Data from CP-sim | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Method/ | 0 | 0.10 | 0.25 | 0.50 | 0.75 | 1 | 0 | 0.10 | 0.25 | 0.50 | 0.75 | 1 |
| CP-fit | 0.10 | 0.78 | 0.89 | 0.94 | 0.96 | 0.97 | 0.04 | 0.72 | 0.86 | 0.92 | 0.96 | 0.96 |
| NB-fit | 0.04 | 0.71 | 0.86 | 0.92 | 0.95 | 0.95 | 0.04 | 0.72 | 0.86 | 0.92 | 0.96 | 0.96 |
| VST | 0 | 0.56 | 0.76 | 0.86 | 0.90 | 0.92 | 0 | 0.55 | 0.76 | 0.86 | 0.90 | 0.92 |
The value in a cell denotes the value of the power function (rounded off to 2 places after decimal) for each case. The value of drives the power. The case reflects the null hypothesis, therefore corresponding the cell value is the type-I error. Columns with indicates the power for increasing strain specific variance
Coverage of the VPC by 95% confidence intervals (based on 500 simulations)
| Data from NB-sim | Data from CP-sim | |||||
|---|---|---|---|---|---|---|
| Method/ True VPC | Low | Medium | High | Low | Medium | High |
| CP-fit | 91.8 (0.32) | 93.2 (0.32) | 94.2 (0.19) | 93.4 (0.32) | 92.4 (0.33) | 92.8 (0.20) |
| NB-fit | 90.2 (0.31) | 91.8 (0.32) | 92.0 (0.18) | 93.4 (0.31) | 90.4 (0.31) | 78.4 (0.19) |
| VST | 92.0 (0.32) | 91.6 (0.33) | 92.4 (0.20) | 94.8 (0.32) | 89.6 (0.34) | 84.0 (0.22) |
Coverage percentage is shown at each level of true heritability score for data generated from NB-sim and CP-sim. Trues VPCs 0.2, 0.5 and 0.8 were considered as representatives for low, medium and high heritability. The values in parentheses show the average lengths of the intervals for the different cases. 200 bootstraps are used in each case
Fig. 4Heritability comparison across estimation methods for the LXS miRNA preprocessed dataset. The histograms with rug (tick marks along the x-axis) and kernel density plots are shown along the diagonal. The panels below the diagonal show the scatter plots and the LOESS fits for the pairwise comparisons. The corresponding correlation values are listed above the diagonal
Top heritable miRNA based on the LXS dataset
| miRNA | VPC (NB) |
| VPC (CP) |
|
|---|---|---|---|---|
| novel:chr10_26214 | 0.959 | 2.2e–39 | 0.978 | 9.6e–38 |
| mmu-miR-5621-5p | 0.947 | 1.2e–27 | 0.955 | 1.1e–28 |
| mmu-miR-466q | 0.941 | 1.4e–22 | 0.982 | 1.5e–21 |
| mmu-miR-9769-3p | 0.914 | 8.5e–33 | 0.994 | 1.6e–33 |
| novel:chr4_11381 | 0.898 | 2.6e–31 | 0.996 | 8.8e–28 |
| novel:chr8_23508 | 0.867 | 1.8e–25 | 0.994 | 1.8e–27 |
| mmu-miR-7057-5p | 0.844 | 5.5e–27 | 0.979 | 5.3e–25 |
The second and fourth columns are the VPC scores for the datasets processed and fit under NBMM (NB-proc & NB-fit) and processed and fit under CPMM (CP-proc & CP-fit), respectively. The corresponding p-values for testing the presence of heritability are listed in the adjacent columns. The features are sorted by their heritability estimates using the NB-fit method
Fig. 6Sequencing read examples for a a top heritable miRNA and b a top heritable mRNA features. Each boxplot summarizes the reads for one strain and they are sorted by strain mean in an increasing order. The color of the boxes have no special significance. The estimated VPC scores are reported in the top left tables
Fig. 5Heritability comparison for the LXS mRNA dataset. The histograms and kernel density plots are shown along the diagonal. The panels below the diagonal show the scatter plots and the LOESS fits for the pairwise comparisons. The corresponding correlation values are listed above the diagonal