| Literature DB >> 27105844 |
Sergio Pulido-Tamayo1, Jorge Duitama2, Kathleen Marchal3.
Abstract
Identification of genomic regions associated with a phenotype of interest is a fundamental step toward solving questions in biology and improving industrial research. Bulk segregant analysis (BSA) combined with high-throughput sequencing is a technique to efficiently identify these genomic regions associated with a trait of interest. However, distinguishing true from spuriously linked genomic regions and accurately delineating the genomic positions of these truly linked regions requires the use of complex statistical models currently implemented in software tools that are generally difficult to operate for non-expert users. To facilitate the exploration and analysis of data generated by bulked segregant analysis, we present EXPLoRA-web, a web service wrapped around our previously published algorithm EXPLoRA, which exploits linkage disequilibrium to increase the power and accuracy of quantitative trait loci identification in BSA analysis. EXPLoRA-web provides a user friendly interface that enables easy data upload and parallel processing of different parameter configurations. Results are provided graphically and as BED file and/or text file and the input is expected in widely used formats, enabling straightforward BSA data analysis. The web server is available at http://bioinformatics.intec.ugent.be/explora-web/.Entities:
Mesh:
Year: 2016 PMID: 27105844 PMCID: PMC4987886 DOI: 10.1093/nar/gkw298
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of the web service. (A) Input experimental information. (B) Upload count data. (C) Parameter selection. The black line corresponds to the cumulative distribution of allele frequencies (alternative read count/total read count) derived from the uploaded data and is used to estimate the probability distribution that models the emission probability of the neutral state allele distribution. The blue lines represent the β-distributions model for each of the three different ratios of α/β where line 1 corresponds to a setting reflecting high specificity and low sensitivity, line 2 medium specificity and sensitivity and line 3 low specificity and high sensitivity. (D) Visual Output. The X-axis corresponds to the chromosomal positions and the Y-axis to the posterior probabilities obtained for each marker site. (E) Posterior distributions of the marker sites for each parameter setting. (F) BED file indicating the regions linked to the phenotype.