| Literature DB >> 23559639 |
Jinyan Huang1, Jun Chen, Mark Lathrop, Liming Liang.
Abstract
SUMMARY: RNA sequencing data are becoming a major method of choice to study transcriptomes, including the mapping of gene expression quantitative trait loci (eQTLs). RNA sample contamination or swapping is a serious problem for downstream analysis and may result in false discovery and lose power to detect the true biological relationships. When genetic data are available, for example, in eQTL studies or samples have been previously genotyped or DNA sequenced, it is possible to combine genetic data and RNA-seq data to detect sample contamination and resolve sample swapping problems. In this article, we introduce a tool (IDCheck) that allows easy assessment of concordance between genotype (from SNP arrays or DNA sequencing) and gene expression (RNA-seq) samples. IDCheck compares the identity of RNA-seq reads and SNP genotypes using a likelihood-based method. Based on maximum likelihood estimates of relevant parameters, we can detect sample contamination and identify correct sample pairs when swapping occurs. Our tool provides an efficient and convenient way to evaluate and resolve these problems. AVAILABILITY: A complete description of the software is included on the application home page. The software is freely available in the public domain at http://eqtl.rc.fas.harvard.edu/idcheck/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.Mesh:
Year: 2013 PMID: 23559639 DOI: 10.1093/bioinformatics/btt155
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937