| Literature DB >> 25519328 |
Ying Liu1, ChienHsun Huang1, Inchi Hu2, Shaw-Hwa Lo1, Tian Zheng1.
Abstract
Current sequencing technology enables generation of whole genome sequencing data sets that contain a high density of rare variants, each of which is carried by, at most, 5% of the sampled subjects. Such variants are involved in the etiology of most common diseases in humans. These diseases can be studied by relevant longitudinal phenotype traits. Tests for association between such genotype information and longitudinal traits allow the study of the function of rare variants in complex human disorders. In this paper, we propose an association-screening framework that highlights the genotypic differences observed on rare variants and the longitudinal nature of phenotypes. In particular, both variants within a gene and longitudinal phenotypes are used to create partitions of subjects. Association between the 2 sets of constructed partitions is then evaluated. We apply the proposed strategy to the simulated data from the Genetic Analysis Workshop 18 and compare the obtained results with those from sequence kernel association test using the receiver operating characteristic curves.Entities:
Year: 2014 PMID: 25519328 PMCID: PMC4143709 DOI: 10.1186/1753-6561-8-S1-S47
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Clustering of individuals using SNPs with MAFs between 0.01 and 0.05 for MAP4. A, Shown are 10 clusters, with the numbers at the top odds ratios within each partition block based on blood pressures. Each row is a SNP, and each column is an individual. SNPs are ordered with decreasing MAFs (from top to bottom). Green vertical bars indicate subjects with higher blood pressures (see text). Genotype aa is plotted in red, aA is plotted in blue, and AA is plotted in white (a denotes the minor allele). The partitions of the 849 individuals are indicated by dotted lines. Most partition elements are driven by similarity on rarer SNPs but not on more common SNPs. B, Clustering of individuals using their SBP curves from the first simulation. It can be seen that individuals are reasonably grouped into 1 high blood pressure cluster and 1 low blood pressure cluster.
Figure 2Average ROC curves across simulation replicates for 3 methods. Shown are results by 10 clusters using inverse-probability weighting. Areas under the curve (AUCs) by different methods are compared using paired Wilcoxon signed rank tests based on the 200 replicates, with the resulting p values shown in the table below. fpr, ie, false positive rate, is the ratio of the number of claimed causal genes and the number of true noncausal genes; tpr, ie, true positive rate, is the ratio of the number of claimed causal genes and the number of true causal genes.