| Literature DB >> 35287573 |
Abstract
BACKGROUND: Rapid development of high-throughput omics technologies generates an increasing interest in algorithms for cutoff point identification. Existing cutoff methods and tools identify cutoff points based on an association of continuous variables with another variable, such as phenotype, disease state, or treatment group. These approaches are not applicable for descriptive studies in which continuous variables are reported without known association with any biologically meaningful variables.Entities:
Keywords: -omics; Cutoff; Descriptive genomics; Dichotomization; Threshold
Mesh:
Year: 2022 PMID: 35287573 PMCID: PMC8922865 DOI: 10.1186/s12864-022-08427-6
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Illustration of the method for cutoff point identification in descriptive high-throughput biological studies. Variable distribution (A, D, F) and biological categories enriched in shortlists identified using cutoff points (B, E, G) for the following datasets: genes expressed in the human cerebral cortex (A, B), genes sensitive to chemical exposures (D, E), and proteins expressed in the adult human heart (F, G). Figure C illustrates changes in the number of shortlisted genes identified by the described cutoff algorithm in relation to the number of genes in the dataset. Number of shortlisted genes is shown as percent of the total shortlisted genes identified for a complete dataset (16,353 genes). In graphs (A, D, F), A is a curve of the original data distribution, B is a linear shortcut connecting the first and the last points of A, and C is a family of linear functions perpendicular to B. Four C functions are shown in figure A. In figures C and D longest segments corresponding C functions are shown. Red vertical lines in figures A, D, F correspond to the cutoff points