Literature DB >> 20234772

Bimodal gene expression and biomarker discovery.

Adam Ertel1.   

Abstract

With insights gained through molecular profiling, cancer is recognized as a heterogeneous disease with distinct subtypes and outcomes that can be predicted by a limited number of biomarkers. Statistical methods such as supervised classification and machine learning identify distinguishing features associated with disease subtype but are not necessarily clear or interpretable on a biological level. Genes with bimodal transcript expression, however, may serve as excellent candidates for disease biomarkers with each mode of expression readily interpretable as a biological state. The recent article by Wang et al, entitled "The Bimodality Index: A Criterion for Discovering and Ranking Bimodal Signatures from Cancer Gene Expression Profiling Data," provides a bimodality index for identifying and scoring transcript expression profiles as biomarker candidates with the benefit of having a direct relation to power and sample size. This represents an important step in candidate biomarker discovery that may help streamline the pipeline through validation and clinical application.

Entities:  

Keywords:  bimodal; biomarkers; cancer; gene expression microarrays; genomics

Year:  2010        PMID: 20234772      PMCID: PMC2834379          DOI: 10.4137/cin.s3456

Source DB:  PubMed          Journal:  Cancer Inform        ISSN: 1176-9351


High-throughput gene expression assays are capable of generating large-scale datasets that are useful in gaining insight to healthy biological systems, disease phenotypes, and biomarkers that are representative of these phenotypes. The recent publication by Wang et al1 provides a sound approach for mining through these expression datasets to identify and rank a class of genes, with bimodal expression profiles, that may serve as ideal biomarker candidates. Biomarkers that correspond with disease phenotypes are a useful tool for the diagnosis, treatment, and prognosis of disease. Cancer, as a heterogeneous disease, has many subtypes that respond differently to treatment and have different overall prognosis.2 Biomarkers with accurate and reliable assays can be useful in identifying specific cancer subtypes and guiding treatment in the age of personalized or precision medicine. Molecular profiles with bimodal expression provide excellent candidates for biomarkers because the modes can be used to classify samples into two distinct expression states. During the biomarker discovery process, a bimodal expression profile may be considered meaningful when the modes of expression correspond with binary biological phenotypes, such as healthy and disease states. A biomarker that is deemed meaningful then needs follow up studies to determine the sensitivity and specificity before it could be considered accurate and reliable for practical application. However, it is typically rare that a molecule associated with a disease phenotype can be assayed with the sensitivity and specificity required for a clinical diagnostic test.3 One advantage of biomarker candidates with bimodal profiles at the transcript level is that they may be easily translated to the protein level and IHC staining, for a greater variety of available assays. Bimodal transcript expression typically corresponds with membrane and extracellular proteins, where molecules used as cancer biomarkers primarily localize.3,4 A variety of available assays may need to be evaluated at the gene or protein level before an adequate reliability is obtained. Estrogen receptor, for example, has served as an important biomarker in breast cancer, but assays have had varying success and some but not all assays capture a bimodal distribution.5,6 The method presented in Wang et al was applied to the MDA133 breast cancer microarray dataset previously published by Hess et al.1,7 The MDA133 microarray dataset is accompanied by clinical information including immunohistochemistry (IHC) scores for markers currently used to evaluate breast cancer, including estrogen receptor (ESR1), progesterone receptor (PGR), and human epidermal growth factor receptor 2 (HER2, or ERBB2). These markers define subcategories of breast cancer that differ in response to therapy as well as overall survival.2 IHC scores for these proteins are graded by pathologists and used to guide the diagnosis and treatment of breast cancer subtypes, and there is evidence that transcript profiles for these markers correlate well with protein measures.8,9 The IHC profiles of these markers, based on the dataset from Hess et al available at http://bioinformatics.mdanderson.org/pubdata.html, follow a bimodal distribution (Fig. 1).7 The bimodal distribution is suitable for defining a cut point between the two modal peaks. The cut-point used for the IHC scores corresponding to each molecule in the Hess et al7 dataset demonstrate this, and are identified with the dashed vertical red line in Figure 1. With the established bimodal distributions of IHC scores for these markers, Wang et al1 investigated the gene expression profiles and Bimodality Index for these three genes, and found that they all had high scores for bimodality. The bimodal expression profiles for these three transcripts, using log2 transformed data from Hess et al7 are shown in Figure 2. The software package for computing the bimodality index also provides parameters for the bimodal mixture distributions, which were used to define marker classification thresholds shown as dashed vertical red lines in Figure 2. While the authors only commented on the proportion of samples represented by each mode, the mode of expression from the transcript profile is shown by the degree of shading to correlate well with the mode of expression from the IHC score. This serves not only as a validation for the bimodality index in real data, but also demonstrates that an automated transcript-based assay may be an attractive alternative to manually scored IHC.
Figure 1.

Histograms representing IHC scores for ESR1, PGR, and ERBB2. These three IHC markers appear as bimodal distributions in the MD Anderson 133 sample dataset. Dashed vertical red lines define thresholds for dichotomizing values as marker-positive and marker-negative.

Figure 2.

Histograms representing transcript level distributions for the ESR1, PGR, and ERBB2 genes. The transcripts for these three genes have bimodal distributions with the dashed vertical line representing the classification threshold between the two modes. The histogram shading represents the proportion of marker-positive IHC scores in each bin (Dark blue corresponds to marker-negative IHC and white corresponds to marker-positive IHC). The solid red line represents the bimodal distribution density estimate based on parameters from the bimodality index software package.

The correspondence between the protein and transcript level expression for these three markers shows much promise for the application of the bimodality index to biomarker discovery. However, a bimodal expression profile alone does not imply that a molecule will have a meaningful correlation with a biological or clinical variable of interest. The authors provide an example of a problematic candidate, where the marker creatine kinase, brain (CKB) has a strong bimodal profile that appears to be associated with breast cancer, and furthermore, advanced stage of disease, but provided limited value as a prognostic marker in this disease.10 Recognizing that many biomarker candidates will turn out to be false positives emphasizes the advantage to using a score such as the bimodality index, in that it relates directly to power and sample sizes and provides a ranking system for the systematic assessment and validation of biomarker candidates. This aspect should prove valuable in efficiently evaluating biomarker candidates from discovery through validation to establish clinically relevant molecules and assays.
  10 in total

1.  Estrogen receptor testing of breast cancer in current clinical practice: what's the question?

Authors:  Stuart J Schnitt
Journal:  J Clin Oncol       Date:  2006-03-27       Impact factor: 44.544

2.  Bimodal population or pathologist artifact?

Authors:  David L Rimm; Jennifer M Giltnane; Christopher Moeder; Malini Harigopal; Gina G Chung; Robert L Camp; Barbara Burtness
Journal:  J Clin Oncol       Date:  2007-06-10       Impact factor: 44.544

3.  Pharmacogenomic predictor of sensitivity to preoperative chemotherapy with paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide in breast cancer.

Authors:  Kenneth R Hess; Keith Anderson; W Fraser Symmans; Vicente Valero; Nuhad Ibrahim; Jaime A Mejia; Daniel Booser; Richard L Theriault; Aman U Buzdar; Peter J Dempsey; Roman Rouzier; Nour Sneige; Jeffrey S Ross; Tatiana Vidaurre; Henry L Gómez; Gabriel N Hortobagyi; Lajos Pusztai
Journal:  J Clin Oncol       Date:  2006-08-08       Impact factor: 44.544

4.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications.

Authors:  T Sørlie; C M Perou; R Tibshirani; T Aas; S Geisler; H Johnsen; T Hastie; M B Eisen; M van de Rijn; S S Jeffrey; T Thorsen; H Quist; J C Matese; P O Brown; D Botstein; P E Lønning; A L Børresen-Dale
Journal:  Proc Natl Acad Sci U S A       Date:  2001-09-11       Impact factor: 11.205

5.  Determination of oestrogen-receptor status and ERBB2 status of breast carcinoma: a gene-expression profiling study.

Authors:  Yun Gong; Kai Yan; Feng Lin; Keith Anderson; Christos Sotiriou; Fabrice Andre; Frankie A Holmes; Vicente Valero; Daniel Booser; John E Pippen; Svetislava Vukelja; Henry Gomez; Jaime Mejia; Luis J Barajas; Kenneth R Hess; Nour Sneige; Gabriel N Hortobagyi; Lajos Pusztai; W Fraser Symmans
Journal:  Lancet Oncol       Date:  2007-03       Impact factor: 41.316

6.  Identification of molecular apocrine breast tumours by microarray analysis.

Authors:  Pierre Farmer; Herve Bonnefoi; Veronique Becette; Michele Tubiana-Hulin; Pierre Fumoleau; Denis Larsimont; Gaetan Macgrogan; Jonas Bergh; David Cameron; Darlene Goldstein; Stephan Duss; Anne-Laure Nicoulaz; Cathrin Brisken; Maryse Fiche; Mauro Delorenzi; Richard Iggo
Journal:  Oncogene       Date:  2005-07-07       Impact factor: 9.867

7.  A list of candidate cancer biomarkers for targeted proteomics.

Authors:  Malu Polanski; N Leigh Anderson
Journal:  Biomark Insights       Date:  2007-02-07

8.  The bimodality index: a criterion for discovering and ranking bimodal signatures from cancer gene expression profiling data.

Authors:  Jing Wang; Sijin Wen; W Fraser Symmans; Lajos Pusztai; Kevin R Coombes
Journal:  Cancer Inform       Date:  2009-08-05

9.  Switch-like genes populate cell communication pathways and are enriched for extracellular proteins.

Authors:  Adam Ertel; Aydin Tozeren
Journal:  BMC Genomics       Date:  2008-01-04       Impact factor: 3.969

10.  Creatine kinase BB isoenzyme levels in tumour cytosols and survival of breast cancer patients.

Authors:  N Zarghami; M Giai; H Yu; R Roagna; R Ponzone; D Katsaros; P Sismondi; E P Diamandis
Journal:  Br J Cancer       Date:  1996-02       Impact factor: 7.640

  10 in total
  9 in total

1.  Ketones and lactate increase cancer cell "stemness," driving recurrence, metastasis and poor clinical outcome in breast cancer: achieving personalized medicine via Metabolo-Genomics.

Authors:  Ubaldo E Martinez-Outschoorn; Marco Prisco; Adam Ertel; Aristotelis Tsirigos; Zhao Lin; Stephanos Pavlides; Chengwang Wang; Neal Flomenberg; Erik S Knudsen; Anthony Howell; Richard G Pestell; Federica Sotgia; Michael P Lisanti
Journal:  Cell Cycle       Date:  2011-04-15       Impact factor: 4.534

2.  OPG and PgR show similar cohort specific effects as prognostic factors in ER positive breast cancer.

Authors:  Nicole Sänger; Eugen Ruckhäberle; Giampaolo Bianchini; Tomas Heinrich; Karin Milde-Langosch; Volkmar Müller; Achim Rody; Erich Franz Solomayer; Tanja Fehm; Uwe Holtrich; Sven Becker; Thomas Karn
Journal:  Mol Oncol       Date:  2014-04-15       Impact factor: 6.603

3.  Evidence for a stromal-epithelial "lactate shuttle" in human tumors: MCT4 is a marker of oxidative stress in cancer-associated fibroblasts.

Authors:  Diana Whitaker-Menezes; Ubaldo E Martinez-Outschoorn; Zhao Lin; Adam Ertel; Neal Flomenberg; Agnieszka K Witkiewicz; Ruth C Birbe; Anthony Howell; Stephanos Pavlides; Ricardo Gandara; Richard G Pestell; Federica Sotgia; Nancy J Philp; Michael P Lisanti
Journal:  Cell Cycle       Date:  2011-06-01       Impact factor: 4.534

4.  Molecular profiling of a lethal tumor microenvironment, as defined by stromal caveolin-1 status in breast cancers.

Authors:  Agnieszka K Witkiewicz; Jessica Kline; Maria Queenan; Jonathan R Brody; Aristotelis Tsirigos; Erhan Bilal; Stephanos Pavlides; Adam Ertel; Federica Sotgia; Michael P Lisanti
Journal:  Cell Cycle       Date:  2011-06-01       Impact factor: 4.534

5.  A new class of weighted bimodal distribution with application to gamma-ray burst duration data.

Authors:  Najme Sharifipanah; Rahim Chinipardaz; Gholam Ali Parham
Journal:  J Appl Stat       Date:  2020-09-04       Impact factor: 1.416

6.  Expression a la bimode.

Authors:  James C Willey
Journal:  Cancer Inform       Date:  2010-03-03

7.  LBoost: A boosting algorithm with application for epistasis discovery.

Authors:  Bethany J Wolf; Elizabeth G Hill; Elizabeth H Slate; Carola A Neumann; Emily Kistner-Griffin
Journal:  PLoS One       Date:  2012-11-08       Impact factor: 3.240

8.  MMP1 bimodal expression and differential response to inflammatory mediators is linked to promoter polymorphisms.

Authors:  Muna Affara; Benjamin J Dunmore; Deborah A Sanders; Nicola Johnson; Cristin G Print; D Stephen Charnock-Jones
Journal:  BMC Genomics       Date:  2011-01-19       Impact factor: 3.969

9.  A comparison of methods for data-driven cancer outlier discovery, and an application scheme to semisupervised predictive biomarker discovery.

Authors:  Seppo Karrila; Julian Hock Ean Lee; Greg Tucker-Kellogg
Journal:  Cancer Inform       Date:  2011-04-18
  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.