Literature DB >> 15626333

Gene expression versus sequence for predicting function: Glia Maturation Factor gamma is not a glia maturation factor.

Michael G Walker.   

Abstract

It is standard practice, whenever a researcher finds a new gene, to search databases for genes that have a similar sequence. It is not standard practice, whenever a researcher finds a new gene, to search for genes that have similar expression (co-expression). Failure to perform co-expression searches has lead to incorrect conclusions about the likely function of new genes, and has lead to wasted laboratory attempts to confirm functions incorrectly predicted. We present here the example of Glia Maturation Factor gamma (GMF-gamma). Despite its name, it has not been shown to participate in glia maturation. It is a gene of unknown function that is similar in sequence to GMF-beta. The sequence homology and chromosomal location led to an unsuccessful search for GMF-gamma mutations in glioma. We examined GMF-gamma expression in 1432 human cDNA libraries. Highest expression occurs in phagocytic, antigen-presenting and other hematopoietic cells. We found GMF-gamma mRNA in almost every tissue examined, with expression in nervous tissue no higher than in any other tissue. Our evidence indicates that GMF-gamma participates in phagocytosis in antigen presenting cells. Searches for genes with similar sequences should be supplemented with searches for genes with similar expression to avoid incorrect predictions.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 15626333      PMCID: PMC5172355          DOI: 10.1016/s1672-0229(03)01007-6

Source DB:  PubMed          Journal:  Genomics Proteomics Bioinformatics        ISSN: 1672-0229            Impact factor:   7.691


Introduction

When a researcher finds a new gene they search Genbank or other databases for genes that have a similar sequence. It is not standard practice, when a researcher finds a new gene, to search for genes that have similar expression (co-expression). Failure to perform co-expression searches has led to incorrect conclusions about the likely functions of new genes, and has led to wasted laboratory attempts to confirm functions incorrectly predicted by sequence similarity. We describe here the case of Glia Maturation Factor gamma. Despite its name, Glia Maturation Factor gamma (GMF-gamma) has not been shown to participate in glia maturation. It is a recently-identified gene of unknown function. It was first described by Asai (, who named it based on its homology to GMF-beta (82% amino acid identity, 70% DNA identity), a neural growth and maturation factor. Asai and colleagues examined GMF-gamma expression in eight human tissues and found it predominantly in lung, heart and placenta, with trace expression in brain, liver, skeletal muscle, kidney, and pancreas. (Their libraries did not include hemic or immune system tissues.) Peters and colleagues ( determined that GMF-gamma is located on human chromosome 19 at band q13.2, a region that is frequently deleted in malignant glioma. The possible role of GMF-gamma in glial differentiation (based on its homology to GMF-beta) and its chromosomal location in a region linked to malignant glioma led Peters to study its potential role in this disease. They examined 41 gliomas, but found no mutations in GMF-gamma, indicating that it is not the 19q13.2 glioma gene. Thus, the function of GMF-gamma is currently unknown.

Results and Discussion

We detected GMF-gamma mRNA in 292 of 1432 cDNA libraries (Table 1), in every tissue category except stomatognathic. It is most abundant in hemic and immune tissues (87 of 179 libraries). These 179 hemic and immune libraries were derived from peripheral blood, umbilical cord blood, lymph nodes, thymus, spleen, bone marrow, tonsil, and Jurkat cells. We detected GMF-gamma in monocytes, macrophages, dendrites, B-cells, T-cells, blast cells, eosinophils, and mast cells in the blood samples. We also detected it in lymph node, spleen, thymus, and tonsil tissue. It was not detected in bone marrow samples. The high expression in hemic and immune tissues, absence in bone marrow, and ubiquitous but low-level expression in other tissue samples are consistent with expression in mature blood cells.
Table 1

Distribution of GMF-gamma EST’s by Tissue Type

Tissue CategoryNumber of GMF gamma ESTsNumber of ESTs in libraries from this tissuePercent GMF gamma ESTsNumber of LibrariesNumber of libraries in which GMF gamma is detected
Cardiovascular System172729860.0067411
Connective Tissue81516780.005546
Digestive System355217620.00715528
Embryonic Structures81084680.007245
Endocrine System142336830.006637
Exocrine Glands292583830.0116718
Genitalia, Female214563530.00511714
Genitalia, Male214630160.00512014
Germ Cells7481810.01552
Hemic and Immune7217259420.09917987
Liver51156200.004375
Musculoskeletal221628010.0145013
Nervous System559955330.00623131
Pancreas61117710.005254
Respiratory System414128980.0109628
Sense Organs1253450.004101
Skin4727320.006182
Stomatognathic0147120.000170
Unclassified/Mixed141591800.009228
Urinary Tract302955170.010688

Totals105956065610.0191432292
Table 2 shows genes with expression patterns similar to that of GMF-gamma by co-expression analysis (see Methods). For all the co-expressed genes in Table 2, the probability that the co-expression with GMF-gamma is due to chance is less than 1.0×10−15 by the Fisher Exact test. What do the genes with which GMF-gamma is co-expressed indicate about its likely function? Cathepsin S (CTSS) is a cysteine protease expressed in antigen presenting cells whose function is to cleave the MHC class II invariant chain that is required for antigen presentation 3., 4., 5., 6.. It is found at the highest levels in spleen, heart and lung, in the lung being detected only in macrophages (. CD53 is a member of the transmembrane-4 superfamily (TM4SF) of proteins that act as linker molecules, recruiting protein kinase С (PKC) into a signaling complex with beta integrins (. It is expressed predominantly in lymphoid-myeloid lineage cells (, participates in LFA-1 dependent pathway of lymphocyte activation and cell adhesion (, and associates with MHC class II molecules at the plasma membrane of lymph cells (. HLA-DR alpha is an MHC class II protein expressed specifically by antigen presenting cells; it regulates antigen presentation. The Fc-epsilon-receptor gamma is a subunit of several Fc receptors, and is expressed in antigen presenting cells and other leukocytes. Fc receptors phagocytose antigen-antibody complexes, induce antigen presentation by MHC II proteins, and participate in signal transduction 11., 12., 13., 14., 15., 16., 17.. PKC isoforms are important in phacocytosis by Fc receptors (. Lysosomal protein (LAPTm5) is expressed preferentially in hematopoietic cell lines, is localized to lysosomes, and binds to ubiquitin (. Adra and colleagues found high expression of LAPTm5 in peripheral blood leukocytes, lung, thymus, and spleen in human adult tissues. Leukocyte factor associated protein 1 (LFA-1) integrin beta subunit (alternate name CD18/CD11) is a leukocyte cell surface adhesion molecule that mediates signal transduction and co-operates with Fc receptors 20., 21., 22., 23.. L-plastin (LPL) is a calcium-regulated actin-binding protein expressed in leukocytes, fibroblasts and in diverse set of solid tumors 24., 25., 26.. LPL is phosphorylated in phagocytes in response to inflammatory stimuli, which also increase actin polymerization 24., 27.. Pleckstrin is a protein kinase С substrate expressed in macrophages and localizes to the phagosomal membranes upon ingestion of opsonized antigen 18., 28.. It is found in lymphocytes, monocytes, granulocytes, and platelets but is not detected in non-hematopoietic cells (. It is induced upon differentiation of hematopoietic cells and is thought to participate in signaling (. p40hox is expressed specifically in phagocytic cells, is a PKC substrate, and participates in the oxidation of phagocytosed particles 31., 32., 33., 34.. Upstream stimulation factor 1 (USF1) is a transcription factor that regulates differentiation of hematopoietic cells and activates MHC genes, among others 35., 36., 37.. Co-expression of GMF-gamma with these genes indicates that it is expressed particularly in phagocytic / antigen processing cells. GMF-gamma is co-expressed at lower levels with many other MHC, proteasome, and blood-cell specific genes.
Table 2

Genes Co-expressed with GMF-gamma ( p-values from Fisher Exact Test)

Gene-log of p-value
LAPTm530
HLA-DR alpha29
Fc-epsilon R1 gamma26
CD5325
LFA-1/integrin beta25
LCP1 / L-plastin / p6524
pleckstrin p4723
p40phox23
USF123
CTSS cathepsin S23
Is expression of GMF-gamma in these cell types consistent with the types of tissues in which it is most abundant? Asai found GMF-gamma at the highest levels in human lung, heart and placenta. In the Life-Seq collection of 1432 cDNA libraries, we find it predominantly in hemic and immune tissues, and most specifically in hematopoietic cells, but detect it in every tissue but bone marrow. Macrophages are particularly abundant in lung and placenta. Coronary vascular epithelial cells perform phagocytosis and antigen presentation, and genes with which GMF-gamma is co-expressed, such as cathepsin S, are expressed at high levels in heart tissue. Thus, the tissues in which other researchers and we detect GMF-gamma are consistent with expression in phagocytic and antigen processing cells. We would expect that GMF-gamma is absent from bone marrow because immature blood cells do not perform phagocytosis or antigen processing, and therefore do not express these genes. We would expect that GMF-gamma is present at low levels in virtually every other tissue because mature blood cells are present in virtually every tissue. Tsuiki and colleagues cloned a rat gene with 91% amino acid identity to human GMF-gamma, which they named rGMF-gamma (. They examined rGMF-gamma expression in eight rat tissue samples. Northern blots indicated the strongest expression in thymus, testis, and lung. Western blots indicated the strongest expression in spleen and thymus. Expression was low or absent in brain, skin, small intestine, and stomach. In rat testis, they found rGMF-gamma mRNA in spermatids, while in brain it was localized around pyramidal cells within CA3 of the hippocampus. Four phosphorylation sites that are targets of PKC, PKA, and other kinases in vitro are conserved in rGMF-beta and rGMF-gamma, and Tsuiki concurred with the view that these genes may be involved in cell differentiation and growth via signal transduction. Nishiwaki and colleagues investigated the roles of rGMF-beta and rGMF-gamma in development and growth of the rat retina (. They reported that rGMF-gamma is synthesized and localized mainly in Muller glial cells in the rat retina during fetal development. Earlier research has shown that retinal Muller glial cells are phagocytic, can express MHC class II determinants and function as antigen presenting cells 40., 41.. Expression analysis provides hypotheses about the likely cell-specificity and function of GMF-gamma, but these hypotheses need confirmation in direct experiments. The primary utility of an expression database analysis is to suggest experiments that are most likely to be fruitful, thereby saving research time and expense. The co-expression analysis makes several assumptions that are violated to greater or lesser degrees by aspects of the library selection and preparation. For example, libraries are not completely independent, because more than one library may be obtained from a single patient. Normalizing or subtracting makes the detection of associations between genes expressed at different levels more difficult. The cDNA libraries used in this analysis were prepared at different times and with different methods, and may not be consistent. The effects of different cDNA library samples, different normalization, different preparation methods, or preparation at different times are most likely to obscure true relationships. Such differences will make the calculated probability of association less accurate. However, it is unlikely that a pattern that is consistent across 1432 libraries, has good p-values, and is consistent with known biological relationships would be introduced by the cumulative random effects of such differences. Thus, co-expression analysis will yield false negatives, but it is unlikely to yield false positive results with a database of this size. We have observed that human GMF-gamma is expressed predominantly in phagocytic cells, is absent from immature blood cells, and is co-expressed with phagocytosis and antigen processing genes. The evidence indicates that GMF-gamma is involved in phagocytosis and antigen presentation. Despite its name, Glia Maturation Factor gamma is unlikely to be a glia maturation factor. These results indicate that searches for genes with similar sequences should be supplemented with searches for genes with similar expression, to avoid incorrect predictions of putative gene function.

Methods

We examined the expression of GMF-gamma in 1432 human cDNA libraries from diverse anatomic and pathologic states. Some libraries were subtracted or normalized to enrich rare mRNA. Approximately 5000 cDNA’s from each library were sequenced by gel electrophoresis, assembled, and aligned against known genes. All genes that were detected in at least five of the 1432 libraries were included in the analysis described here, which yielded 37,071 genes, gene fragments, or splice variants. To identify genes with a similar expression pattern to GMF-gamma, we performed co-expression analysis using the Guilt-by-Association (GBA) algorithm (. Briefly, in a GBA analysis, we consider a gene to be present (expressed) in a library if cDNA corresponding to that gene is detected in the library. We consider a gene to be absent (not expressed) in a library when no cDNA for that gene is detected. For a given pair of genes, the co-expression data can be summarized in a two-by-two contingency table, where the contingency table entries indicate one of four categories: both genes detected, neither detected, and one or the other gene detected. From the contingency table, we determine the probability that the co-expression occurs by chance using a chi-square test or a Fisher Exact test (43).
  42 in total

1.  Transmembrane-4 superfamily proteins associate with activated protein kinase C (PKC) and link PKC to specific beta(1) integrins.

Authors:  X A Zhang; A L Bontrager; M E Hemler
Journal:  J Biol Chem       Date:  2001-04-26       Impact factor: 5.157

2.  Expression of the protein kinase C substrate pleckstrin in macrophages: association with phagosomal membranes.

Authors:  J H Brumell; J C Howard; K Craig; S Grinstein; A D Schreiber; M Tyers
Journal:  J Immunol       Date:  1999-09-15       Impact factor: 5.422

3.  A role for the actin-bundling protein L-plastin in the regulation of leukocyte integrin function.

Authors:  S L Jones; J Wang; C W Turck; E J Brown
Journal:  Proc Natl Acad Sci U S A       Date:  1998-08-04       Impact factor: 11.205

4.  Cloning of a rat glia maturation factor-gamma (rGMFG) cDNA and expression of its mRNA and protein in rat organs.

Authors:  H Tsuiki; K Asai; M Yamamoto; K Fujita; Y Inoue; Y Kawai; T Tada; A Moriyama; Y Wada; T Kato
Journal:  J Biochem       Date:  2000-03       Impact factor: 3.387

5.  Expression of glia maturation factor during retinal development in the rat.

Authors:  A Nishiwaki; K Asai; T Tada; T Ueda; S Shimada; Y Ogura; T Kato
Journal:  Brain Res Mol Brain Res       Date:  2001-11-01

6.  T lymphocyte development in the absence of Fc epsilon receptor I gamma subunit: analysis of thymic-dependent and independent alpha beta and gamma delta pathways.

Authors:  H Heiken; R J Schulz; J V Ravetch; E L Reinherz; S Koyasu
Journal:  Eur J Immunol       Date:  1996-08       Impact factor: 5.532

7.  Upregulation of L-plastin gene by testosterone in breast and prostate cancer cells: identification of three cooperative androgen receptor-binding sequences.

Authors:  C S Lin; A Lau; C C Yeh; C H Chang; T F Lue
Journal:  DNA Cell Biol       Date:  2000-01       Impact factor: 3.311

8.  DNA binding of USF is required for specific E-box dependent gene activation in vivo.

Authors:  A Kiermaier; J M Gawn; L Desbarats; R Saffrich; W Ansorge; P J Farrell; M Eilers; G Packham
Journal:  Oncogene       Date:  1999-12-02       Impact factor: 9.867

9.  Human cathepsin S: chromosomal localization, gene structure, and tissue distribution.

Authors:  G P Shi; A C Webb; K E Foster; J H Knoll; C A Lemere; J S Munger; H A Chapman
Journal:  J Biol Chem       Date:  1994-04-15       Impact factor: 5.157

10.  Integrin LFA-1 interacts with the transcriptional co-activator JAB1 to modulate AP-1 activity.

Authors:  E Bianchi; S Denti; A Granata; G Bossi; J Geginat; A Villa; L Rogge; R Pardi
Journal:  Nature       Date:  2000-04-06       Impact factor: 49.962

View more
  2 in total

1.  GMF is a cofilin homolog that binds Arp2/3 complex to stimulate filament debranching and inhibit actin nucleation.

Authors:  Meghal Gandhi; Benjamin A Smith; Miia Bovellan; Ville Paavilainen; Karen Daugherty-Clarke; Jeff Gelles; Pekka Lappalainen; Bruce L Goode
Journal:  Curr Biol       Date:  2010-04-01       Impact factor: 10.834

2.  Glia maturation factor gamma (GMFG): a cytokine-responsive protein during hematopoietic lineage development and its functional genomics analysis.

Authors:  Ying Shi; Ling Chen; Lance A Liotta; Hong-Hui Wan; Griffin P Rodgers
Journal:  Genomics Proteomics Bioinformatics       Date:  2006-08       Impact factor: 7.691

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.