| Literature DB >> 18394197 |
John M J Herbert1, Dov Stekel, Sharon Sanderson, Victoria L Heath, Roy Bicknell.
Abstract
BACKGROUND: In this study, differential gene expression analysis using complementary DNA (cDNA) libraries has been improved. Firstly by the introduction of an accurate method of assigning Expressed Sequence Tags (ESTs) to genes and secondly, by using a novel likelihood ratio statistical scoring of differential gene expression between two pools of cDNA libraries. These methods were applied to the latest available cell line and bulk tissue cDNA libraries in a two-step screen to predict novel tumour endothelial markers. Initially, endothelial cell lines were in silico subtracted from non-endothelial cell lines to identify endothelial genes. Subsequently, a second bulk tumour versus normal tissue subtraction was employed to predict tumour endothelial markers.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18394197 PMCID: PMC2346479 DOI: 10.1186/1471-2164-9-153
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1An overview of the EST to gene assignment process. Each EST sequence is BLAST searched against a Refseq mRNA database and the best mRNA is assigned that EST. In tandem, a mapping of all ESTs and Refseq mRNA to the human genome assigns ESTs to genes based on genome position. A decision tree makes the final assignment based on the quality of alignment and agreement between the two methods. If the genome position and BLAST result agree, the EST is assigned, if they do not agree but the BLAST result is of high quality (> 92% and > 100 bp alignment) the EST is also assigned. For any other result the EST is removed from the analysis.
A comparison of EST to gene assignment methods
| Endothelial EST pool count | ESTs unambiguously assigned to a gene | ESTs assigned to more than 1 gene | % success rate for the total pool | |
| Huminiecki and Bicknell [8] | 11,117 | 5,889 | 5,228 | 53 |
| Method described here | 11,117 | 10,153 | 0 | 91 |
A comparison of the two EST to gene assignment methods using the same data. The new method of EST to gene assignment improved accuracy enabling a higher percentage of ESTs to be unambiguously assigned compared to the Huminiecki and Bicknell method [8].
Endothelial specific genes found using the original data
| Gene | FDR q-value | Endothelial ESTs | Non-endothelial ESTs |
| ECSM2 | 0.0000 | 9 | 0 |
| TFPI | 0.0000 | 7 | 0 |
| MMRN1 | 0.0000 | 5 | 0 |
| TIE1 | 0.0000 | 5 | 0 |
| ACTA1 | 0.0000 | 5 | 0 |
| ECSM1 | 0.0002 | 4 | 0 |
| CD34 | 0.0002 | 4 | 0 |
| BMX | 0.0031 | 3 | 0 |
| LOC650049 | 0.0031 | 3 | 0 |
| APLN | 0.0031 | 3 | 0 |
| DUS4L | 0.0031 | 3 | 0 |
| FABP4 | 0.0031 | 3 | 0 |
| LOC643977 | 0.0031 | 3 | 0 |
| PAQR3 | 0.0031 | 3 | 0 |
A list of predicted endothelial specific genes using the new EST to gene assignment and statistical methods but with the original cDNA library data from Huminiecki and Bicknell [8]. 14 genes were predicted as significantly endothelial specific. A further 160 genes were predicted as showing significantly upregulated endothelial expression (q-value <= 0.01) (additional file 1) but were not endothelial specific (i.e. had EST hits in the non-endothelial pool). With the new analysis there was no longer a need to cross reference to SAGE libraries for accurate prediction.
Comparison of methods with table 7 from Huminiecki and Bicknell (2000)
| Gene | FDR q-value | Endothelial ESTs | Non-endothelial ESTs | Original Unigene ID |
| ECSM2 | 9 | 0 | Hs.30089 | |
| MMRN1 | 5 | 0 | Hs.268107 | |
| ECSM1 | 4 | 0 | Hs.13957 | |
| FABP4 | 3 | 0 | Hs.83213 | |
| RASIP1 | 0.3696 | 1 | 0 | Hs.233955 |
| RAMP2 | - | 0 | 0 | Hs.155106 |
| VWF | 27 | 1 | Hs.110802 | |
| CD93 (ECSM3) | 4 | 1 | Hs.8135 | |
| ROBO4 (ECSM4) | 4 | 1 | Hs.111518 | |
| CDH5 | 4 | 1 | Hs.76206 | |
| EDN1 | 7 | 2 | Hs.2271 | |
| SDPR | 6 | 2 | Hs.26530 | |
| PECAM1 | 24 | 5 | Hs.78146 | |
| EFEMP1 | 40 | 8 | Hs.76224 | |
| COL4A1 | 0.5598 | 4 | 16 | Hs.119129 |
| CTGF | 30 | 49 | Hs.75511 |
Listing of the genes from Table 7 of our earlier work [8] and how they came out in the new analysis. 13 of the 16 genes were significantly endothelial; however, non-endothelial hits to known endothelial genes showed that the choice of non-endothelial cell lines could be improved. q-values in bold denote a significance threshold of <= 0.01.
Latest endothelial libraries available at Genbank
| New/Original | cDNA library | Count |
| Original | Stratagene endothelial cell 937223 | 7173 |
| Original | Aorta endothelial cells, TNF alpha-treated | 1908 |
| Original | Aorta endothelial cells | 1245 |
| Original | Human endothelial cells, large insert, pCMV expression library | 859 |
| Original | Umbilical vein endothelial cells II | 404 |
| Original | Human aortic endothelium | 20 |
| Original | HDMEC cDNA library | 12 |
| Original | Umbilical vein endothelial cells I | 9 |
| Original | Human endothelial cell (Y. Mitsui) | 3 |
| New | PUAEN2 | 9382 |
| New | Sugano cDNA library, coronary artery endothelial cell | 4707 |
| New | VESEN1 | 1316 |
| New | VESEN2 | 1173 |
| New | HEV PCR-select | 1049 |
| New | UMVEN2 | 433 |
| New | Human Endothelial cells | 346 |
| New | Sugano cDNA library, umbilical vein endothelial cell | 342 |
| New | PUAEN1 | 326 |
| New | UMVEN1 | 167 |
| New | CAE | 88 |
| New | Human umbilical vein Endothelial Cell cDNA library | 48 |
| New | Sugano cDNA library, endothelial cell | 28 |
| New | Human umbilical vein cord | 15 |
| New | IMS_CAS | 15 |
| New | Human umbilical venous cord | 12 |
| New | HUVEC cDNA Library | 12 |
| New | HUVEC Subtracted Library 1 | 8 |
| New | Plasmid subtractive library of human umbilical vein endothelial cells (HUVEC) stimulated by lipopolysaccharide | 8 |
| New | IMS_CAE | 4 |
| New | Homo sapiens umbilical vein | 2 |
| Total | 31114 |
Genbank endothelial cDNA libraries that were used in this study. 21 new libraries have been submitted since our 2000 analysis [8]. The 30 combined libraries incorporate 31,114 endothelial ESTs.
Endothelial specific genes from cDNA library analysis with latest data
| Gene | FDR q-value | Endothelial ESTs | Non-endothelial ESTs |
| MMP1 | 0 | 203 | 0 |
| ROBO4 | 0 | 130 | 0 |
| SPARCL1 | 5.21E-70 | 97 | 0 |
| VWF | 1.33E-52 | 73 | 0 |
| HHIP | 6.58E-44 | 61 | 0 |
| C9orf26 | 1.60E-23 | 33 | 0 |
| RHOJ | 4.68E-22 | 31 | 0 |
| BMX | 2.45E-21 | 30 | 0 |
| ELTD1 | 1.26E-20 | 29 | 0 |
| MMRN1 | 1.84E-18 | 26 | 0 |
| EMCN | 5.17E-17 | 24 | 0 |
| CDH5 | 2.75E-16 | 23 | 0 |
| SOX7 | 3.94E-14 | 20 | 0 |
| ARHGAP24 | 1.03E-12 | 18 | 0 |
| FGD5, PCDH12 | 1.03E-12 | 18 | 0 |
| CD93 | 5.37E-12 | 17 | 0 |
| ERG, MYCT1 | 2.64E-11 | 16 | 0 |
| FLJ22746 | 1.28E-10 | 15 | 0 |
| SELE | 6.56E-10 | 14 | 0 |
| ANGPT2, TCF4 | 3.36E-09 | 13 | 0 |
| EDG1 | 1.68E-08 | 12 | 0 |
This table shows the top 24 genes from the 104 genes in the human genome with the most endothelial specific expression profile predicted by applying the new analysis to the latest cDNA libraries. The other 80 endothelial specific genes are in additional file 8.
Figure 2Real time PCR analysis of randomly chosen endothelial predicted genes across a range of cell types. Real time PCR was carried out on the predicted endothelial genes ECSM2, MMP1, SOX18, ERG, RHOJ and APLN. The graphs illustrate the power of the bioinformatics models as all genes examined were up regulated or specific to HUVECs and/or HDMECs.
Figure 3Real time PCR analysis of randomly chosen endothelial predicted genes across a range of cell types. Real time PCR was carried out on predicted endothelial genes MMRN2, STAB1, LYL1, ELTD1, EFEMP1 and BMX. The graphs illustrate the power of the bioinformatics models as all genes examined were up regulated or specific to HUVECs and/or HDMECs.
The top Nine Tumour Endothelial Markers
| Gene | Product |
| SPHK1 | sphingosine kinase 1 isoform 2 |
| KCTD15 | potassium channel tetramerisation domain containing 15 |
| LRRC8C | factor for adipocyte differentiation 158 |
| PCDH12 | protocadherin 12 precursor |
| C12orf11 | hypothetical protein LOC55726 |
| ECSM2 | hypothetical protein LOC641700 |
| GBP4 | guanylate binding protein 4 |
| IKBKE | IKK-related kinase epsilon |
| MED28 | mediator of RNA polymerase II transcription, subunit 28 homolog |
This table lists the top nine tumour endothelial markers from the analyses.