| Literature DB >> 17727719 |
Raj Chari1, Kim M Lonergan, Raymond T Ng, Calum MacAulay, Wan L Lam, Stephen Lam.
Abstract
BACKGROUND: Lung cancer is the most common cause of cancer-related deaths. Tobacco smoke exposure is the strongest aetiological factor associated with lung cancer. In this study, using serial analysis of gene expression (SAGE), we comprehensively examined the effect of active smoking by comparing the transcriptomes of clinical specimens obtained from current, former and never smokers, and identified genes showing both reversible and irreversible expression changes upon smoking cessation.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17727719 PMCID: PMC2001199 DOI: 10.1186/1471-2164-8-297
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Demographics of subjects in study
| Current 1 | F | 63 | 40 | CS | N/A | 69 | BE-13 |
| Current 2 | M | 56 | 62 | CS | N/A | 89 | BE-7 |
| Current 3 | F | 63 | 44 | CS | N/A | 96 | BE-12 |
| Current 4 | M | 68 | 81 | CS | N/A | 76 | BE-1 |
| Current 5 | M | 64 | 45 | CS | N/A | 73 | BE-2 |
| Current 6 | M | 66 | 53 | CS | N/A | 85 | - |
| Current 7 | M | 52 | 48.1 | CS | N/A | 63 | - |
| Current 8 | F | 55 | 34.4 | CS | N/A | 81 | - |
| Former 1 | M | 68 | 33 | FS | 19 | 50 | BE-3 |
| Former 2 | M | 69 | 100 | FS | 1 | 21 | BE-4A/4B |
| Former 3 | M | 68 | 30 | FS | 1 | 30 | BE-9 |
| Former 4 | M | 70 | 75 | FS | 17 | 76 | BE-5 |
| Former 5 | M | 67 | 55 | FS | 5 | N/A | BE-6 |
| Former 6 | M | 65 | 82 | FS | 10 | 59 | BE-10 |
| Former 7 | F | 56 | 64 | FS | 1.5 | 71 | BE-11A |
| Former 8 | F | 63 | 45 | FS | 4.5 | 83 | BE-14 |
| Former 9 | M | 72 | 40 | FS | 32 | 87 | BE-15 |
| Former 10 | F | 71 | 56 | FS | 16 | 58 | BE-16 |
| Former 11 | M | 72 | 63 | FS | 6 | N/A | BE-8B |
| Former 12 | M | 69 | 55.3 | FS | 21 | 57 | - |
| Never 1 | M | 58 | 0 | NS | N/A | 115 | - |
| Never 2 | F | 56 | 0 | NS | N/A | 104 | - |
| Never 3 | M | 53 | 0 | NS | N/A | N/A | - |
| Never 4 | F | 81 | 0 | NS | N/A | N/A | - |
* CS = Current Smoker, FS = Former Smoker, NS = Never Smoker
**Subset of samples were used in a previous study by Lonergan et al 2006 [55]
Figure 1(A) SAGE library statistics: Summary statistics of the 24 SAGE libraries analyzed in this study. Mapping information was based on the May 10th, 2006 version of SAGEGenie [45]. In total, over 3,000,000 SAGE tags were sequenced, with over 110,000 unique tags represented upon the exclusion of super singleton tags. (Super singleton tags are tags which have a count of 1 in a single library only). Approximately 75 % of these 110,000 unique tags, (potentially representing as many unique transcripts), mapped to an annotated UniGene cluster. As multiple SAGE tags frequently map to the same UniGene cluster, we have identified at a total of 25,653 distinct UniGene clusters within our dataset, approximately 68% of which represent previously characterized genes. Notably, 25% of the unique tags had no mapping, suggesting much information is currently unknown. (B) Transcriptome Venn diagram: Venn diagram of the transcriptomes of current, former and never smokers. Reported is the number of tags which are expressed in every library group at a raw tag count greater than or equal to 2, representing the tags which are constitutively expressed in each set. Nearly 2000 SAGE tags, mapping to over 1700 genes are common to all 24 SAGE libraries. A lower number of never smokers may have contributed to a higher number of preferentially expressed transcripts in this group.
Figure 2(A) Cluster analysis of current, former and never smokers: Single link hierarchical clustering using the 609 SAGE tags comprised in Additional file 5 representing tags differentially expressed between current and never smokers. Distance measure used was a Euclidean distance. The visualization package Genesis [23] was used for clustering. Green rectangles represent samples with lower expression for the particular gene amongst the samples, and red rectangles represent samples where the gene is highly expressed relative to other samples. (B) Principal component analysis of current, former and never smokers. Expression values used were scaled to tags per million (TPM). Each tag was then normalized by dividing its value by the maximum value for that tag seen in all the libraries. Subsequently, this value was then multiplied by 6 and then subtracted by 3 to put the values ratios in the range of -3 to 3. A co-variance based approach was used and the statistics toolbox in MatLab (Mathworks) was used. Current smokers are represented in red, former smokers are represented in blue and never smokers are represented in green.
Figure 3Principal component of current, former and never smokers using (A) the 161 tags deemed reversible upon smoking cessation (Additional file 6) and (B) the 152 tags deemed irreversible upon smoking cessation (Additional file 7). Expression values used were scaled to tags per million (TPM). Each tag was then normalized by dividing its value by the maximum value for that tag seen in all the libraries. Subsequently, this value was then multiplied by 6 and then subtracted by 3 to put the values in the range of -3 to 3. A co-variance based approach was used and the statistics toolbox in MatLab (Mathworks) was used. Current smokers are represented in red, former smokers are represented in blue and never smokers are represented in green.
Reversible gene expression upon smoking cessation related to xenobiotic metabolism and DNA adduct formation (genes in bold have not been previously associated with smoking)
| GGCCCAGGCC | ALDH3A1 | Aldehyde dehydrogenase 3 family, memberA1 | 4355 | 313 | 261 | 0.00002 |
| TTAAAAATTC | ADH7 | Alcohol dehydrogenase 7 (class IV) | 899 | 145 | 130 | 0.00002 |
| AGGTCTGCCA*** | AKR1C2 | Aldo-keto reductase family 1, member C2 | 547 | 116 | 74 | 0.00002 |
| AATGCTTTTA | CYP1B1 | Cytochrome P450, family 1, subfamily B, polypeptide 1 | 204 | 13 | 0 | 0.00002 |
| TTATCAAATC | NQO1 | NAD(P)H dehydrogenase, quinone 1 | 809 | 202 | 149 | 0.00003 |
| CAAATAAACC | PIR | Pirin (iron-binding nuclear protein) | 260 | 47 | 43 | 0.00003 |
| GGCCCCATTT | CBR1 | Carbonyl reductase 1 | 144 | 31 | 24 | 0.00003 |
| TATTTTTGTT | TXNRD1 | Thioredoxin reductase 1 | 250 | 88 | 78 | 0.00006 |
| GGTGGTGTCT | GPX2 | Glutathione peroxidase 2 (gastrointestinal) | 384 | 40 | 46 | 0.00011 |
| CAAGACCAGT | GSTA2 | Glutathione S-transferase A2 | 1436 | 485 | 528 | 0.00019 |
| GCTTGAATAA | AKR1B10 | Aldo-keto reductase family 1, member B10 (aldose reductase) | 332 | 10 | 15 | 0.0003 |
| GTGATGTAAG | SRXN1 | Sulfiredoxin 1 homolog (S. cerevisiae) | 63 | 14 | 11 | 0.0003 |
| TTTTCTGAAA | TXN | Thioredoxin | 698 | 326 | 212 | 0.00048 |
| CTTGCATAAG | CYP1A1 | Cytochrome P450, family 1, subfamily A, polypeptide 1 | 89 | 2 | 0 | 0.00048 |
| GCAAGAAGAG | ALDH3A1 | Aldehyde dehydrogenase 3 family, memberA1 | 77 | 10 | 2 | 0.00048 |
| AGAACAAAAC | PRDX1 | Peroxiredoxin 1 | 1043 | 418 | 510 | 0.00071 |
| CGGCTGAATT | PGD | Phosphogluconate dehydrogenase | 252 | 104 | 80 | 0.00106 |
| AAGAGTTTTG | AKR1B1 | Aldo-keto reductase family 1, member B1 (aldose reductase) | 25 | 5 | 6 | 0.00141 |
| GCTGAGATGA** | CYP4F11 | Cytochrome P450, family 4, subfamily F, polypeptide 11 | 22 | 6 | 2 | 0.00141 |
| GGCGCCTCCT | TALDO1 | Transaldolase 1 | 232 | 76 | 94 | 0.00152 |
| ACATCCTAGG | ALDH1A1 | Aldehyde dehydrogenase 1 family, member A1 | 60 | 28 | 30 | 0.00216 |
| TTAGAAGGAA | NQO1 | NAD(P)H dehydrogenase, quinone 1 | 41 | 14 | 9 | 0.00216 |
| AGGTCTACCA | AKR1C2 | Aldo-keto reductase family 1, member C2 | 270 | 32 | 8 | 0.00292 |
| ATTAGGCCTG | TXNRD1 | Thioredoxin reductase 1 | 51 | 19 | 17 | 0.00297 |
| GAGAGCTTTG | AKR1C3 | Aldo-keto reductase family 1, member C3 | 149 | 21 | 22 | 0.00298 |
| TACGCTTGGT | CYB5R1 | Cytochrome b5 reductase 1 | 68 | 32 | 30 | 0.00298 |
| CACTGCCTTG | FTH1 | Ferritin, heavy polypeptide 1 | 59 | 23 | 17 | 0.00298 |
| CTGCTGCACT | GSR | Glutathione reductase | 126 | 54 | 50 | 0.0041 |
| ACCTTGGGGT | NQO1 | NAD(P)H dehydrogenase, quinone 1 | 73 | 19 | 6 | 0.0041 |
| AATGGAAACT | GCLM | Glutamate-cysteine ligase, modifier subunit | 34 | 16 | 9 | 0.03186 |
*Mean in tags per million (TPM)
** Changed mapping with TAGMapper [56]
***Tag maps with equal reliability to AKR1C1
Reversible gene expression upon smoking cessation related to mucus secretion (genes in bold have not been previously associated with smoking)
| CTCCACCCGA | TFF3 | Trefoil factor 3 (intestinal) | 4974 | 1978 | 1722 | 0.00019 |
| TTGGTTTTTG | CXCL6 | Chemokine (C-X-C motif) ligand 6 | 147 | 414 | 371 | 0.0003 |
| CCTATCAGTA | MSMB | Microseminoprotein, beta- | 15881 | 4405 | 2948 | 0.00048 |
| AGGGAGGCAG | SCGB1A1 | Secretoglobin, family 1A, member 1 | 135 | 473 | 436 | 0.0041 |
| TATCACATTC | CXCL6 | Chemokine (C-X-C motif) ligand 6 | 10 | 42 | 29 | 0.00573 |
| TTGCACCCTT | MSMB | Microseminoprotein, beta- | 71 | 16 | 9 | 0.00735 |
| AGCTTAATGA** | SCGB1A1 | Secretoglobin, family 1A, member 1 | 557 | 1478 | 3269 | 0.00956 |
| GAAAAAATAG | SCGB1A1 | Secretoglobin, family 1A, member 1 (uteroglobin) | 88 | 288 | 281 | 0.0124 |
| GACAAGGATG | CX3CL1 | Chemokine (C-X3-C motif) ligand 1 | 29 | 63 | 93 | 0.01586 |
*Mean in tags per million (TPM)
** Changed mapping with TAGMapper [56]
Figure 4SAGE and quantitative PCR (qRT-PCR) analysis of select genes: (A) Genes found to have reversible expression upon smoking cessation. Box plots of SAGE data and histograms for qRT-PCR for CABYR, ENTPD8 and TFF3. Distribution of ratios between both current vs. former and current vs. former and never (Additional file IV) were found to be statistically different. (B) Genes found to be either partially or fully irreversible. Box plots of SAGE data and histograms for qRT-PCR for MUC5AC and GSK3B. Distribution of ratios between current vs. former and former vs. never were statistically different for MUC5AC and in addition, GSK3B was statistically significant for the combination of current and former vs. never. Box plot analysis was done using the Statistics toolbox from the MathWorks MatLab program. Red lines in the boxes represent the median expression value in terms of tags per million (TPM), and red "plus" signs represent outliers (values which are greater than 1.5 times the maximum value). The bottom and top part of the boxes represent the 2nd and 3rd quartiles of the data respectively. The error bars represent the 5th and 95th percentiles of the data. Quantitative RT-PCR validation was performed on a second cohort of nine current smokers, seven former smokers and six never smokers. Plotted is the average expression ratio relative to the average expression in never smokers of current (red), former (blue) and never (green) smokers. Statistical significance was determined using a one-tailed p-value from the Mann Whitney U Test (Supplemental Table IX).
Figure 5Expression trends of specific genes related to muco-ciliary function and airway restitution as compared with smoking status and lung cancer: TFF3, CABYR, and MUC5AC are over expressed in current smokers with lowered expression in both former and never smokers. Conversely, SCGB1A1 shows the opposite effect, with lowered expression in current smokers as compared to former and never smokers. MUC5AC and TFF3 are known to be components of mucus. EGFR levels are positively correlated with smoking status, with modestly higher levels in current smokers. MUC5AC and EGF have been shown to interact with EGFR in the process of airway restitution and SCGB1A1 has been shown to decrease levels of cyclooxygenase 2 (COX2) in cancer cells. Interestingly, within this process alone, we see reversible (TFF3, CABYR), partially reversible (MUC5AC) and completely irreversible (GSK3B) expression changes upon smoking cessation. Values refer to tag counts as tags-per-million (TPM).