| Literature DB >> 26937342 |
Sathya B1, Akila Parvathy Dharshini2, Gopal Ramesh Kumar2.
Abstract
High-throughput sequencing of RNA (RNA-Seq) was developed primarily to analyze global gene expression in different tissues. It is also an efficient way to discover coding SNPs and when multiple individuals with different genetic backgrounds were used, RNA-Seq is very effective for the identification of SNPs. The objective of this study was to perform SNP and INDEL discoveries in human airway transcriptome of healthy never smokers, healthy current smokers, smokers without lung cancer and smokers with lung cancer. By preliminary comparative analysis of these four data sets, it is expected to get SNP and INDEL patterns responsible for lung cancer. A total of 85,028 SNPs and 5738 INDELs in healthy never smokers, 32,671 SNPs and 1561 INDELs in healthy current smokers, 50,205 SNPs and 3008 INDELs in smokers without lung cancer and 51,299 SNPs and 3138 INDELs in smokers with lung cancer were identified. The analysis of the SNPs and INDELs in genes that were reported earlier as differentially expressed was also performed. It has been found that a smoking person has SNPs at position 62,186,542 and 62,190,293 in SCGB1A1 gene and 180,017,251, 180,017,252, and 180,017,597 in SCGB3A1 gene and INDELs at position 35,871,168 in NFKBIA gene and 180,017,797 in SCGB3A1 gene. The SNPs identified in this study provides a resource for genetic studies in smokers and shall contribute to the development of a personalized medicine. This study is only a preliminary kind and more vigorous data analysis and wet lab validation are required.Entities:
Keywords: Airway transcriptome; INDEL; Lung cancer; Next generation sequencing (NGS); SNP; Secretoglobin
Year: 2014 PMID: 26937342 PMCID: PMC4745382 DOI: 10.1016/j.atg.2014.12.003
Source DB: PubMed Journal: Appl Transl Genom ISSN: 2212-0661
Fig. 1Schematic representation of workflow of this current study.
Details of samples used for NGS data analysis.
| Data accession | Category | Details | Age | Status |
|---|---|---|---|---|
| SRR192333 | Healthy never smokers | 3 female | ~ 29 | Healthy normal |
| SRR192334 | Current smokers | 1 male 2 female | ~ 41.7 | Smokers |
| SRR192335 | Smokers without lung cancer | 1 male 2 female | ~ 49 | 2 former and 1 current smokers |
| SRR192336 | Smokers with lung cancer | 2 male 1 female | ~ 64.7 | 2 former and 1 current smokers with cancer |
List of differentially expressed genes in diverse sample.
| gene symbol | Gene name | Location | Gene expression/sample |
|---|---|---|---|
| S100A8 | Calgranulin A | Chr1: 153,362,50–153,363,664 | Upregulation in smokers |
| S100A9 | Calgranulin B | Chr1: 153,330,330–153,333,503 | Upregulation in smokers |
| CYP4F2 | Cytochrome p450 | Chr19: 15,988,834–6,008,885 | Upregulation in smokers |
| NFKB1A1 | NFKB inhibitor | Chr14: 35,870,717–35,873,952 | Upregulation in smokers with lung cancer |
| SCGB1A1 | Secretoglobin | Chr11: 62,172,575–62,190,667 | Differentially expressed smokers with lung cancer |
| SCGB3A1 | Secretoglobin | Chr5: 180,017,103–180,018,540 | Upregulation in smokers with lung cancer |
| CCL20 | Chemokine | Chr2: 228,678,558–228,682,272 | Upregulation in smokers with lung cancer |
| IL8 | Interleukin 8 | Chr4: 74,606,223–74,609,433 | Upregulation in smokers with lung cancer |
| RP11-295J3.2 | ncRNA | Chr10: 127,660,757–27,661,695 | Down regulation in smokers with lung cancer and smokers |
| CTD-2325P2.2 | Pseudogene | Chr14: 69,159,807–69,160,300 | Upregulation in smokers with lung cancer, Down regulation in smokers |
Number of SNPs present in four categories.
| Category | SNP | Transition | Transversion | Ti/Tv ratio |
|---|---|---|---|---|
| Healthy never smokers | 85,028 | 55,314 | 29,714 | 1.86 |
| Healthy current smokers | 32,671 | 21,185 | 11,486 | 1.84 |
| Smokers without lung cancer | 50,205 | 32,820 | 17,385 | 1.88 |
| Smokers with lung cancer | 51,299 | 33,063 | 18,236 | 1.81 |
Number of Known and Novel SNPs/INDEL present in all the four categories.
| Category | Total SNPs | Known SNPs | Novel SNPs | Total INDELs | Known INDELs | Novel INDELs |
|---|---|---|---|---|---|---|
| Healthy never smokers | 85,028 | 37,635 | 47,393 | 5738 | 2305 | 3433 |
| Healthy current smokers | 32,671 | 14,396 | 18,275 | 1561 | 654 | 910 |
| Smokers without lung cancer | 50,205 | 21,914 | 28,291 | 3008 | 1300 | 1708 |
| Smokers with lung cancer | 51,299 | 21,451 | 29,848 | 3138 | 1363 | 1775 |
SNPs/INDEL present in exonic/functional region.
| Category | Synonymous-SNP | Non-synonymous-SNP | Frameshift deletion | Frameshift insertion | Non-frameshift deletion | Non-frameshift insertion |
|---|---|---|---|---|---|---|
| Healthy never smokers | 3116 | 3234 | 71 | 230 | 10 | 24 |
| Healthy current smokers | 1419 | 1429 | 23 | 71 | 4 | 9 |
| Smokers without lung cancer | 1830 | 2061 | 46 | 107 | 4 | 19 |
| Smokers with lung cancer | 1902 | 2058 | 48 | 116 | 5 | 13 |
Number of SNPs and INDELs present in differentially expressed genes.
| Category | SNPs | INDELs |
|---|---|---|
| Healthy never smokers | 39 | 14 |
| Healthy current smokers | 29 | 3 |
| Smokers without lung cancer | 27 | 7 |
| Smokers with lung cancer | 43 | 10 |
SNPs and INDELs present in different categories.
| Chr | Position | dbSNP | Ref/Alt | Category | Location | Gene | Type |
|---|---|---|---|---|---|---|---|
| 1 | 153,333,376 | Novel | A/C | CS + SNL + SL | UTR3 | S100A9 | NA |
| 4 | 74,606,669 | rs2227307 | T/G | CS + SNL + SL | Intronic | IL8 | NA |
| 1 | 153,362,719 | Novel | T/TC | CS + SNL + SL | Intronic | S100A8 | Frameshift insertion |
| 4 | 74,607,910 | rs2227543 | C/T | SNL + SL | Intronic | IL8 | NA |
| 4 | 74,608,162 | Novel | T/A | SNL + SL | Intronic | IL8 | NA |
| 4 | 74,608,163 | Novel | T/G | SNL + SL | Intronic | IL8 | NA |
| 4 | 74,608,408 | Novel | C/CA | SNL + SL | UTR3 | IL8 | NA |
| 11 | 62,186,542 | rs3741240 | G/A | SL | UTR5 | SCGB1A1 | NA |
| 11 | 62,190,293 | rs191704193 | G/A | SL | Intronic | SCGB1A1 | NA |
| 5 | 180,017,251 | Novel | C/A | SL | Exonic | SCGB3A1 | Nonsynonymous SNV |
| 5 | 180,017,252 | Novel | C/G | SL | Exonic | SCGB3A1 | Synonymous SNV |
| 5 | 180,017,597 | Novel | T/C | SL | Intronic | SCGB3A1 | NA |
| 14 | 35,871,168 | Novel | C/CT | SL | UTR3 | NFKB1A | NA |
| 5 | 180,017,797 | Novel | G/GA | SL | Exonic | SCGB3A1 | Frameshift insertion |
NA: not Available, CS: current smokers, SNL: smokers with no lung cancer, SL: smokers with lung cancer.