| Literature DB >> 27557076 |
Fan Zhang1, Jake Y Chen2,3,4,5.
Abstract
BACKGROUND: Clinical proteomics application aims at solving a specific clinical problem within the context of a clinical study. It has been growing rapidly in the field of biomarker discovery, especially in the area of cancer diagnostics. Until recently, protein isoform has not been viewed as a new class of early diagnostic biomarkers for clinical proteomics. A protein isoform is one of different forms of the same protein. Different forms of a protein may be produced from single-nucleotide polymorphisms (SNPs), alternative splicing, or post-translational modifications (PTMs). Previous studies have shown that protein isoforms play critical roles in tumorigenesis, disease diagnosis, and prognosis. Identifying and characterizing protein isoforms are essential to the study of molecular mechanisms and early detection of complex diseases such as breast cancer. However, there are limitations with traditional methods such as EST sequencing, Microarray profiling (exon array, Exon-exon junction array), mRNA next-generation sequencing used for protein isoform determination: 1) not in the protein level, 2) no connectivity about connection of nonadjacent exons, 3) no SNPs and PTMs, and 4) low reproducibility. Moreover, there exist the computational challenges of clinical proteomics studies: 1) low sensitivity of instruments, 2) high data noise, and 3) high variability and low repeatability, although recent advances in clinical proteomics technology, LC-MS/MS proteomics, have been used to identify candidate molecular biomarkers in diverse range of samples, including cells, tissues, serum/plasma, and other types of body fluids.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27557076 PMCID: PMC5001247 DOI: 10.1186/s12864-016-2907-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
The statistics of peptide database
| Peptide Type | Number of Peptides | E_E Type | Number of Peptides |
|---|---|---|---|
| EXON_KB | 208269 | Normal E_E | 367956 |
| E_E_KB | 222731 | Skipping E_E | 3963972 |
| E_E_TH | 4109197 | Peptide Length (aa) | |
| E_I_TH | 413761 | Longest Exon | 6057 |
| I_E_TH | 106864 | Average Exon | 48 |
| Longest Junction | 140 | ||
| Total | 5060822 | Average Junction | 64 |
Among the total 5060822 peptides, intron-exon junctions account for the largest proportion, and theoretical exon-exon junctions the smallest proportion. Majority of exon-exon junctions are normal, while the minority are exon skipping. The average lengths are 64 and 48, for junction and exon, respectively. The maximum of length are 140 and 6057, for junction and exon, respectively. The peptide types are exon region (EXON_KB), annotated exon-exon junctions (E_E_KB), hypothetical exon-exon junctions (E_E_TH), hypothetical exon-intron junctions (E_I_TH), and hypothetical intron-exon junctions (I_E_TH)
Fig. 1Heatmap of 90 alternative splicing isoform markers differentiating the normal and cancer samples of Study II. X axis is 90 alternative splicing isoform markers. Y-axis shows the cancer and normal samples ordered by unsupervised clustering. The top are cancer samples and bottom normal samples (H, health, green; C, cancer, blue). Red squares stand for presence, and white ones for absence
Fig. 2Five splicing types. Red, blue and green boxes are exon. Pink boxes are retained intron. Black lines are intron
number of alternative splicing and normal markers between the normal and cancer samples
| Health | cancer | Total | |
|---|---|---|---|
| Alternative Splicing | 7 | 60 | 67 |
| Normal | 22 | 1 | 23 |
| total | 29 | 61 |
Fig. 3Densities for genes with single transcript and multiple transcripts across whole genome, Study II’s markers and Study III’s markers. It shows that alternative splicing isoform markers could be more likely to be found for genes with two or more than two transcript variants encoding different isoforms than genes with only one transcript (Chisquare Pvalue =1.35e-11 between genome and Study II’s markers, Chisquare Pvalue =0 between genome and Study III’s markers)
Fig. 4Heatmap of 26 alternative splicing isoform markers in Study II differentiating the normal and cancer samples of Study III. X axis is 26 alternative splicing isoform markers from Study II. Y-axis shows the cancer and normal samples in Study III ordered by unsupervised clustering. The top are health samples and bottom cancer samples. The prediction results are green for health and blue for cancer. Red squares stand for presence, and white ones for absence