| Literature DB >> 17010196 |
Matthew N Bainbridge1, René L Warren, Martin Hirst, Tammy Romanuik, Thomas Zeng, Anne Go, Allen Delaney, Malachi Griffith, Matthew Hickenbotham, Vincent Magrini, Elaine R Mardis, Marianne D Sadar, Asim S Siddiqui, Marco A Marra, Steven J M Jones.
Abstract
BACKGROUND: High throughput sequencing-by-synthesis is an emerging technology that allows the rapid production of millions of bases of data. Although the sequence reads are short, they can readily be used for re-sequencing. By re-sequencing the mRNA products of a cell, one may rapidly discover polymorphisms and splice variants particular to that cell.Entities:
Mesh:
Substances:
Year: 2006 PMID: 17010196 PMCID: PMC1592491 DOI: 10.1186/1471-2164-7-246
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary analysis of EST to human transcriptome/genome mapping
| Map Type | Count |
| Total | 181,279 |
Figure 1A histogram showing the number of gene loci hit by a given number of ESTs.
Top 10 most abundant transcripts in androgen-stimulated LNCaP cells by EST count.
| 22377 | ENSG00000198899* | ATP synthase a chain (EC 3.6.3.14) (ATPase protein 6). |
| 7595 | ENSG00000198916* | No description |
| 3200 | ENSG00000148341 | SH3 domain GRB2-like protein B2 (Endophilin B2). |
| 2112 | ENSG00000198804* | Cytochrome c oxidase subunit 1 (EC 1.9.3.1) (Cytochrome c oxidase polypeptide I). |
| 1678 | ENSG00000198886* | NADH-ubiquinone oxidoreductase chain 4 (EC 1.6.5.3) (NADH dehydrogenase subunit 4). |
| 1628 | ENSG00000198938* | Cytochrome c oxidase subunit 3 (EC 1.9.3.1) (Cytochrome c oxidase polypeptide III). |
| 1392 | ENSG00000186063 | No description |
| 1311 | ENSG00000198763* | NADH-ubiquinone oxidoreductase chain 2 (EC 1.6.5.3) (NADH dehydrogenase subunit 2). |
| 1201 | ENSG00000170421 | Keratin, type II cytoskeletal 8 (Cytokeratin 8) (K8) (CK 8). |
| 1088 | ENSG00000198744* | No description |
* indicates mitochondrial genes
Figure 2A histogram showing the start of EST alignments to human transcript sequences (length > 500). Position is given as a percentage of the length of the transcript. ESTs which align to the positive or negative strands of the cDNA are shown in light or dark grey, respectively.
25 novel alternative splicing events in androgen-stimulated LNCaP cells
| ENST00000248342 | eIF3k | Eukaryotic translation initiation factor 3 subunit 12 | 25 bp deletion of 3' end of exon 1 |
| ENST00000207437 | MLEY_HUMAN | Myosin light chain 1, slow-twitch muscle A isoform | 64 bp deletion of 5' end of exon 2 (contained in 5' UTR) |
| ENST00000330964 | RPS27L | 40S ribosomal protein S27-like protein | Deleteion of retained intron in exon 1 (54 bp of coding sequence) |
| ENST00000358666 | UBL5 | Ubiquitin-like protein 5 | 121 bp deletion of 3' end of exon 1 (contained in 5' UTR) |
| ENST00000262746 | PRDX1 | Peroxiredoxin 1 | 227 bp deletion of 3' end of exon 1 (contained in 5' UTR) |
| ENST00000297290 | BRI3 | Brain protein I3 | See Figure 3. |
| ENST00000341480 | MED18 | Mediator of RNA polymerase II transcription, subunit 18 homolog | Deletion of retained intron in exon 3 (entriely contained in 3' UTR) |
| ENST00000270799 | RPL11 | 60S ribosomal protein L11 | 89 bp deletion of 3' end of exon 2 and 103 bp deletion of 5' end of exon 5 |
| ENST00000303553 | NDUFA3 | NADH-ubiquinone oxidoreductase B9 subunit | Deletion of 45 bp retained intron in exon 4 |
| ENST00000222673 | OGDH (MTpc) | 2-oxoglutarate dehydrogenase E1 component, mitochondrial precursor | Deletion of 210 bp retained intron (contained in 3' UTR) |
| ENST00000361643 | This gene can be found on Chromosome MT at location 1,673–3,230. | Deletion of 47 bps | |
| ENST00000361390 | ROPN1B (MT) | NADH-ubiquinone oxidoreductase chain 1 | Deletion of 648 bp |
| ENST00000302192 | Q8WUV6_HUMAN | Podocalyxin-like 2 | Deletion of 363 bp retained intron in exon 8 |
| ENST00000361643 | This gene can be found on Chromosome MT at location 1,673–3,230. | Deletion of 1268 bps | |
| ENST00000261798 | CSNK1A1 | Casein kinase I, alpha isoform | Insertion of unknown length between exons 4 and 5 |
| ENST00000339892 | This gene can be found on Chromosome 1 at location 234,416,172–234,416,946. | Deletion of 42 bp in 3' UTR | |
| ENST00000322297 | OAZ1 | Ornithine decarboxylase antizyme | Deletion of last 4 coding bp and 54 bp of 3' UTR |
| ENST00000224892 | LHPP | Phospholysine phosphohistidine inorganic pyrophosphate phosphatase | Deletion of last 105 bp of exon 1, exons 2–6, and first 608 bp of exon 7 |
| ENST00000361381 | NU4M_HUMAN (MT) | NADH-ubiquinone oxidoreductase chain 4 | Deletion of 152 bp |
| ENST00000358666 | UBL5 | Ubiquitin-like protein 5 | 80 bp deletion of 3' end of exon 1 (contained in 5' UTR) |
| ENST00000239377 | PCMT1 | Protein-L-isoaspartate(D-aspartate) O-methyltransferase | 47 bp deletion of 3' end of exon 7 (10 bp coding, remainder in 5' UTR) |
| ENST00000361899 | ATP6 (MT) | ATPase protein 6 | Deletion of 91 bp |
| ENST00000291565 | PDXK | Pyridoxal kinase | Deletion of 281 bp retained intron (contained in 3' UTR) |
| ENST00000308964 | This gene can be found on Chromosome 19 at location 60,851,213–60,856,333. | 42 bp deletion of 3' end of exon 2, exons 3–5, and first 541 bp of exon 6 | |
| ENST00000361390 | ROPN1B(MT) | NADH-ubiquinone oxidoreductase chain 1 | Deletion of 346 bp |
Figure 3Alternative splicing of Brain Protein I3 (ENSG00000164713) showing a short insertion between two exons. 5' and 3' ends of two exons are shown in black text, interspaced by an intron (full sequence not shown) in orange. Base positions where the EST aligns to the transcript indicated with bold and italic type. The 40 base insertion is high-lighted in blue.
The HQD count, mean and median phred scores and the HQD count
| 86 | 342.4 | 163 | 16 | |
| 1364 | 270.2 | 136 | 175 | |
| 29 | 233.2 | 122 | 4 |
Counts collected for phred scores > 400 for each of the three HDQ classes: those confirmable by Ensembl, those that occur in positions with no known variations, and those that have incorrect mutations at positions with known variations ("Other").
Figure 4A histogram of ESTs that fail to map to the human genome at various p-values.
Idetification of contamination of 454 EST data
| E. coli | 133 |
| Enterococcus sp. | 86 |
| Staphylococcus sp. | 78 |
| Cloning vector | 42 |
| P. marinus (Sea Lamprey) | 37 |
| Other | 211 |