| Literature DB >> 17109755 |
Yi-Ju Li1, Puting Xu, Xuejun Qin, Donald E Schmechel, Christine M Hulette, Jonathan L Haines, Margaret A Pericak-Vance, John R Gilbert.
Abstract
BACKGROUND: Serial Analysis of Gene Expression (SAGE) is a powerful tool to determine gene expression profiles. Two types of SAGE libraries, ShortSAGE and LongSAGE, are classified based on the length of the SAGE tag (10 vs. 17 basepairs). LongSAGE libraries are thought to be more useful than ShortSAGE libraries, but their information content has not been widely compared. To dissect the differences between these two types of libraries, we utilized four libraries (two LongSAGE and two ShortSAGE libraries) generated from the hippocampus of Alzheimer and control samples. In addition, we generated two additional short SAGE libraries, the truncated long SAGE libraries (tSAGE), from LongSAGE libraries by deleting seven 5' basepairs from each LongSAGE tag.Entities:
Mesh:
Substances:
Year: 2006 PMID: 17109755 PMCID: PMC1676023 DOI: 10.1186/1471-2105-7-504
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Summary of SAGE tags for four SAGE libraries
| 34,475 (28,804) | 30,581 (25,145) | 25,140 | 23,126 | |
| 80,292 | 75,018 | 78,126 | 70,456 | |
| 14,643 (24,129) | 11,646 (20,640) | 21,367 | 19,787 | |
| 19,832(4,675) | 18,935(4,505) | 3,773 | 3,339 |
*Library abbreviation is as the following: LongSAGE AD (L_AD), tSAGE AD (T_AD), LongSAGE control (L_Ctrl), tSAGE control (T_Ctrl), ShortSAGE AD (S_AD); ShortSAGE control (S_Ctrl).
**The information of truncated long SAGE (tSAGE) libraries for AD and control are listed in the parenthesis.
Redundancy and tag-to-gene mapping for unique tags with tag counts > 1 (confident tags).
| 8,670 | --- | 6,547 (75.5%) | |
| 7,210 | --- | 5,149 (71.4%) | |
| 7,699 | 379 (4.9%) | 7,265 (94.4%) | |
| 6,195 | 356 (5.7%) | 5,762 (93.0%) |
aRedundant tags refer to tSAGE tags matching to more than one LongSAGE tag with counts greater than 1.
Figure 1Distribution of SAGE tags. The distribution of SAGE tags depicted by the number of corresponding clusters in the LongSAGE, truncated LongSAGE, and short SAGE datasets.
Figure 2Tag frequency comparison. Comparisons of tag frequencies between AD and controls of LongSAGE, ShortSAGE, and tSAGE libraries.
Figure 3The property of significantly differentially expressed tSAGE tags. A diagram to relate the LongSAGE tags to 400 tSAGE tags that are significantly differentially expressed between AD and control. The distribution of the tSAGE tags is summarized based on the number of their corresponding LongSAGE tags.
Results of BLAST analysis for 100 orphan tags.
| 21 | 2 | 2% | 2 |
| 20 | 1 | 3% | 1 |
| 19 | 1 | 4% | 3 |
| 18 | 8 | 12% | 10 |
| 17 | 5 | 17% | 23 |
| ≥ 17* | 9 | 9% | 17 |
Summary of the number of orphan tags by the number of basepairs matched to a human gene sequence.
*No. of tags have all 17 bps in the tag region matched to a human gene sequence.
A list of genes mapping to nine orphan tags.
| CCAGCCGGGGTGACAGA | 5–21 | 17 | NM_000791.3 | Homo sapiens dihydrofolate reductase (DHFR), mRNA |
| CCAGCCGGGGTGACAGA | 5–21 | 17 | NM_001874.3 | Homo sapiens carboxypeptidase M (CPM), transcript variant 1, mRNA |
| CCAGCCGGGGTGACAGA | 5–21 | 17 | NM_024080.3 | Homo sapiens transient receptor potential cation channel, subfamily M, member 8 (TRPM8), mRNA |
| CCAGTCTGGGCAACAAG | 5–21 | 17 | NM_017437.1 | Homo sapiens cleavage and polyadenylation specific factor 2, 100 kDa (CPSF2), mRNA |
| CCAGTCTGGGCAACAAG | 5–21 | 17 | NM_181776.1 | Homo sapiens solute carrier family 36 (proton/amino acid symporter), member 2 (SLC36A2), mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | NM_000641.2 | Homo sapiens interleukin 11 (IL11), mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | NM_000997.3 | Homo sapiens ribosomal protein L37 (RPL37), mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | NM_001009899.1 | Homo sapiens KIAA2018 (KIAA2018), mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | NM_001032999.1 | Homo sapiens core-binding factor, runt domain, alpha subunit 2; translocated to, 2 (CBFA2T2), transcript variant 3, mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | NM_001033564.1 | Homo sapiens hypothetical protein LOC619208 (LOC619208), mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | NM_007042.1 | Homo sapiens ribonuclease P 14 kDa subunit (RPP14), mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | NM_030579.1 | Homo sapiens outer mitochondrial membrane cytochrome b5 (CYB5-M), mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | XM_371492.2 | PREDICTED: Homo sapiens similar to signal-transducing adaptor protein-2; brk kinase substrate (LOC388949), mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | XM_379458.2 | PREDICTED: Homo sapiens hypothetical LOC401287 (LOC401287), mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | XM_499056.1 | PREDICTED: Homo sapiens hypothetical gene supported by AK127523 (LOC441190), mRNA |
| CCTGGCACTTTGGGAGG | 5–21 | 17 | XM_499503.1 | PREDICTED: Homo sapiens hypothetical gene supported by AK127523 (LOC442499), mRNA |
| GTGCTGGGATAACTGGC | 4–21 | 18 | XM_499182.1 | PREDICTED: Homo sapiens hypothetical gene supported by AK128305 |
| GTGCTGGGATAACTGGC | 5–21 | 17 | NM_005431.1 | (LOC441501), mRNA Homo sapiens X-ray repair complementing defective repair in Chinese hamster cells 2 (XRCC2), mRNA |
| GTGCTGGGATAACTGGC | 5–21 | 17 | NM_016094.2 | Homo sapiens COMM domain containing 2 (COMMD2), mRNA |
| GTGCTGGGATAACTGGC | 5–21 | 17 | XM_374973.1 | PREDICTED: Homo sapiens similar to hypothetical protein (L1H 3 region) – human (LOC400025), mRNA |
| TGGTACACACCTGTAGT | 4–21 | 18 | NM_001008528.1 | Homo sapiens matrix-remodelling associated 7 (MXRA7), transcript variant 1, mRNA |
| TGGTACACACCTGTAGT | 4–21 | 18 | NM_152920.1 | Homo sapiens egf-like module containing, mucin-like, hormone receptor-like 2 (EMR2), transcript variant 6, mRNA |
| TGGTACACACCTGTAGT | 5–21 | 17 | NM_014573.1 | Homo sapiens hypothetical protein MAC30 (MAC30), mRNA |
Summary of the nine orphan tags that have complete 17 basepairs (bps) in the tag region matched to gene sequences in human. The input sequences for BLAST were 21 bp long including four bps of restriction site. The starting position of the first nucleotide of the LongSAGE tag is the fifth position of the input sequence.