| Literature DB >> 23575280 |
Roland Schmucki1, Marco Berrera, Erich Küng, Serene Lee, Wolfgang E Thasler, Sabine Grüner, Martin Ebeling, Ulrich Certa.
Abstract
BACKGROUND: Whole transcriptome analyses are an essential tool for understanding disease mechanisms. Approaches based on next-generation sequencing provide fast and affordable data but rely on the availability of annotated genomes. However, there are many areas in biomedical research that require non-standard animal models for which genome information is not available. This includes the Syrian hamster Mesocricetus auratus as an important model for dyslipidaemia because it mirrors many aspects of human disease and pharmacological responses. We show that complementary use of two independent next generation sequencing technologies combined with mapping to multiple genome databases allows unambiguous transcript annotation and quantitative transcript imaging. We refer to this approach as "triple match sequencing" (TMS).Entities:
Mesh:
Substances:
Year: 2013 PMID: 23575280 PMCID: PMC3639954 DOI: 10.1186/1471-2164-14-237
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Triple match sequencing (TMS): principle and workflow. (A) A complex SAGE-tag based deep sequencing library is generated from total RNA and sequenced on a high-throughput platform (e.g. ABI-SOLiD, ABI-Proton, Illumina) for transcript quantification based on read frequency. A second normalized SAGE library from the same RNA and sequenced on a long read, low-throughput platform (e.g. Roche 454). mRNA reads from these libraries share the distal 3′-sequence. The long reads are compared to related transcript databases (i.e. human, mouse, rat, dog; RefSeq) as detailed in Methods. (B) Deep sequencing libraries from total RNA were constructed for either SOLiD or 454 sequencing platforms. To ensure representation of low-abundance transcripts it is essential to normalize the 454 library by competitive hybridization. The long 454 reads are assembled into contigs and mapped to human, rat, mouse and dog RefSeq databases yielding about 10’800 annotated genes (hamster liver transcriptome). 93% of 82 million short SAGE tags mapped to the hamster transcriptome and the remaining 7% to human RefSeq entries allowing quantification over 4 orders of magnitude. All 454 and SOLiD sequencing data generated here are available on request.
Figure 2RNAseq (SAGE) based comparative analysis of gene expression in liver tissue of H. sapiens, M. fascicularis, S. scrofa, M. auratus and R. norwegicus. 48 genes from the public domain network REACTOME “Lipid digestion, mobilization, and transport” were selected to highlight species specific differences in gene expression relevant for HDL biosynthesis (left panel). Gene clusters generated by hierarchical clustering are labelled as CL1 to CL6. The highly abundant albumin transcript is marked. The right panel shows mRNA levels of a standard reference set of liver housekeeping genes compiled from public domain data (L. Badi, unpublished). The log2 values of normalized read counts are presented in a standard heat map as indicated. Grey fields indicate genes lacking valid expression data. SAGE tags of cluster 6 transcripts were quantified using the Chinese hamster genome draft as resource because they had no matches in the 454 library due to low abundance. The RNA source for each species is given in the material section of the paper. All SAGE libraries were built using commercial kits and the source of tissue is given under Methods. Abbreviations of organisms included are indicated at the bottom.