| Literature DB >> 22369384 |
Atsushi Ogura1, Mengjie Lin, Yuya Shigenobu, Atushi Fujiwara, Kazuho Ikeo, Satoshi Nagai.
Abstract
BACKGROUND: Metagenomic studies, accelerated by the evolution of sequencing technologies and the rapid development of genomic analysis methods, can reveal genetic diversity and biodiversity in various samples including those of uncultured or unknown species. This approach, however, cannot be used to identify active functional genes under actual environmental conditions. Metatranscriptomics, which is similar in approach to metagenomics except that it utilizes RNA samples, is a powerful tool for the transcriptomic study of environmental samples. Unlike metagenomic studies, metatranscriptomic studies have not been popular to date due to problems with reliability, repeatability, redundancy and cost performance. Here, we propose a normalized metatranscriptomic method that is suitable for the collection of genes from samples as a platform for comparative transcriptomics.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22369384 PMCID: PMC3333174 DOI: 10.1186/1471-2164-12-S3-S15
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Sequencing, quality control and assembly of the two libraries
| Non-normalized | Normalized | ||
|---|---|---|---|
| Number of reads | 607,490 | 572,233 | |
| Raw data | Average length | 309.2bp | 275.8bp |
| Total base pairs | 187.9Mbp | 157.8Mbp | |
| Number of reads | 483,335 | 373,627 | |
| Quality control | Average length | 333.5bp | 323.2bp |
| Total base pairs | 161.0Mbp | 120.8Mbp | |
| Full-length | 45,064 | 49,121 | |
| Assembly | Contig | 53,324 | 32,440 |
| Singlet | 118,251 | 97,124 | |
| Final | Total number of genes | 216,639 | 178,685 |
| Total base pairs | 73.7Mbp | 57.3Mbp | |
Raw data was produced on a Roche GS FLX sequencer. The quality control process removed low-quality sequences and vectors. After the identification of full-length genes, the assembly process classified contigs and singlets. The total number of genes represents the sum of full-length genes, contigs, and singlets.
Figure 1Comparison of the two libraries and efficiency of the normalization treatment. Reciprocal BLAT searches were performed and common genes, from non-normalized to normalized and from normalized to non-normalized, are shown as “Common.” “Unique” genes are those that do not match to each other by homology search.
Figure 2Function of metatranscriptome data. A. Proportions of genes hit to non-redundant nucleotide database (DB-hit), ribosomal genes, and chloroplast genes are shown in non-normalized library and normalized library. B. The numbers of total genes as query for homology search, and hit to non-redundant nucleotide database (DB-hit), no hit, ribosomal genes, chloroplast genes, and uncultured genes were shown in the table. C. Top 10 DB-hit are shown with their accession number and frequencies in the query of non-normalized library and normalized library.
Figure 3Taxonomic distribution of the two libraries. A. Taxonomy distribution pie-charts of the non-normalized and normalized libraries. Groups with more than 2% share are presented and all other groups are presented as “Other”. B. Genes of known dominant species, diatoms and dinoflagellates, were searched. Figures on the left represent the number of DB-hits, and those on the right represent the number of query-hits.