| Literature DB >> 24809441 |
Alexander F Palazzo1, T Ryan Gregory2.
Abstract
Entities:
Mesh:
Substances:
Year: 2014 PMID: 24809441 PMCID: PMC4014423 DOI: 10.1371/journal.pgen.1004351
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1Summary of haploid nuclear DNA contents (“genome sizes”) for various groups of eukaryotes.
This graph is based on data for about 10,000 species [18], [19]. There is a wide range in genome sizes even among developmentally similar species, and there is no correspondence between genome size and general organism complexity. Humans, which have an average-sized genome for a mammal, are indicated by a star. Note the logarithmic scale.
Figure 2Levels of protein-coding and intergenic RNAs in mammalian cells.
(A) Analysis of nascent and total poly(A)+ RNA levels from mouse liver nuclei. Nascent (i.e., polymerase-associated) RNA and poly(A)+ RNA were isolated from mouse liver nuclei and analyzed by high-throughput sequencing. Individual reads were categorized by their source. Exonic and intronic are from known referenced genes (i.e., “RefSeq” genes), while intergenic originate from nonreferenced loci (i.e., “non-RefSeq”) in the mouse genome. Reproduced from [85]. (B) Empirical Cumulative Distribution Function (ECDF) of transcript expression in each cell compartment as determined by the ENCODE consortia. Results for RNA that either contain (“polyA+”) or lack (“polyA−”) a poly(A)-tail in the nucleus and cytosolic fractions are shown. Each human cell line that was analyzed is represented by three lines, one for each pool of RNA (red for protein-coding RNAs, blue for lncRNAs [“noncoding”], and green for intergenic transcripts [“novel intergenic”]). The lines indicate the cumulative fraction of RNAs in a given pool (y-axis) that are expressed at levels that are equal or less than the reads per kilobase per million mapped reads (RPKM) on the x-axis. Total numbers in each pool are as follows: reference protein coding genes: 20,679, loci producing lncRNAs: 9,277, and regions producing intergenic transcripts: 41,204. Transcripts with expression levels of 0 RPKM were adjusted to an artificial value of 10−6 RPKM so that the onset of each graph represents the fraction of nonexpressed genes or loci. Note that 1–4 RPKM is approximately equivalent to one copy per tissue culture cell [46], [129]. Using this figure, one can easily deduce that the vast majority of intergenic transcripts are present at levels less than one copy per cell. Reproduced with permission from [46].