| Literature DB >> 28935754 |
Matheus Sanitá Lima1, David Roy Smith2.
Abstract
Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells.Entities:
Keywords: RNA-seq; mitochondrial transcription; organelle gene expression; plastid transcription; protists
Mesh:
Year: 2017 PMID: 28935754 PMCID: PMC5677165 DOI: 10.1534/g3.117.300290
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 2Full transcription of small mitochondrial genomes in Apicomplexa. Mapping histograms (or transcription maps) depict the coverage depth—number of transcripts mapped per nucleotide—on a log scale. We used the organelle genome annotations already present in the genome assemblies deposited in GenBank (accession numbers provided in Table 1 and Table S1). Mapping contigs are not to scale and direction of transcription is represented by the direction of the arrows: annotated genes. Mapping histograms were obtained from Geneious v9.1.6 (Kearse ).
Figure 3Polycistronic transcription in mitochondrial genomes of chlorophytes, raphidophytes, and glaucophytes. C. moewusii (Chlorophyta), H. akashiwo (Raphidophyta), and C. paradoxa (Glaucophyta) exhibited clear drops of transcript coverage in some potentially noncoding regions (intergenic regions, introns, and hypothetical proteins). Mapping histograms follow the same structure as in Figure 2 and mapping contigs are not to scale.
Figure 4Entire and near entire transcriptional coverage of diverse plastid genomes. V. brassicaformis (Chromerida) exhibited entire genome transcription, whereas Helicosporidium sp. (Chlorophyta) and E. huxleyi (Haptophyta) had near entire genome transcriptional coverage. Drops in coverage happened mostly in intergenic regions of the E. huxleyi plastid genome. Mapping histograms follow the same structure as in Figure 2 and Figure 3; mapping contigs are not to scale.
Figure 1Pervasive organelle genome transcription across the eukaryotic tree of life. Organelle genomes ≤105 kb are fully or almost fully transcribed in diverse eukaryotic groups, regardless of their coding content and structure. Outer dashed boxes summarize the breadth of organelle genomes analyzed within each major eukaryotic group. Representation of organelle genomes and organelles are not to scale. Refseq coverage represents the percentage of the reference genome sequence that was covered by one or more RNA-seq reads in the mapping analyses. Phylogenetic tree is adapted from Burki (2014) for the relationships among major groups; branches within groups are merely illustrative and not based on sequence analyses. The tree was generated using the NCBI Common Tree taxonomy tool (Federhen 2012) and iTOL v3.4.3 (Letunic and Bork 2016).
Diverse organelle (mitochondrial and plastid) genomes and their respective transcription rates (mean and percent coverage)
| Taxonomic Group and Species | Organelle | GenBank Entry | Genome Size (bp) | Mean Coverage (Reads/nt) | % Refseq | % Coding |
|---|---|---|---|---|---|---|
| API - | MT | NC_011005.1 | 5,895 | 710.9 | 99.7 | 67.5 |
| API - | MT | LK023131.1 | 5,957 | 3,111.9 | 100 | 92.4 |
| API - | MT | AY282930.1 | 5,959 | 368.3 | 100 | 55.7 |
| API - | MT | NC_007243.1 | 5,990 | 693.6 | 100 | 56.3 |
| API - | MT | NC_009902.1 | 6,005 | 614.8 | 99.9 | 63.5 |
| APIC | NC_011395.1 | 35,107 | 71.6 | 90.2 | 54.1 | |
| API - | MT | LN871600.1 | 10,547 | 5.2 | 93.4 | 37 |
| CP - | MT | NC_026573.1 | 14,029 | 136.9 | 95.8 | 86.4 |
| DF - | MT | LC002801 | 19,577 | 2,763 | 100 | 7.4 |
| CP - | MT | NC_001872.1 | 22,897 | 59.8 | 86.7 | 55.4 |
| CP - | MT | GQ497137 | 24,321 | 2,942.4 | 99.8 | 87.7 |
| PP - | MT | NC_007683.1 | 36,392 | 98.9 | 97.9 | 90 |
| RP - | MT | NC_002007.1 | 36,753 | 1,250.4 | 98.7 | 81.5 |
| RP - | MT | NC_017751.1 | 37,023 | 24.4 | 85.6 | 63.2 |
| PP - | MT | NC_023354.1 | 37,402 | 165.1 | 92.8 | 89.9 |
| PP | MT | NC_013476.1 | 37,657 | 145.9 | 100 | 89.4 |
| EP - | MT | NC_022258.1 | 38,057 | 118.7 | 95.8 | 88.8 |
| RH - | MT | NC_016738.1 | 38,690 | 205.2 | 98.5 | 81.3 |
| RP - | MT | NC_017837.1 | 41,688 | 16.2 | 88 | 56.6 |
| DT - | MT | NC_027265.1 | 46,283 | 1,261.3 | 96.4 | 71.5 |
| CP - | MT | NC_012643.1 | 47,425 | 180.6 | 94 | 82.5 |
| CP - | MT | NC_017841.1 | 49,343 | 147.4 | 94.7 | 65 |
| PT | NC_008100.1 | 37,454 | 103.6 | 98 | 94.9 | |
| GP - | MT | NC_017836.1 | 51,557 | 3,355.9 | 94.6 | 58.9 |
| CP - | MT | NC_024626.1 | 52,528 | 23,494.2 | 86.6 | 63 |
| CA - | MT | NC_005255.1 | 67,737 | 24.9 | 94.2 | 52.3 |
| CP - | PT | NC_012575.1 | 72,585 | 2,854.1 | 93.7 | 67.8 |
| CP - | PT | NC_024828.1 | 81,133 | 142.1 | 85.5 | 90.6 |
| CR - | PT | HM222968 | 85,535 | 5,523.6 | 100 | 88.5 |
| HP - | PT | NC_007288.1 | 105,309 | 789.9 | 97 | 85.8 |
| HP - | PT | NC_020371.1 | 95,281 | 2,771.8 | 99.4 | 81 |
| API - | APIC | NC_001799.1 | 34,996 | 1,501.4 | 95 | 80.7 |
API, Apicomplexa; MT, mitochondrion; CP, Chlorophyta; DF, Dinoflagellates; PP, Phaeophyta; RP, Rhodophyta; EP, Eustigmatophytes; RH, Raphidophyta; DT, Diatoms; PT, plastid; GP, Glaucophyta; CA, Charophyta; CR, Chromerida; HP, Haptophyta; APIC, apicoplast.
Percentage of the reference genome sequence that is covered by one or more reads in the mapping contig.
Percentage of the coding region (tRNA-, rRNA-, and protein-coding genes) in the organelle genome. The “% coding” of each genome was determined for this study using the function “extract annotation” in Geneious. We extracted tRNA-, rRNA-, and protein-coding (coding sequence) gene annotations, then excluded spurious annotations and calculated the final length of coding sequences altogether.