| Literature DB >> 30250455 |
Youjin Deng1,2, Tom Hsiang3, Shuxian Li4, Longji Lin1, Qingfu Wang1, Qinghe Chen1,2, Baogui Xie1, Ray Ming1,2.
Abstract
Mitochondrial DNA (mtDNA) is a core non-nuclear genetic material found in all eukaryotic organisms, the size of which varies extensively in the eumycota, even within species. In this study, mitochondrial genomes of six isolates of Annulohypoxylon stygium (Lév.) were assembled from raw reads from PacBio and Illumina sequencing. The diversity of genomic structures, conserved genes, intergenic regions and introns were analyzed and compared. Genome sizes ranged from 132 to 147 kb and contained the same sets of conserved protein-coding, tRNA and rRNA genes and shared the same gene arrangements and orientation. In addition, most intergenic regions were homogeneous and had similar sizes except for the region between cytochrome b (cob) and cytochrome c oxidase I (cox1) genes which ranged from 2,998 to 8,039 bp among the six isolates. Sixty-five intron insertion sites and 99 different introns were detected in these genomes. Each genome contained 45 or more introns, which varied in distribution and content. Introns from homologous insertion sites also showed high diversity in size, type and content. Comparison of introns at the same loci showed some complex introns, such as twintrons and ORF-less introns. There were 44 short fragment insertions detected within introns, intergenic regions, or as introns, some of them located at conserved domain regions of homing endonuclease genes. Insertions of short fragments such as small inverted repeats might affect or hinder the movement of introns, and these allowed for intron accumulation in the mitochondrial genomes analyzed, and enlarged their size. This study showed that the evolution of fungal mitochondrial introns is complex, and the results suggest short fragment insertions as a potential factor leading to larger mitochondrial genomes in A. stygium.Entities:
Keywords: ORF-less intron; Twintron; intron insertion site; mitochondrial genome; small inverted repeat
Year: 2018 PMID: 30250455 PMCID: PMC6140425 DOI: 10.3389/fmicb.2018.02079
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1Diagram of the As31 mitochondrial genome. Fifteen conserved protein-coding genes, 26 tRNAs, small and large subunit rRNAs, and three transferase genes are shown. Blue arrows represent protein coding genes; pink arrows represent rRNAs, dark red arrows represent tRNA; green arcs represent introns of conserved protein-coding genes and rRNAs. The other five mtDNAs share similar diagrams as As31, including gene positions, orientation and order.
Comparison of mitochondrial genomes of six isolates of Annulohypoxylon stygium.
| As03 | 145,449 | 30.17 | 15 | 15,003 | 26 | 1,938 | 2 | 4,341 | 49 | 87,530 | 38,059 | |
| As15 | 131,996 | 29.81 | 15 | 15,012 | 26 | 1,938 | 2 | 4,426 | 47 | 79,669 | 32,373 | |
| As23 | 147,325 | 29.77 | 15 | 15,015 | 26 | 1,938 | 2 | 4,431 | 47 | 92,443 | 34,920 | |
| As24 | 140,328 | 30.05 | 15 | 15,015 | 26 | 1,938 | 2 | 4,430 | 47 | 85,627 | 34,740 | |
| As28 | 143,806 | 30.18 | 15 | 15,012 | 26 | 1,938 | 2 | 4,424 | 48 | 88,863 | 34,991 | |
| As31 | 133,761 | 29.86 | 15 | 15,012 | 26 | 1,938 | 2 | 4,427 | 45 | 81,437 | 32,369 | |
rps3 (1,422 bp) located in intron of rnl, which was counted both as conserved protein-coding gene and intron in the table.
Information on introns.
| nad6-i1 | 80 | I | + | + | + | + | + | + | 1,911–1,923 | 97.3 | ID | L1 | |
| cox3-i1 | 73 | I | + | + | – | – | + | + | 1,330–1,337 | 94 | IB | L1 | orf262 |
| cox3-i2 | 73–76 | I | – | – | + | + | – | – | 2,496 | 100 | IB | L1; L1 | orf262; orf241 |
| cox3-i3 | 92 | I | – | + | – | – | + | + | 2,378 | 100 | II | RT | orf547 |
| cox3-i4 | 143 | I | + | + | + | + | + | + | 1,681–1,689 | 99 | ID | double L1 | orf405 |
| cox3-i5 | 183 | I | + | + | + | + | + | + | 1,640–1,649 | 98.5 | IA | double L1 | orf362 |
| cox3-i6 | 214 | I | – | – | + | + | – | – | 1,779 | 99.9 | IA | L1 | orf266 |
| II | + | – | – | – | – | – | 1,743 | – | IA | L1 | orf254 | ||
| rnl-i1 | 305 | I | – | – | + | + | – | – | 2,562 | 99.96 | II | RVT | orf489 |
| II | + | – | – | – | – | – | 3,048 | – | II | RVT | orf625 | ||
| rnl-i2 | 510 | I | + | – | – | + | – | – | 1,530 | 99.74 | I(derived) | GIY-YIG | orf225 |
| rnl-i3 | 743 | I | + | + | – | – | + | – | 1,862–1,863 | 99.7 | IC1 | GIY-YIG | orf253 |
| II | – | – | – | + | – | – | 521 | – | IC1 | – | – | ||
| rnl-i4 | 1,086 | I | – | – | – | – | + | + | 2,289 | 100 | IC1 | double L1 | – |
| II | – | + | + | + | – | – | 464–466 | 99.1 | – | – | – | ||
| III | + | – | – | – | – | – | 1,022 | – | IC1 | – | |||
| rnl-i5 | 1,277 | I | + | – | – | – | – | – | 1,317 | – | IB | L1 | orf182 |
| rnl-i6 | 2,291 | I | + | + | + | + | + | + | 2,709–2,717 | 97.7 | rps3 | – | rps3 |
| rnl-i7 | 2,372 | I | + | – | – | – | – | – | 1,640 | – | IC2 | double L1 | orf450 |
| II | – | + | + | – | + | + | 1,314 | 99.8 | IC2 | double L1 | orf382 | ||
| III | – | – | – | + | – | – | 1,362 | – | IC2 | double L1 | orf382 | ||
| rnl-i8 | 2,428 | I | – | + | + | + | + | + | 1,144–1,148 | 99.6 | IA | L2 | orf261 |
| nad2-i1 | 125 | I | – | + | + | + | + | + | 1,487–1,490 | 98.7 | IC2 | double L1 | orf416 |
| II | + | – | – | – | – | – | 1,454 | – | IC2 | double L1 | – | ||
| nad2-i2 | 210 | I | – | + | + | + | + | + | 2,548 | 98.9 | II | RT | orf697 |
| nad2-i3 | 253 | I | + | + | + | + | + | + | 1,397 | 99.6 | – | double L1 | – |
| nad2-i4 | 329 | I | – | – | – | + | – | – | 2,403 | – | II | RT | orf711 |
| nad2-i5 | 566 | I | + | – | – | – | – | – | 855 | – | – | L1 | orf207 |
| atp9-i1 | 60 | I | + | – | – | – | – | – | 519 | – | IA | – | – |
| II | – | + | – | – | + | + | 667 | 100 | IA | – | – | ||
| III | – | – | + | + | – | – | 673–675 | 99.7 | IA | – | – | ||
| cox2-i1 | 77 | I | – | + | – | – | + | – | 1,134 | 100 | IB | GIY-YIG | orf250 |
| nad5-i1 | 83 | I | + | – | – | – | – | – | 2,012 | – | – | double L1 | orf485 |
| nad5-i2 | 108 | I | + | + | + | + | + | + | 1,372 | 99.4 | IC2 | double L1 | orf376 |
| nad5-i3 | 138 | I | – | + | – | – | – | + | 2,895–2,896 | 99.9 | ID | double L1; double L2 | – |
| II | – | – | + | + | + | – | 4,523–4,524 | 99.9 | – | double L1; double L2; GIY–YIG | orf405 | ||
| nad5-i4 | 187 | I | + | + | + | + | + | + | 1,307–1,312 | – | IB | L1 | orf340 |
| nad5-i5 | 233 | I | + | + | + | + | + | + | 2,566–2,574 | 97.7 | ID; ID | – | orf165; orf132 |
| nad5-i6 | 304 | I | + | + | + | + | + | + | 1,210 | 99.6 | IB | L1 | org380 |
| nad5-i7 | 468 | I | + | – | – | – | – | – | 42 | – | – | – | – |
| cob-i1 | 67 | I | – | + | + | – | + | + | 3,482–3,489 | 99.5 | IB(5') | L1; L1 | orf529; orf461 |
| cob–i2 | 134 | I | + | + | + | + | + | + | 1,562–1,564 | 99.3 | ID | GIY-YIG | orf216 |
| cob-i3 | 143 | I | – | – | + | + | – | – | 4,153–4,155 | 99.9 | ID; IB | double L1 | – |
| II | + | + | – | – | + | + | 6,251–6,259 | 99.5 | ID; IB | double L1 | – | ||
| cob-i4 | 167 | I | + | + | + | + | + | + | 1,244–1,260 | 94.7 | IA | – | orf222 |
| cob-i5 | 207 | I | + | + | + | + | + | + | 933–934 | 96 | ID | L2 | orf219 |
| cob-i6 | 277 | I | + | – | – | – | – | – | 2,156 | – | IB | L2 (b); GIY-YIG | orf296 |
| cox1-i1 | 59 | I | + | – | – | – | + | – | 2,858 bp | 99.8 | II | RVT | orf852 |
| cox1-i2 | 73 | I | + | + | – | – | + | – | 1,263 | 99.4 | IB | L1 | – |
| II | – | – | – | + | – | – | 2,561 | – | ID; IB | double L1; L1 | orf309; – | ||
| III | – | – | – | – | – | + | 2,693 | – | IC2; IB | double L1; L1 | orf401; orf194 | ||
| cox1-i3 | 82 | IV | – | – | + | – | – | – | 4,039 | – | ID; IC2; IB | double L1; double L1; L1 | orf309; orf417; orf194 |
| I | + | – | – | – | – | – | 2,606–2,822 | – | II | RT | orf675 | ||
| II | – | + | + | – | + | + | 2,676–2,679 | 100 | II | RT | orf675 | ||
| III | + | 2,822 | – | II | RT | orf675 | |||||||
| cox1-i4 | 93 | I | + | – | – | – | – | – | 2,596 | – | IB | double L1 | – |
| cox1-i5 | 98 | I | + | + | + | + | + | + | 1,138 | 100 | IB | L1 | orf316 |
| cox1-i6 | 131 | I | + | + | + | + | + | + | 1,717–1,779 | 92.9 | IB | GIY-YIG | orf275 |
| cox1-i7 | 149 | I | – | – | + | – | – | – | 2,682 | – | IB | L1; L1 | orf326 |
| cox1-i8 | 174 | I | + | – | + | + | – | – | 2,465–2,468 | 96.3 | IB | – | orf718 |
| cox1-i9 | 205 | I | – | + | – | – | + | + | 1,300 | 100 | ID | double L1 | orf173 |
| II | + | – | + | + | – | – | 1,368 | 100 | ID | double L1 | orf173 | ||
| cox1-i10 | 214 | I | + | + | + | + | + | + | 1,062 | 99.3 | IB | double L1 | orf316 |
| cox1-i11 | 243 | I | + | – | – | – | – | – | 4,817 | – | IB; ID | L1; double L1 | orf404 |
| II | – | + | – | – | – | – | 1,576 | – | IB | L1 | orf404 | ||
| III | – | – | + | – | – | – | 3,212 | – | IB | L1 | orf404 | ||
| IV | – | – | – | + | + | + | 3,084–3,086 | 99.2 | IB; ID | L1; double L1 | orf404 | ||
| cox1-i12 | 257 | I | + | + | + | + | + | + | 1,128 | 99.3 | IB | L1 | orf346 |
| cox1-i13 | 269 | I | + | – | – | – | – | – | 1,427 | – | IB | L1 | – |
| II | – | + | – | – | + | + | 1,495 | 100 | IB | L1 | – | ||
| cox1-i14 | 294 | I | + | – | – | – | – | – | 1,984 | 100 | – | L2 | orf501 |
| II | – | + | + | + | + | + | 1,841–1,847 | 99.5 | IB | L1; L2 | orf445 | ||
| cox1-i15 | 323 | I | – | + | + | + | + | + | 2,427 | 98.3 | I(derived) | GIY-YIG; GIY-YIG | orf268; orf331 |
| II | + | – | – | – | – | – | 2,474 | – | I(derived) | GIY-YIG; GIY-YIG | orf430; orf331 | ||
| cox1-i16 | 339 | I | + | + | + | + | + | + | 1,206–1,229 | 96.9 | I(derived) | double L1 | orf277 |
| cox1-i17 | 397 | I | – | + | – | – | + | + | 1,607 | 100 | IB | GIY-YIG | orf445 |
| nad1-i1 | 49 | I | – | + | – | – | + | + | 2,231 | 100 | IB(3') | GIY-YIG | orf320 |
| nad1-i2 | 84 | I | + | – | – | – | – | – | 1,467 | – | – | double L1 | – |
| II | – | + | + | + | + | + | 1,594–1,596 | 97.6 | – | double L1 | – | ||
| nad1-i3 | 99 | I | + | + | + | + | + | + | 1,445–1,480 | 97.6 | IC2 | double L1 | – |
| nad1-i4 | 214 | I | – | + | – | – | – | + | 2,471 | 100 | IB | GIY-YIG (b) | – |
| II | – | – | + | – | – | – | 3,903 | – | IB | GIY-YIG (b); double L1(b) | |||
| III | – | – | – | + | – | – | 3,878 | – | IB | GIY-YIG (b); double L1 | orf392 | ||
| IV | – | – | – | – | + | – | 3,843 | – | IB | GIY-YIG (b); double L1 | orf392 | ||
| V | + | – | – | – | – | – | 3,896 | – | IB | GIY-YIG (b); double L1 | orf 327 | ||
| nad4-i1 | 152 | I | – | – | + | + | – | – | 2,138 | 100 | – | double L1 | orf302 |
| atp6-i1 | 120 | I | + | + | + | + | + | + | 1,490–1,511 | 96.9 | IB | double L1 | orf346 |
| atp6-i2 | 181 | I | – | + | – | – | + | + | 1,918 | 100 | – | GIY-YIG | orf382 |
| II | – | – | + | – | – | – | 3,269 | – | – | GIY-YIG; GIY-YIG (b) | – | ||
| atp6-i3 | 198 | I | – | – | + | – | – | – | 1,946 | – | IC2 | GIY-YIG | – |
| II | + | – | – | – | – | – | 1,905 | – | – | GIY-YIG | – | ||
| rns-i1 | 860 | I | – | – | + | + | – | – | 2,389–2,390 | 99.9 | II | double L1 | orf366 |
| rns-i2 | 910 | I | + | – | – | – | – | – | 12 | – | – | – | – |
| II | – | + | + | + | + | + | 50 | 100 | – | – | – | ||
| rns-i3 | 1,108 | I | – | – | – | + | – | – | 1,406 | – | IC2 | double L1 | orf418 |
| rns-i4 | 1,440 | I | – | + | – | – | + | + | 432 | 100 | ID | – | – |
| rns-i5 | 1,557 | I | + | + | – | – | + | + | 960–972 | 97 | – | GIY-YIG | – |
| II | – | – | + | – | – | – | 1,007 | – | – | GIY-YIG | – | ||
| III | – | – | – | + | – | – | 1,061 | – | – | GIY-YIG | – | ||
Column 2: “position” represents the insertion site of introns into conserved genes; the unit for protein coding genes is aa; the unit for rRNA sequences is bp; interval number means that the insertion site is uncertain, but should be within the interval. Column 3: “type” represents intron type located in same insertion site. Introns with more than 90 % nucleotide identity and similar length are classified in same type. Column 4–9: +/– indicates the presence or not of a corresponding intron. Column 11: “lowest similarity” means the lowest one of similarity values from pairwise intron alignments within same type. Column 13: L1, LAGLIDADG_1 domain; L2, LAGLIDADG_2 domain; RT, reverse transcriptase domain; RVT, RNA-dependent DNA polymerase domain; (b) means degraded domain. Column 14: “ORF” represents intact conserved domain sequences.
Figure 2Possible evolution of complex introns. Del operators point to the insertion sites of introns into the conserved genes. Black bars indicate conserved sequences of introns. Other color bars represent new insertion elements. Co-linearity between introns is indicated by dotted lines. Conserved domain regions are shown by solid lines under bars.
Figure 3Formation of three orf-less introns. Black bars indicate conserved sequences of introns. Blue bars represent lost parts of sequences. Bright green bars represent short insertion fragments. Conserved domain regions are represented by solid lines under bars.