Eun-Chae Kwon1, Jong-Hwa Kim2,3, Nam-Soo Kim4,5. 1. Department of Molecular Bioscience, Kangwon National University, Chuncheon, 24341, Korea. 2. Department of Horticulture, Kangwon National University, Chuncheon, 24341, Korea. 3. Institute of Oriental Biomedicine, Kangwon National University, Chuncheon, 24341, Korea. 4. Department of Molecular Bioscience, Kangwon National University, Chuncheon, 24341, Korea. kimnamsu@kangwon.ac.kr. 5. Institute of Biotechnology and Bioscience, Kangwon National University, Chuncheon, 24341, Korea. kimnamsu@kangwon.ac.kr.
Abstract
BACKGROUND: Chloroplasts are a common character in plants. The chloroplasts in each plant lineage have shaped their own genomes, plastomes, by structural changes and transferring many genes to nuclear genomes during plant evolution. Some plastid genes have introns that are mostly group II introns. OBJECTIVE: This study aimed to get genomic and evolutionary insights on the plastomes from green algae to flowering plants. METHODS: Plastomes of 115 species from green algae, bryophytes, pteridophytes (spore bearing vascular plants), gymnosperms, and angiosperms were mined from NCBI organelle genome database. Plastome structure, gene contents and GC contents were analyzed by the in-house developed Phyton code. Intronic features including presence/absence, length, intron phases were analyzed by manually in the annotated information in NCBI. RESULTS: The canonical quadripartite structures were retained in most plastomes except of a few plastomes that had lost an invert repeat (IR). Expansion or reduction or deletion of IRs resulted in the length variation of the plastomes. The number of protein coding genes ranged from 40 to 92 with an average 79.43 ± 5.84 per plastome and gene losses were apparent in specific lineages. The number of trn genes ranged from 13 to 33 with an average 21.19 ± 2.42 per plastome. Ribosomal RNA genes, rrn, were located in the IRs so that they were present in a duplicate except of the species that had lost one of the IR. GC contents were variable from 24.9 to 51.0% with an average 38.21 ± 3.27%, indicating bias to high AT contents. Plastid introns were present in 18 protein coding genes, six trn genes, and one rrn gene. Intron losses occurred among the orthologous genes in different plant lineages. The plastid introns were long compared with the nuclear introns, which might be related with the spliceosome nuclear introns and self-splicing group II plastid introns. The trnK-UUU intron contained the maturase encoding matK gene except in the chlorophyte algae and monilophyte ferns in which the trnK-UUU was lost, but matK retained. There were many annotation artefacts in the intron positions in the NCBI database. In the analysis of intron phases, phase 0 introns were more frequent than those of phase 2 and 3 introns. Phase polymorphism was observed in the introns of clpP which was derived from nucleotide insertion. Plastid trn introns were long compared to the archaeal or eukaryotic nuclear tRNA introns. Of the six plastid trn introns, one was at the D loop and other five were at the anticodon loop. The insertion sites were conserved among the trn genes in archaea, eukaryotic nuclear and plastid tRNA genes. CONCLUSIONS: Current study refurbrished the previous findings of structural variations, gene contents, and GC contents of the chloroplast genomes from green algae to flowering plants. The study also included some noble findings and discussions on the plastome introns including their length variations and phase variation. We also presented and corrected some false annotations on the introns in protein coding and tRNA genes in the genome database, which might be confirmed by the chloroplast transcriptome analysis in the future.
BACKGROUND: Chloroplasts are a common character in plants. The chloroplasts in each plant lineage have shaped their own genomes, plastomes, by structural changes and transferring many genes to nuclear genomes during plant evolution. Some plastid genes have introns that are mostly group II introns. OBJECTIVE: This study aimed to get genomic and evolutionary insights on the plastomes from green algae to flowering plants. METHODS: Plastomes of 115 species from green algae, bryophytes, pteridophytes (spore bearing vascular plants), gymnosperms, and angiosperms were mined from NCBI organelle genome database. Plastome structure, gene contents and GC contents were analyzed by the in-house developed Phyton code. Intronic features including presence/absence, length, intron phases were analyzed by manually in the annotated information in NCBI. RESULTS: The canonical quadripartite structures were retained in most plastomes except of a few plastomes that had lost an invert repeat (IR). Expansion or reduction or deletion of IRs resulted in the length variation of the plastomes. The number of protein coding genes ranged from 40 to 92 with an average 79.43 ± 5.84 per plastome and gene losses were apparent in specific lineages. The number of trn genes ranged from 13 to 33 with an average 21.19 ± 2.42 per plastome. Ribosomal RNA genes, rrn, were located in the IRs so that they were present in a duplicate except of the species that had lost one of the IR. GC contents were variable from 24.9 to 51.0% with an average 38.21 ± 3.27%, indicating bias to high AT contents. Plastid introns were present in 18 protein coding genes, six trn genes, and one rrn gene. Intron losses occurred among the orthologous genes in different plant lineages. The plastid introns were long compared with the nuclear introns, which might be related with the spliceosome nuclear introns and self-splicing group II plastid introns. The trnK-UUU intron contained the maturase encoding matK gene except in the chlorophyte algae and monilophyte ferns in which the trnK-UUU was lost, but matK retained. There were many annotation artefacts in the intron positions in the NCBI database. In the analysis of intron phases, phase 0 introns were more frequent than those of phase 2 and 3 introns. Phase polymorphism was observed in the introns of clpP which was derived from nucleotide insertion. Plastid trn introns were long compared to the archaeal or eukaryotic nuclear tRNA introns. Of the six plastid trn introns, one was at the D loop and other five were at the anticodon loop. The insertion sites were conserved among the trn genes in archaea, eukaryotic nuclear and plastid tRNA genes. CONCLUSIONS: Current study refurbrished the previous findings of structural variations, gene contents, and GC contents of the chloroplast genomes from green algae to flowering plants. The study also included some noble findings and discussions on the plastome introns including their length variations and phase variation. We also presented and corrected some false annotations on the introns in protein coding and tRNA genes in the genome database, which might be confirmed by the chloroplast transcriptome analysis in the future.
Authors: Sabeeha S Merchant; Simon E Prochnik; Olivier Vallon; Elizabeth H Harris; Steven J Karpowicz; George B Witman; Astrid Terry; Asaf Salamov; Lillian K Fritz-Laylin; Laurence Maréchal-Drouard; Wallace F Marshall; Liang-Hu Qu; David R Nelson; Anton A Sanderfoot; Martin H Spalding; Vladimir V Kapitonov; Qinghu Ren; Patrick Ferris; Erika Lindquist; Harris Shapiro; Susan M Lucas; Jane Grimwood; Jeremy Schmutz; Pierre Cardol; Heriberto Cerutti; Guillaume Chanfreau; Chun-Long Chen; Valérie Cognat; Martin T Croft; Rachel Dent; Susan Dutcher; Emilio Fernández; Hideya Fukuzawa; David González-Ballester; Diego González-Halphen; Armin Hallmann; Marc Hanikenne; Michael Hippler; William Inwood; Kamel Jabbari; Ming Kalanon; Richard Kuras; Paul A Lefebvre; Stéphane D Lemaire; Alexey V Lobanov; Martin Lohr; Andrea Manuell; Iris Meier; Laurens Mets; Maria Mittag; Telsa Mittelmeier; James V Moroney; Jeffrey Moseley; Carolyn Napoli; Aurora M Nedelcu; Krishna Niyogi; Sergey V Novoselov; Ian T Paulsen; Greg Pazour; Saul Purton; Jean-Philippe Ral; Diego Mauricio Riaño-Pachón; Wayne Riekhof; Linda Rymarquis; Michael Schroda; David Stern; James Umen; Robert Willows; Nedra Wilson; Sara Lana Zimmer; Jens Allmer; Janneke Balk; Katerina Bisova; Chong-Jian Chen; Marek Elias; Karla Gendler; Charles Hauser; Mary Rose Lamb; Heidi Ledford; Joanne C Long; Jun Minagawa; M Dudley Page; Junmin Pan; Wirulda Pootakham; Sanja Roje; Annkatrin Rose; Eric Stahlberg; Aimee M Terauchi; Pinfen Yang; Steven Ball; Chris Bowler; Carol L Dieckmann; Vadim N Gladyshev; Pamela Green; Richard Jorgensen; Stephen Mayfield; Bernd Mueller-Roeber; Sathish Rajamani; Richard T Sayre; Peter Brokstein; Inna Dubchak; David Goodstein; Leila Hornick; Y Wayne Huang; Jinal Jhaveri; Yigong Luo; Diego Martínez; Wing Chi Abby Ngau; Bobby Otillar; Alexander Poliakov; Aaron Porter; Lukasz Szajkowski; Gregory Werner; Kemin Zhou; Igor V Grigoriev; Daniel S Rokhsar; Arthur R Grossman Journal: Science Date: 2007-10-12 Impact factor: 47.728
Authors: Nicholas Bauman; Srividya Akella; Elizabeth Hann; Robert Morey; Ariel S Schwartz; Rob Brown; Toby H Richardson Journal: Genome Announc Date: 2018-03-22
Authors: Raquel Santos da Silva; Charles Roland Clement; Eduardo Balsanelli; Valter Antonio de Baura; Emanuel Maltempi de Souza; Hugo Pacheco de Freitas Fraga; Leila do Nascimento Vieira Journal: PLoS One Date: 2021-08-24 Impact factor: 3.240