Literature DB >> 34071968

Novel Structural Variation and Evolutionary Characteristics of Chloroplast tRNA in Gossypium Plants.

Ting-Ting Zhang1, Yang Yang1, Xiao-Yu Song1, Xin-Yu Gao1, Xian-Liang Zhang2, Jun-Jie Zhao2, Ke-Hai Zhou2, Chang-Bao Zhao2, Wei Li2, Dai-Gang Yang2, Xiong-Feng Ma2,3, Zhong-Hu Li1.   

Abstract

Cotton is one of the most important fiber and oil crops in the world. Chloroplast genomes harbor their own genetic materials and are considered to be highly conserved. Transfer RNAs (tRNAs) act as "bridges" in protein synthesis by carrying amino acids. Currently, the variation and evolutionary characteristics of tRNAs in the cotton chloroplast genome are poorly understood. Here, we analyzed the structural variation and evolution of chloroplast tRNA (cp tRNA) based on eight diploid and two allotetraploid cotton species. We also investigated the nucleotide evolution of chloroplast genomes in cotton species. We found that cp tRNAs in cotton encoded 36 or 37 tRNAs, and 28 or 29 anti-codon types with lengths ranging from 60 to 93 nucleotides. Cotton chloroplast tRNA sequences possessed specific conservation and, in particular, the Ψ-loop contained the conserved U-U-C-X3-U. The cp tRNAs of Gossypium L. contained introns, and cp tRNAIle contained the anti-codon (C-A-U), which was generally the anti-codon of tRNAMet. The transition and transversion analyses showed that cp tRNAs in cotton species were iso-acceptor specific and had undergone unequal rates of evolution. The intergenic region was more variable than coding regions, and non-synonymous mutations have been fixed in cotton cp genomes. On the other hand, phylogeny analyses indicated that cp tRNAs of cotton were derived from several inferred ancestors with greater gene duplications. This study provides new insights into the structural variation and evolution of chloroplast tRNAs in cotton plants. Our findings could contribute to understanding the detailed characteristics and evolutionary variation of the tRNA family.

Entities:  

Keywords:  chloroplast tRNA; cotton; evolution; phylogenetic relationship; structural variation

Year:  2021        PMID: 34071968      PMCID: PMC8228828          DOI: 10.3390/genes12060822

Source DB:  PubMed          Journal:  Genes (Basel)        ISSN: 2073-4425            Impact factor:   4.096


1. Introduction

Chloroplasts (cp) are unique oblate organelles equipped with important photosynthesis functions [1]. Besides the complete photosynthetic components, they have also been reported as active organelles participating in other diverse biomolecule processes including biosynthesis of starch, fatty acids, pigments, amino acids, etc. [2,3,4]. Some previous studies found that chloroplasts originated from cyanobacterial ancestors [5,6]. For most plants, the cp genome is a double-stranded circular unit involving four main regions: two inverted repeats (IRs) separated by a large single copy (LSC) region and a small single copy (SSC) region [7,8]. In addition, the features of non-recombination and maternal inheritance (in most angiosperms) of the cp genome provide the opportunity to employ it as material for evolutionary and genomics research [9,10]. There are many genes, such as protein-coding genes and multiple tRNA genes, present in the cp genome [11]. Relevant studies have shown that most of these genes play a role in the organization of photosynthesis and biochemical reactions [12,13,14]. However, it is worth noting that the genetic and mutational characteristics of cp genes, especially cp tRNA genes in angiosperms, still need research. Generally, single nucleotide polymorphisms (SNPs) and insertions/deletions are the basis of the differences between most alleles. With the characteristics of high abundance, a fairly low mutation rate, and the adaptability to automatic genotyping, SNPs are employed more frequently than other genetic markers, such as microsatellites [15]. Insertions and deletions (indels) are essential sources of polymorphic markers for high-resolution genetic mapping of traits and association studies based on candidate genes or possibly the whole genome [16]. Furthermore, SNPs and indels are also important in the nucleotide sequence evolution analysis of genomes. The calculation of single nucleotide variation may help estimate and comprehend genetic variations of different genome regions. According to previous studies, the residue polynucleotide sequences of tRNAs fold back, forming clover leaf-like structures with hydrogen bonds, and then turn into L-shaped tertiary structures [17]. The secondary structure of tRNA contains an acceptor arm, dihydrouridine arm (D-arm), dihydrouridine loop (D-loop), anti-codon arm, anti-codon loop, variable loop, pseudouridine arm (Ψ-arm), and pseudouridine loop (Ψ-loop) [18,19]. With this specific structure, tRNA plays an important role in protein synthesis by carrying amino acids to the ribosome [20,21]. Recently, researchers conducted a series of analyses related to tRNAs on the genomic level. Asymmetric combinations and divided segments in tRNA genes would help to understand the diversity of tRNA molecules [22]. Additionally, studies showed that the heterogeneous tRNA fragments play multiple roles in terms of size, nucleotide composition, biogenesis, and even biological disease [23]. Additionally, wobble modifications were frequently found in tRNAs in diverse species after the discovery of tRNA molecules [24,25,26]. In recent years, studies on the genomic structure and evolution of tRNAs in plants has been attracting researchers’ attention. For example, analyses of detailed molecular aspects in cyanobacterial tRNAs, complete genomic features of tRNAs in Oryza sativa L, and the evolutionary perspective of chloroplast tRNAs (cp tRNAs) in some economic monocots were previously conducted [27,28,29]. Gossypium L., or the cotton plant, is an important economic and oil crop that has been cultivated worldwide. Gossypium is a large genus belonging to the angiosperm family Malvaceae. Fryxell divided the cotton genus into four subgenera, with a total of 51 species, including 46 diploid (2n = 2× = 26) and five tetraploid (2n = 4× = 52) species [30]. The chromosome composition of tetraploid cotton is heterogeneous A and D: Gossypium hirsutum L. (AD1), G. barbadense L. (AD2), G. tomentosum Nuttalex Seemann (AD3), G. mustelinum Miersex Watt (AD4), and G. darwinii Watt (AD5). A total of eight chromosomes of diploid cotton belong to groups A–G and K. Among them, G. hirsutum, G. barbadense, G. arboreum L., and G. herbaceum L. are cultivars, while the others are wild species. Nowadays, the most widely cultivated cotton species are tetraploid G. hirsutum and G. barbadense [31]. At present, with the rapid development of genome sequencing technology, whole-genome sequences of cotton have been released, which provide the foundation for further analysis of cotton genomes [32,33]. The genome-wide landscape of genomic variation of cotton was constructed through SNP distribution density detection, and functional genes that encode proteins involved in regulation of tissue growth, stress responses, and disease resistance were reported [34,35,36,37,38,39]. Moreover, a previous study on variations of repeat sequences and cp evolutionary relationships detected divergence hotspots in plastid genomes and helped to understand phylogenetic relationships among major Gossypium lineages [40]. However, the evolutionary patterns of Gossypium chloroplast tRNAs are still unclear. The study of the genomic and evolutionary characteristics of cotton cp tRNAs seems to be significant. In this study, we investigated ten globally representative cotton cp genomes, including eight diploid and two allotetraploid species. The aims of our study were: (1) identify the genomic characteristics and diversification of cp tRNAs in cotton; (2) analyze the evolutionary relationship of introns in cp tRNA genes; (3) estimate the evolutionary characteristics of SNPs and the indel mutation rate of the cotton chloroplast genome; and (4) investigate the evolutionary pattern of cp tRNAs in cotton species.

2. Materials and Methods

2.1. Identification of tRNAs

We downloaded the 10 cotton cp genomes (8 diploid and 2 allotetraploid species: Gossypium arboreum L., G. anomalum Wawra and Peyritsch, G. robinsonii (F. Muell.) J. H. Willis, G. klotzschianum Andersson, G. somalense (Gurke) J. B. Hutch., G. longicalyx Hutchinson and Lee, G. bickii Prokhanov, G. populifolium (Benth.) F. Muell., G. hirsutum L., and G. barbadense L.) from the National Center of Biotechnology Information (NCBI) (Table 1). These cotton species are widely distributed near the equator and G. hirsutum, G. barbadense, and G. klotzschianum are mainly distributed in the Americas. Subsequently, tRNA gene sequences were identified and extracted from cp genomes without intergenic regions by the GENEIOUS 8.0.2. program [41].
Table 1

Gossypium species information in the study.

KaryotypeSpeciesAccession NumberWild/Cultivars
A2 Gossypium arboreum NC_016712cultivars
B1 G. anomalum NC_023213wild
C2 G. robinsonii NC_018113wild
D3-k G. klotzschianum NC_033394wild
E2 G. somalense NC_018110wild
F1 G. longicalyx NC_023216wild
G1 G. bickii NC_023214wild
K2 G. populifolium NC_033398wild
AD1 G. hirsutum HQ901196cultivars
AD2 G. barbadense HQ901199cultivars

2.2. Structural Analysis of tRNAs

ARAGORN and tRNAScan-SE software [42,43] were employed to investigate the secondary structure of tRNA sequences of cp genomes. The default parameters of ARAGORN software were set to investigate tRNAs. The parameters of tRNAScan-SE were set as: bacterial for sequence source, default for search mode, formatted (FASTA) for query sequences, and universal for genetic code for tRNA isotype prediction.

2.3. Sequence Alignment

To identify the presence of consensus sequences, sequence alignments were carried out for the intron sequences of the cotton cp tRNAs, Pinus armandii Franch., Marchantia polymorpha L., Raphanus sativus L., Spirogyra maxima (Hassall) Wittrock, Alsophila spinulosa (Wallich ex Hooker) R. M. Tryon Contr. Gray Herb., Zea mays L., Gleocapsa sp. PCC 73106, Nostoc sp. PCC 7107, and Nostoc sp. PCC 7524 using Multalin software, in which default parameters were set [44,45,46].

2.4. Phylogenetic Tree Construction

The phylogenetic tree was constructed using MEGA7.0 software [47,48]. To investigate the evolution of chloroplast tRNAs, a matrix of whole tRNA sequences was created by Clustal Omega software before the phylogenetic tree was constructed. MEGA7 software was employed to turn the matrix file into MEGA file format for tree construction, in which the lowest Bayesian information criterion (BIC) was selected for the model. As a result, the Kimura2 + G + I model was found to have the lowest BIC score, 8980.50. Thus, this model was adopted for the phylogenetic tree construction. The other related parameters were as follows: phylogeny reconstruction for analysis, maximum likelihood model, bootstrap method in phylogeny test, 1000 bootstrap replicates, nucleotide type, γ distributed with invariant sites (G + I) model, 5 discrete γ categories, partial deletion for gaps/missing data treatment, 95% site coverage cutoff, and very strong for branch swap filter.

2.5. Analysis of Disparity Index

A disparity index test of pattern heterogeneity was conducted to check the homogeneity of nucleotide substitutions and find whether all substitutions in nucleotides happened at equal rates, i.e., homogeneity in the process of evolution. The statistical parameters were as follows based on a previous study [29]: disparity index test for substitution pattern homogeneity, in sequence pairs, 10,001 Monte Carlo replications, nucleotide for substitution type, partial deletion of gaps/missing data treatment, and site coverage cutoff 95%.

2.6. Transition/Transversion Analysis

The transition and transversion rates of the tRNAs genes were analyzed according to their isotypes by MEGA 7 software [49]. The parameters were set as follows: substitution pattern estimation (ML) for analysis, automatic (neighbor-joining tree), maximum likelihood statistical method, nucleotide for substitution type, Kimura 2-parameter model, γ distributed (G) site rates, 5 discrete γ categories, partial deletion of gaps/missing data treatment, 95% of site coverage cutoff, and very strong branch swap filter.

2.7. Evolutionary Analysis of Single Nucleotide Polymorphisms

The diversity of single nucleotide polymorphisms of 8 diploid Gossypium cp genomes was calculated based on 3 regions: coding regions, introns, and intergenic spacers. The synonymous (dS) and non-synonymous (dN) substitutions of coding genes were also calculated by DnaSP v5.10 software [50]. The coding regions, introns, and intergenic regions of the studied Gossypium cp genome were extracted through Geneious 8.0.2 software and aligned manually [41].

2.8. Calculation of Mutation Rate

The rate of indel mutation (μ, per site per year) was calculated with the formula:μ = m/(n where m is the number of sites of observed mutation, n is the total number of sites, and T is the divergence time of Gossypium. The μ value of structural mutations was calculated according to the method of Saitou and Ueda [51]. The T value was obtained through relevant published literature searches in the Fossil works database and the Cenozoic Angiosperm Database [52,53]. Additionally, the mutation rates of protein-coding genes and tRNA genes were calculated.

2.9. Duplication/Loss Analysis of tRNA Genes

To investigate duplication or loss events of the tRNAs, we used the NCBI taxonomy browser to construct the species tree of G. arboreum, G. anomalum, G. robinsonii, G. klotzschianum, G. somalense, G. longicalyx, G. bickii, G. hirsutum, G. barbadense, and G. populifolium. Additionally, the previously constructed phylogenetic tree of the tRNAs was employed as the gene tree. Subsequently, Notung 2.9 software [54,55] was used to reconcile the gene tree and the species tree and obtain the gene duplication and loss nodes.

3. Results

3.1. Basic Characteristics of Cotton Chloroplast tRNAs

The results of detailed genomic analysis showed that G. arboreum, G. anomalum, G. robinsonii, G. klotzschianum, G. somalense, G. hirsutum, G. barbadense, G. longicalyx, and G. populifolium coded 37 tRNAs, respectively, while only G. bickii coded 36 tRNAs (Table 2). The length of the chloroplast tRNAs ranged from 60 nt (tRNAGly in G. arboreum, GCC) to 93 nt (tRNASer, UGA), with an average length of 76 nt (Table S1).
Table 2

Distribution of tRNA isotypes in cotton chloroplast genome.

tRNA IsotypeNumber of tRNAs
A2 1B1 2C2 3D3-K 4E2 5F1 6AD1 7AD2 8G1 9K2 10
Alanine2222222222
Glycine2222222222
Proline1111111111
Threonine2222222222
Valine3333333333
Serine3333333333
Arginine3333333333
Leucine4444444444
Phenylalanine1111111111
Asparagine2222222222
Lysine1111111111
Aspartate1111111111
Glutamate1111111111
Histidine1111111111
Glutamine1111111111
Isoleucine4444444444
Methionine2222222212
Tyrosine1111111111
Cysteine1111111111
Tryptophan1111111111
Selenocysteine0000000000
Suppressor0000000000
Total37373737373737373637

1Gossypium arboretum; 2G. anomalum; 3G. robinsonii; 4G. klotzschianum; 5G. somalense; 6G. longicalyx; 7G. hirsutum; 8G. barbadense; 9G. bickii; 10G. populifolium.

The genomic analysis results showed that chloroplast tRNA genes of the investigated cotton plants coded 28 or 29 anti-codon types, of which only G. anomalum, G. longicalyx, and G. bickii coded 29 anti-codons. The most common anti-codons observed in cp tRNAs were UGC-tRNAAla, GCC-tRNAGly, GAC-tRNAVal, ACG-tRNAArg, CAA-tRNALeu, GUU-tRNAAsn, CAU-tRNAIle, GAU-tRNAIle, and CAU-tRNAMet. Additionally, each of these genes, except GCC-tRNAGly and CAU-tRNAMet, had two copies (Table S2). GCC (tRNAGly) was found with one copy in G. anomalum, G. longicalyx, and G. bickii but two copies in the other cotton species. Similarly, CAU (tRNAMet) was found with one copy in G. bickii but two copies in the other cotton species. There were 33 tRNAs with various iso-acceptors missing in the cp genome of Gossypium (Table S2). UCC-tRNAGly was observed in G. anomalum, G. longicalyx, and G. bickii. All investigated cotton species were shown to possess at least one anti-codon type for each kind of tRNA. Furthermore, the anti-codon CAU was a typical characteristic of tRNAMet, which harbored only one type of iso-acceptor. Apart from the existence of the anti-codon CAU in tRNAMet, tRNAIle was also observed to encode the anti-codon CAU in the cotton cp genome (Table S2). All the tRNA gene families were analyzed by multiple sequence alignment, from which the limited conservation and consensus sequences were found in the Ψ-arm and Ψ-loop. The Ψ-arm of tRNAs was observed to contain the G-G consensus sequence, and the Ψ-loop was observed to contain the conserved sequence U-U-C-X3-U (Table 3).
Table 3

Sequence alignment and the presence of isotype-specific conserved nucleotide consensus sequences in the cotton chloroplast tRNAs.

tRNA IsotypeAC-ArmD-ArmD-LoopANC-ArmANC-LoopVariable RegionΨ-ArmΨ-Loop
AlanineGGGGAUAGCUCAGUUGGUACCGCUCUUGCAUAUGUCAGCGGUUCGAGU
ArginineGXGXCX2Gx3AX2GGAUA*****CUXCXAAGUGGUUCGAAU
AsparagineGUCGGGAGCUCAGUUGGUAGUCGGCUGUUAAUGGUCGUAGGUUCGAAU
AspartateGGGAUUGGUUCAAUUGGUCACCGCCCUGUCAAAAGCUGCGGGUUCGAGC
CysteineGGCGACAGCCGAGCGGUAAGGGGACUGCAAAUAUUCCCCAGUUCAAAU
GlutamateGX2CX3GX3AGXGGUX1–3CX2CXCUUUCAXX2GX1–2X3GXUUCXAXU
GlutamineUGGGGCGGCCAAGUGGUAACGGGUUUUGGUCUAUGCGGAGGUUCGAAU
GlycineGCGGAUAGUCGAAUGGUAAAUCUCUUUGCCAAAGACGCGGGUUCGAUU
HistidineGCGGAUGGCCAAGUGGAUCAAGUGGAUUGUGAACAUGCGCGGGUUCAAUU
IsoleucineGCAUCCAGCUGAAUGGUUAACCCAACUCAUAAAAUUCGUAGGUUCAAUU
LeucineGX6GXGAAAUXGX3–4AX3GXCUX4AXGX9–12X3GGUUCXAGU
LysineGGGUUGCACUCAACGGUAUCGGCUUUUAACUAGUUCCGGGUUCGAGU
MethionineXCX5–6X3GAGUX5–6*****XUCAUAXX2GUCAUXGGUUCAAAU
PhenylalanineGUCGGGAGCUCAGUUGGUAGAGGACUGAAAAGUGUCACCAGUUCAAAU
ProlineAGGGAUGGCGCAGCUUGGUAUUUGUUUUGGGUAUGUCACGGGUUCAAAU
SerineGGAGAGAGCX1–2X4GX3–4AX2GX1–2XUXGXAXX4GX15–19GAGGGUUCGAAU
ThreonineXGCCX0–4XCUCAGXGGUAXCGCXX3GUAAX2GUCAUCGGUUCX3U
TryptophanGCGCUCUGUUCAGUUCGGUAUGGGUCUCCAAAAUGUCGUAGGUUCAAAU
TyrosineGGGUCGACCCGAGCGGUUAAACGGACUGUAAAGGCAGCUGGUUCAAAU
ValineAGGGAUAACUCAGCGGUAUCACCUUGACGUAAGUCAUCAGUUCGAGC

Note: The asterisk (*****) shows the absence of conserved nucleotide consensus sequence in the respective region of the chloroplast tRNAs.

Most of the tRNAs possessed a G nucleotide at the first position of the acceptor arm, while tRNAGln, tRNAPro, and tRNAVal were observed to contain a U and an A (Table 3). Except for tRNALys, tRNAMet, tRNAThr, tRNATyr, and tRNAVal, other tRNAs contained a G in the first nucleotide of the D-arm. Additionally, there was usually an A at the last position of the D-loop and ANC-loop. At the final site of the D-arm, except for tRNAArg, tRNAGlu, tRNAIle, tRNALeu, tRNAMet, tRNASer, and tRNATyr, the other investigated tRNAs were found to have a C (Table 3). Additionally, we also observed that all tRNALys, tRNAAla, tRNAGlu, tRNAArg, tRNATyr, and a few tRNALeu had a C-C-A tail in the 3′ end.

3.2. Diversification of tRNA Structure

The tRNA feature of possessing various arms and loops was responsible for protein translation. The results of structure analysis showed that the acceptor arm in cotton chloroplast tRNA contained 6 to 7 nt. Among the 369 investigated tRNA sequences, only nine were found to contain 6 nt, while the remaining 360 tRNAs (97.56%) contained 7 nt (Table S3). In addition, the D-arm was observed to contain 2 to 4 nt, among which 20 contained only 2 nt, and 110 (29.81%) contained 3 nt. The remaining tRNAs (64.50%) were observed to possess 4 nt in the D-arm. Moreover, the D-loop contained 7 to 11 nt. For all the involved tRNAs, 84 contained 7 nt in their D-loops; 69 (18.70%) contained 8; 105 (28.46%) contained 9; 40 (10.84%) contained 10; and the rest (18.98%) contained 11 nt (Table S3). In all 369 tRNAs, the anti-codon arm had mainly 4 to 5 nt. Among them, 319 (86.45%) had 5 nt, and 50 (13.55%) had 4 nt. In addition, 359 tRNAs (97.29%) contained 7 nt in their anti-codon loops and 10 (only 2.71%) contained 9 nt. This showed that the conservative sequence of anti-codon loops was rather typical (Table 3 and Table S3). For variable loops, 10 tRNAs (2.71%) contained 2; 30 tRNAs (8.13%) contained 3, 67 tRNAs (18.16%) contained 4; 213 tRNAs (57.72%) contained 5; 39 tRNAs (10.57%) contained 6; and 10 tRNAs (2.71%) contained 8 nt (Table S3). While the Ψ-arms of all the analyzed chloroplast tRNAs were observed to contain 5 nt, most tRNAs (349, 94.58%) had 7 nt in their Ψ-loops, apart from several tRNAArg (Table S3).

3.3. Chloroplast tRNA Contained Introns

The cotton chloroplast tRNAs had intron annotations according to previous studies. The tRNAVal of G. populifolium was observed to contain an intron in its anti-codon loop region (Figure 1). The introns in bacterial and plant chloroplast tRNAs had conserved G-A-T-T-T and C-T-T-C-A consensus sequences (Figure 2). Phylogenetic analysis showed that chloroplast tRNA introns grouped with the introns in cyanobacteria (Figure 3). Introns contained in the same tRNAs of most plants (tRNAIle and tRNALeu) tended to appear in the same branch. This revealed their close phylogenetic relationship. The introns in chloroplast tRNAVal of G. populifolium and Zea mays L. aggregated to the same branch (Figure 3), showing that the introns of corn and cotton have a close phylogenetic relationship.
Figure 1

Presence of intron in chloroplast tRNA, found to be located in the tRNA anti-codon loop. The blue dot represents dihydrogen bonds; the red dot represents triple hydrogen bond; and the green dot indicates the position of intron.

Figure 2

Multiple sequence alignment of introns in tRNA of plant chloroplast and cyanobacteria. Introns in bacterial and chloroplast tRNAs had conserved G-A-T-T-T and C-T-T-C-A consensus sequences.

Figure 3

Phylogenetic relationship of introns in chloroplast tRNAs (from angiosperms, gymnosperms, ferns, bryophytes, and algae). Introns of chloroplast tRNA grouped with that of cyanobacteria illustrated a common cyanobacterial origin of introns in chloroplasts. Introns in chloroplast tRNAVal of Gossypium populifolium and Zea mays indicated the same evolutionary ancestors of corn and cotton introns. Introns present in the same tRNA of plants (tRNAIle and tRNALeu) tended to appear in the same branch, showing their close phylogenetic relationship.

3.4. Chloroplast tRNAs with Non-Typical Features

A few unconventional tRNAs were observed in cotton cp genomes. tRNALeu, tRNASer, and tRNATyr were observed to have a loop in the variable region (Figure 4). In these non-typical tRNAs, the anti-codon loop harbored 7 nt with the X-U-X3-A-A consensus sequence, and the stem of the anti-codon loop contained 4 to 5 nt. The variable loop region was observed to have 2 to 8 nt for tRNALeu, tRNASer, and tRNATyr. The stem of the variable region contained 3 to 7 nt pairs. Obviously, tRNASer had mainly 6 or 7 nt pairs (Figure 4). These loop structures in variable regions of tRNAs put forward the question of whether these loops play an important role in the process of protein translation of the chloroplast.
Figure 4

Structures of the chloroplast tRNAs showed the presence of a loop structure in the variable region: (a–c) tRNALeu; (d–f) tRNASer; (g) tRNATyr. Anti-codon loop had seven nucleotides with the conservative X-U-X3-A-A consensus sequence in these tRNAs. Loop structure of the variable region was observed to contain three to eight nucleotides from tRNALeu, tRNASer, and tRNATyr.

3.5. Cotton Chloroplast tRNAs Were Derived from Several Evolutionary Ancestors

The phylogenetic tree of cotton and other various species’ tRNA sequences from a wide range of taxonomic positions, including algae (Nostoc sp. PCC 7524), bryophytes (Dumortiera hirsuta), ferns (Psilotum nudum), gymnosperms (Pinus taeda), and angiosperms (Arabidopsis lyrata), presented three major clusters (Table 4). In all, there were 87 groups in integrated clade I, 43 groups in integrated clade II, and 7 groups in integrated clade III. In the phylogenetic tree, tRNASer, tRNALeu, tRNAArg, tRNAVal, tRNAIle, tRNAMet, and tRNAGln were polyphyletic, positioned in more than two integrated clades. In integrated clade I, most of tRNAMet was clustered into two polyphyletic sub-clades (containing nine groups), one embedded in the tRNAThr group and another adjacent to tRNACys and tRNAPhe, while tRNAMet of cyanobacteria was independently clustered to its tRNAArg. In integrated clade II, tRNALeu was closely related to cyanobacterial tRNAGln, even with its polyphyly. tRNATrp, except the cyanobacterial one, was clustered into a monophyletic group, while the cyanobacterial one was embedded into the sister clade of the tRNATrp group. tRNAIle was found in all three clades. Interestingly, tRNALeu, tRNAIle, tRNAGln, tRNAPhe, and tRNAArg present in integrated clade I were also found in clade II, and the tRNAs in integrated clade III were also found in clade I (Figure 5).
Table 4

The distribution of types of tRNAs in the three major clusters of the phylogenetic tree.

Integrated CladesTypes of tRNAs
clade ItRNASer, tRNALeu, tRNAArg, tRNAMet, tRNAAla, tRNAGly, tRNAAsp, tRNALys, tRNAVal, tRNAIle, tRNAThr, tRNAPro, tRNAGln, tRNACys, tRNAPhe
clade IItRNAIle, tRNALeu, tRNAGln, tRNATyr, tRNAHis, tRNAAsn, tRNAPhe, tRNAGlu, tRNATrp, tRNAArg
clade IIItRNAThr, tRNASer, tRNAVal, tRNAMet, tRNAIle
Figure 5

Phylogenetic tree of chloroplast tRNAs. A, alanine (Ala); R, arginine (Arg); N, asparagine (Asn); D, aspartate (Asp); C, cysteine (Cys); Q, glutamine (Gln); E, glutamate (Glu); G, glycine (Gly); H, histidine (His); I, isoleucine (Ile); L, leucine (Leu); K, lysine (Lys); M, methionine (Met); F, phenylalanine (Phe); P, proline (Pro); S, serine (Ser); T, threonine (Thr); W, tryptophan (Trp); Y, tyrosine (Tyr); V, valine (Val). The green solid circle represents Gossypium; the red solid triangle represents P. taeda; the pink solid diamond represents A. lyrate; the yellow solid triangle represents P. nudum; the orange solid triangle represents D. hirsuta; and the blue solid square represents Nostoc sp. PCC 7524. Phylogenetic analysis showed the polyphyletic origin of chloroplast tRNAs.

3.6. Transition/Transversion of tRNAs

tRNAs have evolved with almost equal transition and transversion rates in spite of the small probability of transition or transversion events in tRNAs. In the present study, we found some intriguing phenomena in the substitution rate of cotton chloroplast tRNAs. A transition rate of 16.66 and transversion rate of 16.68 were found in tRNAAla, tRNAPhe, tRNAAsp, tRNAPro, tRNATyr, tRNAVal, and tRNAIle. This showed that the transversion rate was slightly higher than the transition rate, and these tRNAs evolved with almost the same transition and transversion rates (Figure 6, Table S4). The highest transition rate was 50.00 for tRNATrp and the highest transversion rate was 25.00 for tRNAHis. Correspondingly, the lowest transition rate (0.00) was observed for tRNAHis, and the lowest transversion rate (0.00) for tRNATrp (Figure 6). This indicated that the tRNATrp of the cotton cp genome had experienced a high transition rate without any transversion. Similarly, tRNAHis had experienced a high rate of transversion but no transition. For tRNALys, tRNAArg, tRNAMet, tRNAAsn, tRNACys, tRNASer, tRNAGlu, tRNAGly, and tRNALeu, the transition rate was higher than the transversion rate (Figure 6A). Additionally, the transition rate was two times higher than transversion for tRNAArg, tRNACys, and tRNAGlu, which showed that the evolution of chloroplast tRNAArg, tRNACys, and tRNAGlu tended toward transition rather than transversion. For tRNAGln and tRNAThr, the transversion rate was apparently higher than the transition rate (Figure 6B). This indicated that these tRNA iso-acceptors experienced transversion substitutions more easily than transition. The substitution rates of overall cp tRNAs showed that the average transition rate (23.64) was greater than the transversion rate (13.18) (Figure 6B).
Figure 6

Rates of transition (blue) and transversion (gray) of chloroplast tRNAs. (A) a–i refer to R, Arg; S, Ser; C, Cys; E, Glu; G, Gly; W, Trp; L, Leu; K, Lys; M, Met. (B) j–u refer to A, Ala; P, Pro; N Asn; D, Asp; T, Thr; Q, Gln; H, His; I, Ile; Y, Tyr; V, Val; F, Phe; overall.

3.7. Evolutionary Characteristics of Single Nucleotide Polymorphisms

The biallelic and parallel mutation SNPs of the eight diploid Gossypium cp genomes were calculated. There were 2709 SNPs in Gossypium cp genomes (Table 5). They were subdivided into coding, intron, and intergenic spacer regions for further analyses. Among the 2709 SNPs, 906 were in coding regions, 299 were in intron regions, and 1504 were in intergenic spacers. The percentage of SNPs to total length was 1.14, 1.39, and 2.90%, respectively. In coding regions, the overall ratio of nonsynonymous mutations to synonymous mutations (dN/dS) was 3.04.
Table 5

Genomic distribution of biallelic single nucleotide polymorphic loci in the eight chloroplast genomes of diploid Gossypium plants.

Genome RegionLength (bp)Value%
Total substitutions162,23127091.67
Coding regions79,2449061.14
Non-synonymous/681.580.86
Synonymous/224.420.28
dN/dS/3.04/
Intron21,4432991.39
Intergenic spacer51,92015042.90

3.8. Mutation Rate of Chloroplast Genome

The divergence time of Gossypium was speculated to be 9.8 Mya from Theobroma [52,53]. The total length of the eight diploid cotton cp genomes was 156,796 bp. The mutation rate (μ1) of the protein-coding genes and the mutation rate (μ2) of the tRNA genes were calculated using the length of genomes, the number of indel mutations (522 for protein-coding genes and 133 for tRNA genes), and the divergent time. The value of μ1 of 0.34 × 10–9 per site per year for protein-coding genes and the value of μ2 of 0.09 × 10–9 per site per year for tRNA genes were obtained, respectively.

3.9. tRNA Duplication/Loss Events

Besides substitution, duplication and loss events of genes had a vital influence on their evolution. The analysis of duplication or loss events indicated that the investigated cotton chloroplast tRNA genes had experienced 226 duplication and 93 co-duplication events (Figure S1). However, only 63 loss events were observed (Figure S1, Table S5). The results showed that the duplication of genes was greater than the loss of genes for all of the tRNAs.

4. Discussion

The nucleotide composition of tRNA is closely related to its senior structure, which is responsible for the translation process. In many species, the tRNA family is conserved in evolution [56,57]. As one of the major gene components of semi-autonomous chloroplast, tRNAs were shown to have several basic conserved genomic features. tRNALeu and tRNASer were observed to have 80 nt or more. These two tRNA isotypes were also found to harbor more than 83 nt in Adoxaceae plants, which shows that the gene sequences of tRNASer and tRNALeu are longer than other tRNAs [58]. In addition, some tRNA iso-acceptors were not observed in the cp genome of Gossypium, which was similar to that of the species in Gramineae [28]. Additionally, this is perhaps related to the codon usage and wobble base pairing in the genomic constitution of plants [59]. Furthermore, selenocysteine and suppressor tRNAs were found lacking in the cotton cp genome, which were present in Oryza sativa, Sachharum officinarum L., Sorghum bicolor (L.) Moench, Triticum aestivum L., and Zea mays. This may be related to the biotoxin accumulation ability of these two tRNAs [29]. The differences between the quantity (36 or 37) and anti-codon types (28 or 29) of tRNAs in diploid and allotetraploid cotton species were not significant. This may be related to the stability and conservation of tRNA genes and the long cultivation of cotton species, which reduced the level of interspecific diversity [60,61]. The tRNA family was conserved in its genomic composition. In this study, most of the involved tRNAs were observed to harbor the conservative sequence U-U-C-X3-U at their Ψ-loop (Table 3). Additionally, in Adoxaceae, U-U-C-A conservative nucleotides at its Ψ-loop region were found [58]. This showed that the short consensus sequences might be important components of the molecular recognizer in the Ψ-loop, and are associated with the recognition of ribosomal RNA during the translation process [62]. In our study, we found that the acceptor arm of cotton chloroplast tRNA contained 6 to 7 nt; the D-arm contained 2 to 4 nt; the D-loop contained 7 to 11 nt; the anti-codon loops contained 7 and 9 nt; the Ψ-arms contained 5 nt; and the Ψ-loops had 7 nt. Additionally, in gymnosperm chloroplast genomes, the acceptor arm of tRNAs harbors 6 bp to 7 bp; the D-arm has 3 bp or 4 bp; and the D-loop contains 7 nt to 11 nt; the anti-codon loop contains 7 nt; the Ψ-arm contains 5 bp; and the Ψ-loop has 7 nt [63]. Our results and previous findings together suggest that chloroplast RNAs are significantly conserved, though a few tRNAs contain rare secondary structures [64,65]. In the present study, some tRNALeu, tRNASer, and tRNATyr were found in the variable region including a loop structure. The presence of this unconventional structure in the variable region exhibited the structural variation existing in tRNAs and this might be relevant to the maintenance of tRNA structures and the interaction with the D-arm and Ψ-arm [66]. Introns are previously reported in archaeal and eukaryotic genomes that break the continuity of numerous eukaryotic genes [67]. Here, the intron was observed in chloroplast tRNAs of G. populifolium. This is consistent with previous reports in other organisms [68,69]. In general, the group I intron has a few hundred nucleotides because of its built-in ribozyme unit [70]. tRNAs contain sequences of less than 100 polynucleotides that fold into a clover-type secondary structure [71]. Thus, the intron we observed in this study might be a part of intron I that was contained within the cotton chloroplast genome. Phylogenetic analysis revealed that the introns in chloroplast tRNAVal of Gossypium populifolium and Zea mays aggregated to the same branch (Figure 3), suggesting that the introns of corn and cotton have a close phylogenetic relationship. Additionally, chloroplast tRNAs with introns were grouped with cyanobacteria, providing supportive evidence for a common cyanobacterial lineage source of cp tRNAs [58]. In addition to the anti-codon CAU of tRNAMet, tRNAIle was observed to code the anti-codon CAU in the cotton chloroplast genome. This may be associated with the modification of tRNA [72]. Three types of tRNAs—tRNAfMet, tRNAMet, and tRNAkIle, with anti-codon CAU—were found in a bacterial genome, and tRNAkIle was able to identify the codon of isoleucine instead of methionine after anti-codon modification [73]. For the tRNAIle CAU observed in the genomes of plants, the first position in the anti-codon of these tRNAs has a lysidine-like nucleotide. On the other hand, a methionine-discerning anti-codon CAT was found in the genes. After the modification of the C residue of the CAT anti-codon, the mature tRNA shows isoleucine-identifying activity rather than methionine-identifying activity [74,75]. The presence of three obvious clusters and diverse groupings, and the appearance of tRNA groups from different plants with a wide range of taxonomic positions in integrated clade I and II, clade I and III, and clade I, II, and III of the phylogenetic tree, indicate frequent duplication and divergence during their evolution. In addition, tRNASer, tRNALeu, tRNAArg, tRNAVal, tRNAIle, tRNAMet, and tRNAGln were found in more than two integrated clades. This indicates their multiple evolutionary origins. Diverse groupings and the overlapping of tRNA groups from different plants suggest that the tRNAs have several inferred ancestors, including tRNAMet, tRNAIle, etc., in the evolutionary history [63]. The interlaced emergence of tRNA groups (tRNAMet, tRNAThr, tRNAArg, tRNATrp, tRNAVal, and tRNASer) also suggests that evolutionary relationships among these tRNA types are relatively close. Just as the cases with the most genetic variations, the transition rate was greater than the transversion rate of Gossypium cp tRNAs at the overall level. However, tRNATrp and tRNAHis showed different substitutive choices, which may be caused by various factors, such as their neighbor bases and the efficiency of the repair system of DNA strands [76]. The percentage of SNPs in coding regions, intron regions, and intergenic spacers compared to the total cotton cp genomes was 1.14, 1.39, and 2.90%, respectively, which implies that the density of SNPs was different in the cotton cp genome and the intergenic spacer of Gossypium was more variable than intron regions. This may be associated with the richness of A/T and G/C repetitive units in certain regions of the cp genome. It also implies that the evolutionary mutation potential of different positions in the genome is unbalanced [77]. In coding regions, the overall ratio of non-synonymous to synonymous mutations (dN/dS) was 3.04, showing that non-synonymous mutations had been fixed in the cotton cp genome. In the involved diploid Gossypium cp genomes, the mutation rate (μ1), 0.34 × 10–9 per site per year of protein-coding genes, was significantly higher than the mutation rate (μ2), 0.09 × 10–9 per site per year of tRNA genes (more than three times than that of μ2), which also implies that compared with protein-coding genes, tRNA genes and structures have considerable evolutionary stability [78]. In addition, it should be mentioned that the genomic data employed in our study still have limitations. Perhaps the addition of more diverse gene sequences would provide more interesting results. In concert with multiple factors of tRNA evolution, most important gene functions have evolved from gene duplication events [79,80]. In this study, there were about five times as many duplication events as loss events. Many studies have also shown that frequent duplication greatly promotes evolution and functional diversification in gene families [81,82]. This may provide further insights to confirm that the evolution of cotton tRNAs derived from polyphyletic evolutionary ancestors.

5. Conclusions

Cotton chloroplast tRNAs encode 28 or 29 anti-codon types, and 36 or 37 anti-codon-specific tRNAs with lengths ranging from 60 (tRNAGly) to 93 nucleotides (tRNASer). Thirty-three anti-codon types including AGC-tRNAAla are absent in the cp genome of Gossypium. The CAU anti-codon is encoded in both tRNAMet and tRNAIle. The acceptor arm of cotton chloroplast tRNA contains 6 to 7 nt; the D-arm contains 2 to 4 nt; the D-loop contains 7 to 11 nt; the anti-codon loop contains 7 and 9 nt; the Ψ-arms contains 5 nt; and the Ψ-loop has 7 nt. The Ψ-arm of cotton chloroplast tRNAs contains the G-G consensus sequence, and the Ψ-loop contains the conserved U-U-C-X3-U motifs. Additionally, cotton chloroplast tRNAs were found to contain introns and a few tRNALeu, tRNASer, and tRNATyr were observed to have a loop in the variable region. Furthermore, phylogenetic analysis suggests that tRNAs possibly have several inferred ancestors, including tRNAMet, tRNAIle, etc., in the evolutionary history. On the other hand, the average transition rate of all involved cp tRNAs was greater than their transversion rate. The density of SNPs was unbalanced and the intergenic spacer of Gossypium was more variable than intron regions. Gene duplication events (226 duplication and 93 co-duplication) have occurred more frequently than gene loss events (63) in cotton chloroplast tRNAs. These results provide helpful insights into the detailed characteristics and evolutionary variation of the tRNA family.
  76 in total

1.  A retrotransposon-mediated gene duplication underlies morphological variation of tomato fruit.

Authors:  Han Xiao; Ning Jiang; Erin Schaffner; Eric J Stockinger; Esther van der Knaap
Journal:  Science       Date:  2008-03-14       Impact factor: 47.728

2.  Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution.

Authors:  Fuguang Li; Guangyi Fan; Cairui Lu; Guanghui Xiao; Changsong Zou; Russell J Kohel; Zhiying Ma; Haihong Shang; Xiongfeng Ma; Jianyong Wu; Xinming Liang; Gai Huang; Richard G Percy; Kun Liu; Weihua Yang; Wenbin Chen; Xiongming Du; Chengcheng Shi; Youlu Yuan; Wuwei Ye; Xin Liu; Xueyan Zhang; Weiqing Liu; Hengling Wei; Shoujun Wei; Guodong Huang; Xianlong Zhang; Shuijin Zhu; He Zhang; Fengming Sun; Xingfen Wang; Jie Liang; Jiahao Wang; Qiang He; Leihuan Huang; Jun Wang; Jinjie Cui; Guoli Song; Kunbo Wang; Xun Xu; John Z Yu; Yuxian Zhu; Shuxun Yu
Journal:  Nat Biotechnol       Date:  2015-04-20       Impact factor: 54.908

3.  Structural Basis for tRNA Mimicry by a Bacterial Y RNA.

Authors:  Wei Wang; Xinguo Chen; Sandra L Wolin; Yong Xiong
Journal:  Structure       Date:  2018-10-11       Impact factor: 5.006

4.  Evolutionary rates of insertion and deletion in noncoding nucleotide sequences of primates.

Authors:  N Saitou; S Ueda
Journal:  Mol Biol Evol       Date:  1994-05       Impact factor: 16.240

5.  Genome-wide identification of auxin response factor (ARF) genes and its tissue-specific prominent expression in Gossypium raimondii.

Authors:  Runrun Sun; Kunbo Wang; Tenglong Guo; Don C Jones; Juliana Cobb; Baohong Zhang; Qinglian Wang
Journal:  Funct Integr Genomics       Date:  2015-03-26       Impact factor: 3.410

6.  Rare germinal unequal crossing-over leading to recombinant gene formation and gene duplication in Arabidopsis thaliana.

Authors:  J G Jelesko; R Harper; M Furuya; W Gruissem
Journal:  Proc Natl Acad Sci U S A       Date:  1999-08-31       Impact factor: 11.205

7.  Conformational preferences of hypermodified nucleoside lysidine (k2C) occurring at "wobble" position in anticodon loop of tRNA(Ile).

Authors:  Kailas D Sonawane; Ravindra Tewari
Journal:  Nucleosides Nucleotides Nucleic Acids       Date:  2008-10       Impact factor: 1.381

8.  Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.

Authors:  Matthew Kearse; Richard Moir; Amy Wilson; Steven Stones-Havas; Matthew Cheung; Shane Sturrock; Simon Buxton; Alex Cooper; Sidney Markowitz; Chris Duran; Tobias Thierer; Bruce Ashton; Peter Meintjes; Alexei Drummond
Journal:  Bioinformatics       Date:  2012-04-27       Impact factor: 6.937

Review 9.  Disrupted tRNA Genes and tRNA Fragments: A Perspective on tRNA Gene Evolution.

Authors:  Akio Kanai
Journal:  Life (Basel)       Date:  2015-01-26

10.  Genome-Wide Identification of R2R3-MYB Genes and Expression Analyses During Abiotic Stress in Gossypium raimondii.

Authors:  Qiuling He; Don C Jones; Wei Li; Fuliang Xie; Jun Ma; Runrun Sun; Qinglian Wang; Shuijin Zhu; Baohong Zhang
Journal:  Sci Rep       Date:  2016-03-24       Impact factor: 4.379

View more
  1 in total

1.  Plastome evolution of Aeonium and Monanthes (Crassulaceae): insights into the variation of plastomic tRNAs, and the patterns of codon usage and aversion.

Authors:  Shiyun Han; Ran Yi; Hengwu Ding; Longhua Wu; Xianzhao Kan
Journal:  Planta       Date:  2022-07-09       Impact factor: 4.540

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.