Literature DB >> 22888294

Molecular evolution of the ent-kaurenoic acid oxidase gene in Oryzeae.

Abstract

We surveyed the substitution patterns in the ent-kaurenoic acid oxidase (KAO) gene in 11 species of Oryzeae with an outgroup in the Ehrhartoidaea. The synonymous and non-synonymous substitution rates showed a high positive correlation with each other, but were negatively correlated with codon usage bias and GC content at third codon positions. The substitution rate was heterogenous among lineages. Likelihood-ratio tests showed that the non-synonymous/synonymous rate ratio changed significantly among lineages. Site-specific models provided no evidence for positive selection of particular amino acid sites in any codon of the KAO gene. This finding suggested that the significant rate heterogeneity among some lineages may have been caused by variability in the relaxation of the selective constraint among lineages or by neutral processes.

Entities: Chemical Disease Gene Species

Keywords: codon usage bias; ent-kaurenoic acid oxidase (KAO); positive selection; rate heterogeneity; substitution rate

Year: 2012 PMID： 22888294 PMCID： PMC3389533 DOI： 10.1590/S1415-47572012005000020

Source DB: PubMed Journal: Genet Mol Biol ISSN： 1415-4757 Impact factor: 1.771

Introduction

Gibberellins (GAs) are an important class of plant hormones involved in the regulation of various growth and developmental processes in higher plants (Appleford ). The absence of GAs results in dwarfism in some plant species. ent-kaurenoic acid oxidase (KAO), a member of the CYP88A subfamily of cytochrome P450 enzymes, catalyzes a three-step reaction in the gibberellin biosynthetic pathway from ent-kaurenoic acid to GA12 (Helliwell ). A primary goal of molecular evolutionary studies is to estimate the rate of DNA mutation and elucidate the mechanisms of molecular evolution. Such studies frequently involve a comparison of orthologous DNA fragments among species to determine evolutionary rates and an assessment of the evolutionary processes involved, e.g., natural selection, rate heterogeneity of lineages and mutational biases. Analysis of the molecular evolutionary patterns of different genes provides understanding of the evolutionary processes and pressures experienced by particular lineages. The tribe Oryzeae (Poaceae) includes approximately 12 genera and more than 70 species distributed throughout tropical and temperate regions of the world (Clayton and Renvoize, 1986; Vaughan, 1994). In the genus Oryza, the Asian cultivated rice (Oryza sativa L.) is one of the world’s most important crops and a primary food source for more than one-half of the world’s population (Chandler and Wessler, 2001). This species has become a model monocotyledon in scientific research and its entire genome has been sequenced. Other members of the Oryzeae are also of economic importance, including wild species of Oryza that can be used in the genetic improvement of rice. Analysis of the substitution patterns in the KAO gene can provide insights into the driving forces that have led to evolutionary change in this gene in Oryzeae. In addition, the identification of patterns of molecular evolution in the KAO gene can improve our understanding of the evolutionary history of some Oryzeae species. In this work, we examined the heterogeneity of the substitution rate in the KAO gene among various genera and species of Oryzeae and sought to identify the possible causes of such heterogeneity. We also sought for evidence of natural selection in the exon regions of the KAO gene.

Materials and Methods

Plant material

A portion of the KAO gene was isolated and sequenced from members of the rice tribe (Oryzeae) (Table 1). Eleven diploid species were selected to represent the major phylogenetic lineages of Oryzeae (Figure S1, Supplementary Material) (Guo and Ge, 2005). These consisted of seven Oryza species representing six diploid genome types, namely, Oryza sativa (AA), O. meridionalis (AA), O. punctata (BB), O. officinalis (CC), O. australiensis (EE), O. brachyantha (FF), O. granulata (GG), and one species from each of four other genera in the tribe Oryzeae (Leersia tisserantti, Chikusichloa aquatica, Luziola leiocarpa, and Rhynchoryza subulata) (Table 1). Ehrharta erecta, a species in the tribe Ehrhartoideae, which is a sister tribe to the Oryzeae, was used as an outgroup (GPWG, 2001; Guo and Ge, 2005). Plastid, mitochondrial and nuclear gene sequences have been used to establish the phylogeny of the Oryzeae (Ge ; Guo and Ge, 2005; Tang ) and have provided an important framework for the study of molecular evolution in this group (Figure S1, Supplementary Material).

Table 1

Species used in this study.

Species	Genome	Accession^a	Country
Oryza sativa	A	japonica	GenBank
O. meridionalis	A	105282	Australia
O. punctata	B	103903	Tanzania
O. officinalis	C	104972	China
O. australiensis	E	105263	Australia-PNAS
O. brachyantha	F	105151	Sierra Leone-PNAS
O. granulata	G	M8-15	Ledong, Hainan
Leersia tisserantti	—	105610	Cameroon
Chikusichloa aquatica	—	106186	Japan
Rhynchoryza subulata	—	100913	Argentina
Luziola leiocarpa	—	82043	Argentina
Ehrharta erecta	—	218290	South Africa

All accessions were obtained from the International Rice Research Institute at Los Banos, Philippines.

DNA extraction, amplification and sequencing

Total DNA was isolated from silica-gel dried or fresh leaves as described by Ge . A 1–2 kb fragment of the KAO gene containing several exons and introns was obtained by using the polymerase chain reaction (PCR) in conjunction with the forward primer KAOF (5′-CAGGA CGTTCATGTTCAGCAG-3′) and the reverse primers KAOR1 (5′-TCGTCGCCAAGCAGTTGTC-3′) and KAOR2 (5′-GCCAAGCAGTTGTCCAC-3′) (Figure 1). The PCR was done in a total volume of 25 μL that contained 5–50 ng of template DNA, 0.2 μM of each primer, 200 μM of each dNTP, 10 mM Tris-HCl (pH 8.3), 50 mM KCl, 1.5 mM MgCl2 and 0.75 U of ExTaq DNA polymerase (TaKaRa, Shiga, Japan). Amplifications were done in a T gradient 96 U thermocycler (Biometra, Göttingen, Germany) as follows: 3 min at 94 °C, followed by 33 cycles of 30 s at 94 °C, 30 s at 56 °C, 2.5 min at 72 °C and a final extension at 72 °C for 10 min. Further internal primers used for sequencing were: KAO707F 5′-ACCGTCTTCCTCC AGGAGAAC-3′ (Tm = 61.9 °C), KAO931F 5′-GATGCACTTCCTCTCACAG-3′ (Tm = 57.6 °C) and KAO1478F 5′-CGTCAACATCTCCTTCGTGTC-3′ (Tm = 60 °C) (Yang ). All of the sequences were deposited in GenBank under accession numbers EF577665-EF577670 and EU179429-EU179435 (Table 2).

Figure 1

Schematic diagram of the KAO gene and the regions sequenced in this study. Boxes and lines indicate exons and introns, respectively. Exon numbers are labeled with the roman numbers. Locations of primers are shown above the diagram.

Table 2

Information for the KAO gene sampled in this study.

Species	Length sequenced (bp)		Coding					Noncoding	Accession number
								Noncoding
	Total	Coding	ENC^a	CBI	GC	GC1,2	GC3	GC
Oryza sativa	2231	1053	35.02	0.693	0.600	0.492	0.818	0.307	AP004572^b
O. meridionalis	1819	1053	35.60	0.678	0.598	0.492	0.818	0.334	EU179429
O. punctata	1833	1053	35.13	0.685	0.597	0.486	0.820	0.333	EF577665
O. officinalis	1844	1053	39.10	0.641	0.600	0.493	0.815	0.328	EF577666
O. australiensis	1867	1053	39.27	0.637	0.602	0.495	0.818	0.334	EF577667
O. brachyantha	2626	1053	39.20	0.642	0.606	0.498	0.823	0.334	EF577668
O. granulata	1808	1053	37.72	0.662	0.612	0.501	0.832	0.336	EF577669
Leersia tisserantti	1775	1053	48.56	0.405	0.565	0.489	0.718	0.327	EF577670
Luziola leiocarpa	1826	1050	38.67	0.636	0.612	0.503	0.831	0.336	EU179408
Chikusichloa aquatica	1772	1047	42.48	0.568	0.598	0.490	0.814	0.338	EU179409
Rhynchoryza subulata	1790	1047	42.02	0.569	0.595	0.490	0.805	0.328	EU179410
Ehrharta erecta	2363	1026	53.65	0.390	0.541	0.451	0.723	0.324	EU179411
Mean ± SE^e	1962.83 ± 81.51	1049.50 ± 2.24	40.54 ± 1.61	0.601 ± 0.030	0.594 ± 0.006	0.490 ± 0.004	0.803 ± 0.011	0.330 ± 0.002

ENC – effective number of codons (Wright, 1990), CBI – codon bias index, GC1, 2 is G+C content at the first and second codon positions.

Sequences downloaded from GenBank.

Average for 11 species of Oryzeae.

Sequence analysis

Sequences were aligned using ClustalX v.1.81 (Thompson ) and refined by manual adjustment based on the predicted amino acid sequence. The amino acid sequences (excluding introns) were sufficiently conserved across the 12 species to provide unambiguous alignments. We examined the possibility of sequence saturation using DAMBE v.4.5.45 (Xia and Xie, 2001). Pairwise synonymous and non-synonymous substitutions per site (dS and dN) among the 11 species were estimated for the coding regions of the KAO gene. The extent of codon usage bias often reflects the degree of selective constraint in a gene (Sharp, 1991; Sharp ). To measure the extent of codon usage bias, we estimated the effective number of codons (ENC) and codon bias index (CBI) using DnaSP v.4.10.9 (Rozas and Rozas, 1999). The ENC values range from 20 (only one codon is used for each amino acid, i.e., the codon bias is maximal) to 61 (all synonymous codons for each amino acid are equally used, i.e., there is no codon bias) (Wright, 1990). The CBI values range from 0 (uniform use of synonymous codons) to 1 (maximum codon bias) (Morton, 1993). Variation in the rate of synonymous substitution among genes may be related to codon use (Sharp, 1991). Therefore, several parameters related to codon usage bias, such as the GC content at the first and second codon positions (GC1, 2), as well as third codon positions (GC3), were also estimated using DnaSP v.4.10.9 (Rozas and Rozas, 1999).

Detecting rate heterogeneity among lineages

The relative-rate test based on the method of Muse and Gaut (1994), as implemented in Hyphy (Pond ), was used to detect variation in the synonymous and non-synonymous substitution rates along different lineages, with Ehrharta erecta as the reference sequence. This method examines substitution rates between two lineages with reference to a third outgroup lineage. In the first model, the two related taxa from the most recent common ancestor are constrained to have the same substitution rate. In the second model, the two lineages may have different substitution rates. A likelihood ratio test is used to test which of the models best explains the data (Muse and Gaut, 1994).

Detection of positive selection

The ratio ω (dN/dS) provides an effective means of detecting selection or selective pressure on a gene or gene region, with ω < 1, = 1 and > 1 indicating negative selection, neutral evolution and positive selection, respectively (Yang, 2006). We ran likelihood-based analyses using the CODEML program of PAML 4 (Yang, 2007) to explore the selective processes acting on the KAO gene. First, we used the branch models to examine whether the evolutionary rates differed among lineages within the gene tree. The one ratio model (M0) assumes a single ω for all branches and all sites. However, the free ratio model (Mf) postulates an independent ω ratio for each branch of the tree. A likelihood ratio test (LRT) was used to decide whether there was a significant difference between M0 and Mf. The model with the higher likelihood value was assumed to be the better model (Bielawski and Yang, 2003; Yang and Nielsen, 1998). We next used site-specific models to detect whether particular amino acid residues were subject to positive selection (Yang, 2006). The neutral model (M1a) classifies all of the sites into two categories, i.e., strict constraint (0 < ω < 1) (purifying selection) and neutral (ω = 1). Based on M1a, the positive selection model (M2a) assumes a third category under positive selection (ω > 1). The beta model (M7) assumes a beta distribution for the ω ratios over sites, and the beta and ω model (M8) increases the independent ratio estimated by the data. M8 and M2a assume positive selection and are compared with M7 and M1a, respectively. If the LRT is significant and there is a site with ω > 1 then positive selection is invoked for the gene (Bielawski and Yang, 2003; Yang, 2006).

Results and Discussion

Previous studies showed that the KAO gene was a single-copy gene (Helliwell ; Sakamoto ; Yamaguchi, 2008) and the loss-of-function mutant exhibits a typical phenotype, indicating the functional importance of this enzyme in GA biosynthesis (Sakamoto ). In view of the importance of comparing orthologous rather than paralogous genes when estimating substitution rates, we initially examined this issue and found that the KAO gene was orthologous in all of the species analyzed. The similarity of the aligned coding regions ranged from 87.5% to 99.5% (Figure S2, Supplementary Material). Sequences of the KAO gene were isolated from all of the Oryzeae species and from the outgroup, Ehrharta erecta. The sequenced regions ranged in size from 1772 bp to 2626 bp and their aligned coding regions varied from 1047 bp to 1053 bp (Table 2). The total GC content and the GC content of the third position of the codons (GC3) were similar across species. Table 2 summarizes the sequence data for this gene.

Codon usage bias and its correlation with GC3 and substitution rates

Codon usage bias has been important in studies of molecular evolution because it provides examples of weak selection at the molecular level. CBI and ENC were calculated to measure the degree of codon usage bias. CBI showed a marked negative correlation with ENC (r2 = 0.958, p < 0.0001) (Figure 2A) such that both CBI and ENC could be used to measure the degree of codon usage bias. In this study, ENC was used to measure the degree of codon usage bias.

Figure 2

The relationships between effective number of codons (ENC) and codon bias index (CBI) (A), synonymous substitution rates (dS) (B), and non-synonymous substitution rates (dN) (C), between dS and dN (D) and third codon positions (GC3) (E), and between the first and second codon positions (GC1, 2) and GC3 (F).

To determine the relative effects of mutation pressure versus natural selection on codon composition, we examined the relationship between the GC content at third codon positions (GC3) and the GC content at the first and second codon positions (GC1,2). The GC content of GC1,2 ranged from 48.9% to 50.3%, which there was a tendency of positive correlation with GC3 (r2 = 0.227) but this was not significant (p = 0.139) (Figure 2F). This pattern of base composition suggests that the GC content is most likely the result of mutation pressure since natural selection acts differently on different codon positions (Shackelton ). Interestingly, after excluding L. tisserantti, GC1,2 showed a significant positive correlation with GC3 (r2 = 0.604, p < 0.05) (data not shown), which further confirmed that these changes were most likely the result of mutation pressure. dS was positively correlated with dN (r2 = 0.498, p < 0.05) (Figure 2D), as also observed in other organisms (Bielawski ; Dunn ; Hurst and Williams, 2000; Kusumi ), and negatively correlated with codon bias (r2 = 0.713, p < 0.05) (Figure 2B) and GC3 (r2 = 0.796, p < 0.001) (Figure 2E). The negative correlation between dS and codon usage bias may be explained by natural selection (Bielawski ; Smith and Eyre-Walker, 2001; Urrutia and Hurst, 2001) since codon usage bias is a primary factor in dS variation among genes and is thought to be under natural selection, perhaps because of the need to maintain accuracy or speed in translation (Yang and Gaut, 2011). There was also a tendency for dN being negatively correlated with codon usage bias (r2 = 0.348) but this was not significant (p = 0.056) (Figure 2C). The latter would be consistent with sites that are functionally constrained and consequently conserved at the amino acid level. Such sites are also likely to experience stronger selection for translation accuracy and hence have a higher codon bias (Akashi, 2003). This might explain the negative correlation between dN and codon bias observed here (though not significant), and by others in enteric bacteria (Rocha, 2004; Sharp, 1991), Drosophila (Betancourt and Presgraves, 2002), yeast (Drummond ), and viruses (Duffy ). The fact that dN is correlated to codon bias suggests that codon bias might be used as a measure of the level of constraint upon a site or gene (Plotkin , 2006; Stoletzki and Eyre-Walker, 2007).

The driving forces governing evolution of the KAO gene in Oryzeae

A codon-based approach showed that the free ratio model (Mf) had significantly higher likelihood scores (ln4103.38) than the one ratio model (M0) (ln4124.44) (p < 0.001) (Table 3). Although the dN/dS ratios varied across lineages from 0.0001 to 0.358 (with one of the 21 lineages showing no predicted synonymous substitutions, i.e., the dN/dS ratio was equal to 999.000), the estimated dN/dS ratio for each lineage was less than 1. The ω values were estimated to be 0.079 under the M0 model, suggesting that purifying selection or selection constraint best explained the molecular evolution of the KAO gene, in agreement with the studies on anthocyanin pathway genes (Lu and Rausher, 2003; Rausher ).

Table 3

Log likelihood values, ω ratios and parameter estimates for the KAO gene in models with variable ω ratios among codon sites.

Model	p^a	ln	Parameter estimates^b	Models compared	2ΔL	p-value
Mf	31	−4103.38	ω = 0.0001 ∼ 999.000, tree length^c = 2.140, kappa(ts/tv) = 1.103	M0–Mf	42.12	< 0.001
M0	23	−4124.44	ω = 0.079, tree length = 2.181, kappa (ts/tv) = 1.082
M1a	24	−4067.60	ω₀ = 0.049, p₀ = 0.921; ω₁ = 1.000, p₁ = 0.079	M1a–M2a	0	1
M2a	26	−4067.60	p₀ = 0.921, p₁ = 0.053, p₂ = 0.026, ω₂ = 1.000
M7	24	−4061.19	p = 0.282, q = 2.548	M7–M8	0	1
M8	26	−4061.19	p₀ = 1.000, p = 0.282, q = 2.548; p₁ = 0.000, ω = 8.931

p – number of parameters, ln – log-likelihood values of the data in each model.

Parameter estimates in different models.

Tree length is the sum of branch lengths.

The branch model test is a very conservative test of positive selection because it averages the ratio across all sites. We therefore used site-specific codon models to examine whether there was positive selection on codon sites. The M2a and M8 models, which assume positive selection, were not significantly better than the null models M1a and M7 (for M1a vs. M2a, 2ΔL = 0, p = 1.0; for M7 vs. M8, 2Δ = 0, p = 1.0) (Table 3). These results indicate that the KAO gene is under strong selective constraint, thus ruling out the possibility of past episodes of positive selection on this gene. Previous studies have shown that variation in the evolutionary rate among nucleotide sites may be attributed to differences in the frequency of positive selection (Yang ; Gaut ) or in the magnitude of selective constraints (Li, 1997; Rausher , 2008). In this study, the branch and codon models failed to detect any sign of positive selection for any lineage and codon of the KAO gene, suggesting that the significant heterogeneity of some lineages was attributable mainly to the relaxed constraint among lineages or neutral processes rather than positive selection. However, the power to detect positive selection using the methods mentioned above may be low, especially when adaptive substitutions are spread across many amino acid sites (Pond ; Rausher ). Further investigations with alternative tests on intraspecific changes (Olsen ; Whitt ; Flowers ; Rausher ) would be necessary to detect evidence of positive selection.

Rate variation among lineages

There was significant heterogeneity in the synonymous and non-synonymous substitution rates of the KAO gene among lineages of the rice tribe (Table 4), especially in C. aquatica and L. leiocarpa. Among 55 relative-rate tests for synonymous substitutions, 11 comparisons were significant at the 5% or 1% level. At the same time, among 55 relative-rate tests for non-synonymous substitutions, the null hypothesis of rate homogeneity was rejected for 18 comparisons. In C. aquatica and L. leiocarpa dN appeared to be decelerated, and did dS in C. aquatica. The significant slowdown in the rate of synonymous and non-synonymous substitutions in C. aquatica and L. leiocarpa lineages may reflect differences in the intensity of selection, i.e., the KAO gene may be under different functional constraints in different lineages.

Table 4

Results of 110 relative-rate tests for dS (lower triangle) and dN (upper triangle). Rejection of rate equality is indicated by * at the 0.05 level, ** at the 0.01 level, or *** at the 0.001 level. Ehrharta erecta was used as the outgroup in all comparisons. Species names that were inferred to have evolved more quickly in each pairwise comparison are indicated in the table by the first letter of the genus name and the first three letters of the species name.

	Osat	Omer	Opun	Ooff	Oaus	Obra	Ogra	Ltis	Llei	Caqu	Rsub

Osat	-								***Osat	**Osat
Omer		-				*Omer	*Ogra		***Omer	***Omer
Opun			-			*Opun			***Opun	**Opun
Ooff				-					**Ooff	**Ooff
Oaus					-				**Oaus	*Oaus
Obra						-			*Obra
Ogra							-		*Ogra
Ltis								-	**Ltis	*Ltis
Llei									-
Caqu	***Osat	***Omer	***Opun	**Ooff	**Oaus	**Obra	***Ogra	***Ltis	**Llei	-	* Rsub
Rsub							*Ogra			*Rsub	-

Caqu – Chikusichloa aquatica, Llei – Luziola leiocarpa, Ltis – Leersia tisserantti, Oaus – O. australiensis, Obra – O. brachyantha, Ogra – O. granulata, Omer – O. meridionalis, Ooff – O. officinalis, Opun – O. punctata, Osat – O. sativa and Rsub – Rhynchoryza subulata.

Several mechanisms could explain the observed rate heterogeneity, including life history traits such as generation time, biochemical features such as efficiency of DNA repair machinery, and environmental variables such as energy and temperature (Eyre-Walker and Gaut, 1997; Li, 1997; Brown ; Soria-Hernanz ). Rate heterogeneity may also result from differences in population size since variation in population size can alter evolutionary rates within a lineage (Eyre-Walker and Gaut, 1997; Lynch and Conery, 2003) and vice versa. Variation in the nucleotide substitution rates of the KAO gene significantly changed the ω ratios of the respective lineages. These features of the KAO gene in Oryzeae resulted from the influence of various factors that affected the evolution of these species and their ancestors. A detailed knowledge of these factors will help us to understand the evolutionary history of Oryzeae species.

Conclusions

The results of this study showed that codon usage bias was negatively correlated with synonymous and non-synonymous substitution rates, a finding consistent with the importance of codon usage. CBI was positively correlated with ENC, thus confirming the similarity of CBI and ENC as parameters for measuring the degree of codon usage bias. There was considerable heterogeneity in the nucleotide substitution rates of the KAO gene and this significantly affected the ω ratios of the respective lineages. There was no positive selection and no positively selected codons in this gene, a finding indicative of substantial selective constraint. These features of nucleotide substitutions in the KAO gene reflected the influence of various factors on the evolution of many Oryzeae species and their ancestors.

48 in total

1. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis.

Authors: J Rozas; R Rozas
Journal: Bioinformatics Date: 1999-02 Impact factor: 6.937

2. Covariation of GC content and the silent site substitution rate in rodents: implications for methodology and for the evolution of isochores.

Authors: L D Hurst; E J Williams
Journal: Gene Date: 2000-12-30 Impact factor: 3.688

Review 3. Maximum likelihood methods for detecting adaptive evolution after gene duplication.

Authors: Joseph P Bielawski; Ziheng Yang
Journal: J Struct Funct Genomics Date: 2003

4. Increased rates of molecular evolution in an equatorial plant clade: an effect of environment or phylogenetic nonindependence?

Authors: Jeremy M Brown; Gregory B Pauly
Journal: Evolution Date: 2005-01 Impact factor: 3.694

5. Codon usage and selection on proteins.

Authors: Joshua B Plotkin; Jonathan Dushoff; Michael M Desai; Hunter B Fraser
Journal: J Mol Evol Date: 2006-10-14 Impact factor: 2.395

6. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome.

Authors: S V Muse; B S Gaut
Journal: Mol Biol Evol Date: 1994-09 Impact factor: 16.240

7. Synonymous codon usage in Escherichia coli: selection for translational accuracy.

Authors: Nina Stoletzki; Adam Eyre-Walker
Journal: Mol Biol Evol Date: 2006-11-13 Impact factor: 16.240

8. Contrasting evolutionary forces in the Arabidopsis thaliana floral developmental pathway.

Authors: Kenneth M Olsen; Andrew Womack; Ashley R Garrett; Jane I Suddith; Michael D Purugganan
Journal: Genetics Date: 2002-04 Impact factor: 4.562

Review 9. Determinants of DNA sequence divergence between Escherichia coli and Salmonella typhimurium: codon usage, map position, and concerted evolution.

Authors: P M Sharp
Journal: J Mol Evol Date: 1991-07 Impact factor: 2.395

10. Statistical methods for detecting molecular adaptation.

Authors:
Journal: Trends Ecol Evol Date: 2000-12-01 Impact factor: 17.712