Literature DB >> 30408035

Phylogenetic relationships in the genus Avena based on the nuclear Pgk1 gene.

Yuanying Peng1, Pingping Zhou1,2, Jun Zhao1, Junzhuo Li1, Shikui Lai1, Nicholas A Tinker3, Shu Liao1, Honghai Yan1,2.   

Abstract

The phylogenetic relationships among 76 Avena taxa, representing 14 diploids, eight tetraploids, and four hexaploids were investigated by using the nuclear plastid 3-phosphoglycerate kinase gene (Pgk1). A significant deletion (131 bp) was detected in all the C genome homoeologues which reconfirmed a major structural divergence between the A and C genomes. Phylogenetic analysis indicated the Cp genome is more closely related to the polyploid species than is the Cv genome. Two haplotypes of Pgk1 gene were obtained from most of the AB genome tetraploids. Both types of the barbata group showed a close relationship with the As genome diploid species, supporting the hypothesis that both the A and B genomes are derived from an As genome. Two haplotypes were also detected in A. agadiriana, which showed close relationships with the As genome diploid and the Ac genome diploid, respectively, emphasizing the important role of the Ac genome in the evolution of A. agadiriana. Three homoeologues of the Pgk1 gene were detected in five hexaploid accessions. The homoeologues that might represent the D genome were tightly clustered with the tetraploids A. maroccana and A. murphyi, but did not show a close relationship with any extant diploid species.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 30408035      PMCID: PMC6224039          DOI: 10.1371/journal.pone.0200047

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The genus Avena L. belongs to the tribe Aveneae of the grass family (Poaceae). It contains approximately 30 species [1-4] reflecting a wide range of morphological and ecological diversity over the temperate and subtropical regions [5]. The evolutionary history of Avena species has been discussed for decades, and remains a matter of debate despite considerable research effort in this field. Cytologically, three ploidy levels are recognized in the genus Avena: diploid, tetraploid, and hexaploid, with a base number of seven chromosomes [6, 7]. The diploids are divided clearly into two distinct lineages with the A and C genomes. All hexaploid species share the same genomic constitution of ACD, corroborated by fertile interspecific crosses among each other, as well as by their similar genome sizes [8]. With less certainty, the tetraploids have been designated as AB or AA, AC or DC, and CC genomes [9]. It is noteworthy that the B and D genomes within the polyploid species have not been identified in any extant diploid species. There are three C genome diploid species, which have been grouped into two genome types (Cp and Cv) according to their karyotypes [10]. Both types show a high degree of chromosome affinity to the polyploid C genome [9-14], but none have been undisputedly identified as the C genome progenitor of the polyploids. The A genome origin of polyploid oats has also been under intense scrutiny. However, there is no conclusive evidence regarding which the A genome diploid contributed to the polyploid oats. There are up to 12 species designated as A genome diploids. These species have been further subdivided into five sub-types of Ac, Ad, Al, Ap and As genomes, according to their karyotypes [6, 7]. Most research based on karyotype comparisons [6, 15], in situ hybridization [11, 16–18], as well as the alignments of nuclear genes [13, 14] suggest that one of the As genome species may be the A genome donor of polyploid oats. Alternatively, some studies have proposed the Ac genome diploid A. canariensis [19], or the Al genome diploid A. longiglumis [9, 12] as the most likely A genome donor. The absence of diploids with the B and D genomes complicates the B and D genome donor identification. It is generally accepted that both B and D genomes are derived from A genomes, due to the high homology between the B and A genomes [11, 20], as well as between the D and A genomes [16, 19, 21]. Our recent study based on high-density genotyping-by-sequencing (GBS) markers [9] provided strong evidence that the three tetraploid species formerly designated as AC genomes are much closer to the C and D genomes of the hexaploids than they are to the hexaploid A genome. These findings suggest that the hexaploid D genome exists in the extant tetraploids. However, no extant diploid species, even the Ac genome diploid A. canariensis, which was considered as the most likely D genome progenitor based on direct evidence from morphological features [22] and indirect evidence from fluorescent in situ hybridization (FISH) [18], showed enough similarity to the D genome of tetraploid and hexaploid oats to warrant consideration as a direct D genome progenitor. In the case of the B genome, an initial study of chromosome pairing of hybrids between the AB genome tetraploids and the As genome diploids suggested that the B genome arose from the As genome through autoploidization [23]. This hypothesis was supported by another GBS study [19], which showed that the AB genome tetraploid species fell into a tight cluster with As genome diploids. However, other evidence from C-banding [24], FISH [17], RAPD markers [25], and DNA sequence alignment [14] has indicated a clear distinction between A and B genomes, suggesting an allotetraploid origin of the AB genome tetraploid species. The most probable A genome progenitor of the AB genome tetraploids is assumed to be an As genome diploid species, while the B genome of these species remains controversial. Single or low copy nuclear genes are widely used in phylogenetic analyses due to their bi-parental inheritance and to the informativeness of mutations. Such studies have successfully revealed multiple polyploid origins, and clarified hybridization events in a variety of plant families [26, 27]. In a previous study [14], we investigated the relationships among Avena species by sequencing the single-copy nuclear Acetyl-CoA carboxylase gene (Acc1). The results provided some useful clues to the relationships of Avena species. The Pgk1 gene, which encodes the plastid 3-phosphoglyceratekinase, is another nuclear gene that has been widely used to reveal the evolutionary history of the Triticum/Aegilops complex due to its single copy status per diploid chromosome in grass [26, 28, 29]. The Pgk1 gene is now considered to be superior to the Acc1 gene in phylogenetic analysis, since it has more parsimony informative sites than the Acc1 gene [26, 29]. In the present study, we sequenced cloned Pgk1 gene copies from 76 accessions representing the majority of Avena species, in an attempt to further clarify evolutionary events in this important genus.

Materials and methods

Plant materials

A total of 76 accessions from26 Avena species were investigated to represent the geographic range of six sections in Avena, together with one accession from Trisetopsis turgidula as a functional outgroup (Table 1). All seeds were provided by Plant Gene Resources of Canada (PGRC) or the National Small Grains Collection, Agriculture Research Service, United States Department of Agriculture (USDA, ARS) with the exception of the three accessions of A. insularis, which were kindly provided by Dr. Rick Jellen, Brigham Young University, Provo, UT, USA. The species A. atherantha, A. hybrida, A. matritensis and A. trichophylla described in Baum’s [1] monograph and A. prostrata described by Ladizinsky [30] were not included due to a lack of viable material.
Table 1

List of materials used in the present study including species, haplomes, accession number, origin, the number of sequenced clones, abbreviation displayed in MJ network, and the sequence number in Genbank (https://www.ncbi.nlm.nih.gov).

TaxaHaplomesAccessionNumberOrigin*Number ofsequenced clonesAbbrev-iationGenbank Accession
Section Ventricosa
A. clauda Dur.CpCN 19242Turkey6CLA1_1KU888786
CN 21378Greece2CLA2_1KU888787
CN 21388Algeria2CLA3_1KU888804
CN 24695Turkey2CLA4_1KU888784
A. eriantha Dur.(syn A. pilosa Bieb.)CpCIav 9050United Kingdom2ERI1_1KU888785
PI 367381Madrid, Spain2ERI2_1KU888805
A. ventricosa Balansa ex Coss.CvCN 21405Algeria2VEN1_1KU888806
CN 39706Azerbaijan2VEN2_1KU888807
Section Agraria
A. brevis RothAsCIav 1783German1BRE1_1KU888707
CIav 9113Europe2BRE2_1KU888718
PI 258545Portugal1BRE3_1KU888710
A. hispanica Ard.AsCN 25676Portugal2HIS1_1KU888714
CN 25727Portugal2HIS2_1KU888711
CN 25766Portugal2HIS3_1KU888709
CN 25778Portugal1HIS4_1KU888712
A. nuda L.AsPI 401795Netherlands2NUD1_1KU888734
A. strigosa Schreb.AsPI 83722Australia6STR1_1KU888719
PI 158246Lugo, Spain2STR2_1KU888713
CIav 9066Ontario, Canada3STR3_1KU888708
Section Tenuicarpa
A. agadiriana Baum & FedakABCN 25837Africa: Morocco5AGA1_1KU888753
AGA1_2KU888774
CN 25854Africa: Morocco4AGA2_1KU888777
AGA2_2KU888754
CN 25856Africa: Morocco3AGA3_1KU888776
AGA3_2KU888751
CN 25863Africa: Morocco3AGA4_1KU888775
CN 25869Africa: Morocco4AGA5_1KU888752
AGA5_2KU888778
A. atlantica Baum & FedakAsCN 25849Africa: Morocco2ATL1_1KU888757
CN 25859Africa: Morocco1ATL2_1KU888756
CN 25864Africa: Morocco2ATL3_1KU888739
CN 25887Africa: Morocco2ATL4_1KU888737
CN 25897Africa: Morocco1ATL5_1KU888736
A. barbata Pott ex LinkABPI 296229Northern, Israel5BAR1_1KU888723
PI 337802Izmir, Turkey8BAR2_1KU888722
BAR2_2KU888732
PI 337826Greece6BAR3_1KU888720
PI 282723Northern, Israel6BAR4_1KU888729
PI 337731Macedonia, Greece8BAR5_1KU888731
PI 367322Beja, Portugal6BAR6_1KU888730
A. canariensis Baum et alAcCN 23017Canary Islands6CAN1_1KU888779
CN 23029Canary Islands2CAN2_1KU888782
CN 25442Canary Islands1CAN3_1KU888780
CN 26172Canary Islands2CAN4_1KU888783
CN 26195Canary Islands2CAN5_1KU888781
A. damascena Rajah & BaumAdCN 19457Syria1DAM1_1KU888744
CN 19458Syria2DAM2_1KU888745
CN 19459Syria2DAM3_1KU888747
A. hirtula Lag.AsCN 19530Antalya, Turkey2HIR1_1KU888738
CN 19739Algeria2HIR2_1KU888762
CN 21703Morocco2HIR3_1KU888717
A. longiglumis Dur.AlCIav 9087Oran, Algeria6LON1_1KU888741
CIav 9089Libya2LON2_1KU888749
PI 367389Setubal, Portugal1LON3_1KU888750
A. lusitanica BaumAsCN 25885Morocco1LUS1_1KU888746
CN 25899Morocco1LUS2_1KU888748
CN 26265Portugal2LUS3_1KU888742
CN 26441Spain2LUS4_1KU888763
A. wiestii Steud.AsPI 53626Giza, Egypt2WIE1_1KU888715
CIav 9053Ontario, Canada2WIE2_1KU888716
Section Ethiopica
A. abyssinica Hochst.ABPI 411163Seraye, Eritrea4ABY1_1KU888724
PI 411173Tigre, Ethiopia6ABY2_1KU888740
ABY2_2KU888725
A. vaviloviana Mordv.ABPI 412761Eritrea4VAV1_1KU888743
VAV1_2KU888728
PI 412766Shewa, Ethiopia5VAV2_1KU888726
VAV2_2KU888735
Section Pachycarpa
A. insularis Ladiz.AC(DC)snSicily, Italy4INS1_1KU888794
INS1_2KU888705
6-B-22Sicily, Gela, Italy4INS2_1KU888706
INS2_2KU888796
INS-4Sicily, Gela, Italy3INS3_1KU888790
INS3_2KU888704
A. maroccana Grand. (syn. A magna Murphy et Terrell)AC(DC)CIav 8330Morocco3MAR1_1KU888773
MAR1_2KU888799
CIav 8331Khemisset, Morocco3MAR2_1KU888721
MAR2_2KU888800
A. murphyi Ladiz.AC(DC)CN 21989Spain4MUR1_1KU888767
MUR1_2KU888802
CN 25974Morocco3MUR2_1KU888769
MUR2_2KU888788
Section Avena
A.fatua L.ACDPI 447299Gansu, China6FAT1_1KU888768
FAT1_2KU888795
FAT1_3MH780169
PI 544659United States7FAT2_1KU888764
FAT2_2KU888760
FAT2_3KU888798
A.occidentalis Dur.ACDCN 4547Canary Islands, Spain6OCC1_1KU888791
OCC1_2MH780167
OCC1_3MH780165
CN 23036Canary Islands, Spain8OCC2_1KU888755
OCC2_2KU888803
OCC2_3KU888771
CN 25942Morocco7OCC3_1KU888733
OCC3_2KU888789
OCC3_3KU888758
CN 25956Morocco8OCC4_1KU888801
OCC4_2KU888772
A. sativa L.ACDPI 194896Gonder, Ethiopia6SAT1_1KU888727
SAT1_2KU888759
SAT1_3KU888793
PI 258655Russian Federation8SAT2_1KU888797
SAT2_2KU888766
SAT2_3KU888761
A. sterilis L.ACDPI 411503Alger, Algeria8STE1_1KU888765
STE1_2MH780168
PI 411656Tigre, Ethiopia7STE2_1KU888792
STE2_2KU888770
STE2_3MH780166
Outgroup
Trisetopsis turgidulaRöser & A. WölkPI 364343Maseru, Lesotho1KU888808

* Origin represents the collection site of wild material where this information is available, otherwise it represents the earliest source for which information is available.

* Origin represents the collection site of wild material where this information is available, otherwise it represents the earliest source for which information is available.

DNA isolation, cloning and sequencing

Genomic DNA was isolated from fresh leaves of single plants following a standard CTAB protocol [31]. Pgk1 gene sequences were amplified by using a pair of Pgk1-specific primers, PGKF1 (5’-TCGTCCTAAGGGTGTTACTCCTAA-3’) and PGKR1 (5’-ACCACCAGTTGAGATGTGGCTCAT-3’) described by Huang et al. [28]. Polymerase chain reactions (PCR) were carried out under cycling conditions reported previously [26]. High fidelity Taq DNA polymerase (Ex-Taq, Takara, Japan, Cat#RR001A) was used to reduce the potential PCR-based mutation. After estimating the size by 1.0% agarose gel, PCR products were purified using the QIAquick gel extraction kit (QIAGEN Inc., USA). The purified products were cloned into the pMD19-T vector (Takara) following the manufacturer's instructions. Initially, 6–8 positive clones from each of four accessions from 4 diploid species, including A. canariensis (Ac), A. longiglumis (Al), A. strigosa (As), and A. clauda (Cp), were sequenced to confirm that the Pgk1 gene was present in Avena diploid species as a single copy. After confirming its single copy status in diploid species, 2–3 positive clones were selected and sequenced from each accession of the remaining diploid species. In order to isolate all possible homoeologous sequences in polyploid species, 4–6 positive clones from each accession of the tetraploid species and 5–10 positive clones from each accession of the hexaploid species were selected and sequenced. All the cloned PCR products were sequenced on both strands by a commercial company (Sangon Biotech Co., Ltd., Shanghai, China) based on Sanger sequencing technology.

Sequence alignment and phylogenetic analysis

The homology of sequences was verified usingthe BLAST program in NCBI. In order to reduce the matrix size of the dataset, redundant sequences were removed, keeping one representative sequence if several identical sequences were derived from the same accession. Sequences were aligned using ClustalW software with default parameters [32] followed by manual correction. Substitution saturation of Pgk1 sequences was examined using DAMBE version 5 [33] by calculating and plotting pairwise rates of transitions and transversions against sequence divergence under the TN93 model. Phylogenetic trees were created by using maximum parsimony (MP), and Bayesian inference (BI). MP analysis was performed on PAUP* 4.0b10 [34] using the heuristic search with 100 random addition sequence replicates and Tree Bisection-Reconnection (TBR) branch swapping algorithms. Bootstrapping with 1000 replicates was estimated to determine the robustness of formed branches [35]. Gaps in the sequence alignment were disregarded using the option “gapmode = missing”, which is consistent with an assumption that insertion/deletion events are an independent stochastic process from SNP substitutions. BI analysis was carried out by using MrBayes v3.2 [36]. The best-fit substitution model for BI analysis was GTR+Γ+I, which was determined by using MrModelTest v2.3 under Akaike information criteria (AIC) (http://www.ebc.uu.se/systzoo/staff/nylander.html). Four Markov chain Monte Carlo (MCMC) chains with default priors settings were run simultaneously. To ensure the two runs converged onto the stationary distribution, 6,000,000 generations were run to make the standard deviation of split frequencies fall below 0.01. Samples were taken every 100 generations. The first 25% samples from each run were discarded as the “burn-in”. The 50% majority-rule consensus tree was constructed from the remaining trees. Posterior probability (PP) values were used to evaluate the statistical confidence of each node.

Network analysis

The median-joining (MJ) network [37] method has been demonstrated to be an effective method for assessing the relationship in closely related lineages [38], and thus was applied in this study. As MJ algorithms are designed for non-recombining molecules [37], DNA recombination was test by using a pragmatic approach-Genetic Algorithm Recombination Detection (GARD), described by Pond et al. [39]. The test was carried out on a web-based interface for GARD at http://www.datamonkey.org/GARD/. Building upon this test, the intron data was used for MJ reconstruction due to the absence of recombination signal, while potential recombination signals were detected in the exon regions. The MJ network analyses was performed using the Network 4.6.1.4 program (Fluxus Technology Ltd, Clare, Suffolk, UK).

Results

Sequence analysis

A total of 268 clones were sequenced from 76 accessions of 26 Avena species. BLASTn analysis indicated that these sequences ranged in identity from 84% to 87% with wheat Pgk1 (AF343478) with high query coverage (more than 90%), and from 77% to 100% with wheat Pgk2 (AF343449) but with very low query coverage (less than 35%), confirming the proper identity of all clones as Pgk1. Following removal of the redundant sequences within each accession, 109 sequences were identified, including one from each of the 44 diploid accessions, 37 unique sequences from 22 tetraploids, and 28 from 10 hexaploids. Theoretically, 44 homoeologues should be isolated from 22 tetraploid accessions, and 30 single-copy homoeologues were expected from 10 hexaploid accessions. Despite a high number of cloning attempts in A. barbata accession (Table 1), only one copy was detected in five of its six accessions. Whereas two very similar (only one site varied in exon 2) copies were detected in the sixth accession. It is possible that these accessions contain genomes of high similarity or autopolyploid origin. Another possibility that cannot be ruled out within the polyploids is the loss of one gene copy through homoeologous recombination or deletion. All of the Pgk1 gene sequences isolated in this study contain 5 exons and 4 introns, covering a total length from 1391 bp to 1527 bp, which is consistent with previous studies of this gene in wheat [28] and Kengyilia [26]. The alignment of Pgk1 sequences was edited and deposited in TreeBase (http://treebase.org) under following URL: http://purl.org/phylo/treebase/phylows/study/TB2:S23228. Including both exons and introns, this alignment resulted in a matrix of 1539 nucleotide positions, of which 11.6% (179/1539) were variable, and 10.1% (156/1539) were parsimony informative. The nucleotide frequencies were 0.259 (A), 0.300 (T), 0.206 (C), and 0.235 (G). A significant (131 bp) insertion/deletion feature (Fig 1A) occurred at position 968, whereby all non-C genome type sequences contained the inserted (or non-deleted) region. Further analysis indicated that this region is likely an inserted inverted repeat, which belongs to the MITE stowaway element. Its secondary structure is shown in Fig 1B. This insertion/deletion event could be used as a genetic marker for rapid diagnosis of Avena species containing the C genome.
Fig 1

Pgk1 gene sequence analysis.

(A) Partial alignment of the amplified Pgk1 gene of Avena species (B) Secondary structure of the deletion sequence between the A and C genomes.

Pgk1 gene sequence analysis.

(A) Partial alignment of the amplified Pgk1 gene of Avena species (B) Secondary structure of the deletion sequence between the A and C genomes.

Phylogenetic analyses

The substitution plot for Pgk1 (Fig 2) indicated that the Pgk1 gene was not saturated and that it could be used for phylogenetic analysis. Phylogenetic trees of 76 Avena accessions with the oat-like species Trisetopsis turgidula as outgroup were generated through maximum parsimony and Bayesian inference approaches on the non-redundant dataset. The parsimony analysis resulted in 80 equally parsimonious trees (consistency index (CI) = 0.637, retention index (RI) = 0.956). BI analysis inferred an almost identical tree topology as the MP analysis (S1 Fig).
Fig 2

Saturation plot for transition and transversion of Pgk1 gene sequences.

The crosses are the number of transition events; the triangles are the number of transversion events. The x axis shows the genetic distance based on the TN93 model; the y axis is the proportion of transitions or tansversions, which was calculated by using the number of transitions or transversions divided by the sequence length. The curves show the trends of the variance of transitions and transversions with the genetic distance increasing.

Saturation plot for transition and transversion of Pgk1 gene sequences.

The crosses are the number of transition events; the triangles are the number of transversion events. The x axis shows the genetic distance based on the TN93 model; the y axis is the proportion of transitions or tansversions, which was calculated by using the number of transitions or transversions divided by the sequence length. The curves show the trends of the variance of transitions and transversions with the genetic distance increasing. Both Fig 3 and S1 Fig show that the Pgk1 gene sequences from 76 Avena accessions were split into two distinct clades with high BS (100% and 95%) and PP (100% and 100%) support. One clade contained all C-genome type sequences, hence referred to as the C genome clade. The other clade contained all sequences from the species carrying the A genome, henceforth, referred to as the A genome clade. The C genome clade was composed of two major subclades. All Cv genome diploids formed the subclade C1 with 100% BS and 100% PP support, while subclade C2 included six Cp diploid accessions, seven AC(DC) genome tetraploid accessions and nine hexaploid accessions with 74% BS and 99% PP support. The Pgk1 gene sequences in the A genome clade were further split into five major subclades. One genome copy of the AC(DC) genome tetraploid species A. insularis clustered with five accessions of the Ac genome diploid species A. canariensis and one genome homoeologue of the AB genome tetraploid species A. agadiriana, consequently forming the subclade A1 with low BS (51%) and PP (less than 90%) support. Within A1, the Ac genome diploids showed close relationships with the AB genome tetraploid species A. agadiriana. Subclade A2 was composed of four accessions of the AB genome tetraploids A. agadiriana, nine hexaploid accessions (A. occidentalis CN 23036, CN 25942 and CN 4547, A. sativa PI 194896 and PI 258655, A. fatua PI 447299 and PI 544659, A. sterilis PI 411503 and PI 411656) and four As genome diploid accessions (A. atlantica CN25849 and CN 25859, A. lusitanica CN 26441, and A. hirtula CN 19739). One genome sequence of the AC(DC) genome tetraploids (without A. insularis) together with eight hexaploid taxa formed a homogeneous clade (A3) that was separated from other species with high BS (100%) and PP (100%) support. The subclade A4 consisted of the Ad genome diploid A. damascena, the Al genome diploid A. longiglumis, and the As genome diploid A. lusitanica. The remaining sequences from the A genome diploids and the AB genome tetraploids (without A. agadiriana) formed a relatively broader cluster A5, together with two hexaploid accessions (A. sativa PI 194896 and A. occidentalis CN 25942) and one AC(DC) genome tetraploid accession (A. maroccana CIav 8831).
Fig 3

Maximum parsimony tree derived from Pgk1 sequence data.

The tree was constructed using a heuristic search with TBR branch swapping. Numbers above the branches are bootstrap support (BS) values ≥50%. Accession number, species name and haplome are indicated for each taxon.

Maximum parsimony tree derived from Pgk1 sequence data.

The tree was constructed using a heuristic search with TBR branch swapping. Numbers above the branches are bootstrap support (BS) values ≥50%. Accession number, species name and haplome are indicated for each taxon. Three groups of haplotypes of Pgk1 sequences were identified in eight of ten hexaploid accessions (A. fatua PI 447299 and PI 544659, A. occidentalis CN 25942, CN 23036, and CN 4547, A. sativa PI 194896 and PI 258655, and A. sterilis PI 411656). These sequences fell into four subclades. One group clustered with the C genome diploids in subclade C2, and one group clustered with AC(DC) genome tetraploids in subclade A3. We hypothesize that these two types represent homoeologues from the C and D genomes, respectively. This interpretation is consistent with strong evidence presented by Yan et al. [9] that the AC(DC) tetraploids contain the progenitor D genome of the hexaploids. A third and fourth group fell into subclades A2 and A5. Since these two groups are highly separated, it is possible that they represent different A-genome events leading to different hexaploid lineages. To gain better insight into relationships within closely related lineages, MJ network reconstruction based on the haplotypes of Pgk1 sequences was employed. Due to the potential presence of recombination in the exon regions, the intron data was used for MJ network reconstruction. A total of 41 haplotypes were derived from 109 Pgk1 gene sequences (Fig 4). This low level of haplotype diversity demonstrates the high conservation of this gene within genus Avena. The MJ network recovered a nearly identical phylogenetic reconstruction to that based on the MP and BI trees, therefore we identified the clades from the MP results (Fig 3) within the MJ network (Fig 4). Based on the topology and frequency of haplotypes, the MJ network was split into two main groups. The two major groups representing two distinct types of haplotypes (A and C genomes) were distinguished due to the 131 bp insertion/deletion. Ten C genome haplotypes were observed, which were much less diverse than the 31 A genome haplotypes. The two main groups were further subdivided into clusters corresponding to the seven MP-based subclades discussed earlier.
Fig 4

Median-joining networks based on 41 Pgk1 gene haplotypes of intron regions derived from 26 Avena species.

Each circular node represents a single haplotype, with relative size being proportional to the frequency of that haplotype. Distinct colors in the same haplotype node represent different species sharing the same haplotype (colors are arbitrary). Median vectors (mv) represent the putative missing intermediates. Numbers along network branches indicate the number of bases involved in mutations between two nodes. Clusters (surrounded by dashed lines) are named based on clade names shown in the MP tree (Fig 3). Three-letter abbreviations of species names are listed in Table 1. The numbers immediately after each species abbreviation represent different accessions of the same species, and the number following the underscore identifies different haplotypes from the same accession.

Median-joining networks based on 41 Pgk1 gene haplotypes of intron regions derived from 26 Avena species.

Each circular node represents a single haplotype, with relative size being proportional to the frequency of that haplotype. Distinct colors in the same haplotype node represent different species sharing the same haplotype (colors are arbitrary). Median vectors (mv) represent the putative missing intermediates. Numbers along network branches indicate the number of bases involved in mutations between two nodes. Clusters (surrounded by dashed lines) are named based on clade names shown in the MP tree (Fig 3). Three-letter abbreviations of species names are listed in Table 1. The numbers immediately after each species abbreviation represent different accessions of the same species, and the number following the underscore identifies different haplotypes from the same accession.

Discussion

Two distinct diploid lineages exist in genus Avena

A significant 131 bp insertion/deletion separated all Avena diploid species into two distinct groups representing the A and C genomes, respectively (Figs 1 and 4). These groups were also separated based on the MP or BI analysis that ignored gaps (Fig 3 and S1 Fig), indicating that the separation of A and C genomes is the most ancient major articulation in the genus Avena, a result that is consistent with most other literature [13, 14, 40]. MJ network analysis revealed that the C genome diploids have much lower levels of haplotype diversity than the A genome diploids. Within the C genome diploids, the Cp genome haplotypes were relatively more diverse than those of the Cv genome. These results might be explained by the geographic distribution of these species. The A genome diploids are distributed in a large region between latitude 20 and 40° N, while the C genome diploid species are restricted to a narrow territory along the Mediterranean shoreline [1]. The geographic distributions of the C genome diploid species are overlapping, but the range of the Cp genome diploid species is much broader than that of the Cv genome diploid species [41]. The A genome diploid species are the most diverse set of species in genus Avena, and chromosome rearrangements have occurred during the divergence of A-genomes from a common progenitor [41], resulting in the subdivision of the A genome into five types, of which we have investigated four. Our results showed that species with genome types Ac, Al, and Ad formed groups that correspond well with previously reported structural differences. However, the As genome diploids appear to be much more diverse than previously reported, and are scattered into different subclades (Figs 3 and 4). Baum [1] divided all As genome diploids into two sections, section Agraria and section Tenuicarpa. All species of section Agraria have florets with a domesticated (non-shattering) base, whereas the other As species share relatively narrow spikelets. However, classification based on simple morphological traits is increasingly controversial. In this study, the As genome diploid species of section Agraria showed high degree of genetic homogeneity, consistently forming their own subclade A5, but other As genome species in section Tenuicarpa did not have their own subclade. A. wiestii showed a close relationship with the species of section Agraria, suggesting that it may be better-classified within that section. This result is in agreement with previous studies based on RAPD [42] and karyotypic comparisons [43]. Accessions of the other two As genome species of section Tenuicarpa (A. atlantica and A. hirtula) were scattered into different subclades. These results were also observed in other studies [13,14]. A. lusitanica, another As species of section Tenuicarpa, was diverged from other As species, but showed a close relationship to those with the Ad genome species A. damascena. This divergence has also been observed in many other studies [8, 9, 14, 40]. These, and other incongruences between morphological characters and genetic differences raise questions about appropriate taxonomical classifications among As genome species.

The As and Ac genomes played roles in the AB tetraploid formation

Four recognized species have been proposed to have an AB genome composition. Of these, A. barbata, A. abyssinica and A. vaviloviana are grouped into a biological species known as the barbata group, while A. agadiriana is distinct [25]. Our results confirmed the reported structural differences between these two groups (Fig 3). Two different Pgk1 gene sequences were detected from most of the AB genome tetraploids, supporting their allotetraploid origins. However, the genomes of A. barbata showed the least divergence, with only one of six A. barbata accessions providing multiple sequences, both of which were very similar. It seems that little divergence has occurred within the genome of A. barbata compared with that of A. abyssinica and A. vaviloviana. This result has also been observed in previous study based on FISH and southern hybridization analysis [17], which found some B chromosomes of A. vaviloviana are involved in inter-genomic translocations, while these rearrangements were not detected in A. barbata. There is little doubt that the A genome diploids have been involved in the formation of the barbata species. Some studies have suggested that both the A and B genomes of barbata species are diverged As genomes [16, 23, 44], while some others proposed that the B genome might have originated from other A genome diploid species [24, 25, 45]. In this study, both types of Pgk1 sequences detected from the barbata group showed high degree of genetic homogeneity with the As genome diploids (Fig 3), thus it was impossible to determine which type represents the A or B genome. The recently discovered tetraploid species A. agadiriana was also proposed to have an AB genome composition because of its high affinity with A. barbata [23]. However, this designation has been questioned due to chromosomal divergences between A. agadiriana and the barbata species, as revealed by cytological studies [45] and by molecular data [9, 13, 14]. In the current study, two distinct types of Pgk1 sequences were obtained in A. agadiriana. One copy clustered with the Ac genome species A. canariensis, whereas the other copy fell into cluster A2 with the As species A. atlantica, A. hirtula, A. lusitanica, and the hexaploids A. occidentalis, A. fatua and A. sativa (Fig 3 and S1 Fig). These results were in agreement with our previous studies based on nuclear Acc1 gene [14] and GBS markers [9], and they support the proposal that A. agadiriana contains a different combination of A and/or B genomes from the barbata group, and that one of its two genomes originates from the Ac genome species A. canariensis, whereas the other one is closely related to the As species.

The tetraploid species A. maroccana and A. murphyi are closely related to the hexaploids, while A. insularis is diverged

The other tetraploid group (Avena sect. Pachycarpa) contains three species, A. maroccana, A. murphyi, and the recently discovered A. insularis. Initial studies based on genomic in situ hybridization [46] supported an AC genome designation for these species. However, this designation has been challenged by FISH analysis, which has revealed that this set of tetraploid species, like the D chromosomes of the hexaploid oats, lacks a repetitive element that is diagnostic of the A genome [18]. This, together with other molecular evidence [14, 47] and our recent whole-genome analysis based on GBS markers [9], suggests that these tetraploid species contain the genome designated as D in hexaploid oats, and that they are more properly designated as DC genome species. In the present study, two distinct Pgk1 homoeologues were detected in each of the three AC(DC) species, with each pair falling consistently into two clusters within the C and the A genome clades, respectively (Fig 3 and S1 Fig). The C-copy sequences of these tetraploids clustered consistently with the C-type homoeologues of the hexaploids, while the A/D genome homoeologues, with the exception of these from A. insularis and one sequence from A. maroccana (CIav 8331) fell into subclade A3 along with a set of sequences from the hexaploid oats (Fig 3). Considering that the other Pgk1 gene sequences from the hexaploid oats clustered with the C or A genome diploids, we deduced that the sequences falling in subclade A3 must represent the D genome homoeologues of the hexaploids and of the AC(DC) species A. maroccana and A. murphyi. This result is not fully consistent with our previous GBS study: although A. maroccana and A. murphyi were very similar to hexaploid oat and were designated as DC genomes, our GBS work suggested that A. insularis was also a DC genome that was even more similar to the hexaploids [9]. Examining the existing literature, all three of these tetraploid species have variously been considered as the tetraploid ancestor of the hexaploids [4, 9, 48]. In view of the genome structure of these tetraploids [24, 49] and the meiotic chromosome paring of their interspecific hybrids [50], all of these tetraploids are proposed to have diverged from a common ancestral tetraploid after the occurrence of some large chromosome rearrangements [24, 49]. However, it cannot be ruled out that these tetraploids might have originated independently from different diploid ancestors, since they have shown close relationships with different diploid species [40]. Interestingly, the A/D-type homoeologues of A. insularis fell into a group with the Ac genome species A. canariensis and the AB genome species A. agadiriana. In fact, previous studies have revealed that A. canariensis is closely related to the DC genome tetraploids [15]. These results suggest a possibility that A. canariensis was involved in contributing an early version of a D genome in all three AC(DC) genome tetraploids. Nevertheless, we do not have an explanation for why the D genome copy of Pgk1 in A. insularis could have diverged so far from the version found in the hexaploids, especially since the C genome copies remain identical.

The genome origins of the hexaploid species

It is now generally accepted that two distinct steps were involved in the evolution of hexaploid oats. The first step would have been the formation of a DC genome hybrid from ancestral D and C genome diploids, followed by doubling of the chromosomes to form an allotetraploid. The second step would have involved hybridization of a DC tetraploid with a more recent A genome diploid, followed by doubling of the triploid hybrid [9, 13]. The diploid progenitor of the hexaploid C genome was probably restricted to the narrow geographic range where the three extant C genome diploids are distributed. However, numerous inter-genomic translocations among hexaploid chromosomes [9, 11, 51, 52] have decreased the homology between the C genome diploids and the hexaploid C genome, making the identification of the C genome donor of the hexaploids challenging. In this study, the Cp genome species shared the highest degree of genetic similarity with both the AC(DC) genome tetraploids, as well as with the hexaploids, leading us to conclude that a Cp genome species was the C genome donor of the polyploids. This conclusion is supported by other evidence from nuclear genes [13, 53]. This is important, since it was recently demonstrated that the maternal tetraploid and hexaploid genomes originated from an A genome species, not from a C genome species [54], rendering comparisons to the Cv vs Cp maternal genomes irrelevant in determining the origin of the nuclear C genome in the hexaploids. The A genome origin of the hexaploids remains a matter of debate, and many A genome diploids have been suggested as putative diploid progenitors, as summarized by Peng et al. [13]. FISH analysis showed that an As-specific DNA repeat was restricted to the As and Al genomes, as well as the hexaploid A genome [18]. In this study, a close relationship between the As genome diploid A. atlantica was observed for some hexaploid haplotypes in the phylogenetic tree (Fig 3) and the MJ network (Fig 4). An A. atlantica genome origin is consistent with previous studies based on IGS-RFLP analysis [12] and the ppcB1 gene [40]. However, there is evidence in our work that some hexaploids may have an alternate A genome origin, closer to the Agraria section of As diploids. The presence of multiple A genome origins could explain variable results that have been reported in studies of hexaploid phylogeny. In this study, strong evidence is presented for a D genome origin in the tetraploids A. maroccana and A. murphyi (Figs 3 and 4). However, these D genome sequences did not show a close relationship with any diploid species investigated in this study. Other than the discrepancy with A. insularis, this result is consistent with our recent GBS study [9]. One factor that may hinder the discovery of a D genome progenitor is the presence of inter-genomic translations among all three genomes in the hexaploid [9, 52]. Two hexaploid accessions (A. occidentalis CN 25942 and A. sativa PI 194896) did not contribute haplotypes that clustered with the putative D genome sequences (Subclade A3 in Fig 3). Although this may be a result of incomplete sampling, it may also result from inter-genomic translations that have duplicated or eliminated copies of Pgk1. In conclusion, this is the most comprehensive study to date that investigates a phylogeny in genus Avena using a single informative nuclear gene. It confirms or clarifies most previous work, and presents strong evidence in support of a working hypothesis for the origin of hexaploid oat. However, many questions still remain, and these will be best addressed through further studies involving multiple nuclear genes or whole genomes. We are collaborating on work that will provide exome-based gene diversity studies, but this work will require complete hexaploid reference sequences before it can be properly analyzed. Such reference sequences are currently in progress, so the next few years may see a revolution in our understanding of Avena phylogeny. Nevertheless, as many researcher in this field are aware, the polyploid species in this genus have experienced extensive chromosome rearrangement, which will continue to complicate phylogenetic studies. It may even be necessary to generate a pan-genome hexaploid reference sequence before definitive statements can be made.

Consensus tree based on 110 Pgk1 sequences reconstructed using Bayes inference.

The GTR+Γ+I model was chosen as the best-fit substitution model by using MrModelTest v2.3 under AIC. Bayesian posterior probability (PP) values equal or more than 90% are showed above the branches. Accession number, species name and haplome are indicated for each taxon. (TIF) Click here for additional data file.
  30 in total

1.  CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP.

Authors:  Joseph Felsenstein
Journal:  Evolution       Date:  1985-07       Impact factor: 3.694

2.  Phylogenetic analysis of the genus Avena based on chloroplast intergenic spacer psbA-trnH and single-copy nuclear gene Acc1.

Authors:  Hong-Hai Yan; Bernard R Baum; Ping-Ping Zhou; Jun Zhao; Yu-Ming Wei; Chang-Zhong Ren; Fang-Qiu Xiong; Gang Liu; Lin Zhong; Gang Zhao; Yuan-Ying Peng
Journal:  Genome       Date:  2014-05       Impact factor: 2.166

3.  A new chromosome nomenclature system for oat (Avena sativa L. and A. byzantina C. Koch) based on FISH analysis of monosomic lines.

Authors:  M J Sanz; E N Jellen; Y Loarce; M L Irigoyen; E Ferrer; A Fominaya
Journal:  Theor Appl Genet       Date:  2010-07-24       Impact factor: 5.699

4.  Isolation and identification of Triticeae chromosome 1 receptor-like kinase genes (Lrk10) from diploid, tetraploid, and hexaploid species of the genus Avena.

Authors:  D W Cheng; K C Armstrong; G Drouin; A McElroy; G Fedak; S D Molnar
Journal:  Genome       Date:  2003-02       Impact factor: 2.166

5.  Discrimination of the closely related A and D genomes of the hexaploid oat Avena sativa L.

Authors:  C Linares; E Ferrer; A Fominaya
Journal:  Proc Natl Acad Sci U S A       Date:  1998-10-13       Impact factor: 11.205

6.  C-banding variation in the Moroccan oat species Avena agadiriana (2n=4x=28).

Authors:  E N Jellen; B S Gill
Journal:  Theor Appl Genet       Date:  1996-05       Impact factor: 5.699

7.  Fluorescence in situ hybridization mapping of Avena sativa L. cv. SunII and its monosomic lines using cloned repetitive DNA sequences.

Authors:  M L Irigoyen; C Linares; E Ferrer; A Fominaya
Journal:  Genome       Date:  2002-12       Impact factor: 2.166

8.  Unraveling the evolutionary dynamics of ancient and recent polyploidization events in Avena (Poaceae).

Authors:  Qing Liu; Lei Lin; Xiangying Zhou; Paul M Peterson; Jun Wen
Journal:  Sci Rep       Date:  2017-02-03       Impact factor: 4.379

9.  Oat evolution revealed in the maternal lineages of 25 Avena species.

Authors:  Yong-Bi Fu
Journal:  Sci Rep       Date:  2018-03-09       Impact factor: 4.379

10.  High-density marker profiling confirms ancestral genomes of Avena species and identifies D-genome chromosomes of hexaploid oat.

Authors:  Honghai Yan; Wubishet A Bekele; Charlene P Wight; Yuanying Peng; Tim Langdon; Robert G Latta; Yong-Bi Fu; Axel Diederichsen; Catherine J Howarth; Eric N Jellen; Brian Boyle; Yuming Wei; Nicholas A Tinker
Journal:  Theor Appl Genet       Date:  2016-08-13       Impact factor: 5.699

View more
  3 in total

1.  New evidence confirming the CD genomic constitutions of the tetraploid Avena species in the section Pachycarpa Baum.

Authors:  Honghai Yan; Zichao Ren; Di Deng; Kehan Yang; Chuang Yang; Pingping Zhou; Charlene P Wight; Changzhong Ren; Yuanying Peng
Journal:  PLoS One       Date:  2021-01-08       Impact factor: 3.240

2.  CicerSpTEdb: A web-based database for high-resolution genome-wide identification of transposable elements in Cicer species.

Authors:  Morad M Mokhtar; Alsamman M Alsamman; Haytham M Abd-Elhalim; Achraf El Allali
Journal:  PLoS One       Date:  2021-11-11       Impact factor: 3.240

3.  Comparative chloroplast genome analyses of Avena: insights into evolutionary dynamics and phylogeny.

Authors:  Qing Liu; Xiaoyu Li; Mingzhi Li; Wenkui Xu; Trude Schwarzacher; John Seymour Heslop-Harrison
Journal:  BMC Plant Biol       Date:  2020-09-02       Impact factor: 4.215

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.