Literature DB >> 35503755

Assembly and comparative analysis of the complete mitochondrial genome of three Macadamia species (M. integrifolia, M. ternifolia and M. tetraphylla).

Yingfeng Niu1, Yongjie Lu2, Weicai Song2, Xiyong He1, Ziyan Liu1, Cheng Zheng1, Shuo Wang2, Chao Shi2, Jin Liu1.   

Abstract

BACKGROUND: Macadamia is a true dicotyledonous plant that thrives in a mild, humid, low wind environment. It is cultivated and traded internationally due to its high-quality nuts thus, has significant development prospects and scientific research value. However, information on the genetic resources of Macadamia spp. remains scanty.
RESULTS: The mitochondria (mt) genomes of three economically important Macadamia species, Macadamia integrifolia, M. ternifolia and M. tetraphylla, were assembled through the Illumina sequencing platform. The results showed that each species has 71 genes, including 42 protein-coding genes, 26 tRNAs, and 3 rRNAs. Repeated sequence analysis, RNA editing site prediction, and analysis of genes migrating from chloroplast (cp) to mt were performed in the mt genomes of the three Macadamia species. Phylogenetic analysis based on the mt genome of the three Macadamia species and 35 other species was conducted to reveal the evolution and taxonomic status of Macadamia. Furthermore, the characteristics of the plant mt genome, including genome size and GC content, were studied through comparison with 36 other plant species. The final non-synonymous (Ka) and synonymous (Ks) substitution analysis showed that most of the protein-coding genes in the mt genome underwent negative selections, indicating their importance in the mt genome.
CONCLUSION: The findings of this study provide a better understanding of the Macadamia genome and will inform future research on the genus.

Entities:  

Mesh:

Year:  2022        PMID: 35503755      PMCID: PMC9064092          DOI: 10.1371/journal.pone.0263545

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.752


1. Introduction

Macadamia spp belongs in the family Proteaceae, class Magnoliopsida, and order Proteales. The Proteaceae family has five subfamilies, 80 genera, and over 1600 species [1, 2]. Most of them are distributed in Oceania and South Africa, while a few are produced in East Asia and South America. Notably, more than 100 species in the Proteaceae family produce flowers that are traded internationally [3]. Besides, the species grown in the northeastern part of Oceania are also rich in nuts. The genus Macadamia comprises four species: Macadamia integrifolia, M. jansenii, M. ternifolia, and M. tetraphylla. These species are naturally distributed in the subtropical rain forests from southeastern Queensland in Australia to northeastern New South Wales [4, 5]. Among them, M. integrifolia and M. tetraphylla produce edible nuts; thus, most commercial cultivars are either these two species or their hybrids. The other two species, M. Jansenii and M. ternifolia produce non-edible nuts containing high levels of bitter cyanide glycosides, thus has not been used to guide the breeding [6, 7]. Macadamia seeds are sweet with high nutritional and medicinal value. Therefore, they have enjoyed the reputation of "King of Thousand Fruits". They are also used in international transactions due to their high economic value [8]. Mitochondria (mt) are organelles that primarily convert biomass energy in living cells into chemical energy to fuel biological activities [9]. Additionally, they participate in other biological processes, including cell differentiation, cell apoptosis, cell growth, and cell division [10-13]. Therefore, mt are central to life activities within individual cells and the entire living body [14]. Both plastids and mt harbor genetic information and are thought to have evolved through endosymbiosis of freely living bacteria [15-17]. In most seed plants, nuclear genetic information is inherited from both parents, while cp and mt are derived from maternal genes [18]. Thus, we can temporarily ignore the influence of paternal genes, thereby reducing the difficulty of genetic research and promoting the research of genetic mechanisms [19]. Studies have shown that the size of the mt genome varies significantly between different species. For example, plants have a larger mt genome than animals [20]. Furthermore, mt genome size in seed plants can vary by at least one order of magnitude ranging from ~ 222 bp in Brassica napus [21] and ~ 316 Kb in Allium cepa [22] to ~ 3.9 Mb in Amborella trichopoda [23] and a striking ~ 11.3 Mb in Silene conica [24]. This phenomenon may be caused by the abundance of non-coding regions and repeated elements in the plant mt genome [25]. DNA recombination between homologous sequences produces small circular sub-genomic DNA. The circular genomic DNA coexists with the complete "master" genome in the cell. These genomes typically have several kb repeats, leading to multiple heterogeneous forms of the genome [26-31]. The mutation rate of plant mt genomes is very low; however, their rearrangement rate is so high that there is almost no conservation of synteny [32-34]. The development of cost-effective and more efficient DNA sequencing methods like high-throughput sequencing has accelerated mt genome sequencing. So far (until June 2021), the mt genomes of 618 green plant species have been released in the NCBI (https://www.ncbi.nlm.nih.gov/) database. Long-term mutually beneficial symbiosis caused the mt to lose some of the original DNA, possibly by transfer, leaving only the DNA encoding it [35, 36]. Mt DNA integrates DNA from various sources by intracellular and horizontal transfer [37]. Therefore, regardless of the length, gene sequence and content, mt genome varies remarkably among different plant species [33]. The mt genome length of the smallest terrestrial plant is about 66 Kb, and that of the largest terrestrial plant is 11.3 Mb [24, 38, 39]; the number of genes is usually between 32 and 67 [40]. In this study, the mt genomes of three Macadamia species were sequenced, assembled, and annotated. Also, their genomic and structural features were analyzed and compared with other angiosperms (and gymnosperms). This study improves our understanding of Macadamia genetics and provides crucial data to inform future research on the evolution of mt genomes of land plants.

2. Materials and methods

2.1 Genome sequencing

The three Macadamia species examined in this study were collected from Yunnan Institute of Tropical Crops (Xishuangbanna, China; 101°28’ E, 21°92’ N). Total genomic DNA was extracted from fresh leaves using modified CTAB [41]. Meanwhile, the quantity and quality of extracted DNA was assessed by spectrophotometry and the integrity was evaluated using a 1% (w/v) agarose gel electrophoresis. The qualified DNA samples were used for Illumian DNA library construction, according to the standard procedure. Subsequently, a paired-end sequencing library with an insert size of 350 bp was constructed. The Illumina Hiseq 4000 high-throughput sequencing platform was used for sequencing. The sequencing strategy involved PE150 (Pair-End 150) and the sequencing data volume of not less than 1 Gb. Illumina high-throughput sequencing results initially existing as original image data files were converted into Raw Reads. CASAVA software was used for Base Calling.

2.2 Genome assembly and annotation

SPAdes v.3.5.0 [42] software was used to splice and assemble mt genome sequences. To correct the splicing results, the raw sequencing data were mapped to mitochondrial sequences using Geneious software [43]. DOGMA [44] and NCBI were used to annotate the mt genome. The Blastn and Blastp method was used to compare mt gene-encoding protein and rRNAs among related species. TRNA scan-SE2.0 [45] and ARWEN [46] were used to annotate tRNA. The tRNAs with unreasonable length and incomplete structure were eliminated. Subsequently, a tRNA secondary structure diagram was generated. The final mt genomes of M. integrifolia, M. ternifolia, and M. tetraphylla have been deposited in the GenBank (Accession number: MW566570/MW566571/MW566572).

2.3 Analysis of repeat structure and sequence

Microsatellites within the mt genomes of the three Macadamia species were identified using MISA [47, 48]. The minimum number of repeats for the motif length of 1, 2, 3, 4, 5, and 6 were 10, 6, 5, 4, 3, and 3, respectively, were identified in this analysis. The tandem repeats were detected using Tandem Repeats Finder v4.09 software [49] with default parameters.

2.4 DNA transformation from cp to mt and RNA editing analyses

The cp genome of M. integrifolia (NC_025288) was downloaded from the NCBI database. Chloroplast-like sequences were identified and the genome was mapped using TBtools [50]. The online program Predictive RNA Editor for Plants (PREP) suite [51] was adopted to identify the possible RNA editing sites in the protein-coding genes of the three Macadamia species. The cutoff value was set as 0.2 to ensure accurate prediction. The protein-coding genes from other plant mt genomes were used as references to reveal the RNA editing sites in the mt genomes of the three Macadamia species.

2.5 Phylogenetic tree construction and Ka/Ks analysis

The genome sequences of the three Macadamia species were compared with those of 35 (S1 Table) other plant species to further verify their phylogenetic position. Notably, the complete mt genome sequences of these species were available in the NCBI database. Phylogenetic analyses were performed on 23 conserved protein-coding genes (atp1, atp4, atp6, atp8, atp9, ccmB, ccmC, ccmFc, ccmFn, cob, cox1, cox2, cox3, matR, nad1, nad2, nad3, nad4, nad4L, nad5, nad6, nad7 and nad9) that were extracted from the mt genomes of the 35 plant species using TBtools [51]. These conserved genes were then aligned using Muscle [52] implemented in MEGA X [53]; the alignment was modified manually to eliminate gaps and missing data. The GTR + G + I model was determined to be the best model based on the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) calculated by ModelFinder [54]. The Maximum Likelihood (ML) algorithm in MEGA X [53] was used to construct a phylogenetic tree. The bootstrap consensus tree was inferred from 1000 replications. Cycas taitungensis and Ginkgo biloba were designated as the outgroup in this analysis. The Ka and Ks replacement rates of protein-coding genes in mitochondrial genomes of the three Macadamia species and other higher plants were analyzed. blastn in TBtools was used to extract the sequences of corresponding protein-coding genes in Macadamia and N. nucifera genomes. The Ka and Ks replacement rates of each protein-coding gene were estimated using N. nucifera genome as a reference.

3. Results and discussion

3.1 Genomic features of the mt genomes of the three Macadamia species

The mt genomes of M. integrifolia, M. ternifolia and M. tetraphylla have a typical terrestrial plant genome ring structure (Fig 1). A total of 71 unique genes were identified in the mt genomes of the three Macadamia species, including 42 protein-coding, 26 tRNA, and 3 rRNA genes (Table 1). In addition, two copies of rRNA26, ccmB, rps19, trnN-GTT, and trnH-GTG, and seven copies of trnM-CAT were identified. It has been established that the mt genomes of land plants contain a variable number of introns [55]. In the present study, the three mt genomes had ten genes with introns, length ranging from 13 bp (rps3) to 31,841 bp (cox2) where ccmFC, rpl2, rps3, and rps10 had two introns, cox2 had three, nad1, nad4, and nad5 had four and nad2 and nad7 had five introns. Besides, in all protein-coding genes, except atp6, cox1, nad1, nad4L, rps4, and rps10, which had ACG as the start codon, all the others had ATG as their start codon. In addition, the stop codons in all the protein-coding genes were: TAA 45.2%, TGA 28.6%, TAG 14.3%, CAA 9.5%, and CGA 2.4%.
Fig 1

The circular map of three Macadamia species mitochondrial genome.

Gene map showing 71 annotated genes of different functional groups.

Table 1

Gene profile and organization of three Macadamia species (M. integrifolia, M. ternifolia and M. tetraphylla).

Group of genesGene/elementSize(bp)GC_PercentAminoAcids (aa)InferredInitiation CodonInferred TerminationCodon
ATP synthase atp1 153045.29%509ATGTGA
atp4 59743.05%198ATGTAG
atp6 78339.08%260ACGTAA
atp8 48040.63%159ATGTAA
atp9 22546.67%74ATGCAA
Cytochrome c biogenesis ccmB(2) 621,62142.83%206ATGTGA
ccmC 77144.23%256ATGTAA
ccmFCa 135646.53%451ATGTAA
ccmFN 173447.58%577ATGTGA
Ubichinol cytochrome c reductase cob 118242.39%393ATGTGA
Cytochrome c oxidase cox1 158444.26%527ACGTAA
cox2a 82242.34%273ATGTAG
cox3 79845.11%265ATGTGA
Maturases matR 196852.64%655ATGTAG
NADH dehydrogenase nad1a 97844.99%325ACGTAA
nad2a 146740.90%488ATGTAA
nad3 35741.74%118ATGTAA
nad4a 148842.67%495ATGTGA
nad4L 30337.29%100ACGTAA
nad5a 198941.78%662ATGTAA
nad6 63040.95%209ATGTGA
nad7a 118545.23%394ATGTAG
nad9 57342.93%190ATGTAA
Ribosomal proteins (LSU) rpl2a 99952.15%332ATGCAA
rpl5 56144.74%186ATGTAA
rpl10 51646.32%171ATGTAA
rpl16 49243.09%163ATGTAA
Ribosomal proteins (SSU) rps1 60643.56%201ATGTAA
rps2 64839.20%215ATGCAA
rps3a 169243.91%563ATGTAG
rps4 105940.51%352ACGTAA
rps7 44743.18%148ATGTAA
rps10a 33339.04%110ACGCGA
rps11 44445.27%147ATGCAA
rps12 37845.50%125ATGTGA
rps13 35139.60%116ATGTGA
rps14 30340.92%100ATGTAG
rps19(2) 285,28540.00%94ATGTAA
Transport membrane protein sdh3 33637.20%111ATGTGA
sdh4 45041.33%149ATGTGA
Ribosomal RNAs rrn5 11952.94%   
rrn18206155.12%      
rrn26(2)3989,398953.02%   
Transfer RNAs trnR-CCG7557.33%   
trnN-GTTb(2)75,72 49.33 %      
trnD-GTCb7463.51%   
trnC-GCA7652.63%      
trnQ-TTG7247.22%   
trnE-TTC7250.00%      
trnG-GCC7454.05%   
trnH-GTGb(2)75,7554.67%   
trnK-TTT7546.67%      
trnM-CATb(7)72,75,73,72,77,72,7259.72%,46.67%,43.84%,59.72%,44.16%,59.72%,59.72%  
trnF-AAA7549.33%   
trnF-GAA7447.30%   
trnP-TGG7554.67%   
trnS-TGA 88 51.14%   
trnS-GCT9146.15%   
trnW-CCAb7451.35%   
trnY-GTA8451.19%   

Notes: The numbers after the gene names indicate the duplication number. Lowercase a indicates the genes containing introns, and lowercase b indicates the chloroplast-derived genes.

The circular map of three Macadamia species mitochondrial genome.

Gene map showing 71 annotated genes of different functional groups. Notes: The numbers after the gene names indicate the duplication number. Lowercase a indicates the genes containing introns, and lowercase b indicates the chloroplast-derived genes. The size and GC content of mt genome are the primary characteristics. Here, we compared the size and GC content of mt genomes between three Macadamia species and 36 other green plants, including four phorophytes, three bryophytes, two gymnosperms, four monocots, and 23 dicots (S1 Table). The size of the mt genomes ranged from 22,897 bp (Chlamydomonas moewusii) to 2,709,526 bp (Cucumis melo) (Fig 2). Compared to phorophytes and bryophytes, the mt genomes of the three Macadamia species are larger. The GC content in the mt genomes was also highly variable, ranging from 32.24% in Sphagnum palustric to 50.36% in Ginkgo biloba. Overall, the GC content of angiosperm mt genome (including monocots and dicots) is higher than that in bryophytes but less than in gymnosperms [56, 57], implying that the GC contents fluctuated following the angiosperms divergence from bryophytes and gymnosperms. Interestingly, the GC content significantly fluctuated in algae and was mostly conserved in angiosperms, although their genome sizes vary significantly.
Fig 2

The sizes and GC contents of 39 plant mitochondrial genomes.

The blue dots represent the genome size and the orange trend line shows the variation of GC content across the different taxa.

The sizes and GC contents of 39 plant mitochondrial genomes.

The blue dots represent the genome size and the orange trend line shows the variation of GC content across the different taxa.

3.2 Repeat sequences analysis

Microsatellites or simple sequence repetitions (SSRs) are DNA fragments composed of short sequence repeating units of 1–6 base pairs [58]. Their unique value is created by their polymorphism, relative abundance, codominant inheritance, large-scale genome coverage, and PCR detection simplicity [59]. Based on the SSRs analysis, we identified 87 SSRs with SSRs monomers and dimers accounting for 70.11% of the total SSRs. Adenine (A) was the most repeated monomer with 19 (38%) out of the 50 identified monomer SSRs. The AT repeat was the most common dimer SSR, accounting for 66.67% of all the identified dimers. However, one hexamer [ATTAGG(X3)] was present in the mt genomes of three Macadamia species. Among the reference mt genome, only Nelumbo nucifera has been published in the NCBI database. N. nucifera belongs to the family Nelumbonaceae and the same order (Proteales) with Macadamia. Therefore, the mt genome of N. nucifera was used as a reference for comparative analysis in the present study. The monomers in N. nucifera were lower than in the three Macadamia species, while pentamers and hexamers in N. nucifera were significantly higher than in the three Macadamia species (Fig 3A). Moreover, the SSRs in mt genomes of M. integrifolia, M. ternifolia, M. tetraphylla, and N. nucifera were mainly single-nucleotide A/T motifs, and dimer AT/TA motifs. Within the Macadamia genus, the mt SSRs among the different species are highly similar (Fig 3B). However, compared with N. nucifera, there were both differences and similarities. For example, the single nucleotide A/T in the three Macadamia species has 23-unit repeats, while N. nucifera has only nine. Nevertheless, their single-nucleotide C/G numbers were the same (two-unit repeats) (Fig 3B). In addition, the AG/CT and AT/AT motifs unit repetitions are the same, although N. nucifera also has an AC/GT motif, lacking in the three Macadamia species. Interestingly, the pentanucleotide AATGT/ACATT, ACTAG/AGTCT, and ACATT/AGTAT also had the same number of repetitions in the three Macadamia species and N. nucifera. Overall, the greater the nucleotide motif, the greater the difference between the three Macadamia species and N. nucifera.
Fig 3

The comparison of microsatellites and oligonucleotide repeats in three Macadamia species and N. nucifera mitochondrial genomes.

Core repeating units ranging from 1 to 200 bases (tandem repeats) are widely present in eukaryotes and some prokaryotes genomes [60]. In the present study, 25, 21, and 20 tandem repeats (10–33 bp) were identified in the M. integrifolia, M. ternifolia, and M. tetraphylla with a match greater than 95% (S2–S4 Tables). The tandem repeats (11–20 bp and 21–30 bp) significantly varied among the three Macadamia species (Fig 3C), where M. ternifolia had the least number of repetitions, while M. integrifolia and M. tetraphylla had a very similar number of repetitions. However, N. nucifera had the least (11–20 bp and 21–30 bp) and had the highest (0–10 bp, 31–40 bp, 41–50 bp) tandem repeated compared to the three Macadamia species. Besides, no repetitions ranged from 51–60 bp among the four genomes, while the number of repetitions was the same for 60–70 bp and above.

3.3 The prediction of RNA editing

RNA editing is a post-transcriptional process entailing the addition, deletion, or conversion of bases in the coding region of a transcribed RNA. The conversion of cytosine to uridine is common in cp and mt genomes of plants [61-65], which improves protein preservation in plants. The accurate detection of ribonucleic acid editing is inseparable from the proteomics data. In the present study, we predicted 42 protein-coding genes (including two multi-copy genes: ccmB and rps19) in the mt genomes of the three Macadamia species using the PREP-mt program [51]. The findings revealed that the RNA editing sites were 688, 689, and 688 (Fig 4). Among the protein-coding genes, nad4 had the most RNA editing sites (59 sites), while atp8, rpl2, rpl10, rps1, rps2, rps7, rps10, rps11, rps13, rps14, rps19, sdh3, and sdh4 had less than 10 RNA editing sites. 236 RNA editing sites occurred in the first base position of the codon, 472 sites appeared in the second base position, and there was no RNA editing in the third base position. M. ternifolia had more than one RNA editing site, unlike the other two Macadamia species.
Fig 4

The distribution of RNA-editing sites in the mt protein-coding genes of three species of Macadamia.

The bars of different colors represent the number of RNA-editing sites of each gene.

The distribution of RNA-editing sites in the mt protein-coding genes of three species of Macadamia.

The bars of different colors represent the number of RNA-editing sites of each gene. The RNA editing increases the diversity at the start and stop codons in protein-coding genes. However, even with RNA editing, 30.2% (208 positions) of amino acid hydrophobicity and 12.5% (86 positions) of amino acid hydrophilicity remained unchanged in the M. integrifolia and M. tetraphylla mt genomes. However, 6.7% (46 positions) of amino acids were converted from hydrophobic to hydrophilic, and 47.9% (330 positions) from hydrophilic to hydrophobic. In addition, five amino acids were converted from glutamine to stop codons and two from arginine to stop codons (Table 2). The findings in this study revealed that most amino acids were converted from serine to leucine (23.3%, 160 sites), proline to leucine (22.4%), and serine to phenylalanine (15.3%). The remaining 269 RNA editing sites included other RNA editing types, such as Ala-Val, His-Tyr, Leu-Phe, Pro-Phe, Pro-Ser, Arg-Cys, Arg-Trp, Thr-Ile, Thr- Met, Gln-X, and Arg-X (X = stop codon). Compared to M. integrifolia and M. tetraphylla, M. ternifolia only had one more RNA-edited site (Leu-Phe).
Table 2

Prediction of RNA editing sites.

TypeRNA-editingNumberPercentage
hydrophobic GCA (A) = > GTA (V)130.23%
GCG (A) = > GTG (V)6
GCT (A) = > GTT (V)4
CTC (L) = > TTC (F)7
CTT (L) = > TTT (F)16
CCC (P) = > TTC (F)6
CCT (P) = > TTT (F)14
CCA (P) = > CTA (L)61
CCC (P) = > CTC (L)14
CCG (P) = > CTG (L)44
CCT (P) = > CTT (L)35
hydrophilic CAT (H) = > TAT (Y)2412.50%
CAC (H) = > TAC (Y)11
CGC (R) = > TGC (C)15
CGT (R) = > TGT (C)36
hydrophobic-hydrophilic CCA (P) = > TCA (S)168.28%
CCC (P) = > TCC (S)13
CCG (P) = > TCG (S)6
CCT (P) = > TCT (S)22
hydrophilic-hydrophobic CGG (R) = > TGG (W)4347.97%
TCC (S) = > TTC (F)47
TCT (S) = > TTT (F)58
TCA (S) = > TTA (L)101
TCG (S) = > TTG (L)59
ACA (T) = > ATA (I)7
ACC (T) = > ATC (I)1
ACG (T) = > ATG (M)8
ACT (T) = > ATT (I)6
hydrophilic-stop CGA (R) = > TGA (X)21.02%
CAG (Q) = > TAG (X)1
CAA (Q) = > TAA (X)4

Notes: Compared with the other two species of Macadamia, M. ternifolia had only one more RNA-editing site (CTT (L) = >TTT (F)).

Notes: Compared with the other two species of Macadamia, M. ternifolia had only one more RNA-editing site (CTT (L) = >TTT (F)).

3.4 DNA migration from cp to mt

The cp-like sequences in the mt genome were detected by comparing against the complete cp genome sequence of M. integrifolia obtained from the NCBI database (Fig 5). We detected 28 fragments in the mt genome of M. integrifolia, ranging in size from 32 bp to 5,210 bp. The cp-like sequence had 36,902 bp, accounting for 5.4% of the mt genome. Five complete annotated tRNA genes were detected, namely trnH-GTG, trnM-CAT, trnW-CCA, trnD-GTC, and trnN-GTT, with some fragments of rrn18 genes. The findings also revealed that 28 insertion regions accounted for 23.2% of the cp genome, including seven complete protein-coding genes (petL, petG, ndhE, rps15, rpl23(X2), rpl2) and eight complete tRNA genes (trnH-GUG, trnD-GUC, trnM-CAU, trnW-CCA, trnP-UGG, trnP-GGG, trnI-CAU, trnN-GUU). Besides, several protein-coding genes were also identified, including psbA, rpoB, psbD, psbC, ndhC, rpl2, ycf2(X2), ndhB, rps7(X2), ndhD, ndhB and ycf1, and some tRNA genes (trnI-GAU, trnA-UGC, trnN-GUU), which migrated from the cp genome into the mt genome. But, most of these genes lost their integrity during the evolution process, and only their partial sequences were found in the mt genome. Furthermore, most cp-like sequences were located in the spacer region of the mt genome. These findings are consistent with previous research, where during evolution, tRNA genes were more conserved than the protein-coding genes and rRNA genes since they play an important role in mt genome [66].
Fig 5

Schematic representation of mitochondrial genome, chloroplast genome and chloroplast-like sequence of M. integrifolia.

Dots and heat maps inside the two chromosomes show where genes are located. The green lines in the circle show the regions of chloroplast-like sequences inserted from the chloroplast genome into the mt genome.

Schematic representation of mitochondrial genome, chloroplast genome and chloroplast-like sequence of M. integrifolia.

Dots and heat maps inside the two chromosomes show where genes are located. The green lines in the circle show the regions of chloroplast-like sequences inserted from the chloroplast genome into the mt genome.

3.5 Phylogenetic analysis within higher plant mt genomes

Australia is the origin and center of diversity of the Proteaceae, and this is distributed across remnant landmasses of the southern supercontinent Gondwana [67]. The order Proteales inclusive of Proteaceae, Platanaceae and Nelumbonaceae was established relatively recently, on the basis of molecular data, and morphological synapomorphies for the order are yet to be identified [68, 69]. Phylogenetic analysis was performed to understand the evolution of the three Macadamia species compared to 29 dicots, four monocots, and two gymnosperms (out-groups). The phylogenetic tree was constructed based on the comparisons in the data matrix of 23 conserved protein-coding genes (Fig 6). The findings revealed that the phylogenetic tree strongly supports the separation of Proteales from rosids and asterids, the separation of eudicots from monocots and angiosperms from gymnosperms. The evolutionary relationships among all the taxa separated into 20 families (Leguminosae, Cucurbitaceae, Apiaceae, Apocynaceae, Solanaceae, Rosaceae, Caricaceae, Brassicaceae, Salicaceae, Bataceae, Malvaceae, Vitaceae, Lamiaceae, Nelumbonaceae, Proteaceae, Butomaceae, Arecaceae, Poaceae, Cycadaceae, and Ginkgoaceae) were efficiently deduced in the phylogenetic tree (Fig 6). The Macadamia chloroplast genome confirms the placement of this family with the morphologically divergent Plantanaceae (plane tree family) and Nelumbonaceae (sacred lotus family) in the basal eudicot order Proteales [70]. In addition, Phylogenetic analysis of chloroplast genomic variation revealed a latitudinal population structure of wild M. integrifolia germplasm, suggesting long-term regional isolation of maternal lineages [71]. Overall, evolutionary analyses of organelle genomes suggest that Proteaceae are most closely related to Nelumbonaceae.
Fig 6

The phylogenetic relationships of three species of Macadamia with other 35 plant species.

The Maximum Likelihood tree was constructed based on the sequences of 23 conserved protein-coding genes. Colors indicate the families that the specific species belongs.

The phylogenetic relationships of three species of Macadamia with other 35 plant species.

The Maximum Likelihood tree was constructed based on the sequences of 23 conserved protein-coding genes. Colors indicate the families that the specific species belongs.

3.6 The substitution rates of protein-coding genes

In genetics, non-synonymous (Ka) and synonymous (Ks) substitution rates help understand the evolutionary dynamics of protein-coding genes among similar species since the Ka to Ks ratio indicates gene selection [72, 73]. In the present study, N. nucifera was used as a reference species to calculate the Ka/Ks ratio of 40 protein-coding genes present in the mt genome of three Macadamia species. The Ks of atp9 and rps14, and the Ka of rps12 was 0. Besides, in most protein-coding genes, the Ka/Ks ratio was significantly less than 1 (Fig 7). However, the Ka/Ks ratio of nad4, rpl2, rps3, rps4, and rps10 was greater than 1, with the rps3 ratio being 2.34, implying that these genes might have undergone mutation related positive selection following Macadamia and N. nucifera differentiation from their last common ancestor [74]. Besides, the ATP synthase, Cytochrome C biogenesis, Ubiquinol Cytochrome C reductase, and Maturases of Ka/Ks ratios were below 1, implying that the negative selection acted on these genes (Table 2). Therefore, these genes may be highly conserved during the evolution of higher plants [75].
Fig 7

The Ka/Ks values of 40 protein-coding genes of three Macadamia species.

4. Conclusions

The complete mt genomes of M. integrifolia, M. ternifolia and M. tetraphylla share many common features with angiosperm mt genomes. In this study, we found that the mt genomes of the three Macadamia species were circular like most mt genomes. Compared them with the GC content of the mt genome of 36 other green plants, the results supported the conclusion that the GC content in the Macadamia species and angiosperms are highly conserved. In addition, we conducted studies on SSRs and longer tandem repeats in the three sets of data. Besides, 688 RNA editing sites were identified in 42 protein-coding genes, providing important clues for predicting gene function with new codons. By detecting gene migration, we observed 28 fragments (with five complete tRNA genes) were transferred from the cp genome to mt genome. The subsequent phylogenetic analysis results also showed their accuracy in plant classification. Moreover, based on the Ka/Ks substitution of protein-coding genes, most coding genes have undergone negative selection, indicating that the protein-coding genes in the mt genome are conserved in Macadamia species. The findings of this study provide information on the mt genome of Macadamia species, which is key in understanding the evolutionary history of the family Proteaceae.

The abbreviations and NCBI accession numbers of mt genomes used in this study.

(XLSX) Click here for additional data file.

Perfect tandem repeats in the Macadamia integrifolia mitochondrial genome.

(XLSX) Click here for additional data file.

Perfect tandem repeats in the Macadamia ternifolia mitochondrial gemone.

(XLSX) Click here for additional data file.

Perfect tandem repeats in the Macadamia tetraphylla mitochondrial gemone.

(XLSX) Click here for additional data file. 14 Feb 2022
PONE-D-22-01656
Assembly and comparative analysis of the complete mitochondrial genome of three Macadamia species (M. integrifolia, M. ternifolia and M. tetraphylla)
PLOS ONE Dear Dr. Shi, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. ============================== Please consider both reviewers comments carefully. Note the detailed comments of reviewer #1 are provided in the manuscript PDF file. ============================== Please submit your revised manuscript by Mar 31 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Yanbin Yin Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Thank you for stating the following in the Acknowledgments Section of your manuscript: "This work was supported by the National Natural Science Foundation of China (No. 31760215 and No. 31801022) and the Technology Innovation Talents Project of Yunnan Province (2018HB086)." We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: "This work was supported by the National Natural Science Foundation of China (No. 31760215 and No. 31801022) and the Technology Innovation Talents Project of Yunnan Province (2018HB086)." Please include your amended statements within your cover letter; we will change the online submission form on your behalf. 3. PLOS requires an ORCID iD for the corresponding author in Editorial Manager on papers submitted after December 6th, 2016. Please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field. This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. Please see the following video for instructions on linking an ORCID iD to your Editorial Manager account: https://www.youtube.com/watch?v=_xcclfuvtxQ 4. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Overall, the manuscript is well written and data/findings are clearly presented. The methods can be expanded and I have made notes on the manuscript to direct the authors on where to enhance the methods. Likewise, I have provided suggestions for additional literature that can be included in the manuscript at the discretion of the authors. Reviewer #2: The authors presented a study on three mitogenomes from Macadamia species, and did the comparison of genomic features, repeat sequences, RNA editing, transfer from plastome into mitogenome and phylogeny with other higher plants. This manuscript provides some interesting findings. But there are still a few questions that need to be clarified and improved. In order to make the manuscript much clearer and the conclusions more valid, comments as follows: 1. The Figure 1 on the circular map of three Macadamia mitogenomes, the authors should remake these maps, clear labels for each gene. 2. I suggest the authors take care of the space and consistence on the number and word, for example, “31841bp (cox2)”, there is no space between “31841” and “bp”, but in other places there is space between them; also for the number, “31841” using “31,841” style, etc. 3. The other big problem is the references, the author must be very careful on every reference, such as uppercase, lowercase, italic, journal names, etc. The author must follow the instructions of the reference of this journal. 4. What are the structural differences among the three mitogenomes? How the author verify the accuracy of the three assemblies, especially for a couple of bps difference among these genomes? 5. In the part of “Up to 34.3% (236 sites) RNA editing sites occurred in the first base position of the codon, 68.6% (472 sites) appeared in the second base position, and there was no RNA editing in the third base position.”, why the total percentage > 100%? 6. For the Phylogenetic analysis, the authors need to provide more interesting information or findings. 7. I suggest the authors add more findings from the structural and evolution innovations from these mitogenomes, such as group I and II introns, etc. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. Submitted filename: PONE-D-22-01656_reviewer_#.pdf Click here for additional data file. 23 Mar 2022 Dear Reviewers: Reviewer #1: Overall, the manuscript is well written and data/findings are clearly presented. The methods can be expanded and I have made notes on the manuscript to direct the authors on where to enhance the methods. Likewise, I have provided suggestions for additional literature that can be included in the manuscript at the discretion of the authors. Thank you very much for your guidance. We have made necessary modifications according to the marks. Reviewer #2: The authors presented a study on three mitogenomes from Macadamia species, and did the comparison of genomic features, repeat sequences, RNA editing, transfer from plastome into mitogenome and phylogeny with other higher plants. This manuscript provides some interesting findings. But there are still a few questions that need to be clarified and improved. In order to make the manuscript much clearer and the conclusions more valid, comments as follows: 1. The Figure 1 on the circular map of three Macadamia mitogenomes, the authors should remake these maps, clear labels for each gene. The Figure 1 had been redrawn and uploaded in the new manuscript. 2. I suggest the authors take care of the space and consistence on the number and word, for example, “31841bp (cox2)”, there is no space between “31841” and “bp”, but in other places there is space between them; also for the number, “31841” using “31,841” style, etc. We checked the full text and corrected similar errors. 3. The other big problem is the references, the author must be very careful on every reference, such as uppercase, lowercase, italic, journal names, etc. The author must follow the instructions of the reference of this journal. All references have been corrected to Plos ONE style. 4. What are the structural differences among the three mitogenomes? How the author verify the accuracy of the three assemblies, especially for a couple of bps difference among these genomes? As mentioned in the material method, depth of coverage was used to correct mitochondrial sequence information. SPAdes v.3.5.0 software was used to splice and assemble mt genome sequences. To correct the splicing results, the raw sequencing data were mapped to mitochondrial sequences using Geneious software. 5. In the part of “Up to 34.3% (236 sites) RNA editing sites occurred in the first base position of the codon, 68.6% (472 sites) appeared in the second base position, and there was no RNA editing in the third base position.”, why the total percentage > 100%? This statement has been modified in the new manuscript. 236 RNA editing sites occurred in the first base position of the codon, 472 sites appeared in the second base position, and there was no RNA editing in the third base position. 6. For the Phylogenetic analysis, the authors need to provide more interesting information or findings. 7. I suggest the authors add more findings from the structural and evolution innovations from these mitogenomes, such as group I and II introns, etc. We have revised the above two points in the new manuscript. Thank you very much for your guidance Submitted filename: Response to Reviewers.doc Click here for additional data file. 20 Apr 2022 Assembly and comparative analysis of the complete mitochondrial genome of three Macadamia species (M. integrifolia, M. ternifolia and M. tetraphylla) PONE-D-22-01656R1 Dear Dr. Shi, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Please fix the minor issues that the reviewer identified. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Yanbin Yin Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: (No Response) Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: I am pleased with the author's revision and attention to my comments. However, I have two very minor corrections for the authors to address: Line 56: "Additionally,, they..." Please remove the redundant comma. Line 113 -114 : "...and NCBI were used to annotate the mt genome." Please cite the program also; I apologize that my previous comment was vague regarding an NCBI citation. Did you use ORF finder? If so, please cite: “…and ORF finder (NCBI) were used to annotate the mt genome.” Reviewer #2: There are minor errors in the manuscript: 1. Line 56, two comma after "Additionally"; 2. Line 123, "Macadamia" is not italic. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No 25 Apr 2022 PONE-D-22-01656R1 Assembly and comparative analysis of the complete mitochondrial genome of three Macadamia species (M. integrifolia, M. ternifolia and M. tetraphylla) Dear Dr. Shi: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Yanbin Yin Academic Editor PLOS ONE
  59 in total

Review 1.  Mitochondrial control of cell death.

Authors:  G Kroemer; J C Reed
Journal:  Nat Med       Date:  2000-05       Impact factor: 53.440

2.  Surprising features of plastid ndhD transcripts: addition of non-encoded nucleotides and polysome association of mRNAs with an unedited start codon.

Authors:  Aitor Zandueta-Criado; Ralph Bock
Journal:  Nucleic Acids Res       Date:  2004-01-26       Impact factor: 16.971

3.  SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing.

Authors:  Anton Bankevich; Sergey Nurk; Dmitry Antipov; Alexey A Gurevich; Mikhail Dvorkin; Alexander S Kulikov; Valery M Lesin; Sergey I Nikolenko; Son Pham; Andrey D Prjibelski; Alexey V Pyshkin; Alexander V Sirotkin; Nikolay Vyahhi; Glenn Tesler; Max A Alekseyev; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

4.  ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences.

Authors:  Dean Laslett; Björn Canbäck
Journal:  Bioinformatics       Date:  2007-11-22       Impact factor: 6.937

5.  Tripartite mitochondrial genome of spinach: physical structure, mitochondrial gene mapping, and locations of transposed chloroplast DNA sequences.

Authors:  D B Stern; J D Palmer
Journal:  Nucleic Acids Res       Date:  1986-07-25       Impact factor: 16.971

6.  MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.

Authors:  Sudhir Kumar; Glen Stecher; Michael Li; Christina Knyaz; Koichiro Tamura
Journal:  Mol Biol Evol       Date:  2018-06-01       Impact factor: 16.240

7.  The mitochondrial genome is large and variable in a family of plants (cucurbitaceae).

Authors:  B L Ward; R S Anderson; A J Bendich
Journal:  Cell       Date:  1981-09       Impact factor: 41.582

8.  Analysis of the Complete Mitochondrial Genome Sequence of the Diploid Cotton Gossypium raimondii by Comparative Genomics Approaches.

Authors:  Changwei Bi; Andrew H Paterson; Xuelin Wang; Yiqing Xu; Dongyang Wu; Yanshu Qu; Anna Jiang; Qiaolin Ye; Ning Ye
Journal:  Biomed Res Int       Date:  2016-10-25       Impact factor: 3.411

9.  MISA-web: a web server for microsatellite prediction.

Authors:  Sebastian Beier; Thomas Thiel; Thomas Münch; Uwe Scholz; Martin Mascher
Journal:  Bioinformatics       Date:  2017-08-15       Impact factor: 6.937

10.  Signatures of selection in recently domesticated macadamia.

Authors:  Jishan Lin; Wenping Zhang; Xingtan Zhang; Xiaokai Ma; Shengcheng Zhang; Shuai Chen; Yibin Wang; Haifeng Jia; Zhenyang Liao; Jing Lin; Mengting Zhu; Xiuming Xu; Mingxing Cai; Hui Zeng; Jifeng Wan; Weihai Yang; Tracie Matsumoto; Craig Hardner; Catherine J Nock; Ray Ming
Journal:  Nat Commun       Date:  2022-01-11       Impact factor: 17.694

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.