Literature DB >> 31188933

Araucaria angustifolia chloroplast genome sequence and its relation to other Araucariaceae.

José Henrique S G Brandão1, Nureyev F Rodrigues2, Maria Eguiluz1, Frank Guzman2, Rogerio Margis1,2,3.   

Abstract

Araucaria angustifolia is endemic to southern Brazil. Known as Brazilian pine, A. angustifolia is the only native conifer species with economic and social relevance in this country. Due to massive exploitation, it has suffered a significant population decline and currently is classified as critically endangered. This encouraged the scientific community to investigate genetic features in Brazilian pine to increase resources for management and preservation. In this work, RNA-Seq data was used to determine the complete nucleotide sequence of the A. angustifolia chloroplast genome (cpDNA). The cpDNA is 146,203 bp in length and contains 122 genes, including 80 protein-coding genes, 5 ribosomal RNA genes, and 37 tRNA genes. Coding regions comprise 45.02%, 4.96% correspond to rRNAs and tRNAs, and 50.02% of the genome encompasses non-coding regions. Genes found in the inverted repeat (IR) are present as single copy, with exception of the rrn5 and trnI-CAU loci. The typical LSC, SSC, IRa and IRb organization reported in several land-plant groups is not present in A. angustifolia cpDNA. Phylogenetic analyses using Bayesian and Maximum Likelihood methods clustered A. angustifolia in the Araucariaceae family, with A. heterophylla and A. columnaris as congeneric species. The screening of A. angustifolia cpDNA reveled 100 SSRs, 14 of them corresponding to tetrapolymer loci.

Entities:  

Year:  2019        PMID: 31188933      PMCID: PMC6905450          DOI: 10.1590/1678-4685-GMB-2018-0213

Source DB:  PubMed          Journal:  Genet Mol Biol        ISSN: 1415-4757            Impact factor:   1.771


Araucaria angustifolia (Bertol.) Kuntze syn. Columbea angustifolia, also known as Brazilian pine, is a native Brazilian tree of the class Pinopsida, order Pinales, and family Araucariaceae. It is widely distributed in the southern and southeastern areas of Brazil but also occurs in limited areas from Argentina and Paraguay (de Souza ). A. angustifolia is a dioecious wind-pollinated species with a mixed seed dispersion by barochoric authocory and birds (Lowe ). The seeds maintain elevated levels of water and active metabolic rates at the mature stage, resulting in a rapid loss of viability (Astarita ). Due to seed recalcitrance to storage, conservation strategies are restricted mainly to propagation by embryogenic cultures (Steiner ). A. angustifolia is one of the most important trees in its region of natural occurrence due to its relevant ecological, economic, and social functions. Its seeds are rich in starch, proteins, and flavonoids, having a high nutritional value during the winter season. As results of its social and economic relevance, A. angustifolia went through an indiscriminate exploitation and a substantial population decline, having been categorized as a critically endangered species in the International Union for the Conservation of Nature and Natural Resources (IUCN), Red List of Threatened Species (Thomas, 2013). The taxonomic classification of the genus Araucaria is well resolved (Stefenon ). It comprises 19 species with an interesting distribution worldwide. The species are distributed only in tropical and subtropical zones of the Southern hemisphere (Stefenon ). Seventeen species are present in Oceania, 13 of which are endemic to the small archipelago New Caledonia (Lu ). The two remaining species, Araucaria araucana, and Araucaria angustifolia, are present in southern South America (Lu ). Brazilian pine has been targeted by genetic studies that mainly focused on somatic embryogenesis, with the purpose of developing technologies for the conservation and genetic improvement of this species. One of these studies has generated RNA-seq data from early stage tissues and the libraries are available in the NCBI database (Elbl ). Once high-throughput sequencing data is generated, it can be used in a plethora of ways beyond the original purpose, and relevant information can be further explored from the targeted organism. In the present study, the RNA-seq data composed by 24 libraries (Elbl ) was used as input to perform the complete assembly and annotation of the A. angustifolia chloroplast (cp) genome. The paired-end sequence reads were filtered against 58 Pinidae cp genomes (Table S1) using BWA software with two mismatches allowed (Li and Durbin, 2009). The reads were used for an assembly de novo with ABySS software (Simpson ). The cp genome scaffolds were orientated using cp genome sequences of Araucaria heterophylla (NC_026450.1) using BLASTN (Camacho ). A gap region relative to an intergenic region was filled in after Sanger sequencing using the primers F: 5’ ACCGTGAGGGTTCAAGTCC and R: 3’ GTGGCACG AGGATTTTCAGT. For this purpose, total DNA was extracted by the CTAB method from young leaves of an A. angustifolia tree. DNA quality was evaluated by electrophoresis in a 1% agarose gel, and quantity was determined using a NanoDrop spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA). Genes were annotated using GeSeq (Tillich ) and BLAST similarity searches. Transfer RNAs (tRNAs) were predicted using the Aragorn program (Laslett and Canback, 2004) implemented in the GeSeq program and confirmed by comparison with the appropriate homologs in A. heterophylla. The circular cp genome map was drawn using the online program OGDRAW (Lohse ). To determine the phylogenetic relationships of A. angustifolia in the Pinidae division and corroborate with the Brazilian pine plastid genome sequence, a set of 73 cp protein-coding sequences (Table S2) from 18 conifer species, 17 belonging to Pinidae (Table S3) and Ginkgo biloba serving as outgroup were used. Nucleotide sequences were aligned separately using MUSCLE available in MEGA version 6.0 (Tamura ). Alignments were concatenated and nucleotide positions of each gene were specified and a Bayesian tree was generated using MrBayes version 3.2.6 (Ronquist et ), with the JC evolutionary model as determined by MODELTEST version 3.7 (Posada and Crandall, 1998), and 10,000,000 generations sampled every 100 generations. The first 25% of trees were discarded as burn-in to produce a consensus phylogram, with posterior probability (PP) values for each node. Maximum Likelihood (ML) analysis was also applied, using RaxML (Stamatakis, 2014) program and the ML tree was compared to Bayesian topology. The phylogenetic trees were rooted and visualized using FigTree software (http://tree.bio.ed.ac.uk/software/figtree/). Simple sequence repeats (SSRs) were detected using MISA perl script, available at (http://pgrc.ipk-gatersleben.de/misa/), with thresholds of 10 repeat units for mononucleotide SSRs, five repeat units for di- and trinucleotide SSRs, and three repeat units for tetra-, penta- and hexanucleotide SSRs. The interruption threshold among SSRs, which indicates the maximum difference between two SSRs was 50 base pairs. A total of 229,914,266 high quality Illumina paired-end reads from the A. angustifolia transcriptome libraries generated via the HiScanSQ platform and available at NCBI Sequence Read Archive (SRA) under accession number SRP039545 were filtered against Pinidae cp genomes. The 2,107,993 obtained reads were de novo assembled into non-redundant contigs and singletons covering about 99.65% of the cp genome (minimum coverage of 23 reads, maximum coverage of 1,780 reads). Two final large scaffolds were obtained and joined into a single cp circular genome after the use of Sanger sequencing. The complete cp genome of A. angustifolia is 146,203 bp in size and was submitted to GenBank (accession number: MH599004). This size is similar to those found in other Araucaria species (Ruhsam ). In A. angustifolia, duplicated genes present in inverted repeat regions IRa and IRb are found as a single copy, with exception of a sequence of 513 bp corresponding to the rrn5 gene, placed between clpP and psbB, suggesting a recombination event, and the two inverted copies from tRNA-CAU gene (Figure 1). The same pattern was described in other species of the Araucariaceae family, Agathis dammara and Wollemia nobilis, which lack canonical IRs and harbor double inverted copies of rrn5 and tRNA-CAU in their plastomes (Wu and Chaw, 2014; Yap ). A reduced size of IRa and IRb was described in Pinus taeda L. (Asaf ). Loss of the IR was also reported in the chloroplast genomes of some species of Pinaceae and Cupressophytes (Wu ).
Figure 1

Gene map of the Araucaria angustifolia chloroplast genome. Genes drawn inside the circle are transcribed counterclockwise and those outsides are clockwise. Genes belonging to different functional groups are indicated by different tonalities. The darker gray in the inner circle corresponds to the GC content, while the lighter gray corresponds to the AT content. Red and blue arrows indicate, respectively, double inverted copies of the rrn5 and trnI-CAU genes.

The coding sequences comprise 45.02%, 4.96% correspond to rRNAs and tRNAs, and 50.02% of the genome comprises non-coding regions, including introns, pseudogenes and intergenic spacers (Table 1). In general, all genomic features showed similarity in size, structure, and gene abundance with other Araucaria species (Table S4) (Ruhsam ). The genome contains 122 genes in total, which includes 120 single-copy genes corresponding to 80 protein-coding genes, 36 transfer RNA (tRNA) genes and four ribosomal genes (rRNA) (Figure 1, Table 1). The cp genome has 14 intron-containing genes: 9 protein-coding genes, one pseudogene, and four tRNA genes. The rps12 gene, a trans-spliced gene entirely located in the LSC region, and the ycf3 gene contain two introns each; the other genes have only one intron each. The trnK-UUU intron has 2,407 bp, with the largest intron encompassing also the matK gene, a common feature of land-plants chloroplast genomes.
Table 1

Summary of the Araucaria angustifolia chloroplast genome features.

Feature Araucaria angustifolia
Total cpDNA size146,203 bp
Protein coding regions (%)45.02%
rRNA and tRNA (%)4.96%
Introns (% total)8.3%
Intergenic sequences and pseudogenes (%)41.72%
Number of genes122
Number of protein coding genes80
Number of tRNA genes36
Number of rRNA genes4
Number of duplicated genes2
Pseudogenes1
GC content (%)36.54%
A phylogenetic analysis was performed to evaluate the position of A. angustifolia in the Araucariaceae family and subclass Pinidae, and 73 protein-coding genes from other 18 conifers were used for this purpose. The final alignment reached 55,435 nucleotides. These species were intentionally sampled to comprise the main representative taxa of the subclass Pinidae, while Ginkgo biloba was used as outgroup. The Bayesian analysis resulted in a consistent phylogenetic relationship of A. angustifolia and the 18 conifer represented species (Figure 2). Within Pinidae we found, two branches, the first one represented by the family Pinaceae, order Pinales, and the other by the families Cupressaceae, Podocarpaceae, Araucariaceae comprising Conifer I and Conifer II, respectively (Lu ). Within Conifer II, the genera and species distribution is clear and well supported among Araucariaceae and Podocarpaceae, comprising the order Araucariales and Cupressaceae and comprising the order Cupressales (Lu ). Within Araucariaceae, Araucaria angustifolia, Araucaria heterophylla, and Araucaria columnaris were grouped together in the genus Araucaria, which grouped with Agathis dammara and Wollemia nobilis. In the monophyletic Araucaria clade, Araucaria columnaris and Araucaria heterophylla, representing endemic species from New Caledonia and Norfolk Island, Australia, respectively, are grouped together, and another basal branch corresponds to Araucaria angustifolia, endemic from South America (Lu ). The strongly supported topology within the genus Araucaria, family Araucariaceae, and among the other taxa (Cupressaceae, Podocarpaceae, and Pinaceae) is congruent with a series of phylogenetic studies (Lu ). The ML analysis corroborated the Bayesian approach (Figure S1), which strongly reinforces the importance of cpDNA for phylogenetic inference.
Figure 2

Phylogenetic tree of 18 species of Pinidae based on 73 cp protein-coding genes generated by Bayesian method. Numbers above each node are posterior probability values. Family and order are also indicated. Ginkgo biloba was used as outgroup.

Using the MISA perl script, 100 simple sequence repeats (SSRs) were detected in A. angustifolia cpDNA. The 53 homopolymers A/T and 24 dipolymers AT were the most common SSRs, while 14 different tetrapolymers and a single hexapolymer were also found (Table S5). SSR pentapolymers were not present in the cpDNA. The present A. angustifolia chloroplast genome is the first complete cpDNA sequence for this species and shows a set of features that could be further explored for population and phylogenetic studies within this group. Moreover, the present study increases the genetic and genomic resources available in Araucaria and shows that, as reported in bryophytes and angiosperms (Shi ), the plastome sequence can be straightforwardly assembled from transcriptome data generared for conifers.
  18 in total

1.  MEGA6: Molecular Evolutionary Genetics Analysis version 6.0.

Authors:  Koichiro Tamura; Glen Stecher; Daniel Peterson; Alan Filipski; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2013-10-16       Impact factor: 16.240

2.  Loss of different inverted repeat copies from the chloroplast genomes of Pinaceae and cupressophytes and influence of heterotachy on the evaluation of gymnosperm phylogeny.

Authors:  Chung-Shien Wu; Ya-Nan Wang; Chi-Yao Hsu; Ching-Ping Lin; Shu-Miaw Chaw
Journal:  Genome Biol Evol       Date:  2011-09-19       Impact factor: 3.416

3.  MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space.

Authors:  Fredrik Ronquist; Maxim Teslenko; Paul van der Mark; Daniel L Ayres; Aaron Darling; Sebastian Höhna; Bret Larget; Liang Liu; Marc A Suchard; John P Huelsenbeck
Journal:  Syst Biol       Date:  2012-02-22       Impact factor: 15.683

4.  Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution.

Authors:  Jia-Yee S Yap; Thore Rohner; Abigail Greenfield; Marlien Van Der Merwe; Hannah McPherson; Wendy Glenn; Geoff Kornfeld; Elessa Marendy; Annie Y H Pan; Alan Wilton; Marc R Wilkins; Maurizio Rossetto; Sven K Delaney
Journal:  PLoS One       Date:  2015-06-10       Impact factor: 3.240

5.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2014-01-21       Impact factor: 6.937

6.  GeSeq - versatile and accurate annotation of organelle genomes.

Authors:  Michael Tillich; Pascal Lehwark; Tommaso Pellizzer; Elena S Ulbricht-Jones; Axel Fischer; Ralph Bock; Stephan Greiner
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

7.  Complete chloroplast genome sequence and comparative analysis of loblolly pine (Pinus taeda L.) with related species.

Authors:  Sajjad Asaf; Abdul Latif Khan; Muhammad Aaqil Khan; Raheem Shahzad; Sang Mo Kang; Ahmed Al-Harrasi; Ahmed Al-Rawahi; In-Jung Lee
Journal:  PLoS One       Date:  2018-03-29       Impact factor: 3.240

8.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

9.  OrganellarGenomeDRAW--a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets.

Authors:  Marc Lohse; Oliver Drechsel; Sabine Kahlau; Ralph Bock
Journal:  Nucleic Acids Res       Date:  2013-04-22       Impact factor: 16.971

10.  Full transcription of the chloroplast genome in photosynthetic eukaryotes.

Authors:  Chao Shi; Shuo Wang; En-Hua Xia; Jian-Jun Jiang; Fan-Chun Zeng; Li-Zhi Gao
Journal:  Sci Rep       Date:  2016-07-26       Impact factor: 4.379

View more
  1 in total

1.  The complete chloroplast genome of Araucaria cunninghamii (Araucariaceae).

Authors:  Jingyao Ping; Xin Luo; Ming Zhu; Rongjing Zhang; Chunmei Qian; Yingjuan Su; Ting Wang
Journal:  Mitochondrial DNA B Resour       Date:  2020-07-25       Impact factor: 0.658

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.