Literature DB >> 35814936

The Complete Chloroplast Genome Sequence of Cicer bijugum, Genome Organization, and Comparison with Related Species.

Melih Temel¹, Yasin Kaymaz¹, Duygu Ateş¹, Abdullah Kahraman^1,2, Muhammed Bahattin Tanyolaç¹.

Abstract

Background: Chickpea is one of Turkey's most significant legumes, and because of its high nutritional value, it is frequently preferred in human nourishment.Chloroplasts, which have their own genetic material, are organelles responsible for photosynthesis in plant cells and their genome contains non-trivial information about the molecular features and evolutionary process of plants. Objective: Current study aimed at revealing complete chloroplast genome sequence of one of the wild type Cicer species, Cicer bijugum, and comparing its genome with cultivated Cicer species, Cicer arietinum, by using bioinformatics analysis tools. Except for Cicer arietinum, there has been no study on the chloroplast genome sequence of Cicer species.Therefore, we targeted to reveal the complete chloroplast genome sequence of wild type Cicer species, Cicer bijugum, and compare the chloroplast genome of Cicer bijugum with the cultivated one Cicer arietinum.
Methods: In this study, we sequenced the whole chloroplast genome of Cicer bijugum, one of the wild types of chickpea species, with the help Next Generation Sequencing platform and compared it with the chloroplast genome of the cultivated chickpea species, Cicer arietinum, by using online bioinformatics analysis tools.
Results: We determined the size of the chloroplast genome of C. bijugum as 124,804 bp and found that C. bijugum did not contain an inverted repeat region in its chloroplast genome. Comparative analysis of the C. bijugum chloroplast genome uncovered thirteen hotspot regions (psbA, matK, rpoB, rpoC1, rpoC2, psbI, psbK, accD, rps19, ycf2, ycf1, rps15, and ndhF) and seven of them (matK, accD, rps19, ycf1, ycf2, rps15 and ndhF) could potentially be used as strong molecular markers for species identification. It has been determined that C. bijugum was phylogenetically closer to cultivated chickpea as compared to the other species.
Conclusion: It is aimed that the data obtained from this study, which is the first study in which whole chloroplast genomes of wild chickpea species were sequenced, will guide researchers in future molecular, evolutionary, and genetic engineering studies with chickpea species.

Entities: Chemical

Keywords: Cicer bijugum; Wild type chickpea; bioinformatics; chloroplast genome; comparative genome analysis; genome organization

Year: 2022 PMID： 35814936 PMCID： PMC9199535 DOI： 10.2174/1389202923666220211113708

Source DB: PubMed Journal: Curr Genomics ISSN： 1389-2029 Impact factor: 2.689

INTRODUCTION

Legumes, also known as Leguminosae or Fabaceae, are economically important angiosperms in the plant kingdom, with one of the largest families [1-3]. Legumes have a worldwide distribution area and can grow under various climate conditions such as the Mediterranean, savanna, or arid regions [4]. Chickpea is among the most essential cool season grain legumes all over the world after beans and peas in terms of production amount and consumption [5]. Chickpea is also a great nutrient that constitutes one of Turkey's main means of livelihood [6]. The main reason for grain legumes to be cultivated is for their seeds [7]. Grain legume seeds are mostly preferred in both human and livestock nutrition dueto their high nutritional content, especially their rich protein content [5, 8-10]. Recent studies show that the origin of cultivated chickpea is Middle Asia (especially South-Eastern Turkey), while the origins of wild type Cicer species are Central and Western Asia, Northern Africa, and the Mediterranean region [11, 12]. Chickpea is a self-pollinated plant that belongs to the Cicer genus in the Fabaceae family. In the Fabaceae family, Cicer arietinum is the only cultivated Cicer species, and Cicer reticulatum is known as the wild ancestor of C. arietinum [13]. Cicer reticulatum and Cicer echinospermum species are close relatives of C. arietinum. There are also wild type Cicer species such as Cicer bijugum, Cicer pinnatifidum, Cicer yamashitae and Cicer echinospermum [12]. Cicer bijugum is an essential crop for plant breeders because of having resistance against some plant threats such as botrytis grey mold, pod borer, and ascochyta blight [14]. Besides C. bijugum being resistant, it is also a tertiary genetic relative of C. arietinum and thus has the potential to be used as a gene donor for the improvement of C. arietinum [15]. In terms of crossability, wild type species have been divided into three gene pools and C. bijugum is in the second group [16]. Except for C. reticulatum and C. echinospermum (members of the first gene pool), there is no evidence that wild relatives of C. arietinum, including C. bijugum cannot be successfully crossed with C. arietinum by using conventional breeding methods [17]. Chloroplasts are organelles responsible for main photosynthesis and carbon fixation [18, 19]. Photosynthesis is the most important function of chloroplasts, but in addition to photosynthesis, chloroplasts play a crucial role in the biosynthesis of nucleotides, fats, vitamins, amino acids, and phytohormones [20]. Separate from nuclear DNA, chloroplasts have their own genome and can encode proteins related to photosynthesis, tRNA, and rRNA in that genome [21]. It is thought that chloroplasts are endosymbiotically evolved organelles and have a conserved structure with respect to gene content, gene organization, and gene structure [21, 22]. This conserved and non-recombinant genome structure makes chloroplasts suitable for phylogenetic, taxonomic, evolutionary, and molecular genetics research [23-25]. Moreover, chloroplasts can be modified to give various agronomic characteristics to plants by using genetic engineering techniques and can be used as bioreactors in the production of commercial enzymes, biopharmaceutics, and vaccines [18]. Chloroplast genome is maternally inherited and has lots of genetic polymorphisms; therefore, it has a plentiful source of genetic information [26, 27]. Chloroplast genome has a double-stranded circular structure, and its genome size is variable, usually ranging from 120 - 160 kb in plants. Moreover, it encodes highly conserved 110 -130 genes with various functions mostly related to photosynthesis [22, 28, 29]. In angiosperms, chloroplast genomes have a quadripartite structure, including inverted repeat A (IRA) and inverted repeat B (IRB), large single copy (LSC), and small single copy (SSC) regions. These regions have different lengths in the genome [30, 31]. On the other hand, some structural changes like loss of one copy of IR region were observed in the C. bijugum chloroplast genome. The species that have only one copy of IR are the members of Inverted Repeat Lacking Clade (IRLC) and were located in the Papilionoideae subfamily belonging to the Fabaceae family [32]. However, Jansen et al. (2008) sequenced the complete chloroplast genome sequence of C. arietinum and found out that C. arietinum has only one IR region in its chloroplast genome. In addition to that, they have detected 108 genes while infA, rps16, and ycf4 genes were absent [33]. The present study was carried out to determine how this structural change was organized in the relatives of cultivated chickpea. Recent advances in Next Generation Sequencing (NGS) techniques have led to a rapid increase in chloroplast genome sequencing studies in plants. Next-generation sequencing techniques enable whole-genome sequencing (WGS) and allow longer base pairs to be read compared to classical sequencing methods. Usage of NGS platforms has dramatically accelerated genome-based studies such as molecular genetics, genomics, and phylogenetic [34, 35]. It is a fact that the genomic data obtained in large quantities thanks to high-throughput sequencing technologies can be processed more easily with the help of bioinformatics tools [36]. The first whole chloroplast genome sequencing study was performed with tobacco (Nicotiana tabacum) [37]. Today, whole chloroplast genome sequences of more than 800 plants are available in the Genbank database. Since the chloroplast genome carries important information about the plant's evolutionary process and photosynthesis, sequencing the whole chloroplast genome is very critical for the precision of comparative genome analyses between plant species [22, 25]. The main purpose of this study is to reveal the whole chloroplast genome sequence of C. bijugum, detect the genes located in the C. bijugum chloroplast genome, and compare orientations of both chloroplast genome and genes with the outgroup species. To date, the chloroplast genome sequence of any wild type Cicer species has not been sequenced yet and this is the first study that has revealed the whole chloroplast genome sequence of wild type C. bijugum. In the light of the results obtained in this study, it is aimed to uncover the chloroplast genome structure of C. bijugum and illuminate the evolutionary development of chickpea species. At the same time, this study reveals important information about the chloroplast genome structure and includes molecular and phylogenetic information that will contribute to further evolutionary and biotechnological studies on chickpea species.

MATERIALS AND METHODS

Plant Materials

Wild type chickpea species C. bijugum and the cultivated one C. arietinum were used in this research. The seeds of chickpea species were obtained from Harran University, Faculty of Agriculture, Department of Field Crops. The chickpea species used in this research were sown sequentially at the experimental station of the Faculty of Agriculture of Ege University, İzmir, Turkey. Genotypes were sown at equal intervals, 12 in each row. Approximately 20 cm spacing was left between each genotype of the same species and approximately 30 cm between each row. Distinct species were grown at least 40 cm apart from each other. Cicer seeds were sown in November 2019 and harvested in May 2020 when the leaves reached the fully green stage.

Chloroplast DNA Extraction

The young leaves of the chickpea genotypes were collected with 20 grams of fresh weight and transported to the laboratory environment in liquid nitrogen at -196°C. The harvested leaves were stored at +4°C for 3 days to reduce the amount of starch. Chloroplast DNA isolation was performed following the high salt chloroplast DNA extraction method as described by Shi et al. (2012) with some modifications [38]. 100 μl Tris-EDTA (TE) buffer (1X, pH 8.0) was used to dissolve the isolated DNA. The purity of isolated DNA was determined by running the DNA samples on the agarose gel having a 0.8% concentration. DNA isolates were quantified by using a Nanodrop spectrophotometer (NanoDrop ND 1000, Thermo Scientific). After these processes, isolated chloroplast DNA samples were deposited at -80ºC for further use.

Chloroplast DNA Sequencing, Assembly and Data Processing

After high molecular weight chloroplast DNA isolation, isolated DNA samples were sent to Beijing Genome Institute (BGI) and the methods sequencing process was achieved in BGI. The chloroplast genome of C. bijugum was sequenced by using the Whole Genome Sequencing (WGS) approach. The method for sequencing is briefly as follows; Before initiating the sequencing procedure, sample concentration, integrity, and purity were tested. Concentration was detected by a fluorometer (Qubit Fluorometer, Invitrogen). The integrity and purity of the samples were determined using agarose gel electrophoresis for 40 minutes at a voltage of 150 V and an agarose gel concentration of 1%.After this point, 1µg C. bijugum chloroplast DNA was randomly fragmented by Covaris. The fragmented chloroplast DNA was selected by Agencourt AMPure XP-Medium kit to an average size of 200-400 bp. Fragments were end-repaired and then 3’ adenylated. Adaptors were ligated to the ends of these 3’ adenylated fragments. In the next step, fragments with adaptors were amplified by Polymerase Chain Reaction (PCR) and then PCR products were purified using Agencourt AMPure XP-Medium kit. The double-stranded PCR products were heat-denatured and circularized by the splint oligo sequence. The single-strand circle DNA (sscir DNA) was formatted as the final library. After library preparation, chloroplast DNA was sequenced by an NGS platform BGISEQ-500 and 150 bp paired-end reads were generated. The complete chloroplast genome of Carya Illinoinensis (Genbank Accession: MH909600.1) was used as a reference in the assembly of C. bijugum and paired-end reads were assembled by software organelle (1.7.4.1). The Geseq online tool was used for chloroplast genome annotation (https://chlorobox.mpimp-golm.mpg.de/geseq.html). The physical plastid genome map of C. bijugum was constructed using an online tool OrganellarGenomeDRAW v1.3.1 (OGDRAW) [39]. The assembled genome sequences and their associated raw sequencing data are available under the study accession PRJEB47534 with the sample identification number ERS7635404 in the European Nucleotide Archive (ENA) database.

Comparative Bioinformatic Analysis

Complete chloroplast genome sequences of C. bijugum and C. arietinum were compared with each other by using mVISTA [40] program in SHUFFLE LAGAN mode. C. arietinum was set as a reference genome. The annotation file of C. arietinum (Accession No: NC_011163.1) was obtained from the National Center of Biotechnology Information (NCBI) database. In order to align chloroplast genomes of species and to detect homologous regions in the chloroplast genomes, the ProgressiveMauve v2.4.0 algorithm in the MAUVE program [41] was used. For determining codon usage bias in C. bijugum and C. arietinum chloroplast genomes, Relative Synonymous Codon Usage (RSCU) values and amino acid compositions of species were calculated in MEGA X v1.01 [42]. In addition, codon usage frequencies were visualized using “ggpubr” package in R programming language. To detect forward, reverse, complementary, and palindromic repeat regions, REPuter [43] program was used (Hamming distance = 3, Maximum Computed Repeats = 50, and Minimum Repeat Size = 30). Tandem Repeat Finder [44] was used to reveal tandem repeats located in the chloroplast genomes of C. bijugum and C. arietinum. Simple Sequence Repeats (SSR) analysis was carried out by using MISA [45] with the following thresholds; > 10 for mononucleotide, > 5 for dinucleotide, > 5 for trinucleotide, > 3 for tetranucleotide, > 3 for pentanucleotide and > 3 for hexanucleotide SSRs. Before nucleotide diversity analysis, chloroplast genomes of C. bijugum, C. arietinum, and Medicago orbicularis were aligned using MAFFT v7.475 [46]. After then, Dnasp v6.12.03 [47] program was used to estimate nucleotide polymorphisms of chloroplast genome sequences of C. bijugum, C. arietinum, and Medicago orbicularis. For the sliding window option, the following parameters were set as window length of 600 bp and step size of 200 bp.

Phylogenetic Analysis

The whole chloroplast genome sequences of nine species were used to construct a phylogenetic relationship tree of species. Medicago sativa (NC_042841.1), Triticum aestivum (NC_002762.1), Glycine max (NC_007942.1), Phaseolus vulgaris (NC_009259.1), Vigna unguiculata (NC_ 018051.1), Arachis hypogaea (NC_037358.1), and Arabidopsis thaliana (NC_000932.1) were selected as outgroup species and accession numbers of outgroup species were retrieved from NCBI database. Chloroplast genome sequences of all species were aligned using the MAFFT v7.475 [46] program at first, and then the phylogenetic relationship tree was constructed by using MEGA X v1.01 [42] with Maximum likelihood (ML) method, GTRGAMMAI substitution model, and 1000 Bootstrap replicates.

RESULTS

Chloroplast Genome Assembly

The whole chloroplast genome of C. bijugum was sequenced using an NGS platform BGISEQ-500 and the sequencing coverage was 100X. At the end of sequencing, the reads with the length of 150 bp were obtained and then the reads were remapped to the chloroplast genome of Cicer arietinum. The whole length of C. bijugum chloroplast genome was 124,804 bp (Fig. ).

Chloroplast Genome Organization and Gene Content of C. bijugum

Unlike the other angiosperms, the chloroplast genome of C. bijugum did not show a quadripartite structure. The chloroplast genome of C. bijugum consisted of three parts which were LSC (84,705 bp), SSC (11,640 bp), and IR (28,459 bp) (Fig. ). It was found that C. bijugum chloroplast genome contained a total of 113 genes, including 79 protein coding genes (70%), 30 tRNA genes (26.5%), and 4 rRNA genes (3.5%). GC content of chloroplast genome of C. bijugum was found to be 33.6% (Table ). When all genes were functionally classified, it was detected that 59 genes were responsible for self-replication, 44 genes for photosynthesis, 5 genes for photosystem I, 15 genes for photosystem II, 1 gene for RUBISCO, 6 genes for ATP synthase, and 6 genes for cytochrome b/f complex. 6 genes were involved in different functions (Table ).

Comparative Genome Analysis

In this analysis, gene base identities of whole chloroplast genome sequences of C. bijugum and C. arietinum were analyzed by using mVISTA program. MegaBlast program was used to compute the percent identity of the whole chloroplast genome of species. At the end of MegaBlast analysis, as expected, it was found that C. bijugum and C. arietinum chloroplast genome sequences had high similarity, and the percent identity of chloroplast genomes was equal to 97.24%. This result indicates that C. bijugum and C. arietinum chloroplast genomes were highly conserved at the genome level. As a result of mVISTA analysis, the coding regions which showed diversity were detected, and it is revealed that matK, accD, ycf1, ycf2, rps15, rps19, and ndhF genes were divergent regions and they can be used as molecular barcodes in such studies species identification, phylogenetic analysis, evolutionary and molecular research. In addition, the IR region was the most divergent region and the non-coding regions showed higher variation than the coding regions (Fig. ). Gene orders and genome orientations of C. bijugum and C. arietinum chloroplast genome were investigated by using the MAUVE program. Locally Collinear Blocks (LCBs) were defined as highly homologous genome regions that genome rearrangements have not occurred [48]. When the chloroplast genome orientations of species were examined, it was clearly seen that chloroplast genomes of C. bijugum and C. arietinum included 5 LCB regions. Orientations of the LCB regions were greatly the same and linear except a very small 594 bp inversion in C. bijugum labelled with yellow. While this small inversion in C. bijugum did not change the gene content, it has been observed in the literature that no such inversion occurred in the evolutionary process in that region of the legume family (Fig. ).

Codon Usage Frequency Analysis

Codon usage frequencies, RSCU values, and amino acid composition of C. bijugum chloroplast genome were calculated by using the MEGAX v1.01 program based on protein- coding gene regions. In C. bijugum chloroplast genome, 79 protein coding genes were encoded by 41,601 codons. The most abundant amino acid in the C. bijugum chloroplast genome was Leucine encoded 4120 (10.52%), and the least abundant amino acid was Tryptophan encoded 608 (1.55%) (Fig. ). The two most abundant amino acids were Leucine and Isoleucine, respectively. RSCU values have ranged from 0.42 - 2.17. High codon usage bias was detected in 29 codons having RSCU > 1, while low codon usage bias was detected in 33 codons having RSCU < 1. According to these results, it can be said that C. bijugum chloroplast genome showed low codon usage bias. Furthermore, no codon usage bias (RSCU = 1) was detected in Methionine and Tryptophan (Table S1). In addition, the third position of all highly preferred codons (RSCU > 1) mostly included adenine (A) and uracil (U) nucleotides (Fig. ).

Repeat Sequences Analysis

In C. bijugum chloroplast genome, 107 SSRs were detected in total. Among these SSRs, 72 repeats for mononucleotide, 27 repeats for dinucleotide, 1 repeat for trinucleotide, 6 repeats for tetranucleotide, and 1 repeat for pentanucleotide, respectively. Any hexanucleotide repeats were detected in the chloroplast genome (Fig. ). Mononucleotide and dinucleotide repeats were found to be the most abundant repeat types with percentages of 67.2% and 25.2%, respectively. When the SSR motifs were investigated, it was seen that A / T (67.2%) and AT / AT (24.2%) were the most common SSR motifs in the chloroplast genome of C. bijugum (Fig. ). Moreover, besides the SSRs, forward, reverse, palindromic and complementary repeats were identified in C. bijugum chloroplast genome by using the REPuter program. These repeats included 27 repeats for forward, 3 repeats for reverse, 28 repeats for palindromic, and 2 repeats for complementary, respectively. In addition, 38 tandem repeats were found in C. bijugum chloroplast genome (Fig. ). This result showed that tandem and palindromic repeats were the other most common repeat types with percentages of 38.7% and 28.5%, respectively.

Divergent Hotspots Analysis

In chloroplast genomes, some regions showed high variations and these regions were called hotspots [22]. The pi values that indicate nucleotide diversity were calculated by using DnaSP v6.12.03. As a result of sliding window analysis, pi values ranged from 0.00333 to 0.33167. High pi values indicated that the variation was high and low pi values indicated that the variation was low in the region. As a result of divergent hotspots analysis, thirteen hotspot regions (psbA, matK, rpoB, rpoC1, rpoC2, psbI, psbK, accD, rps19, ycf2, ycf1, rps15, and ndhF) were detected in chloroplast genomes of C. bijugum, C. arietinum, and M. orbicularis. In addition, it was revealed that the IR region was the most divergent region compared to other regions. This result supported the comparative genome analysis result done by using mVISTA. The most divergent region was found to be ycf1 (Pi = 0.33167). Furthermore, non-coding regions were more divergent than coding regions as in comparative genome analysis (Fig. ). Previous studies show that the chloroplast genome is a very useful material for revealing the evolutionary and phylogenetic relationships between species in the legume family [49]. In this study, C. bijugum species in the legume family were phylogenetically compared with the C. arietinum and selected outgroup species. In order to construct a phylogenetic tree of C. bijugum, complete chloroplast genome sequences of 9 species were used. 7 species belonged to the Fabaceae family and 2 species (Arabidopsis thaliana and Triticum aestivum) were used as an outgroup. The phylogenetic tree was constructed with the ML method. All of the branches in the tree had 100% bootstrap support. When the phylogenetic tree was investigated, it was seen that C. bijugum and C. arietinum formed a branch and they were the closest species to each other. As expected, legume species and outgroups separately were clustered at two different branches. In legumes, Arachis hypogea was merely located in a separate branch from other legume species. According to chloroplast genome sequences, Medicago sativa and Glycine max were the closest species to the Cicer species. Also, the other legume species, Phaseolus vulgaris and Vigna unguiculata, were positioned together in another branch (Fig. ).

DISCUSSION

Chloroplast Genome Organization and Gene Content of C. bijugum and C. arietinum

In this research, chloroplast genome lengths of C. bijugum and C. arietinum were detected 124.804 bp and 125.319 bp, respectively. In literature, it is stated that chloroplast genome lengths of land plants varied between 115 - 165 kb [50]. When the chloroplast genome structure of terrestrial plants is examined, it is seen that the genome structure mostly consists of LSC, SSC, and two inverted repeat regions (IRA and IRB) [51]. Furthermore, it was detected that chloroplast genomes of C. bijugum and C. arietinum, which belong to the Cicereae tribe, have lost one copy of their IR region as in other IRLC family members such as Galegeae, Millettieae, Caraganeae, Trifolieae, Fabeae [52-54]. GC contents of chloroplast genomes of C. bijugum and C. arietinum were detected at 33.6% and 33.9%, respectively. It was clearly seen from the results that the GC content of C. bijugum was less than the cultivated one. In both species, the number of protein coding and rRNA genes was the same but the number of tRNA genes was different. The difference was caused by the trnF-AAA gene because the trnF-AAA gene was encoded in C. bijugum chloroplast genome but was not encoded in C. arietinum chloroplast genome. Jansen et al. (2008) annotated C. arietinum chloroplast genome by using Dual Organellar Genome Annotator (DOGMA) tool and stated the absence of rpl22, rps16, and infA genes [33]. In this study, C. arietinum chloroplast genome was reannotated using the Geseq annotation tool and the absent genes in Jansen’s study were detected. Also,in this research, undetected ycf3 and ycf4 genes of C. arietinum in Jansen’s study were detected with the names of pafI and pafII, respectively. As a result of the study, when the data obtained in the comparison of gene contents were examined, it was determined that the gene contents of C. arietinum and C. bijugum species were highly similar. In addition, it has been determined that the gene contents of the cultivated and wild species are largely compatible with the members of other IRLC families in the literature [55, 56]. At the end of chloroplast genome sequence identity analysis with Megablast, it was observed that chloroplast genomes of C. bijugum and C. arietinum were highly similar to each other and the percent identity was 97.24%. This identity value indicated that chloroplast genomes of wild and cultivated type Cicer species were highly conserved during the evolutionary process. When the comparative genome analysis results were investigated, it was seen that there were seven potential marker gene regions (matK, rps19, accD, ycf2, ycf1, rps15, and ndhF) located in chloroplast genomes of C. bijugum and C. arietinum. Previous studies indicated that matK, ycf1, ycf2 and rps19 are some of the strong molecular markers found in land plants [57-59]. In addition, it was detected that the varieties in non-coding regions were more than coding regions, as mostly stated in literature [60-62]. Contrary to what is often stated in the literature, IR region was found to be the most variable region in this study [63]. As a result of comparative genome analysis by using MAUVE, it was found that chloroplast genome orientations and gene contents of C. bijugum and C. arietinum were extremely similar except for the gene losses. These results were consistent with the results of Munyao et al.'s (2020) study about comparative chloroplast genome analysis of Chlorophytum comosum ve Chlorophytum gallabatense [64]. An inversion detected in the chloroplast genomes of C. bijugum was not detected in the C. arietinum chloroplast genome. In the present study, the chloroplast genome of C. bijugum was isolated with high molecular weight and sequenced with high genome coverage (100X). These parameters indicate that the chloroplast genome of C. bijugum had accurately correct genome orientation. The C. arietinum chloroplast genome has been sequenced by designing chloroplast-specific primers with low genome coverage [33]. Genomes that have been sequenced by this method could have high error rates. Therefore, this inversion, which was detected in C. bijugum whose chloroplast genome were isolated with high molecular weight and sequenced with high coverage, is a true inversion that is not caused by sequencing errors, and it has been determined that there is no such inversion in the legume family in the literature. In C. bijugum and C. arietinum chloroplast genomes, the number of encoded codons of C. bijugum was less than C. arietinum. It was detected that Leucine was the most abundant amino acid (10.52% for C. bijugum and 10.28% for C. arietinum) and Tryptophan was the least amino acid (1,52% for C. bijugum and 1,55% for C. arietinum) for both C. bijugum and C. arietinum chloroplast genomes. Similar to the obtained results, Alzahrani et al. (2020) found that the most abundant amino acid was Leucine and the least abundant amino acid was Tryptophan in Barleria prionitis chloroplast genome [65]. It was clearly observed from the results that the percentage of the amino acids was different between species. Percentage of Leucine increased from cultivated to wild type; on the other hand, percentage of Tryptophan decreased from cultivated to wild type. Low codon usage bias was determined in C. bijugum chloroplast genome; however, codon usage bias of C. arietinum chloroplast genome was in balance. RSCU values of species were much close to each other. For both C. bijugum and C. arietinum chloroplast genomes, start codon Methionine and Tryptophan did not have any codon usage bias (RSCU = 1). As it was seen from the codon usage frequency graphs, similar to most of the land plants' chloroplast genomes [31, 66], it was detected that the third position of the most preferred codons (RSCU > 1) was rich in A/U content. SSR regions are highly repetitive regions in genomes of eukaryotic organisms and abundant in genomes. Generally, they consist of 1 - 6 nucleotide repetitions and they can be used as potential molecular markers in evolutionary and molecular genetic studies [58]. Moreover, it was reported in the literature that SSRs play an important role in phylogenetic analysis and genome rearrangements [67]. At the end of SSR analysis, mononucleotide and dinucleotide repeats were found to be the most abundant SSR types in both C. bijugum and C. arietinum chloroplast genomes. The results were consistent with the result obtained by Li et al. (2017) and Li et al. (2021) [68, 69]. As it was seen from the figures, SSR regions of both C. bijugum and C. arietinum species had plenty of A and T nucleotides. Although this plenty of A and T nucleotides in SSR regions of chloroplast genomes of land plants was reported in the literature before [70, 71], the number of SSRs located in the chloroplast genomes of species were different. In chloroplast genomes of C. bijugum and C. arietinum, 107 and 103 SSRs were detected, respectively. With these in mind, it can be easily said that SSRs can be used as strong molecular markers in phylogenetic analysis, evolutionary studies, or population structure research. Zhang et al. (2016) and Wang et al. (2021) previously reported that SSRs were strong molecular markers for land plants [19, 72]. Similar to the results obtained [71] from the chloroplast genome of a kind of wild-type legume Dipteryx alata, it was found that A / T and AT / AT were the most abundant SSR motifs in chloroplast genomes of C. bijugum and C. arietinum. When the nucleotide diversity analysis of the chloroplast genomes of the species was examined, it was determined that, contrary to the literature, the IR region showed more diversity than the LSC and SSC regions. Considering the nucleotide positions where diversity was seen in the nucleotide diversity graph, it was determined that the coding regions were more conserved than the non-coding regions in the chloroplast genomes of the species. Similar to the results obtained from the comparative genome analysis, matK, accD, rps19, ycf2, ycf1, rps15, and ndhF genes in the coding regions were determined as the most divergent genes in nucleotide diversity analysis. With these in mind, it was determined that these gene regions could be used as molecular markers. Ding et al. (2021) previously reported that these genes were potential strong molecular marker regions for plants [73]. The most divergent region in the chloroplast genomes of species was found to be the ycf1 gene (Pi = 0.33167) located in the IR region. Jung et al. (2021) also reported that the ycf1 gene was one of the strongest molecular marker genes for chloroplast genomes of land plants [61]. When the phylogenetic tree of C. bijugum was examined, it was seen that C. bijugum formed a single branch with two legumes, C. arietinum and Medicago sativa, and as expected, C. bijugum was the closest species to C. arietinum. In Megablast and comparative genome analysis, it was detected that C. bijugum and C. arietinum had highly similar chloroplast genomes with respect to sequence identity, gene order, and genome orientations. These results were supported by the results obtained from phylogenetic analysis. Glycine max was found to be the closest species to C. bijugum, C. arietinum, and Medicago sativa. In legume species, Phaseolus vulgaris and Vigna unguiculata were separated from these three species and formed a single branch among themselves. Schwarz et al. (2017) reported that Glycine and Medicago genera were closer to Cicer genera, and Phaseolus vulgaris and Vigna unguiculata formed a separate group from these species [74]. At the end of the analysis, it has been determined that the topological structure of the phylogenetic tree formed as a result of the analysis was consistent with the phylogenetic trees obtained in other studies with species belonging to the legume family [75-78].

CONCLUSION

This is the first study that exhibits the whole chloroplast genome sequence of C. bijugum, which is one of the wild type chickpea species. In the present study, it was aimed to sequence the whole chloroplast genome of C. bijugum, which is a wild chickpea species. First of all, the chloroplast organelle of C. bijugum was isolated with high molecular weight, and then chloroplast DNAs were isolated. The chloroplast genome of C. bijugum has been sequenced with 100X coverage on the next generation sequencing platform and then compared with the cultivated chickpea species Cicer arietinum and other types of legumes by using bioinformatics tools. As a consequence of the analyzes made, it was determined that the chloroplast genome of C. bijugum was 124,804 bp in length. In addition, it was found that 113 genes were encoded in the chloroplast genome of C. bijugum in total. The percent identity of the chloroplast genomes between C. arietinum and C. bijugum was obtained 97.24% by using the MegaBlast tool. At the end of comparative genome analysis, it was revealed that matK, accD, ycf1, ycf2, rps15, rps19, and ndhF genes were divergent regions. Codon usage frequency analysis showed that Leucine was the most abundant amino acid while Tryptophan was the least abundant amino acid in the chloroplast genome of C. bijugum. Moreover, mononucleotide and dinucleotide SSR types were the most abundant repeat types with percentages of 67.2% and 25.2%, respectively. Furthermore, it was found that tandem and palindromic repeats were the other most common repeat types with percentages of 38.7% and 28.5%, respectively. Thirteen hotspot regions (psbA, matK, rpoB, rpoC1, rpoC2, psbI, psbK, accD, rps19, ycf2, ycf1, rps15, and ndhF) were detected in total. Phylogenetic tree showed that C. bijugum and C. arietinum were the closest species to each other. In the light of all these analyses within the scope of the study, the entire chloroplast genome sequence of the C. bijugum was examined in depth and very useful information was obtained about the chloroplast genome structure, gene orientation, and molecular structure of the chloroplast. It is thought that all this information obtained as a result of the study will greatly contribute to the scientists who will investigate the species belonging to the Fabaceae family and will guide further research to be conducted with chickpea species such as species identification, gene expression, comparative genome analyses, molecular and phylogenetic analyses in the future.

Table 1

Gene content table of C. bijugum and C. arietinum.

Species	Cicer bijugum	Cicer arietinum
Genome Size (bp)	124,804	125,319
LSC (bp) / percentage	84,705 / %67.9	82,528 / %65.9
SSC (bp) / percentage	11,640 / %9.3	13,038 / %10.4
IR (bp) / percentage	28,459 / %22.8	29,753 / %23.7
Total Gene Number	113	112
CDS / percentage	79 / %70	79 / %70.5
tRNA / percentage	30 / %26.5	29 / %25.9
rRNA / percentage	4 / %3.5	4 / %3.6
Average gene length (nt)	1,104.5	1,118.9
GC Ratio (%)	%33.6	%33.9
AT Ratio (%)	%66.4	%66.1

Table 2

Functions of genes located in C. bijugum.

Category	Group of Genes	Names of Genes
Self replication	Large subunit of ribosomal proteins	rpl2, rpl14, rpl16, rpl20, rpl22, rpl23, rpl32, rpl33, rpl36
	Small subunit of ribosomal proteins	rps2, rps3, rps4, rps7, rps8, rps11, rps12, rps14, rps15, rps16, rps18, rps19
	DNA-dependent RNA polymerase	rpoA, rpoB, rpoC1, rpoC2
	Ribosomal RNA Genes	rrn4.5, rrn5, rrn16, rrn23
	Ribosomal RNA Genes	trnH-GUG, trnK-UUU, trnM-CAU, trnT-GGU, trnT-UGU, trnV-UAC, trnV-GAC
	Transfer RNA Genes	trnF-AAA, trnF-GAA, trnfM-CAU, trnL-UAA, trnL-CAA, trnL-UAG, trnS-UGA, trnS-GCU, trnS-GGA
		trnG-GCC, trnE-UUC, trnY-GUA, trnD-GUC, trnC-GCA, trnR-UCU
		trnR-ACG, trnQ-UUG, trnW-CCA, trnP-UGG, trnI-GAU, trnI-CAU, trnA-UGC, trnN-GUU
Genes for photosynthesis	Photosystem I	psaA, psaB, psaC, psaI, psaJ
	Photosystem II	psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ
	RUBISCO	rbcL
	Subunits of ATPsynthase	atpA, atpB, atpE, atpF, atpH, atpI
	Subunit of NADH-dehidrogenase	ndhA, ndhB, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
	cytochrome b/f complex	petA, petB, petD, petG, petL, petN
Other genes	Protease	clpP
	Maturase	matK
	Envelope membrane protein	cemA
	Translation initiation factor	infA
	C-type cytochrome synthase gene	ccsA
	Subunit of Acetyl-CoA-carboxylase	accD
Genes of unknown functions	Hypothetical chloroplast reading frames	ycf1, ycf2, ycf3, ycf4

53 in total

1. DnaSP 6: DNA Sequence Polymorphism Analysis of Large Data Sets.

Authors: Julio Rozas; Albert Ferrer-Mata; Juan Carlos Sánchez-DelBarrio; Sara Guirao-Rico; Pablo Librado; Sebastián E Ramos-Onsins; Alejandro Sánchez-Gracia
Journal: Mol Biol Evol Date: 2017-12-01 Impact factor: 16.240

2. MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors: Kazutaka Katoh; Daron M Standley
Journal: Mol Biol Evol Date: 2013-01-16 Impact factor: 16.240

Review 3. Chloroplast genomes: diversity, evolution, and applications in genetic engineering.

Authors: Henry Daniell; Choun-Sea Lin; Ming Yu; Wan-Jung Chang
Journal: Genome Biol Date: 2016-06-23 Impact factor: 13.583

4. Development of chloroplast genome resources for peanut (Arachis hypogaea L.) and other species of Arachis.

Authors: Dongmei Yin; Yun Wang; Xingguo Zhang; Xingli Ma; Xiaoyan He; Jianhang Zhang
Journal: Sci Rep Date: 2017-09-14 Impact factor: 4.379

5. OrganellarGenomeDRAW (OGDRAW) version 1.3.1: expanded toolkit for the graphical visualization of organellar genomes.

Authors: Stephan Greiner; Pascal Lehwark; Ralph Bock
Journal: Nucleic Acids Res Date: 2019-07-02 Impact factor: 16.971

6. Complete Chloroplast Genomes of Chlorophytum comosum and Chlorophytum gallabatense: Genome Structures, Comparative and Phylogenetic Analysis.

Authors: Jacinta N Munyao; Xiang Dong; Jia-Xin Yang; Elijah M Mbandi; Vincent O Wanga; Millicent A Oulo; Josphat K Saina; Paul M Musili; Guang-Wan Hu
Journal: Plants (Basel) Date: 2020-03-01

7. Chloroplast genome variation and phylogenetic relationships of Atractylodes species.

Authors: Yiheng Wang; Sheng Wang; Yanlei Liu; Qingjun Yuan; Jiahui Sun; Lanping Guo
Journal: BMC Genomics Date: 2021-02-04 Impact factor: 3.969

8. Phytochemical Profile and Microbiological Activity of Some Plants Belonging to the Fabaceae Family.

Authors: Diana Obistioiu; Ileana Cocan; Emil Tîrziu; Viorel Herman; Monica Negrea; Alexandra Cucerzan; Alina-Georgeta Neacsu; Antoanela Lena Cozma; Ileana Nichita; Anca Hulea; Isidora Radulov; Ersilia Alexa
Journal: Antibiotics (Basel) Date: 2021-06-01

9. Genome Sequences of Populus tremula Chloroplast and Mitochondrion: Implications for Holistic Poplar Breeding.

Authors: Birgit Kersten; Patricia Faivre Rampant; Malte Mader; Marie-Christine Le Paslier; Rémi Bounon; Aurélie Berard; Cristina Vettori; Hilke Schroeder; Jean-Charles Leplé; Matthias Fladung
Journal: PLoS One Date: 2016-01-22 Impact factor: 3.240

10. Complete Chloroplast Genome of Cercis chuniana (Fabaceae) with Structural and Genetic Comparison to Six Species in Caesalpinioideae.

Authors: Wanzhen Liu; Hanghui Kong; Juan Zhou; Peter W Fritsch; Gang Hao; Wei Gong
Journal: Int J Mol Sci Date: 2018-04-25 Impact factor: 5.923