| Literature DB >> 30336638 |
Nunzio D'Agostino1, Rachele Tamburino2, Concita Cantarella3, Valentina De Carluccio4,5, Lorenza Sannino6, Salvatore Cozzolino7, Teodoro Cardi8, Nunzia Scotti9.
Abstract
Members of the genus Capsicum are of great economic importance, including both wild forms and cultivars of peppers and chilies. The high number of potentially informative characteristics that can be identified through next-generation sequencing technologies gave a huge boost to evolutionary and comparative genomic research in higher plants. Here, we determined the complete nucleotide sequences of the plastomes of eight Capsicum species (eleven genotypes), representing the three main taxonomic groups in the genus and estimated molecular diversity. Comparative analyses highlighted a wide spectrum of variation, ranging from point mutations to small/medium size insertions/deletions (InDels), with accD, ndhB, rpl20, ycf1, and ycf2 being the most variable genes. The global pattern of sequence variation is consistent with the phylogenetic signal. Maximum-likelihood tree estimation revealed that Capsicum chacoense is sister to the baccatum complex. Divergence and positive selection analyses unveiled that protein-coding genes were generally well conserved, but we identified 25 positive signatures distributed in six genes involved in different essential plastid functions, suggesting positive selection during evolution of Capsicum plastomes. Finally, the identified sequence variation allowed us to develop simple PCR-based markers useful in future work to discriminate species belonging to different Capsicum complexes.Entities:
Keywords: chloroplast genome; microsatellites; molecular markers; next-generation sequencing; pepper; perfect tandem repeats; sequence variability; simple sequence repeats; single-nucleotide polymorphism
Year: 2018 PMID: 30336638 PMCID: PMC6210379 DOI: 10.3390/genes9100503
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Plastome features of the eleven Capsicum genotypes.
| Genotype Code | Species | Complex a | Germplasm Bank Identifier (ID) | Size (Base Pairs) | % GC | |||
|---|---|---|---|---|---|---|---|---|
| Total | LSC d | SSC d | IR d | |||||
| ann1 |
| CA | CGN21526 b | 157,052 | 87,380 | 17,882 | 25,895 | 37.71 |
| ann2 |
| CA | CAP319 c | 156,842 | 87,380 | 17,960 | 25,751 | 37.72 |
| ann3 |
| CA | CAP1546 c | 156,872 | 87,341 | 17,917 | 25,807 | 37.73 |
| chi |
| CA | CGN22099 b | 156,858 | 87,288 | 17,860 | 25,855 | 37.73 |
| fru |
| CA | CGN22779 b | 156,836 | 87,359 | 17,911 | 25,783 | 37.72 |
| gal |
| CA | CGN22208 b | 157,029 | 87,366 | 17,941 | 25,861 | 37.69 |
| cha |
| CA/CB | CGN22084 b | 156,841 | 87,346 | 17,893 | 25,801 | 37.72 |
| bac.b | CB | CGN23261 b | 157,053 | 87,350 | 17,973 | 25,865 | 36.45 | |
| bac.p | CB | CGN21512 b | 157,144 | 87,351 | 17,973 | 25,910 | 37.66 | |
| pra |
| CB | CGN20805 b | 157,056 | 87,351 | 17,973 | 25,866 | 37.66 |
| pub |
| CP | CGN22108 b | 157,390 | 87,688 | 17,928 | 25,887 | 37.69 |
a Walsh and Hoot [20] and Ince, Karaca, and Onus [19]; CA: C. annuum; CB: C. baccatum; CP: C. pubescens; b from the Centre for Genetic Resources germplasm bank, The Netherlands; c from IPK Gatersleben germplasm bank, Germany; d LSC = large single-copy region; SSC = small single-copy region; IR = inverted repeat; GC = guanine/cytosine.
Figure 1Map of the Capsicum pubescens chloroplast genome. Genes inside of the outer circle are transcribed in the clockwise direction, while those outside are transcribed in the counterclockwise direction. Different color codes represent genes belonging to various functional groups. The circle inside GC content graph marks the 50% threshold. The inverted repeat, large single-copy, and small single-copy regions are denoted by IR, LSC, and SSC, respectively.
Figure 2Sliding window analysis of the multiple plastome sequence alignment within the Capsicum genus. The region with high nucleotide variability (Pi > 0.05), corresponding to the IR/SSC junction, is indicated. Window length = 200 base pairs (bp); step size = 50 bp.
Figure 3Schematic representation of the five most variable genes (ndhB, accD, rpl20, ycf1 and ycf2) in the plastomes under investigation. Gray bars represent the multiple-sequence alignment (MSA) for each gene and are scaled according to the MSA length. Black boxes indicate highly variable regions in the MSA. Above each box, a snapshot of the MSA along with alignment positions is reported.
Figure 4Phylogenetic tree of Capsicum genotypes. Phylogram of the best maximum-likelihood (ML) tree as determined using the RAxML software from the complete plastome dataset. Numbers associated with branches are ML bootstrap support values.
Figure 5Results of molecular evolution analysis of plastid genes within the Capsicum genus. (A) Estimation of protein-coding gene divergence by the average branch length ± standard deviation for each gene tree; (B) number of putative sites under positive selection.
Examples of chloroplast molecular markers (single-nucleotide polymorphisms, SNPs; simple sequence repeats, SSRs; tandem repeats, TRs) identified in this study using the accession NC_026551 of C. lycianthoides as a reference.
| Genotypes | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Marker | Region | ann1 | ann2 | ann3 | chi | fru | gal | cha | bac.b | bac.p | pra | pub | Notes |
|
| |||||||||||||
| AAACC[A/ |
| 0 b | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | Gain of a |
| GAATT[C/ | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | Loss of a | |
| ATATT[C/ |
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | Loss of a |
| TGCGA[G/ |
| 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | Loss of a |
| TCTTG[C/ |
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | Loss of a |
| CCAGC[T/ |
| 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | Loss of a |
|
| |||||||||||||
| TTTC(A)nTCAT |
| 9 d | 9 | 9 | 9 | 9 | 9 | 10 | 10 | 10 | 10 | 2 | |
| TCTG(T)nCAAA |
| 12 | 12 | 12 | 12 | 12 | 12 | 11 | 11 | 11 | 11 | 10 | |
| AAT(ATAA)nAT |
| 4 | 4 | 4 | 4 | 4 | 4 | 3 | 2 | 2 | 2 | 3 | |
| CTTC(CT)nTATC | 5 | 5 | 5 | 5 | 5 | 5 | 4 | 5 | 5 | 5 | 5 | ||
| TTTC(A)nGGTA |
| 11 | 11 | 11 | 11 | 11 | 11 | 9 | 9 | 9 | 9 | 8 | |
| GTTA(T)nAGGT |
| 14 | 14 | 14 | 14 | 14 | 14 | 15 | 16 | 16 | 16 | 13 | |
| TAAC(T)nGTTG |
| 6 | 6 | 6 | 6 | 6 | 9 | 6 | 6 | 6 | 6 | 6 | |
|
| |||||||||||||
| GGAT(TTATC…GCCTA)37AAGG |
| 1 f | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | |
| AAGA(GAGTT…AAAGA)22AGAC |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3 | 3 | 3 | 1 | |
| TTAA(TTGGT…TTGTT)30TAAG |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 1 | |
| TCTC(ATTGA…ATTGT)25ATTT |
| 2 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | |
a The nucleotide in brackets (underlined) represents the alternative allele; b 0 = reference allele; 1 = alternative allele; c the nucleotide(s) in parentheses represents the repeat unit; n = number of repeats; d different numbers correspond to the number of repeat unit in each genotype; e the nucleotides in parentheses represent the tandem repeat, the number out of parentheses corresponds to the length of repeat; f different numbers correspond to the number of tandem repeats in each genotype.
Figure 6Examples of chloroplast molecular markers developed in this study. PCR markers based on the presence of perfect tandem repeats and insertions/deletions (InDels) able to discriminate CB (A) and CP (B) complexes. PCR results from representative genotypes in each complex are shown. CB = C. baccatum; CP = C. pubescens; CA = C. annuum; 1 = bac.b; 2 = bac.p; 3 = bac.p2; 4 = bac.p3; 5 = bac.p4; 6 = pra; 7 = pub; 8 = pub2; 9 = pub3; 10 = cha; 11 = cha2; 12 = cha3; 13 = cha4; 14 = ann2; 15 = ann4; 16 = ann5; 17 = ann6; 18 = ann7; 19 = chi2; 20 = chi3; 21 = chi4; 22 = fru2; 23 = fru3; 24 = fru4.