Literature DB >> 27077093

Data characterizing the chloroplast genomes of extinct and endangered Hawaiian endemic mints (Lamiaceae) and their close relatives.

Andreanna J Welch1, Katherine Collins1, Aakrosh Ratan2, Daniela I Drautz-Moses3, Stephan C Schuster3, Charlotte Lindqvist1.   

Abstract

These data are presented in support of a plastid phylogenomic analysis of the recent radiation of the Hawaiian endemic mints (Lamiaceae), and their close relatives in the genus Stachys, "The quest to resolve recent radiations: Plastid phylogenomics of extinct and endangered Hawaiian endemic mints (Lamiaceae)" [1]. Here we describe the chloroplast genome sequences for 12 mint taxa. Data presented include summaries of gene content and length for these taxa, structural comparison of the mint chloroplast genomes with published sequences from other species in the order Lamiales, and comparisons of variability among three Hawaiian taxa vs. three outgroup taxa. Finally, we provide a list of 108 primer pairs targeting the most variable regions within this group and designed specifically for amplification of DNA extracted from degraded herbarium material.

Entities:  

Keywords:  Genome structure; Hawaii; Lamiaceae; Plastid genomes

Year:  2016        PMID: 27077093      PMCID: PMC4816906          DOI: 10.1016/j.dib.2016.03.037

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data These data provide a summary of the characteristics and structure of the chloroplast genomes of several taxa within Lamiaceae, which can be used to increase our understanding of molecular evolution of the chloroplast genome, as well as the evolution of its structure and function. A comparison of variable regions in mints can be used to identify rapidly evolving regions in other taxa. Primer sequences described here can be used to target highly variable regions in closely related taxa.

Data

Raw, demultiplexed sequence reads have been deposited in the NCBI sequence read archive (SRP070171) and full chloroplast genomes for 12 mint taxa have been deposited in GenBank (KU724130-KU724141). Data presented in the text include tables and figures giving information on gene content and variability in these 12 species, as well as comparison of genome structure with other members of the order Lamiales.

Experimental design, materials and methods

Samples, library construction, and shotgun sequencing

We selected 12 Hawaiian mint taxa for shotgun sequencing (five contemporary and seven from herbarium collections ranging up to ~100 years old), of which two extinct species were represented by two accessions each (see Tables 1 and 2 in [1]). We also sequenced four Stachys species, representing both close and more distantly related relatives. DNA extraction, library construction and shotgun sequencing followed the methods described in [1]. Briefly, approximately 100 mg dried leaf tissue was homogenized using the TissueLyser system (Qiagen), and DNA was extracted using the DNeasy plant mini kit (Qiagen). DNA isolated from herbarium samples was processed separately from contemporary samples using stringent protocols and controls to prevent and detect any potential contamination. For contemporary samples, DNA extracts were sheared to 200–600 bp via sonication in a Covaris S220; DNA from herbarium samples is naturally degraded and therefore was not sheared further. Genomic shotgun sequencing libraries were constructed following the standard Illumina Tru-seq protocol for contemporary samples, or the NEBNext Library Prep Mastermix kit (New England Biolabs) for herbarium samples. Libraries were quantified using the PicoGreen High Sensitivity assay and then pooled and sequenced on the Illumina HiSeq and MiSeq platforms. Adapter sequences were trimmed from the reads using the AdapterRemoval software [2]. Assessment of DNA damage in old herbarium specimens was conducted using mapDamage 2.0 [3]. The presence of misincorporations characteristic of damaged DNA molecules typically found in old and degraded samples suggests that the data from herbarium samples are authentic, however, the overall levels of damage were low and within the range expected based on the age of the specimens (see Supplementary Figs. 2 and 3 in [1]).

Assembly of the Hawaiian mint reference chloroplast genome

Because no chloroplast genome sequence from a closely related taxon was available at the time this study was conducted, we implemented a combined reference-guided and de novo assembly approach [4] to determine the first complete chloroplast genome sequence for a Hawaiian mint. We assembled the sequence for Stenogyne haliakalae, an extinct species, as it had the largest number of reads. Briefly, the approach involved conducting both reference-guided assembly in YASRA 2.32 [5] with olive (Olea europaea, NC_013707; [6]) as the reference, as well as de novo assembly in SOAPdenovo v1.05 [7]. Assembly methods are described in more detail in [1]. The resulting contigs from both approaches were split into overlapping sequences, and then used as input for a further reference guided-assembly step in YASRA. Gaps between the final contigs were closed using PCR (see [1] for PCR reaction conditions and Supplementary Table 1 of this paper for primer information) and Sanger sequencing in both directions from high-quality DNA extracted from a contemporary sample of Stengyne bifida. This ensured that amplification could be carried out over potentially large gaps, which would not be possible with degraded DNA from the extinct Stenogyne haliakalae. Contigs and Sanger sequences were aligned in Sequencher 4.7 (Gene Codes) to create a pseudo-reference sequence [4]. Reads from Stenogyne haliakalae were then mapped to the pseudo-reference using BWA v. 0.6.2 [8]. The reference sequence was further refined through Sanger sequencing of areas with low coverage or poor mapping quality (e.g., the border between the inverted repeat and single copy region). Reads were mapped to the final sequence, PCR duplicates were flagged and removed with the MarkDuplicates tool of the Picard command line toolset (http://picard.sourceforge.net/index.shtml), and a consensus sequence was called using SAMtools [9].

Assembly of additional mint chloroplast genomes

Complete or nearly complete chloroplast genomes were assembled using similar methods for 11 additional taxa: seven from the endemic Hawaiian mints (two of which were from herbarium samples) and four Stachys outgroups (see Tables 1 and 2 in [1]). Since the Hawaiian mints have diverged recently, we used the new chloroplast genome sequence from Stenogyne haliakalae as the reference during reference-guided assembly for the remaining Hawaiian taxa. The resulting contigs were aligned to create an interim sequence, and then the reads were mapped to this using BWA and a final consensus sequence called using SAMtools. Chloroplast genome sequences for the Stachys outgroup taxa were assembled in a similar manner. We first assembled the chloroplast genome sequence for Stachys chamissonis, as this species is most closely related to the Hawaiian lineage. We conducted independent YASRA runs using Olea europaea as the reference, in addition to newly available sequences from Stenogyne haliakalae, Sesamum indicum (NC_016433) [10], Origanum vulgare (JX880022) [11], and Salvia miltiorrhiza (NC_020431) [12]. The contigs from all five runs were aligned to create an interim sequence. The reads were then mapped to the interim sequence using BWA and a consensus called using SAMtools. Once the Stachys chamissonis sequence was assembled we used this as the reference in YASRA for reference guided assembly of both Stachys coccinea and Stachys sylvatica. For Stachys byzantina, the most distantly related outgroup, we performed the initial reference-guided assembly using the sequence from Stachys chamissonis, as well as Olea europaea and Sesamum indicum. The rest of the assembly proceeded as described for the other Stachys species.

Gene content and structure of mint chloroplast genomes

The Stenogyne haliakalae reference sequence and sequences from additional species were annotated using a combination of DOGMA [13], tRNAscan-SE [14], and additional manual BLAST searches. The borders of the inverted repeats were identified with the program Inverted Repeats Finder [15]. Overall the chloroplast genome sequences assembled here are very similar to other Lamiales. Table 1 shows the gene content of the Stenogyne haliakalae chloroplast genome, which mirrors the gene content of the chloroplast genomes for the 11 additional mint taxa, including the Stachys outgroups. Table 2 lists the genes that contain introns (and the number of introns present) in the mint taxa investigated here. Table 3 gives the lengths of the full chloroplast genome for each sequence assembled, as well as the lengths of the inverted repeats and single copy regions.
Table 1

Gene content of the chloroplast genome of Stenogyne haliakalae and 11 additional mint species.

Gene ProductsGenes
Photosystem IpsaA, psaB, psaC, psaI, psaJ, ycf3 [20], ycf4 [20]
Photosystem IIpsbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ/lhbA
Cytochrome b6/fpetA, petB, petD, petG, petL, petN
ATP synthaseatpA, atpB, atpE, atpF, atpH, atpI
RubiscorbcL
NADH oxidoreductasendhA, ndhBa, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
RNA polymeraserpoA, rpoB, rpoC1, rpoC2
Large subunit ribosomal proteinsrpl2a, rpl14, rpl16, rpl20, rpl22, rpl23a, rpl32, rpl33, rpl36
Small subunit ribosomal proteinsrps2, rps3, rps4, rps7a, rps8, rps11, rps12b, rps14, rps15, rps16, rps18, rps19
Other functionsaccD, ccsA, cemA, clpP, matK, infA
Unknown functionsycf1b, ycf2a, ycf15a
Ribosomal RNAsrrn23a, rrn16a, rrn5a, rrn4.5a
Transfer RNAstrnA(UGC)a, trnC(GCA), trnD(GUC), trnE(UUC), trnF(GAA), trnG(GCC), trnG(UCC), trnH(GUG), trnI(CAU)a, trnI(GAU)*, trnK(UUU), trnL(UAA), trnL(UAG), trnL(CAA)a, trnfM(CAU), trnM(CAU), trnN(GUU)a, trnP(UGG), trnQ(UUG), trnR(ACG)a, trnR(UCU), trnS(GCU), trnS(GGA), trnS(UGA), trnT(GGU), trnT(UGU), trnV(UAC), trnV(GAC)a, trnW(CCA), trnY(GUA)

Gene fully located within the inverted repeats.

Gene partially located within the inverted repeats.

Table 2

Genes containing introns in the chloroplast genomes of Stenogyne haliakalae and 11 additional mint species. Numbers represent the lengths (bp) of exons and introns in S. haliakalae.

GeneLocation# IntronsExon IIntron IExon IIIntron IIExon III
atpFLSC1143656410
clpPLSC270658291616227
ndhASSC15521020538
ndhBIR1755680776
petBLSC15718650
petDLSC17728474
rpl16LSC18907392
rpl2IR1390658433
rpoC1LSC14347361634
rps12LSC/IRa2113a23153725
rps16LSC139875226
trnA-UGCIR13780734
trnG-UCCLSC12269047
trnI-GAUIR13493836
trnK-UUULSC136250934
trnL-UAALSC13648849
trnV-UACLSC13757936
ycf3LSC2123713229725152

Trans-spliced

Table 3

Lengths (bp) of the long single copy region (LSC), short single copy region (SSC), and inverted repeats regions (IR) for the chloroplast genomes of Stenogyne haliakalae and 11 additional mint species.

SpeciesLSCIRSSCTotal
Haplostachys haplostachya81,75525,44117,495150,132
Haplostachys linearifolia81,75225,44117,495150,129
Phyllostegia velutina81,76525,44017,496150,141
Phyllostegia waimeae81,74425,44817,497150,137
Stachys byzantina81,24525,48017,517149,722
Stachys chamissonis81,77425,46417,552150,254
Stachys coccinea82,17125,47017,563150,674
Stachys sylvatica81,80725,41417,560150,195
Stenogyne bifida81,75225,44117,495150,129
Stenogyne haliakalae81,36425,43617,499149,736
Stenogyne kanehoana81,73925,44117,495150,116
Stenogyne sessilis81,74325,44117,495150,120
To compare the genome structure of the Hawaiian and Stachys taxa to other taxa in the order Lamiales, we conducted analyses in Mauve 2.3.1 [16] (Fig. 1). In the analysis we included Stenogyne haliakalae, Stachys byzantina, Ajuga reptans (NC_023102), Andrographis paniculata (NC_022451), Boea hygrometrica (NC_016468), Jasminum nudiflorum (NC_008407), Lindenbergia philippensis (NC_022859), Olea europaea (NC_013707), Origanum vulgare (JX880022), Pinguicula ehlersiae (NC_023463), Salvia miltiorrhiza (NC_020431), Schwalbea americana (NC_023115), Sesamum indicum (NC_016433), Tectona grandis (NC_020098), and Utricularia gibba (NC_021449). The chloroplast genome sequences of some taxa in this analysis demonstrated rearrangements and inversions. Therefore, one of the inverted repeats was trimmed off at the coordinates suggested by Inverted Repeats Finder so that homology could be determined with the remaining region. Seed weight was set to 19, the gap opening penalty was set to −200, and the gap extension penalty to −30.
Fig. 1

Comparison of structure and similarity among 15 complete chloroplast genomes from the order Lamiales.Cistanche deserticola (102,657 bp) and Epifagus virginiana (70,028 bp), both members of the Orobanchaceae, are not considered here because they are parasitic and lack chlorophyll, thus demonstrating largely reduced chloroplast genomes. Blocks with the same color represent homologous regions free of internal structural changes for that subset of taxa, and those above the centerline for each taxon are in the same orientation as in Stenogyne haliakalae, whereas those below the line are in the reverse direction. Within each block a similarity profile for the region is plotted. Areas outside of blocks are presumed to represent lineage-specific regions of the chloroplast genome. One copy of the inverted repeat has been trimmed so that homology of the remaining repeat (area shaded in light gray) can be shown.

To compare the genome structure of all of the Hawaiian and Stachys mints to each other, we conducted a separate analysis in Mauve (Fig. 2). The sequences were assumed to be collinear. The seed weight was set to 7, the gap opening penalty was set to −200, and the gap extension penalty to −30.
Fig. 2

Conservation among 11 complete mint chloroplast genomes. The sequence for Haplostachys linearifolia was excluded due to missing data. A physical map is given at the top to show gene content and organization (see Fig. 2 in [1] for gene names and products). In the lower panels, regions of the genome are represented by bars, and those that are conserved among all 11 species are colored mauve, whereas those that are conserved among subsets of the taxa have different colors. The height of the bar shows the degree of similarity.

Variability in the chloroplast genome sequences of Hawaiian mints and Stachys outgroups

We investigated variability among the three highest quality chloroplast genome sequences of the Hawaiian mints (Stenogyne haliakalae, Stenogyne bifida, Haplostachys haplostachya), and the three highest quality sequences from the Stachys outgroups (Stachys chamissonis, Stachys coccinea, and Stachys sylvatica). These species represent all of the main lineages within our samples, and were sequenced at >10× depth. We used BWA to map the reads for each of these species onto the Stenogyne haliakalae reference genome and used SAMtools and BCFtools to call SNPs with a SNP quality score >30. Annotations of the S. haliakalae reference genome were transferred to the locations of the SNPs. We compared the levels of chloroplast genome diversity among the six genomes by identifying unique, variable positions, which we refer to as potentially informative characters (PICs) [17], [18]. We did not include indels and inversions in this definition, which have been included in other analyses of chloroplast genome variability. To analyze diversity among the chloroplast genomes, we compared the number of PICs present in 1000 bp non-overlapping sliding windows across the entire chloroplast genome sequences (Table 4, see also Fig. 5 in [1]). We also compared the number of PICs per locus for coding (Table 5, Fig. 3a), intron (Table 6, Fig. 3b), and intergenic spacer and pseudogene regions (Table 7, Fig. 3c). Because this approach does not take into account the length of the locus, very long loci appear to have more PICs than shorter loci. Therefore, we also divided the number of PICs by the total length of the region to give the percent PICs per locus (Table 5, Table 6, Table 7, Fig. 4). However, very short regions may still appear to have a high percentage of variable sites, when in fact only a small number of the sites were variable. To minimize this, we have excluded regions less than 100 bp in length, and for clarity we have also excluded regions that were conserved among all six taxa.
Table 4

Sliding window analysis of variability of complete chloroplast genome sequences of three Hawaiian and three Stachys taxa (Stenogyne haliakalae, Stenogyne bifida, Haplostachys haplostachya, Stachys chamissonis, Stachys coccinea, and Stachys sylvatica). PICs – Potentially informative characters.

BeginEnd# PICsStachys# PICs Hawaiian
199981
1000199972
2000299964
30003999111
4000499970
5000599951
60006999101
7000799980
8000899962
9000999990
10,00010,99941
11,00011,99930
12,00012,99940
13,00013,99994
14,00014,99942
15,00015,99910
16,00016,99930
17,00017,99960
18,00018,99953
19,00019,99920
20,00020,99930
21,00021,99900
22,00022,99951
23,00023,99910
24,00024,99900
25,00025,99910
26,00026,99941
27,00027,99953
28,00028,99992
29,00029,99941
30,00030,999111
31,00031,99951
32,00032,99930
33,00033,99931
34,00034,99981
35,00035,99910
36,00036,99910
37,00037,99950
38,00038,99900
39,00039,99910
40,00040,99950
41,00041,99931
42,00042,99910
43,00043,99952
44,00044,99961
45,00045,999100
46,00046,99931
47,00047,99970
48,00048,99900
49,00049,99961
50,00050,99940
51,00051,99920
52,00052,99931
53,00053,99953
54,00054,99983
55,00055,99982
56,00056,99920
57,00057,99961
58,00058,99991
59,00059,99942
60,00060,99933
61,00061,99971
62,00062,99920
63,00063,99961
64,00064,99951
65,00065,99961
66,00066,99980
67,00067,999120
68,00068,99960
69,00069,99900
70,00070,99970
71,00071,99953
72,00072,99930
73,00073,99931
74,00074,99942
75,00075,99921
76,00076,99950
77,00077,99950
78,00078,99970
79,00079,99971
80,00080,99941
81,00081,999100
82,00082,99922
83,00083,99901
84,00084,99910
85,00085,99910
86,00086,99920
87,00087,99900
88,00088,99911
89,00089,99900
90,00090,99910
91,00091,99900
92,00092,99910
93,00093,99900
94,00094,99921
95,00095,99910
96,00096,99920
97,00097,99900
98,00098,99900
99,00099,99910
100,000100,99910
101,000101,99910
102,000102,99910
103,000103,99900
104,000104,99952
105,000105,99910
106,000106,99940
107,000107,99990
108,000108,99940
109,000109,999163
110,000110,99962
111,000111,999101
112,000112,99970
113,000113,99983
114,000114,99940
115,000115,99950
116,000116,99964
117,000117,99994
118,000118,99930
119,000119,99950
120,000120,999112
121,000121,999101
122,000122,99975
123,000123,999112
124,000124,99941
Min00
Max165
Total565104
Mean4.520.83
Median4.000.00
Table 5

Comparison of variability of coding genes by exon in complete chloroplast genome sequences of three Hawaiian and three Stachys taxa (Stenogyne haliakalae, Stenogyne bifida, Haplostachys haplostachya, Stachys chamissonis, Stachys coccinea, and Stachys sylvatica). Note: Exon numbers are defined by position in overall chloroplast genome sequence (not direction of gene) and exons that are completely conserved among these taxa and/or shorter than 100 bp have been excluded. PICs – Potentially informative characters.

RegionLength#PICs per locusStachys#PICs per locus Hawaiian% PICs per locusStachys% PICs per locus Hawaiian
psbA1056020.000.19
matK15301450.920.33
psbK186100.540.00
atpA1524510.330.07
atpF Exon1411100.240.00
atpI744100.130.00
rps2711100.140.00
rpoC240831630.390.07
rpoC1 Exon11635100.060.00
rpoC1 Exon2435100.230.00
rpoB3213400.120.00
psbD1062400.380.00
psbC1422310.210.07
psaB2205600.270.00
psaA2253200.090.00
rps4606100.170.00
ndhJ477300.630.00
ndhK678300.440.00
atpE402200.500.00
atpB1497400.270.00
rbcL14461260.830.41
accD1467300.200.00
ycf4555200.360.00
cemA690210.290.14
petA963210.210.10
rpl33201100.500.00
rpl20387200.520.00
psbB1527900.590.00
petB Exon2651210.310.15
rpoA1014310.300.10
rpl36114201.750.00
rps8414100.240.00
rpl14369200.540.00
rpl16 Exon1393300.760.00
rps3663300.450.00
rpl22465110.220.22
rps19279602.150.00
rpl2 Exon1434100.230.00
ycf26849520.070.03
ndhB Exon1756100.130.00
rps7468210.430.21
rrn232811100.040.00
ndhF22291600.720.00
rpl32177201.130.00
ccsA970320.310.21
ndhD15211110.720.07
psaC244100.410.00
ndhE306120.330.65
ndhG531100.190.00
ndhI507200.390.00
ndhA Exon1539300.560.00
ndhA Exon2553310.540.18
ycf1 SSC448743110.960.25
Min114000.000.00
Max684943112.150.65
Total611102254323.433.45
Mean11534.250.810.440.07
Median6782.000.000.330.00
Fig. 3

Comparison of the number of potentially informative characters (PICs) per locus in (a) protein coding regions, (b) introns, and (c) intergenic spacers and pseudogenes for chloroplast genomes of three Hawaiian and three Stachys taxa: Stenogyne haliakalae, Stenogyne bifida, Haplostachys Haplostachya, Stachys chamissonis, Stachys coccinea, and Stachys sylvatica. Regions that were completely conserved among these species and/or that are shorter than 100 bp are not shown. Pink=three Hawaiian taxa; Black=three Stachys taxa. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Table 6

Comparison of variability of introns in complete chloroplast genome sequences of three Hawaiian and three Stachys taxa (Stenogyne haliakalae, Stenogyne bifida, Haplostachys haplostachya, Stachys chamissonis, Stachys coccinea, and Stachys sylvatica). Note: Exon and intron numbers are defined by position in the overall chloroplast genome sequence (not direction of gene). The trnK-UUU intron contains the gene matK. Here trnK-UUU-1 refers to the region between exon 1 and matK, and trnK-UUU-2 refers to the region between matK and exon2. PICs – Potentially informative characters.

RegionLength#PICs per locusStachys#PICs per locus Hawaiian% PICs per locusStachys% PICs per locus Hawaiian
trnK-UUU − 1264501.890.00
trnk-UUU − 2714500.700.00
rps16874510.570.11
trnG-UCC689701.020.00
atpF655300.460.00
rpoC1762410.520.13
ycf3-1724010.000.14
ycf3-2712200.280.00
trnL-UAA487400.820.00
trnV-UAC478110.210.21
clpP − 1615600.980.00
clpP − 2657400.610.00
petB717200.280.00
petD727420.550.28
rpl16906600.660.00
rpl2657220.300.30
trnI-GAU937100.110.00
ndhA10191070.980.69
Min264000.000.00
Max10191071.890.69
Sum12594711510.951.86
Mean699.73.940.830.610.10
Median7134.000.000.560.00
Table 7

Comparison of variability of intergenic spacers and pseudogenes in complete chloroplast genome sequences of three Hawaiian and three Stachys taxa (Stenogyne haliakalae, Stenogyne bifida, Haplostachys haplostachya, Stachys chamissonis, Stachys coccinea, and Stachys sylvatica). Note: Regions that are completely conserved among these taxa and/or shorter than 100 bp have been excluded. PICs – potentially informative characters.

RegionLength#PICs per locusStachys#PICs per locus Hawaiian% PICs per locusStachys% PICs per locus Hawaiian
trnH-GUG-psbA323812.480.31
psbA-trnK-UUU244200.820.00
trnK-UUU-rps16643500.780.00
rps16-trnQ-UUG8281011.210.12
psbK-psbI374701.870.00
psbI-trnS-GCU121110.830.83
trnS-GCU-trnG-UCC694510.720.14
trnG-UCC-trnR-UCU160201.250.00
atpF-atpH365511.370.27
atpH-atpI975850.820.51
rps2-rpoC2337200.590.00
rpoC2-rpoC1154100.650.00
rpoB-trnC-GCA1098640.550.36
trnC-GCA-petN436400.920.00
petN-psbM509520.980.39
psbM-trnD-GUC518300.580.00
trnE-UUC-trnT-GGU570711.230.18
trnT-GGU-psbD1075810.740.09
psbC-trnS-UGA232311.290.43
trnS-UGA-psbZ328300.910.00
psbZ-trnG-GCC224200.890.00
trnG-GCC-trnfM-CAU149100.670.00
psaA-ycf3730700.960.00
ycf3-trnS-GGA316220.630.63
trnS-GGA-rps4162100.620.00
rps4-trnT-UGU385310.780.26
trnT-UGU-trnL-UAA690901.300.00
trnL-UAA-trnF-GAA295301.020.00
trnF-GAA-ndhJ658010.000.15
ndhC-trnV-UAC944500.530.00
trnV-UAC-trnM-CAU188100.530.00
trnM-CAU-atpE213200.940.00
atpB-rbcL801110.120.12
rbcL-accD707720.990.28
accD-psaI397110.250.25
psaI-ycf4434701.610.00
ycf4-cemA306511.630.33
cemA-petA217110.460.46
petA-psbJ1059930.850.28
psbE-petL915610.660.11
petL-petG174201.150.00
petG-trnW-CCA125100.800.00
trnP-UGG-psaJ270210.740.37
psaJ-rpl33478501.050.00
rpl33-rps18146010.000.68
rps18-rpl20216502.310.00
rpl20-rps127391001.350.00
psbB-psbT184110.540.54
psbH-petB124100.810.00
petB-petD194100.520.00
infA-rps8123100.810.00
rps8-rpl14173301.730.00
rpl14-rpl16126100.790.00
rpl16-rps3141110.710.71
rps12-3׳end-trnV-GAC1448300.210.00
trnA-UGC-rrn23199100.500.00
rrn4.5-rrn5224210.890.45
rrn5-trnR-ACG234210.850.43
trnR-ACG-trnN-GUU569200.350.00
ycf1Truncated1059100.090.00
ndhF-rpl32435521.150.46
rpl32-trnL-UAG7371321.760.27
ccsA-ndhD180402.220.00
ndhD-psaC105100.950.00
psaC-ndhE251501.990.00
ndhG-ndhI375601.600.00
rps15-ycf1 SSC389300.770.00
Min105000.000.00
Max14481352.480.83
Sum291922504362.7210.44
Mean435.73.730.640.940.16
Median328.03.000.000.820.00
Fig. 4

Comparison of the % PICs (potentially informative characters) per locus in (a) protein coding regions, (b) introns, and (c) intergenic spacers and pseudogenes for chloroplast genomes of three Hawaiian and three Stachys taxa: Stenogyne haliakalae, Stenogyne bifida, Haplostachys haplostachya, Stachys chamissonis, Stachys coccinea, and Stachys sylvatica. Regions that were completely conserved among these species or that are shorter than 100 bp are not shown. Pink=three Hawaiian taxa; Black=three Stachys taxa. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

To identify the most variable regions of the mint chloroplast genome for targeted re-sequencing and high resolution phylogenetic analyzes, reads from all 15 taxa subjected to shotgun sequencing (including the partial genomes from all of the historical samples, except Phyllostegia variabilis) were mapped to the Stenogyne haliakalae reference sequence using BWA. SNPs were called with SAMtools and BCFtools, filtering out those with SNP quality <30. We selected a total of 108 variable loci (see Fig. 2 in [1] for a diagram of locations) identified from single copy regions, including (1) all the regions that had a variant position among the Hawaiian mints (except where every individual had an alternative allele as compared to the reference sequence) and (2) additional regions that had variant positions among at least two of the Stachys species. 100 bp of flanking sequence on either side of the SNPs was retrieved from the reference genome, and PCR primers were designed using BatchPrimer3 [19], with further manual examination for quality control (e.g. to ensure that primer sequences did not fall into a gap for one of the other taxa). Sequences complementary to the Illumina sequencing adapters were appended to the end of each primer (Table 8) so that sequencing libraries could be prepared directly from the cleaned multiplex PCR products. Overall, these regions represent roughly 20,000 bp of sequence from the chloroplast genome and contain additional variable sites beyond the initial targeted SNP.
Table 8

Tailed primers used for multiplex amplification and targeted re-sequencing in mints. Sequences complimentary to the Illumina sequencing adapters were appended to the end of each primer (Forward: 5’ TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-[locus specific sequence] and Reverse: 5’ GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-[locus specific sequence]). The loci within each region are also indicated. When the names of two genes are given with an underscore between them, the intergenic spacer between these genes is included in the amplified region.

PrimerSequence 5’ – 3’TmLoci in region (excluding priming sites)
Mint204FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CGC CCC TCT ACT ATA ATG AAT GA69.3trnH-GUG_psbA
Mint204RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAC AGG ATC CAG AAA AAG AAA GA68.1
Mint771FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCA TGA GCG GCT ACG ATA TT70.3psbA
Mint771RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCA GGC TGA GCA CAA CAT TCT70.2
Mint867FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGA ACC ATG CAT AGC ACT GA70.4psbA
Mint867RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTA CCC AAT CGG TCA AGG AAG69.1
Mint1170FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CAC TCA CGA CCC ATG TAA CAA70.1psbA
Mint1170RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG AGC CTG TTT CTG GAT CTC T69.5
Mint1939FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TCA AAA CAA AAG TTG AAT ACT CAG TTG67.6trnK-UUU Intron1, matK
Mint1939RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG GAT TTG GTA TTT GGA TAT GA68.3
Mint3998FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GCT TCC CGT ATC AGG CAC T71.2trnk-UUU Intron2
Mint3998RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTC GAA TTC TTG GAA CGG AAC68.6
Mint4546FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CAG AAT TGT CAA AAT GTA TAG AGC A68.2trnK-UUU Exon2_rps16 Exon1
Mint4546RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCA TCG TGA TAA GCG ATC TGG69.2
Mint4714FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GAC CCA GAT CGC TTA TCA CG70.1trnK-UUU Exon2_rps16 Exon1
Mint4714RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG TCA ATC CAA GAC AAT TTT GAA67.8
Mint5533FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TCC GAT CCA GTT ATT GAG ACG69.0rps16 Intron
Mint5533RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG GGA ATC GAC TGT CCA TAG69.7
Mint5985FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GCC CCC GAG AAA TGA ATT A69.9rps16 Intron
Mint5985RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTA GAA AGC AAC GTG CGA CTT69.3
Mint6787FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTC CAA TTT TGC ATT CGA GTC68.5rps16 Exon2_trnQ-UUG
Mint6787RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAA AAG GGT GAG TGG GTA GGA69.7
Mint7334FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TCA AAA ACG CAA CCA AAA TG68.4trnQ-UUG_psbK, psbK
Mint7334RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG GCA TAA CAT CCA CGA TTG68.8
Mint8052FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGT TTC CTT GGG TTT GGT TA69.9psbI-trnS-GCU, trnS-GCU
Mint8052RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGG AGA GAT GGC TGA GTG GAC70.3
Mint8691FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCG AAC TCA AAA ATA AAC TGT CG68.7trnS-GCU_trnG-UCC Exon1
Mint8691RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCA AAA CGA GAA CGT TGC ACT69.7
Mint8791FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG ATG AAG CCT CTT TCC CGA A69.8trnS-GCU_trnG-UCC Exon1, trnG-UCC Exon1
Mint8791RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG GTT ACT AGA ACG AAT CAC ACT TT68.9
Mint10381FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TGC GCC AAT TCC AAT TTT A69.0atpA
Mint10381RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTC AAT CGG GAG ATG TTT CG68.7
Mint10531FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CAT TAA TGG CGG GTC TGA TT69.9atpA
Mint10531RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCT TTT GGA AAG AGC CGC TAA69.6
Mint11293FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CTG AGC AAT TCT TCC TGT TGC69.9atpA
Mint11293RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGG CGA TGG TAT TGC TCG TAT69.8
Mint13441FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CAG CAG CAA TAA CGG AAG C70.3atpH, atpH_atpI
Mint13441RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG AAG TCG TTC TGA TGA TTC AA68.8
Mint13731FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AGC CAC GAC GAT ATG AAA GG69.4atpH_atpI
Mint13731RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAA AGT GGA TTG GTT GTC GAA68.7
Mint14054FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GAT TGA TCT AAG TTC ATG CAA TTT TT67.8atpH_atpI
Mint14054RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTT GTC CAC TTA ATA TCC TAC CTT TCC67.9
Mint16824FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TGA AAT CAA GAC CTT TGA TGT TAT T67.7rpoC2
Mint16824RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGA GGG TTG GAA CGA ACG TAT69.8
Mint17255FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCG GAG TGG CCA AAT AAG T70.8rpoC2
Mint17255RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTT GGT ATT TTC TCC ATC CCA AT68.3
Mint17928FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TGG ATT TCA GCA AGT CGA TTC68.9rpoC2
Mint17928RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCC TTT ATG GAA ATG GGA AAC C68.8
Mint20423FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG ATG CCA TTC CGA AGT GAT CT69.6rpoC2, rpoC2_rpoC1 Exon1
Mint20423RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG CGA ATC AAG ATC GAG AAC69.3
Mint20572FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CGA TTT GAC AAT GGG TTT GA69.4rpoC2_rpoC1 Exon1, rpoC1 Exon1
Mint20572RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAA GCC ATA CAG GGG TTT TCC69.4
Mint21910FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GAG CTC ATT AAT TTA CCC CCA TC69.0rpoC1 Exon1
Mint21910RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGG GAA TGA ATG GGA AGA TCA69.2
Mint22486FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TCA AGT TAC CAG TGA AGA CTA AGC A69.0rpoC1 Intron
Mint22486RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGA TTC ATC ATT CGA AGG GAA GT68.8
Mint23294FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AAA TTC GAC TCC GCA TTG TT69.2rpoC1 Exon2
Mint23294RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGC CCA ATG GAG AGA TAG TCG69.7
Mint23536FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AAT CCA ATT CGG AGC TGT TG69.1rpoB
Mint23536RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG GTC AAG TCA TAA AGT CAA ATA AA67.2
Mint24088FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGC GCT ATT CGA TAA TGT CTG69.3rpoB
Mint24088RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGG AAG ACA CGG AWA CAA AGG69.3
Mint24204FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CAA CAG GTC TTC CAT CTT GC69.9rpoB
Mint24204RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCC GGG TTA TTG ATG TGA GGT70.0
Mint27582FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CTG ATT GGT TCA GTC TCA GTT TTT69.1rpoB_trnC-GCA
Mint27582RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGC GAG AGA ATG TTT TTA GCA TTG68.4
Mint27884FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GCA AAT CCT TTT TCC CCA GT70.1trnC-GCA, trnC-GCA_petN
Mint27884RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAA CGG GGC TTC ACA ATC TTT69.5
Mint28672FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AAG AGA TTC GGA TGA TTG GAA A68.4petN_psbM
Mint28672RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG AGC GCA CTA TAA TCA GCA69.9
Mint28881FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GCT GAT TAT AGT GCG CTC GTT70.0petN_psbM, psbM
Mint28881RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTC CTA CCG CCT TTC TGC TTA69.7
Mint30267FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGG GAG GAG TAG AAT CTC TTC A69.9trnE-UUC_trnT-GGU
Mint30267RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTT GTT TCA AGA CCA GCC CTA69.3
Mint31916FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGG GTT GGT TCA CAG GTA CA71.0psbD
Mint31916RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCC ACC TAA TTG ACA CCA ACG69.5
Mint32003FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGT CCT GAA GCA CAA GGA GA70.9psbD
Mint32003RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGC AAT TGG ACC AGA GAA TGC69.6
Mint32348FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTC ATA ATT GGA CGC TGA ACC68.9psbD
Mint32348RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGC GGT CAC CAT GGA ATA AGT70.0
Mint33303FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGG AGG GGG AGA TGT AAG AA70.6psbC
Mint33303RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGA TAT GCC AGA TTC CAC CAA G69.1
Mint34416FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGG ATT CGA ACC CTC GAT A70.1trnS-UGA, trnS-UGA_psbZ
Mint34416RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAC CGG TCG GTA GAT TCA CAC69.6
Mint35190FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AAT GCG GAT ATG GTC GAA TG68.9trnG-GCC, trnG-GCC_trnfM_CAU
Mint35190RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGG TTT GGC TCT TAC CCC TTT70.1
Mint35459FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AGG ATT TGA ACC CGT GAC CT70.2trnfM-CAU, trnfM-CAU_rps14
Mint35459RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAC GGT CGA CGA TCA TAA AGG68.9
Mint36179FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCA ATC TTG CTT GCA CAA TG69.7psaB
Mint36179RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCT GGG CGT GGA TGT TCT TAT70.0
Mint36649FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GTC CCA TGC CGA AAT ATC AC69.7psaB
Mint36649RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGC CTG GAG ACT TTT TGG TTC69.5
Mint36907FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CAT TGA GCA AAT ATG GGT TCG69.2psaB
Mint36907RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGA GAT TAC AAC CCG GAG CAA69.9
Mint38366FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TGC CAT AAT GCC TTT CAA ATC68.6psaB-psaA, psaA
Mint38366RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAT CCA TCG TTT GGG CTC AT69.5
Mint41292FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TAT TCG AAA CGC CTC GTG AT69.5ycf3 Exon1, ycf Intron1
Mint41292RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG GTT CTA AGG GAA GGG ATT69.9
Mint43421FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTG GAG CCT CGA AAG AAA GA69.5ycf3 Exon3_trnS-GGA
Mint43421RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAC TCG GCC ATC TCT CCT ACA70.0
Mint47802FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GAT GGC GTT TGA TAG AGG AAT C69.1ndhK
Mint47802RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTT GTC CAC CTA AAC CGG AAG69.3
Mint50255FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTG GTA CCT AAA CGG GCA CT70.3trnV-UAC Intron, trnV-UAC Exon2
Mint50255RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTC AGT TGG TAG AGC ACC TCG T70.0
Mint51642FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTC GGA TAA TTC GTC CAA CC68.9atpB
Mint51642RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG CCA AAG GGA TCT ATC CAG69.2
Mint51809FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AGG ATC GGT CAA ATC GTC TG69.3atpB
Mint51809RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTC GAC AAT ATC TTC CGT TTC G68.2
Mint52064FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGA AAT ATT CCG CCA TCG TT69.8atpB
Mint52064RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTA TCC GTA TTT GGC GGA GTG69.1
Mint53793FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GAC AAC TGT GTG GAC CGA TG70.3rbcL
Mint53793RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAG CAC GTA GGG CTT TGA ATC69.3
Mint53955FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AAA GCC CTA CGT GCT CTA CG70.0rbcL
Mint53955RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAG CAG ATA ACC CCA ATT TCG68.7
Mint54300FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTT ATG CGT TGG AGA GAT CG68.8rbcL
Mint54300RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAT TGG CAG TGA ATC CTC CTG69.3
Mint54974FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GAC GTG ATC TTG CTG CTG AG70.3rbcL, rbcL_accD
Mint54974RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGA TTG GGC CGA GTT TAA TTG68.9
Mint55144FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CAA TTA AAC TCG GCC CAA TC69.4rbcL_accD
Mint55144RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTT GTG GAT CCA AGA CAC CAA69.4
Mint55263FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CGA AGA CTC CCA TTT TTC TCA69.5rbcL_accD
Mint55263RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGT CTA TTC CAA CAC GGA ACG A69.5
Mint56316FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CGC CCA CTG TAA GTG ATA GC70.3accD
Mint56316RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGC ATT GAA CCC ACA ACT GCT70.4
Mint57465FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTT TTG AGT TCT ACA TTC CTT GGA C68.2accD_psaI
Mint57465RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG GGT ACC TCG ATT TAC TAT TTG T68.2
Mint58710FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TGA TGA GAA TTT GAC TCC ACG A69.0ycf4, ycf4_cemA
Mint58710RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG TGA TGA TCA AAA AGT CGA TTG67.7
Mint59661FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TCC GTC ACT TGT AGT TAT TTA TCA TTC67.6cemA, cemA_petA
Mint59661RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTT CCC TGT CCA AGA TTC TGC69.3
Mint60529FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AGA TTT ATC CCG ACG GAA GC69.4petA
Mint60529RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTT GAT GGA TTC ACC CTC TGA69.1
Mint60945FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CGG ATT TAT GGA CAT CAG GTT69.5petA_psbJ
Mint60945RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG GTC AAG TCA TAA AGT CAA ATA AA67.2
Mint61133FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GAC TTG ACC ACC CCC TTC TT71.0petA_psbJ
Mint61133RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAT TCT TTG TCC CAC GCA TTC68.8
Mint62265FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCC CCA GTA GAG ACT GGT ACG71.0psbL, psbL_psbF, psbF
Mint62265RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAG TAC GCT GGT TGG CTG TTC70.0
Mint64488FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AAA ACA AAC GCG CTA CCA AG69.4trnP-UGG, trnP-UGG_psaJ
Mint64488RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCC AAT TGA AAT GTA AAA CGC TCT68.6
Mint65276FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTT ACT ATG GCT TTG CTT TGA TTT68.0psaJ_rpl33, rpl33
Mint65276RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTT CCA AAA TCA CCG TTA CCC68.8
Mint65764FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CGG GGG ATC GAA TTG ATT AT69.6rps18
Mint65764RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGG TCG ACT CGA TTC TTT CAA AT68.8
Mint68295FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TCT CTC GAT ACA TAA TCG AAT CTT TT67.5clpP Intron1, clpP Exon2
Mint68295RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG GTT GGA GGA GAA ATT ACC A69.0
Mint68800FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCT TTT GGT GCA TAC GGT TC70.1clpP Intron2
Mint68800RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCC ATC GTG ATT TGG ATT GAA68.9
Mint71208FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG ATC GTG CGA CTT TGA AAT CC69.1psbB
Mint71208RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCC CAA GTT TTT GGA ATG CTC69.2
Mint71701FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CGA GAA CCA CCT AAA GTT CCA70.0psbT, psbT_psbN, psbN
Mint71701RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCC GGG TAC GCC TTA TAT ACC69.6
Mint72221FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GAC TCC TTT GAT GGG TGT CG70.2psbH, psbH_petB Exon1
Mint72221RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGG CCG CAA ATT TGA GTT CTA69.5
Mint73489FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTG ACT TGG GTT ACG GGT GT70.3petB Exon2
Mint73489RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG AGT CAA GGT GGA TTG TCC70.0
Mint74004FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTA TGG GAG TGT GCG ACT TG69.6petD Intron
Mint74004RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG GAG CCT ACT CAT GTA CAA C69.5
Mint74125FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CAG AAG ATG GGC TGG TTC AC70.6petD Intron
Mint74125RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAA TGC AGA GGA AAT GAA TGC68.3
Mint75956FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CGA CAT AAG GCG GTA AGA TGA69.9rpoA
Mint75956RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG AGA ATG TCC CGC ATG AAT69.4
Mint76974FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TCT GCG GAT TAG TCG ACA TTT69.3rpl36
Mint76974RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCC GAA ACA AGG ATT CGA AAG68.9
Mint79374FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG ATT GCT TTC CGG TTC ATT TC68.6rpl16 Intron
Mint79374RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCA AGA GCT TCG AGC CAA TAA69.4
Mint79823FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AAT ATG AAG CGA TGG GTT GG69.0rpl16 Intron, rpl16 Exon2, rpl16 Exon2_rps3
Mint79823RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG GCC AAT CAA ACA AAT TCC68.6
Mint80282FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TGC CTG TTC AGT CAA TTC AA69.2rps3
Mint80282RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG CGA GGA ATC GAA GAA TTA69.1
Mint80833FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TAG CCC GGG GTT TTA ATT TC69.2rpl22
Mint80833RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCC GTT CCT ACG AGG AAA CAC69.9
Mint106808FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TGG ACC CGT TTC TGA AGA GT70.1ycf1Truncated, ycf1Truncated_ndhF, ndhF
Mint106808RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTT TTT GGA GAA GGG ATC AAA68.1
Mint107075FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCC ATC GTT TTC TTT TTG GA69.4ndhF
Mint107075RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGG ATT CGG CAA GTT GGT ATG69.4
Mint107435FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TCG TAT TGG CGG ATT CAT AA68.8ndhF
Mint107435RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGG GGT AAA GGG TAT TCC AAA A69.1
Mint107714FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TGA ATG TTT AAA TGC CCC TCA69.0ndhF
Mint107714RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAG GTA CAC TTT CGC TTT GTG G69.1
Mint108575FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CGC TTT TTG ACA AGC ATT TG69.1ndhF
Mint108575RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG GCT CAC GAT CAA GGA TAC69.1
Mint109649FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GTA TCG GGC AGC GTT AAA AG69.7rpl32, rpl32_trnL-UAG
Mint109649RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAT TCC CCG TTG AAG GAA ATG68.8
Mint110066FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TCT CGC TAT CAA TCC ACA CAA69.2rpl32_trnL-UAG
Mint110066RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTA GAA CCC TCC CTC CCC AAA70.4
Mint110450FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGT AGA CAC GCT GCT CTT AGG70.4trnL-UAG, trnL-UAG_ccsA, ccsA
Mint110450RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTC GAA ACG ATC GAA AAA GAA67.9
Mint111053FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CTA TGC GGC CCT TTT ATG TG70.0ccsA
Mint111053RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTT GGA ATT CAC CAA CGA AAA68.3
Mint113090FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GGA TCA TCC GAT TGA AAA TGA68.8ndhD
Mint113090RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAC GAA CCA TTT TCC TTG CTT68.9
Mint113960FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTC AGC GGC TGC AAT AGT TA69.7ndhE
Mint113960RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG CCT ATT TAT TTT CCA TTG GT67.9
Mint116500FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCG TTA CCG TCG CTA TTA CAG69.9ndhA Intron
Mint116500RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGC AGA CAG AAT TCC ATT GGT C69.2
Mint116861FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTG CAA TTC TCG TTT TTG GA68.7ndhA Intron
Mint116861RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGA ATT GGG GCT TTA AGT TGG T69.4
Mint116914FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TGG TGG ATA GGA ACA TAC TCT GG69.3ndhA Intron
Mint116914RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG GAT GGT TAG GAA GAC CAA A69.0
Mint117269FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CAG CAA AAT TTT AAG CCG TTT T68.9ndhA Intron
Mint117269RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG TGT GAT TCG GTG AGA CAT69.9
Mint117433FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TGT CTC ACC GAA TCA CAC GTA69.7ndhA Exon2
Mint117433RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG CCG ATC TTA GTA TTG GTG T69.6
Mint119376FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTT TGA CAA ATA GGC CAG CA69.3rps15
Mint119376RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG CAT CGA TTT CGG TTA TTT C68.0
Mint122192FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CAT TTC TTG GTT TTC GAA TTT TT67.9ycf1 single copy
Mint122192RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGC ATC TCC GAG TTG GAC AAA70.0
Mint122492FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG AAA AAC ACT TTT GAG AAC CCA TTT68.2ycf1 single copy
Mint122492RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAA CCG AAC TCC CCT TTT GTT69.4
Mint122638FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTT TTC GGG GTG AAC AAA AG68.8ycf1 single copy
Mint122638RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GTG GTT GAC GGA TGG TAT TCA69.1
Mint123159FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTT GAA ATG CTT CCC CCT TA69.2ycf1 single copy
Mint123159RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GCG GCA CGG TAT AAT CAA AGG69.4
Mint124145FTCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG TTC GAT TAT AGG CGG GGA TA69.1ycf1 single copy
Mint124145RGTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GAT CGC TGG AAT CGA CCA TTT69.3
Subject areaBiology, genetics, genomics
More specific subject areaMolecular phylogenetics and evolution
Type of dataTables and figures
How data was acquiredHigh-throughput sequencing of contemporary and herbarium samples was conducted on the Illumina HiSeq 2500 and MiSeq platforms, followed by both de novo and reference-guided assemblies, mapping, and functional annotation
Data formatRaw, and analyzed
Experimental factorsDe novo assemblies were created using SOAPdenovo and reference-guided assemblies were created using YASRA. Sequences were functionally annotated using DOGMA. SNPs were called and filtered using SAMtools and BCFtools
Experimental featuresData include chloroplast genome gene content, structure, and comparisons of variable loci in a suite of recently diverged species and outgroups
Data source locationHawaii, North America, South America, Europe, Africa, and Asia
Data accessibilityData are published with this article
  19 in total

1.  Automatic annotation of organellar genomes with DOGMA.

Authors:  Stacia K Wyman; Robert K Jansen; Jeffrey L Boore
Journal:  Bioinformatics       Date:  2004-06-04       Impact factor: 6.937

2.  Inverted repeat structure of the human genome: the X-chromosome contains a preponderance of large, highly homologous inverted repeats that contain testes genes.

Authors:  Peter E Warburton; Joti Giordano; Fanny Cheung; Yefgeniy Gelfand; Gary Benson
Journal:  Genome Res       Date:  2004-10       Impact factor: 9.043

3.  The tortoise and the hare II: relative utility of 21 noncoding chloroplast DNA sequences for phylogenetic analysis.

Authors:  Joey Shaw; Edgar B Lickey; John T Beck; Susan B Farmer; Wusheng Liu; Jermey Miller; Kunsiri C Siripun; Charles T Winder; Edward E Schilling; Randall L Small
Journal:  Am J Bot       Date:  2005-01       Impact factor: 3.844

4.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

5.  progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement.

Authors:  Aaron E Darling; Bob Mau; Nicole T Perna
Journal:  PLoS One       Date:  2010-06-25       Impact factor: 3.240

6.  Complete chloroplast genome sequences of important oilseed crop Sesamum indicum L.

Authors:  Dong-Keun Yi; Ki-Joong Kim
Journal:  PLoS One       Date:  2012-05-14       Impact factor: 3.240

7.  mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters.

Authors:  Hákon Jónsson; Aurélien Ginolhac; Mikkel Schubert; Philip L F Johnson; Ludovic Orlando
Journal:  Bioinformatics       Date:  2013-04-23       Impact factor: 6.937

8.  AdapterRemoval: easy cleaning of next-generation sequencing reads.

Authors:  Stinus Lindgreen
Journal:  BMC Res Notes       Date:  2012-07-02

9.  Identification of new polymorphic regions and differentiation of cultivated olives (Olea europaea L.) through plastome sequence comparison.

Authors:  Roberto Mariotti; Nicolò G M Cultrera; Concepcion Muñoz Díez; Luciana Baldoni; Andrea Rubini
Journal:  BMC Plant Biol       Date:  2010-09-24       Impact factor: 4.215

10.  Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology.

Authors:  Richard Cronn; Aaron Liston; Matthew Parks; David S Gernandt; Rongkun Shen; Todd Mockler
Journal:  Nucleic Acids Res       Date:  2008-08-27       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.