| Literature DB >> 31768299 |
Cleiziane Bispo da Silva1, Hellen Ribeiro Martins Dos Santos1, Phellippe Arthur Santos Marbach2, Jorge Teodoro de Souza3, Valter Cruz-Magalhães1,3, Ronaldo Costa Argôlo-Filho1, Leandro Lopes Loguercio1.
Abstract
BACKGROUND: Intragenomic variability in 16S rDNA is a limiting factor for taxonomic and diversity characterization of Bacteria, and studies on its occurrence in natural/environmental populations are scarce. In this work, direct DNA amplicon sequencing coupled with frequent-cutter restriction analysis allowed detection of intragenomic 16S rDNA variation in culturable endophytic bacteria from cacao seeds in a fast and attractive manner.Entities:
Keywords: 16S rDNA; Bacterial diversity; Chimerical sequences; Genomic databases; PCR limitations; Polymicrobial samples
Year: 2019 PMID: 31768299 PMCID: PMC6874854 DOI: 10.7717/peerj.7452
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Approximate identification of culturable endophytic bacterial isolates from cacao, based on direct amplicon sequencing of the 16S rDNA V5–V9 hypervariable region.
| Clean(15) | 5, | 100 | |
| 3, 11, 23, 34, 37, 38, 39 | 99 | ||
| Mixed (50) | 40, 63 | 99 | |
|
| 98 | ||
| 17, | 96 | ||
| 36; 13, 10, 61; | 86–80 | ||
| 2, | 79–70 |
Notes.
An expedite characterization of each culturable endophytic isolate from cacao was performed through direct amplicon sequencing on gel-purified, single-band PCR products of 799F/1492U primers, spanning the V5–V9 hypervariable region of 16S rDNA. Analysis of the resulting electropherograms indicated two major groups: one with clear, sharp and undoubtedly-single peaks for each nucleotide (high quality sequences), which comprised 15 isolates, within which 93.3% expectedly showed 3–5 AluI restriction fragments (see Results and Discussion). The other group (mixed-peaks) presented variable levels of background, lower- intensity peaks underneath each nucleotide-read peak (lower quality sequences); this group comprised 50 isolates, within which 64.0% expectedly showed ≥6 AluI restriction fragments (also see next).
Numbers correspond to the identification of the culturable isolates in the local collection; underlined isolates are those whose AluI restriction patterns departed from the expected number of bands, i.e. ≥6 fragments for the single-peaks group and ≤5 fragments for the mixed-peaks group (see chi-square analysis in the Results text).
Considering the highest score obtained from the BlastN search (with 100% sequence cover), the species/strains retrieved are shown by the corresponding access number indicated between parenthesis. The lower levels of identity (≤98%) indicated in the right column indicate an increasing interference of the lower-intensity underneath peaks, which generate a “chimeric-sequence” effect in the main base-called reads, thereby departing from the expected (99–100%) sequence similarity.
Figure 1Illustration of electropherograms and restriction profiles resulting from sequencing and AluI digestion of PCR-amplified V5–V9 region of 16S rRNA genes from endophytic bacteria from Theobroma cacao.
The procedure was applied to 65 culturable isolates from cacao seeds + pulp (see text), and the data correspond to six isolates (identified by the numbers on top) that are representative examples. The agarose-gel electrophoretic profiles for single-fragment PCR-amplifications with primers 799F/U1492R are shown on the top. ‘Clean’ electropherograms, characteristic of unique templates, are shown on the left side, whereas mixed ones, which can indicate multiple templates, appear on the right side. The images of the corresponding gel resulting from the AluI digestion of the same sequence appear next to each chromatogram. The percentage of identity with the best hit retrieved by the BlastN is indicated besides each gel. (The sequences produced from the 65 isolates are provided in the Supplementary Information).
Figure 2Example of a double-template mixed electropherogram for a 16S rRNA gene amplicon of an endophytic bacterial isolate from Theobroma cacao.
The region displayed shows specific stretches where a clean pattern changes to a double-sequence electropherogram, indicated by horizontal arrows. The base-called sequence of the respective 16S rRNA gene are given on top of the electropherogram (A), and as the top sequence (capital letters) in the underneath alignment. Visual inspection of peak intensities allowed precise identification of the main and secondary sequences, which were manually aligned as indicated (B). For the stretches where only single peaks were observed, the sequences of the supposed two templates appeared to be the same and are shown as capital letters in a single line. For the aligned ‘mixed’ stretches 1 and 2, the top sequence is the main one and the borders of the overlapping stretches are indicated by dashed arrows and the base-pair number (position) of the main sequence; small-caps letters in the main (top) sequence were base-called ambiguities defined by visual inspection of the eletropherogram. The secondary sequence was predominantly shown in small-caps, except for those bases with only a single peak (capital letters). The two sequences alignment was optimized manually, in which dashes were introduced in each sequence for the best possible adjustment. Dots and stars underneath the alignment correspond to matches and mismatches, respectively; question marks indicate points were the secondary-sequence signals were too low and it was not possible to discern them from background; bold ‘g’ letters and dashes suggest potential indels between the two sequences. The sequences definition and alignment could be undoubtedly done up to the 468 bp only.
Figure 3Intragenomic variation of the 16S rRNA gene from bacterial species with their complete genome deposited at GenBank, identified by the BlastN search with sequences from endophytic isolates from Theobroma cacao.
Only the 15 undoubtful, clean sequences and the two mixed sequences with 99% identity (Table 1) were considered. The indicated genomes (species names in grey boxes) were found by BlastN with the isolates # 6 (A), 23 (B), 53 (C) and 63 (D), which were indicated in bold as the reference sequence used in each of the 11 alignments shown. The identification and access numbers of the corresponding strains with genomes fully sequenced is indicated on top of each alignment group. In each of these groups, the letters A, B, C etc on the left column correspond to the different types of 16S rRNA gene sequences found in the genome, with the number of copies of each type shown between parenthesis besides them. The numbers in the first line (out of scale) correspond to the base-pair positions of the cacao isolate sequence where differences were found among the sequence types. The asterisk (*) sign indicate bp matches in the whole stretches compared to the reference sequence (cacao isolate) on top of each alignment group. The percentage of sequences identity with respect to the cacao isolates is shown on the right side of each sequence. For the second alignment group of Bacillus pumilus (isolate 23), the stretch of ‘zeros’ on the reference sequence indicate a large gap in relation to the ‘F’ sequence (the gap is present in all sequence types, from ‘A’ to ‘E’).