| Literature DB >> 22859974 |
Richard A Hurt1, Steven D Brown, Mircea Podar, Anthony V Palumbo, Dwayne A Elias.
Abstract
Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22859974 PMCID: PMC3409199 DOI: 10.1371/journal.pone.0041295
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Amplification/Sequencing Primers and Amplicon Electrophoretic Mobility Evaluation.
| Primer | Gap | Amplicon Size GAP | Gel Mobility | Estimated GAP Length | Alignment Gap Length |
|
| −363 | 693 bp | 700 bp | 10 bp | 18 bp |
|
| +330 | “ | “ | “ | “ |
|
| −217 | 508 bp | 540 bp | 30 bp | 30 bp |
|
| +291 | “ | “ | “ | “ |
|
| −250 | 372 bp | 375 bp | 3 bp | −1 bp |
|
| +122 | “ | “ | “ | “ |
|
| −127 | 387 bp | 400 bp | 15 bp | 13 bp |
|
| +260 | “ | |||
|
| −260 | 449 bp | 520 bp | 60 bp | ND |
|
| +189 | “ | “ | “ | “ |
|
| −496 | 578 bp | 620 bp | 40 bp | ND |
|
| +82 | “ | “ | “ | “ |
|
| −129 | 207 bp | 260 bp | 60 bp | ND |
|
| +78 | “ | “ | “ | |
|
| −137 | 241 bp | 270 bp | 30 bp | ND |
|
| +104 | “ | “ | “ |
Amplicon size based on the distance between the 5′ termini of the primer pair excluding any additional length from the gap.
Electrophoretic mobility rounded to nearest 10 bp based on a combination of Marker III (Roche), Marker V (Roche), and 100 bp ladder (NEB).
ND means not determined because the artificial contigs did not align in BLAST searches.
ND132 Template Amplification Summary.
|
| |||
| PCR Method |
| Strand displacing | GC-Rich PCR System |
| Standard Cycling | 4 | ND | - |
| Ramped Extension | 1,2,3,4 | 1,2,3,4,5,6 | 5,6 |
| 7′-deaza-2′-dGTP | 1,2,3,4,5,6 | 1,2,3,4,5,6 | ND |
Method includes all additives, ions, and annealing and extension temperatures tested.
Amplification reagent is the GC-Rich PCR System (Roche).
Taq polymerase required 5% or 10% DMSO. Pfu polymerase required no additives.
7’-deaza-2′-dGTP was used with the ramped extension cycle. When used with Taq DNA polymerase, the reaction mixture was supplemented with 1 betaine and 5% DMSO. Pfu polymerase required no additives.
Sequences of the Six D. desulfuricans ND132 Gaps.
| GAP | Locus | Length | Sequence 5′-3′ |
| 1 | 657659–657675 | 17 | gag ggg gaa cct ctt tc |
| 2 | 1009606–1009634 | 29 | cttggaagaggttccccctctggactccc |
| 3 | 1885438 | 0 | |
| 4 | 2275514 - 2275527 | 14 | aaa agt ttc ccc cc |
| 5 | 3127972−3128034 | 63 | ccc ttt gag aaa agg gtt ttt cct ccc ctt ccc ccg aac ccc cat ccc ctc ctt tcc cta aac |
| 6 | 3648109−3648144 | 36 | ctt tgc aaa ggg ttc cct ctc gcc ccc ctt ccc ccc |
Location of gaps in the genome according to updated genome GenBank (CP003220) sequence.
Figure 1Example of the increase in secondary structure following determination of the gap sequences: Gap 1.
Left- secondary structure prediction done using the Mfold web server [29] to aid in primer site selection. Closed triangle shows the location of the gap. Structure was prepared by appending 100 nt at the termini of the 5′ and 3′ contigs surrounding the gap. Folding parameters used 50 mM NaCl, and 2.5 mM MgCl2 at 60°C. Excess nucleotides were trimmed from the structure for presentation. Right – secondary structure including the determined gap sequence. The determined gap sequence nucleotides are highlighted in upper case. Folding parameters are as given for the structure prepared prior to gap sequence determination.
Survey of Additional Non-contiguous Finished Genomes.
| Non-contiguous Finished Genome | URL | Total Number of Gaps | Positive for 2° Structure |
|
| /bren/ | 9 | 5 |
|
| /bcep bu72/ | 3 | 3 |
|
| /cspDL/ | 1 | 0 |
|
| CP003221 | 1 | 1 |
|
| CP003220 | 6 | 5 |
|
| /dfw101/ | 2 | 2 |
|
| /fran_eun1f/ | 1 | 1 |
|
| /lil21528/ | 17 | 5 |
|
| /mli4140/ | 6 | 3 |
|
| /meth4292/ | 1 | 1 |
|
| /sbal_183/ | 4 | 2 |
|
| /tbac/ | 2 | 2 |
http://genome.ornl.gov/microbial/dfw101/.
http://www.ncbi.nlm.nih.gov/nuccore/CP003220.
Secondary structure prior to determination of gap sequences based on location of gap within stem, loop, or adjoined to the stem.
Figure 2Work Flow Diagram.
Top- 100 nt from either side of the gap are appended and used for alignment search to identify and potential problems with assembly causing the gap. N(n) denotes unknown nucleotides (50 or 100 by convention) inserted into the gaps in deposited sequence data The resulting 200 nt artificial contig is subjected to 2° structure analysis. The folding analysis reveals additional 2° structures in the vicinity of the gap that may present added difficulty with amplification and sequencing. Primers are targeted to positions proximal to the gap relative to any additional problems identified in the folding evaluation. Where the gap position is involved with secondary structure, amplification with an SD polymerase and use of a two-step extension cycle (1 min at 72°C followed by 1 min at 75°C) supports amplification. The second segment of the extension cycle can be modified based on the thermal stability identified in the initial folding analysis.