| Literature DB >> 17593976 |
Tal Murthy1, Andreas Rolfs, Yanhui Hu, Zhenwei Shi, Jacob Raphael, Donna Moreira, Fontina Kelley, Seamus McCarron, Daniel Jepson, Elena Taycher, Dongmei Zuo, Stephanie E Mohr, Mauricio Fernandez, Leonardo Brizuela, Joshua LaBaer.
Abstract
The rapid development of new technologies for the high throughput (HT) study of proteins has increased the demand for comprehensive plasmid clone resources that support protein expression. These clones must be full-length, sequence-verified and in a flexible format. The generation of these resources requires automated pipelines supported by software management systems. Although the availability of clone resources is growing, current collections are either not complete or not fully sequence-verified. We report an automated pipeline, supported by several software applications that enabled the construction of the first comprehensive sequence-verified plasmid clone resource for more than 96% of protein coding sequences of the genome of F. tularensis, a highly virulent human pathogen and the causative agent of tularemia. This clone resource was applied to a HT protein purification pipeline successfully producing recombinant proteins for 72% of the genes. These methods and resources represent significant technological steps towards exploiting the genomic information of F. tularensis in discovery applications.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17593976 PMCID: PMC1894649 DOI: 10.1371/journal.pone.0000577
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Schematic representation of the work flow used in genome cloning of Francisella tularensis.
The entire process, from design of primers to production of clonal glycerol stocks, is shown. Most steps are common to both phases; steps specific to phase 2 are shown with a dashed line. The process began with design of primers for each ORF in the genome (Step 1A). The primers were used to amplify ORFs from genomic DNA (Step 1B). Subsequent amplification with universal primers (Step 1C) generated ORF sequences flanked by complete recombinational cloning sites for capture by BP (Step 3). For amplicons captured by In-Fusion in Phase 2, universal primed PCR was not necessary (Step 2B) as the capture reaction completes creation of the recombinational cloning sites. Successful PCR was monitored by agarose gel electrophoresis (Steps 2A and B). In Phase 1 all products were purified from preparative gels (Step 2B) and cloned into a recombinational cloning vector via the BP clonase reaction (Steps 3), whereas in Phase 2, the capture method depended on ORF size as indicated, with only diagnostic gels needed for short amplicons (Step 2A) and preparative gels needed when In-Fusion capture was performed. Competent bacteria were transformed with the reaction mix to yield colonies which were isolated robotically, cultured in liquid media and stored as 15% glycerol stocks (Step 4).
Summary of the cloning process of two annotations of F. tularensis
| Phase1 | Phase2 | |
| ORF Target | 2036 | 703 |
| Average ORF size (bp) | 798 (range 90–4,269) | 1025 (range 105–4,269) |
| Genome annotation | TIGR preliminary annotation (Feb 2004) | NCBI (Feb 2006) |
| Primer synthesis organization | Illumina | IDT |
| PCR polymerase | KOD | Phusion |
| Accuracy of polymerase (errors/bp) | 1/290,000 | 1/770,000 |
| Capture reaction | BP | Small gene: BP |
| Large gene: InFusion | ||
| Isolate picking | 4 per ORF | 1 per ORF |
| Sequencing vector | pDONR221 | pDONR221 & pDEST-17 |
| PCR success rate | 100% | 100% |
| Capture success rate | 99.2% | 99.1% |
|
|
|
|
| Number of reads | 5835 | 3458 |
|
|
|
|
|
|
|
|
| Clones with linker changes | 182 (6.4%) | 6 (0.6%) |
| Clones with frameshift | 239 (8.4%) | 52 (5.3%) |
| Clones with inframe ins/del | 7 (0.2%) | 0 |
| Clone with truncation mutation | 84 (2.9%) | 1 (0.1%) |
| Clone with> = 3 missense | 67(2.3%) | 3 (0.3%) |
| Clone with LQ discrepancy or unassembled (not further pursued) | 768 | 229 |
|
|
|
|
|
|
|
|
| Clones match perfectly with reference | 626 (21.9%) | 663 (95.3%) |
| Clone with silent only | 143 (5.0%) | 7 (1.0%) |
| Clone with< = 2 mis-sense | 736 (25.8%) | 26 (3.7%) |
|
|
|
|
|
|
|
|
|
|
|
|
Figure 2Representative virtual protein analysis gel of 188 proteins produced via the high-throughput protein production pipeline.
The label NTFT02#### indicates the unique identifier for each ORF of Francisella tularensis. The expected molecular weights (based on predicted protein coding sequences of the ORFs) are shown below each lane. Black arrows (left side) indicate protein bands observed at approximately the expected molecular weight. MW, molecular weight; NS, no sample.