| Literature DB >> 19181701 |
William Nelson1, Carol Soderlund.
Abstract
Recent advances in both clone fingerprinting and draft sequencing technology have made it increasingly common for species to have a bacterial artificial clone (BAC) fingerprint map, BAC end sequences (BESs) and draft genomic sequence. The FPC (fingerprinted contigs) software package contains three modules that maximize the value of these resources. The BSS (blast some sequence) module provides a way to easily view the results of aligning draft sequence to the BESs, and integrates the results with the following two modules. The MTP (minimal tiling path) module uses sequence and fingerprints to determine a minimal tiling path of clones. The DSI (draft sequence integration) module aligns draft sequences to FPC contigs, displays them alongside the contigs and identifies potential discrepancies; the alignment can be based on either individual BES alignments to the draft, or on the locations of BESs that have been assembled into the draft. FPC also supports high-throughput fingerprint map generation as its time-intensive functions have been parallelized for Unix-based desktops or servers with multiple CPUs. Simulation results are provided for the MTP, DSI and parallelization. These features are in the FPC V9.3 software package, which is freely available.Entities:
Mesh:
Year: 2009 PMID: 19181701 PMCID: PMC2655663 DOI: 10.1093/nar/gkp034
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.DSI alignment within FPC. The top track shows the clones of FPC contig 15, while the bottom track shows the aligned draft sequence contigs. Draft contig Asm15.1 was clicked with the mouse causing it to highlight in blue, and the associated clones (those having BESs contained in Asm1.1) are also highlighted in the clone track. Alignments that have a reversed orientation are indicated by the notation ‘REV’, as seen on Asm30.1 at the left. The alignment of Asm15.1 illustrates detection of an assembly error, as described in the text.The draft contig lines are drawn to match the extent of the clones with which they share BES anchors; this makes small contigs appear larger than they actually are. Also, the alignment start/end labels reflect the locations of the first/last BES anchors, so they typically do not start exactly at 0 kb or end at the sequence endpoint.
Figure 2.An MTP clone pair, with confirming spanner and flankers. The tick marks represent bands and vertically aligned tick marks represent shared bands. All the bands are shown for candidate MTP clones CL and CR. Only the shared bands are shown for the spanner and flanker. Three bands are not confirmed by either the spanner or the flanker clones.
Figure 3.Effect of different parameters and fingerprint error levels on MTP performance for agarose and HICF. Aggregated MTP clone count and gap count are plotted from simulations of rice chr3, human chr21 and fly chr3L (total genomic sequence 95.2 Mb) for 10× clone coverage for HICF (‘×’) and agarose (‘+’). For HICF, the FPC MinOlap parameter was varied from 0 to 10 and the MinShared parameter was varied from 0 to 30, where MinOlap is the minimal amount of overlap between two clones based on FPC coordinates and MinShared is the minimal number of shared bands. For agarose, MinOlap varied from −6 to 10 and MinShared varied from 0 to 12.
MTP performance tested in simulations using only fingerprint data (see Methods section)
| Method | Species | Clone coverage (×) | No. of contigs | No. of MTP clones | Average overlap (kb) | No. of gaps | MTP coverage |
|---|---|---|---|---|---|---|---|
| HICF | human21 | 10 | 37 | 263 | 30 | 1 | 95 |
| HICF | human21 | 20 | 14 | 265 | 28 | 5 | 98 |
| Agarose | human21 | 10 | 136 | 315 | 51 | 7 | 95 |
| Agarose | human21 | 20 | 55 | 286 | 36 | 10 | 97 |
| HICF | rice3 | 10 | 55 | 277 | 29 | 6 | 98 |
| HICF | rice3 | 20 | 11 | 262 | 23 | 6 | 99 |
| Agarose | rice3 | 10 | 159 | 327 | 52 | 3 | 97 |
| Agarose | rice3 | 20 | 68 | 292 | 34 | 8 | 98 |
| HICF | fly3L | 10 | 35 | 192 | 36 | 2 | 97 |
| HICF | fly3L | 20 | 8 | 177 | 26 | 2 | 99 |
| Agarose | fly3L | 10 | 99 | 211 | 53 | 4 | 95 |
| Agarose | fly3L | 20 | 45 | 196 | 37 | 4 | 98 |
| Gaps | |||||||
| HICF | Average | 10 | - | - | 31 | 1.2 | - |
| HICF | Average | 20 | - | - | 20 | 1.8 | - |
| Agarose | Average | 10 | - | - | 52 | 1.6 | - |
| Agarose | Average | 20 | - | - | 36 | 2.8 | - |
aHuman chr21 is 35.4 Mb, rice chr3 is 36.1 Mb and fly chr3L is 23.8 Mb.
bAverage overlap of the clone pairs selected for the MTP.
cNumber of gaps between clone pairs selected for the MTP, i.e. false-positive overlaps.
dMTP coverage of the genomic sequence.
eGap percentage is the number of gaps divided by the number of MTP clone pairs.
MTP performance with simulated draft sequence and fingerprints
| Coverage | Draft coverage (×) | No. of MTP clones | No. of Gaps | Average overlap (kb) | Average FP overlap (kb) | Average BSS overlap (kb) | No. of FP pairs | No. of BSS pairs |
|---|---|---|---|---|---|---|---|---|
| HICF (×) | ||||||||
| 10 | 0 | 263 | 1 | 30 | 30 | 0.0 | 226 | 0 |
| 10 | 1 | 265 | 0 | 28 | 35 | 0.4 | 184 | 44 |
| 10 | 2 | 264 | 1 | 28 | 36 | 1.0 | 167 | 60 |
| 10 | 4 | 261 | 0 | 26 | 42 | 7.7 | 104 | 120 |
| 10 | 7 | 258 | 0 | 20 | 55 | 17.0 | 18 | 203 |
| 20 | 0 | 263 | 5 | 28 | 28 | 0.0 | 248 | 0 |
| 20 | 1 | 273 | 1 | 26 | 38 | 0.6 | 169 | 89 |
| 20 | 2 | 272 | 2 | 25 | 42 | 1.5 | 144 | 113 |
| 20 | 4 | 271 | 1 | 22 | 47 | 8.2 | 84 | 172 |
| 20 | 7 | 259 | 0 | 15 | 71 | 13.3 | 10 | 234 |
| Agarose (×) | ||||||||
| 10 | 0 | 315 | 7 | 51 | 50 | 0.0 | 179 | 0 |
| 10 | 1 | 320 | 7 | 47 | 58 | 0.3 | 151 | 33 |
| 10 | 2 | 319 | 3 | 44 | 62 | 1.3 | 127 | 56 |
| 10 | 4 | 317 | 0 | 41 | 71 | 6.2 | 91 | 90 |
| 10 | 7 | 318 | 0 | 38 | 104 | 13.5 | 44 | 138 |
| 20 | 0 | 286 | 10 | 36 | 36 | 0.0 | 231 | 0 |
| 20 | 1 | 279 | 5 | 28 | 41 | 0.4 | 150 | 74 |
| 20 | 2 | 282 | 1 | 26 | 48 | 2.3 | 115 | 112 |
| 20 | 4 | 274 | 0 | 24 | 61 | 6.9 | 62 | 157 |
| 20 | 7 | 270 | 1 | 21 | 113 | 11.5 | 16 | 199 |
The simulation used human chr21 sequence and draft sequence coverages from 0×–7×; see Methods section for simulation details. The last four columns show the numbers and average overlaps of the MTP clone pairs selected based on fingerprint and sequence data; see Results section for further discussion.
aNumber of gaps between clone pairs selected for the MTP, i.e. false-positive overlaps.
bAverage overlap for both FP and BSS clone pairs.
cAverage overlap between clone pairs selected from fingerprint overlaps.
dAverage overlap between clone pairs selected from sequence overlaps, i.e. identified using the BSS routine with draft sequence and BESs.
Results of timing experiments on the FPC assembly and Ends→Ends algorithms
| Processors | ||||
|---|---|---|---|---|
| 1 | 2 | 3 | 4 | |
| N × N comparison | 1 h 45 min | 0 h 57 min | 0 h 38 min | 0 h 30 min |
| Speedup | 1 | 1.8 | 2.76 | 3.5 |
| Clone ordering | 3 h 48 min | 2 h 9 min | 1 h 34 min | 1 h 16 min |
| Speedup | 1 | 1.76 | 2.4 | 3 |
| Total time | 5 h 33 min | 3 h 6 min | 2 h 12 min | 1 h 46 min |
| Speedup | 1 | 1.8 | 2.5 | 3.2 |
| Comparison | 2 h 51 min | 1 h 26 min | 58 min | 0 h 44 min |
| Speedup | 1 | 1.9 | 2.9 | 3.9 |
Times are in hours (h) and minutes (min). The speedup is in comparison to using one processor.