| Literature DB >> 20540747 |
Felix D Guerrero1, Paula Moolhuijzen, Daniel G Peterson, Shelby Bidwell, Elisabet Caler, Matthew Bellgard, Vishvanath M Nene, Appolinaire Djikeng.
Abstract
BACKGROUND: The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence fiscally and technically problematic. To selectively obtain gene-enriched regions of this tick's genome, Cot filtration was performed, and Cot-filtered DNA was sequenced via 454 FLX pyrosequencing.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20540747 PMCID: PMC2893602 DOI: 10.1186/1471-2164-11-374
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Maps of the sequenced BACs. Each BAC is represented on a number line marked in kb. Arrows above the number line represent where Genscan search found GenBank sequences that had significant sequence similarity (e < 0.001) to regions of each BAC. The direction of transcription (5'-3' positive direction shown as left to right) is indicated. BAC represented are a) BM-66-M7; b) BM-77-G20; c) BM-129-N14; d) BM-77-J9; and e) BM-74-F12. We also indicate the identity and e-value for the statistically significant Blast hits (e < 0.001) for the gene predictions indicated above the number line for each BAC. The complete Genscan results are in Additional Files 1 and 2.
Figure 2NUCmer analysis of the BACs.NUCmer Version 3.06 was used to plot all 5 BAC sequences. Matches unique in the reference sequence but not necessarily unique in the query are shown in blue. All other matches are shown in red.
Summary of sequencing runs generated by 454 pyrosequencing.
| Run | No. of reads | Total no. of bases | Avg. read length | No. of non-redundant seqs |
|---|---|---|---|---|
| 1 | 372,821 | 91,115,654 | 244 | 280,859 |
| 2 | 417,596 | 101,746,015 | 243 | 279,784 |
| 3 | 508,619 | 124,597,626 | 245 | 209,344 |
| 4 | 390,263 | 89,660,849 | 230 | 212,275 |
| 5 | 421,619 | 97,803,973 | 232 | 171,554 |
| 6 | 560,643 | 128,732,263 | 230 | 218,709 |
| Total | 2,671,561 | 633,656,380 | 237 | 1,372,525 |
Summary of the raw data assembly of 6 runs on 454 using newbler assembler.
| Category | Quantity | Percent |
|---|---|---|
| Total number of reads | 2,671,519 | - |
| Total number of bases | 633,429,850 | - |
| Number of searches | 2,081,652 | - |
| Seed hits found | 218,338,521 | - |
| Overlaps found | 6,694,880 | 3.07 |
| Overlaps reported | 4,244,624 | 1.94 |
| Overlaps used | 2,699,511 | 63.6 |
| Number assembled | 487,166 | - |
| Number partial | 1,135,496 | - |
| Number singleton | 829,693 | - |
| Number repeat | 185,527 | - |
| Number outlier | 33,637 | - |
| Number of contigs | 15,221 | - |
| Number of bases | 9,801,045 | |
| Average contig size | 643 | - |
| N50 contig size | 624 | - |
| Largest contig size | 1,797 | - |
| Q40 plus bases | 6,689,319 | 68.25 |
| Q39 minus bases | 3,111,726 | 31.75 |
| Number of contigs | 456,985 | - |
| Number of bases | 110,333,787 | - |
Figure 3Distribution of the BmiGI member extensions resulting from reassembly with the Cot-selected genomic DNA sequences. The range of length of the extension is noted on the x-axis in bp and the count of the number of members of BmiGI Version 2.1 that were extended within a specific range is noted on the y-axis.