| Literature DB >> 11591651 |
S Zhao1, S Shatsman, B Ayodeji, K Geer, G Tsegaye, M Krol, E Gebregeorgis, A Shvartsbeyn, D Russell, L Overton, L Jiang, G Dimitrov, K Tran, J Shetty, J A Malek, T Feldblyum, W C Nierman, C M Fraser.
Abstract
A large-scale BAC end-sequencing project at The Institute for Genomic Research (TIGR) has generated one of the most extensive sets of sequence markers for the mouse genome to date. With a sequencing success rate of >80%, an average read length of 485 bp, and ABI3700 capillary sequencers, we have generated 449,234 nonredundant mouse BAC end sequences (mBESs) with 218 Mb total from 257,318 clones from libraries RPCI-23 and RPCI-24, representing 15x clone coverage, 7% sequence coverage, and a marker every 7 kb across the genome. A total of 191,916 BACs have sequences from both ends providing 12x genome coverage. The average Q20 length is 406 bp and 84% of the bases have phred quality scores > or = 20. RPCI-24 mBESs have more Q20 bases and longer reads on average than RPCI-23 sequences. ABI3700 sequencers and the sample tracking system ensure that > 95% of mBESs are associated with the right clone identifiers. We have found that a significant fraction of mBESs contains L1 repeats and approximately 48% of the clones have both ends with > or = 100 bp contiguous unique Q20 bases. About 3% mBESs match ESTs and > 70% of matches were conserved between the mouse and the human or the rat. Approximately 0.1% mBESs contain STSs. About 0.2% mBESs match human finished sequences and > 70% of these sequences have EST hits. The analyses indicate that our high-quality mouse BAC end sequences will be a valuable resource to the community.Entities:
Mesh:
Year: 2001 PMID: 11591651 PMCID: PMC311142 DOI: 10.1101/gr.179201
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043