MOTIVATION: Massive oligonucleotide hybridization is one of the most promising technologies of functional genome analysis. The critical point is to design appropriate sets of oligonucleotides that can be used effectively in identification by hybridization. RESULTS: Using a genetic algorithm approach, we have attempted to design sets of oligo probes capable of identifying new genes belonging to a defined gene family within a cDNA or genomic library. It is not limited by oligonucleotide length and admits the letter 'N' in the structure of the oligonucleotides selected. One of the major advantages of this approach is the low homology required to identify functional families of sequences with little homology. We have designed the oligonucleotide sets that are most selective for the cDNA clones of transmembrane G protein-coupled receptors (GPCRs), a large family of proteins that form part of a modular system of extracellular signal transduction to the intracellular second messenger pathways. The accuracy of identification has been checked on the EST library containing 713 870 cDNA sequences. A set of 15 oligos between 7 and 14 bases in length has correctly identified 70% of the GPCR cDNA collection sequences with 0.02% false positives. AVAILABILITY: The developed software is available by ftp://ftp.bionet.nsc. ru/pub/biology/ and on the Web page http://www.bionet.nsc. ru/SRCG/Oligoselector/. CONTACT: kel@.bionet.nsc.ru; sebastian. meier-ewert@gpc-ag.com
MOTIVATION: Massive oligonucleotide hybridization is one of the most promising technologies of functional genome analysis. The critical point is to design appropriate sets of oligonucleotides that can be used effectively in identification by hybridization. RESULTS: Using a genetic algorithm approach, we have attempted to design sets of oligo probes capable of identifying new genes belonging to a defined gene family within a cDNA or genomic library. It is not limited by oligonucleotide length and admits the letter 'N' in the structure of the oligonucleotides selected. One of the major advantages of this approach is the low homology required to identify functional families of sequences with little homology. We have designed the oligonucleotide sets that are most selective for the cDNA clones of transmembrane G protein-coupled receptors (GPCRs), a large family of proteins that form part of a modular system of extracellular signal transduction to the intracellular second messenger pathways. The accuracy of identification has been checked on the EST library containing 713 870 cDNA sequences. A set of 15 oligos between 7 and 14 bases in length has correctly identified 70% of the GPCR cDNA collection sequences with 0.02% false positives. AVAILABILITY: The developed software is available by ftp://ftp.bionet.nsc. ru/pub/biology/ and on the Web page http://www.bionet.nsc. ru/SRCG/Oligoselector/. CONTACT: kel@.bionet.nsc.ru; sebastian. meier-ewert@gpc-ag.com
Authors: Olga V Matveeva; Brian T Foley; Vladimir A Nemtsov; Raymond F Gesteland; Senya Matsufuji; John F Atkins; Alexei Y Ogurtsov; Svetlana A Shabalina Journal: BMC Bioinformatics Date: 2004-04-29 Impact factor: 3.169