Giuseppe Maccari1, Federica Gemignani, Stefano Landi. 1. Laboratory of Microbiology and Genetics - Ospedale di Circolo e Fondazione Macchi, University of Insubria, Viale Borri 57 21100 Varese, Italy. gpmaccari@gmail.com
Abstract
MOTIVATION: The complete sequencing of the human genome shows that only 1% of the entire genome encodes for proteins. The major part of the genome is made up of non-coding DNA, regulatory elements and junk DNA. Transcriptional regulation plays a central role in a multitude of critical cellular processes and responses, and it is a central force in the development and differentiation of multicellular organisms. Identifying regulatory elements is one of the major tasks in this challenge. To accomplish this task, we developed a solid and simple suite that allows direct access to genomic database and immediate result check. We introduce COMPASSS (COMplex PAttern of Sequence Search Software), a simple and effective tool for motif search in entire genomes. Motifs can be partially degenerated and interrupted by spacers of variable length. RESULTS: We demonstrate through real biological data mining the simplicity and robustness of this tool. The test was performed on two well-known protein domains and a highly variable cis-acting element. COMPASSS successfully identifies both protein domains and cis-acting semi-conserved elements. AVAILABILITY: The COMPASSS suite is available for Windows free of charge from our web sites: compasss.sourceforge.net/; www.stefanolandi.eu/
MOTIVATION: The complete sequencing of the human genome shows that only 1% of the entire genome encodes for proteins. The major part of the genome is made up of non-coding DNA, regulatory elements and junk DNA. Transcriptional regulation plays a central role in a multitude of critical cellular processes and responses, and it is a central force in the development and differentiation of multicellular organisms. Identifying regulatory elements is one of the major tasks in this challenge. To accomplish this task, we developed a solid and simple suite that allows direct access to genomic database and immediate result check. We introduce COMPASSS (COMplex PAttern of Sequence Search Software), a simple and effective tool for motif search in entire genomes. Motifs can be partially degenerated and interrupted by spacers of variable length. RESULTS: We demonstrate through real biological data mining the simplicity and robustness of this tool. The test was performed on two well-known protein domains and a highly variable cis-acting element. COMPASSS successfully identifies both protein domains and cis-acting semi-conserved elements. AVAILABILITY: The COMPASSS suite is available for Windows free of charge from our web sites: compasss.sourceforge.net/; www.stefanolandi.eu/
Authors: Paul D Facey; Matthew D Hitchings; Jason S Williams; David O F Skibinski; Paul J Dyson; Ricardo Del Sol Journal: PLoS One Date: 2013-04-01 Impact factor: 3.240