| Literature DB >> 28438115 |
Abstract
BACKGROUND: DNA Sonification refers to the use of an auditory display to convey the information content of DNA sequence data. Six sonification algorithms are presented that each produce an auditory display. These algorithms are logically designed from the simple through to the more complex. Three of these parse individual nucleotides, nucleotide pairs or codons into musical notes to give rise to 4, 16 or 64 notes, respectively. Codons may also be parsed degenerately into 20 notes with respect to the genetic code. Lastly nucleotide pairs can be parsed as two separate frames or codons can be parsed as three reading frames giving rise to multiple streams of audio.Entities:
Keywords: Auditory display; DNA sequence; Sonification
Mesh:
Year: 2017 PMID: 28438115 PMCID: PMC5404335 DOI: 10.1186/s12859-017-1632-x
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1A screen image from the DNA sonification website (http://dnasonification.org). Radio buttons to select either of the six sonification algorithms are located to the left of the page and a brief usage guide to these is located to the right. Below these are options to process start and stop codons and at the bottom left are the controls to play the resultant auditory display
Summary of the six sonification algorithms and their properties
| Algorithm | DNA motif size | Step range (notes) | Number of audio frames (instruments) | Silence on STOP codon Restart on START codon | Sonify occurrence of STOP/START |
|---|---|---|---|---|---|
| Reading frame codons | 3 bp | 20 | 3 | optional | optional |
| Protein sequence | 3 bp | 20 | 1 | no | optional |
| Tri-nucleotides | 3 bp | 64 | 1 | no | optional |
| Di-nucleotide pairs | 2 bp | 16 | 2 | no | no |
| Di-nucleotides | 2 bp | 16 | 1 | no | no |
| Mono-nucleotides | 1 bp | 4 | 1 | no | no |
Summary of how a DNA sequence is parsed by each of the six sonification algorithms to produce an auditory display
DNA sequences are coloured red or black to signify adjacent motifs within the reading frame that is parsed by the respective algorithm. Sharps, flats and the octave (pitch) of the notes are not shown for the sake of simplicity. The ‘Di-nucleotide pairs’ algorithm produces a more complex audio display (twice the number of notes) than the ‘Di-nucleotides’ algorithm. Common notes between the two are indicated in blue. Similarly The first ATG codon of the sequence is parsed to a C note by the both the ‘Reading frame codons’ and ‘Protein sequence’ algorithms (coloured in blue). The ‘Reading frame codons’ produces two additional notes absent from the ‘Protein sequence’ algorithm prior to the sonification of the next common codon (ACG) of the first reading frame. The first TGA codon in the second reading frame and the GAC of the third reading frame are parsed to G and F notes, respectively, only by the ‘Reading frame codons’ algorithm
Test input DNA sequences used to assess the DNA sonification algorithms
All sequence names refer to input sequence data available on the DNA sonification homepage. Red text indicates stop and start codons in the ‘G Sequence’ and also sequence variations in the ‘Mutated G Sequence’ and ‘Human Telomeric DNA’ sequence
Fig. 2Musical scores representing the auditory displays of a) the Human Telomeric DNA sequence and b) the RAS DNA sequence. In each case, DNA sequences were sonified (using the ‘reading frames codons’ algorithm) to produce three staves of music that are played simultaneously. Each stave is labelled to show the reading frame it represents and the instrument used for its sonification. Measures are indicated at the top left of each stave