| Literature DB >> 24383852 |
Sai Zou, Lei Wang1, Junfeng Wang.
Abstract
In this paper, we first present a new concept of 'weight' for 64 triplets and define a different weight for each kind of triplet. Then, we give a novel 2D graphical representation for DNA sequences, which can transform a DNA sequence into a plot set to facilitate quantitative comparisons of DNA sequences. Thereafter, associating with a newly designed measure of similarity, we introduce a novel approach to make similarities/dissimilarities analysis of DNA sequences. Finally, the applications in similarities/dissimilarities analysis of the complete coding sequences of β-globin genes of 11 species illustrate the utilities of our newly proposed method.Entities:
Year: 2014 PMID: 24383852 PMCID: PMC3896961 DOI: 10.1186/1687-4153-2014-1
Source DB: PubMed Journal: EURASIP J Bioinform Syst Biol ISSN: 1687-4145
Relationship between 20 different kinds of most common amino acids and 64 different kinds of mRNA codons
| GCU, GCC, GCA, GCG | Alanine | CUU, CUC, CUA, CUG, UUA, UUG | Leucine |
| CGU, CGC, CGA, CGG, AGA, AGG | Arginine | AAA, AAG | Lysine |
| GAU, GAC | Aspartic acid | AUG | Methionine |
| AAU, AAC | Asparagine | UUU, UUC | Phenylalanine |
| UGU, UGC | Cysteine | CCU, CCC, CCA, CCG | Proline |
| GAA, GAG | Glutamic acid | UCU, UCC, UCA, UCG, AGU, AGC | Serine |
| CAA, CAG | Glutamine | ACU, ACC, ACA, ACG | Threonine |
| GGU, GGC, GGA, GGG | Glycine | UGG | Tryptophan |
| CAU, CAC | Histidine | UAU, UAC | Tyrosine |
| AUU, AUC, AUA | Isoleucine | GUU, GUC, GUA, GUG | Valine |
| UAA, UAG, UGA |
The corresponding triplets of 64 codons
| GCU, GCC, GCA, GCG | GCT, GCC, GCA, GCG | CUU, CUC, CUA, CUG, UUA, UUG | CTT, CTC, CTA, CTG, TTA, TTG |
| CGU, CGC, CGA, 0020CGG, AGA, AGG | CGT, CGC, CGA, CGG, AGA, AGG | AAA, AAG | AAA, AAG |
| GAU, GAC | GAT, GAC | AUG | ATG |
| AAU, AAC | AAT, AAC | UUU, UUC | TTT, TTC |
| UGU, UGC | TGT, TGC | CCU, CCC, CCA, CCG | CCT, CCC, CCA, CCG |
| GAA, GAG | GAA, GAG | UCU, UCC, UCA, UCG, AGU, AGC | TCT, TCC, TCA, TCG, AGT, AGC |
| CAA, CAG | CAA, CAG | ACU, ACC, ACA, ACG | ACT, ACC, ACA, ACG |
| GGU, GGC, GGA, GGG | GGT, GGC, GGA, GGG | UGG | TGG |
| CAU, CAC | CAT, CAC | UAU, UAC | TAT, TAC |
| AUU, AUC, AUA | ATT, ATC, ATA | GUU, GUC, GUA, GUG | GTT, GTC, GTA, GTG |
| UAA, UAG, UGA | TAA, TAG, TGA |
The mapping rules of
| GCT | 1.1 | CTT | 11.1 |
| GCC | 1.2 | CTC | 11.2 |
| GCA | 1.3 | CTA | 11.3 |
| GCG | 1.4 | CTG | 11.4 |
| | | TTA | 11.5 |
| | | TTG | 11.6 |
| CGT | 2.1 | AAA | 12.3 |
| CGC | 2.2 | AAG | 12.4 |
| CGA | 2.3 | | |
| CGG | 2.4 | | |
| AGA | 2.5 | | |
| AGG | 2.6 | | |
| GAT | 3.3 | TTT | 13.1 |
| GAC | 3.4 | TTC | 13.2 |
| AAT | 4.1 | CCT | 14.1 |
| AAC | 4.2 | CCC | 14.2 |
| | | CCA | 14.3 |
| | | CCG | 14.4 |
| TGT | 5.1 | TCT | 15.1 |
| TGC | 5.2 | TCC | 15.2 |
| | | TCA | 15.3 |
| | | TCG | 15.4 |
| | | AGT | 15.5 |
| | | AGC | 15.6 |
| GAA | 6.1 | ACT | 16.3 |
| GAG | 6.2 | ACC | 16.4 |
| | | ACA | 16.5 |
| | | ACG | 16.6 |
| CAA | 7.1 | TGG | 17.3 |
| CAG | 7.2 | | |
| GGT | 8.1 | TAT | 18.1 |
| GGC | 8.2 | TAC | 18.2 |
| GGA | 8.3 | | |
| GGG | 8.4 | | |
| CAT | 9.1 | GTT | 19.1 |
| CAC | 9.2 | GTC | 19.2 |
| | | GTA | 19.3 |
| | | GTG | 19.4 |
| ATT | 10.1 | ATG | 20.1 |
| ATC | 10.2 | | |
| ATA | 10.3 | | |
| TAA | 21.1 | | |
| TAG | 21.2 | | |
| TGA | 21.3 |
The complete coding sequences of β-globin genes of 11 species
| Human | ATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAA |
| Chimpanzee | ATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGTTGGTATCAAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAG |
| Gorilla | ATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAAGCTCCTGGGCAATGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAG |
| Black lemur | ATGACTTTGCTGAGTGCTGAGGAGAATGCTCATGTCACCTCTCTGTGGGGCAAGGTGGATGTAGAGAAAGTTGGTGGCGAGGCCTTGGGCAGGCTGCTGGTCGTCTACCCATGGACCCAGAGGTTCTTCGAGTCCTTTGGGGACCTGTCCTCTCCTTCTGCTGTTATGGGGAACCCTAAGGTGAAGGCCCATGGCAAGAAGGTGCTGAGTGCCTTTAGTGAAGGTCTGCATCACCTGGACAACCTCAAGGGCACCTTTGCTCAACTGAGTGAGCTGCACTGTGACAAGTTGCACGTGGATCCTCAGAACTTCACTCTCCTGGGCAACGTGCTGGTGGTTGTGCTGGCTGAACACTTTGGCAATGCATTCAGCCCGGCGGTGCAGGCTGCCTTTCAGAAGGTGGTGGCTGGTGTGGCCAATGCTCTGGCTCACAAGTACCACTGA |
| Norway rat | ATGGTGCACCTAACTGATGCTGAGAAGGCTACTGTTAGTGGCCTGTGGGGAAAGGTGAATGCTGATAATGTTGGCGCTGAGGCCCTGGGCAGGCTGCTGGTTGTCTACCCTTGGACCCAGAGGTACTTTTCTAAATTTGGGGACCTGTCCTCTGCCTCTGCTATCATGGGTAACCCCCAGGTGAAGGCCCATGGCAAGAAGGTGATAAATGCCTTCAATGATGGCCTGAAACACTTGGACAACCTCAAGGGCACCTTTGCTCATCTGAGTGAACTCCACTGTGACAAGCTGCATGTGGATCCTGAGAACTTCAGGCTCCTGGGCAATATGATTGTGATTGTGTTGGGCCACCACCTGGGCAAGGAATTCACCCCCTGTGCACAGGCTGCCTTCCAGAAGGTGGTGGCTGGAGTGGCCAGTGCCCTGGCTCACAAGTACCACTAA |
| House mouse | ATGGTGCACCTGACTGATGCTGAGAAGTCTGCTGTCTCTTGCCTGTGGGCAAAGGTGAACCCCGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTTGTCTACCCTTGGACCCAGCGGTACTTTGATAGCTTTGGAGACCTATCCTCTGCCTCTGCTATCATGGGTAATCCCAAGGTGAAGGCCCATGGCAAAAAGGTGATAACTGCCTTTAACGAGGGCCTGAAAAACCTGGACAACCTCAAGGGCACCTTTGCCAGCCTCAGTGAGCTCCACTGTGACAAGCTGCATGTGGATCCTGAGAACTTCAGGCTCCTAGGCAATGCGATCGTGATTGTGCTGGGCCACCACCTGGGCAAGGATTTCACCCCTGCTGCACAGGCTGCCTTCCAGAAGGTGGTGGCTGGAGTGGCCACTGCCCTGGCTCACAAGTACCACTAA |
| Goat | ATGCTGACTGCTGAGGAGAAGGCTGCCGTCACCGGCTTCTGGGGCAAGGTGAAAGTGGATGAAGTTGGTGCTGAGGCCCTGGGCAGGCTGCTGGTTGTCTACCCCTGGACTCAGAGGTTCTTTGAGCACTTTGGGGACTTGTCCTCTGCTGATGCTGTTATGAACAATGCTAAGGTGAAGGCCCATGGCAAGAAGGTGCTAGACTCCTTTAGTAACGGCATGAAGCATCTTGACGACCTCAAGGGCACCTTTGCTCAGCTGAGTGAGCTGCACTGTGATAAGCTGCACGTGGATCCTGAGAACTTCAAGCTCCTGGGCAACGTGCTGGTGGTTGTGCTGGCTCGCCACCATGGCAGTGAATTCACCCCGCTGCTGCAGGCTGAGTTTCAGAAGGTGGTGGCTGGTGTTGCCAATGCCCTGGCCCACAGATATCACTAA |
| Bovine | ATGCTGACTGCTGAGGAGAAGGCTGCCGTCACCGCCTTTTGGGGCAAGGTGAAAGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTTGTCTACCCCTGGACTCAGAGGTTCTTTGAGTCCTTTGGGGACTTGTCCACTGCTGATGCTGTTATGAACAACCCTAAGGTGAAGGCCCATGGCAAGAAGGTGCTAGATTCCTTTAGTAATGGCATGAAGCATCTCGATGACCTCAAGGGCACCTTTGCTGCGCTGAGTGAGCTGCACTGTGATAAGCTGCATGTGGATCCTGAGAACTTCAAGCTCCTGGGCAACGTGCTAGTGGTTGTGCTGGCTCGCAATTTTGGCAAGGAATTCACCCCGGTGCTGCAGGCTGACTTTCAGAAGGTGGTGGCTGGTGTGGCCAATGCCCTGGCCCACAGATATCATTAA |
| Rabbit | ATGGTGCATCTGTCCAGTGAGGAGAAGTCTGCGGTCACTGCCCTGTGGGGCAAGGTGAATGTGGAAGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTTGTCTACCCATGGACCCAGAGGTTCTTCGAGTCCTTTGGGGACCTGTCCTCTGCAAATGCTGTTATGAACAATCCTAAGGTGAAGGCTCATGGCAAGAAGGTGCTGGCTGCCTTCAGTGAGGGTCTGAGTCACCTGGACAACCTCAAAGGCACCTTTGCTAAGCTGAGTGAACTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTTATTGTGCTGTCTCATCATTTTGGCAAAGAATTCACTCCTCAGGTGCAGGCTGCCTATCAGAAGGTGGTGGCTGGTGTGGCCAATGCCCTGGCTCACAAATACCACTGA |
| Opossum | ATGGTGCACTTGACTTCTGAGGAGAAGAACTGCATCACTACCATCTGGTCTAAGGTGCAGGTTGACCAGACTGGTGGTGAGGCCCTTGGCAGGATGCTCGTTGTCTACCCCTGGACCACCAGGTTTTTTGGGAGCTTTGGTGATCTGTCCTCTCCTGGCGCTGTCATGTCAAATTCTAAGGTTCAAGCCCATGGTGCTAAGGTGTTGACCTCCTTCGGTGAAGCAGTCAAGCATTTGGACAACCTGAAGGGTACTTATGCCAAGTTGAGTGAGCTCCACTGTGACAAGCTGCATGTGGACCCTGAGAACTTCAAGATGCTGGGGAATATCATTGTGATCTGCCTGGCTGAGCACTTTGGCAAGGATTTTACTCCTGAATGTCAGGTTGCTTGGCAGAAGCTCGTGGCTGGAGTTGCCCATGCCCTGGCCCACAAGTACCACTAA |
| ATGGTGCACTGGACTGCTGAGGAGAAGCAGCTCATCACCGGCCTCTGGGGCAAGGTCAATGTGGCCGAATGTGGGGCCGAAGCCCTGGCCAGGCTGCTGATCGTCTACCCCTGGACCCAGAGGTTCTTTGCGTCCTTTGGGAACCTCTCCAGCCCCACTGCCATCCTTGGCAACCCCATGGTCCGCGCCCACGGCAAGAAAGTGCTCACCTCCTTTGGGGATGCTGTGAAGAACCTGGACAACATCAAGAACACCTTCTCCCAACTGTCCGAACTGCATTGTGACAAGCTGCATGTGGACCCCGAGAACTTCAGGCTCCTGGGTGACATCCTCATCATTGTCCTGGCCGCCCACTTCAGCAAGGACTTCACTCCTGAATGCCAGGCTGCCTGGCAGAAGCTGGTCCGCGTGGTGGCCCATGCCCTGGCTCGCAAGTACCACTAA |
Figure 1The 2D graphical representations of the complete coding sequences of β-globin genes of human.
Figure 2The 2D graphical representations of the complete coding sequences of β-globin genes of chimpanzee.
Figure 3The 2D graphical representations of the complete coding sequences of β-globin genes of opossum.
The similarity/dissimilarity matrix for the coding sequences of Table1based on the weight deviation
| Human | 0 | 5.2500 | 4.3359 | 8.5891 | 10.670 | 9.7047 | 8.2219 | 8.1438 | 7.8281 | 15.6078 | 16.7109 |
| Chimpanzee | | 0 | 1.1266 | 8.0297 | 10.645 | 9.6016 | 8.4375 | 9.3219 | 9.6000 | 14.2578 | 15.8734 |
| Gorilla | | | 0 | 7.8688 | 9.9625 | 8.6063 | 7.6734 | 8.5578 | 8.5547 | 13.9719 | 14.8781 |
| Lemur | | | | 0 | 8.7219 | 9.5500 | 7.1328 | 9.3891 | 5.6891 | 12.9281 | 15.2000 |
| Rat | | | | | 0 | 6.0750 | 7.0484 | 9.3641 | 9.6578 | 13.5906 | 14.1219 |
| Mouse | | | | | | 0 | 9.4953 | 9.2641 | 10.7984 | 12.3406 | 12.3688 |
| Goat | | | | | | | 0 | 5.2625 | 8.7219 | 11.9703 | 14.5359 |
| Bovine | | | | | | | | 0 | 9.2906 | 12.5922 | 15.0234 |
| Rabbit | | | | | | | | | 0 | 14.8984 | 15.6953 |
| Opossum | | | | | | | | | | 0 | 14.2750 |
| 0 |
Figure 4Phylogenetic tree based on the similarity matrix (Table5).
The similarity/dissimilarity of the coding sequences
| Chimpanzee | 5.2500 | 0.0144 | 14.00 | 0.005069 | 0.863 |
| Gorilla | 4.3359 | 0.0125 | 13.63 | 0.006611 | 0.339 |
| Lemur | 8.5891 | - | 31.75 | 0.030894 | 1.188 |
| Rat | 10.670 | 0.1377 | 41.65 | 0.015539 | 1.966 |
| Mouse | 9.7047 | 0.1427 | 30.27 | 0.015700 | 0.735 |
| Goat | 8.2219 | 0.1161 | 31.39 | 0.020980 | 0.311 |
| Bovine | 8.1438 | 0.0773 | 30.68 | 0.017700 | 2.489 |
| Rabbit | 7.8281 | 0.1332 | 35.575 | 0.015788 | 1.372 |
| Opossum | 15.6078 | - | 48.701 | 0.033363 | 6.322 |
| 16.7109 | - | 70.46 | 0.025801 | 7.170 |