| Literature DB >> 29765099 |
Zhiyi Mo1, Wen Zhu2, Yi Sun3, Qilin Xiang3, Ming Zheng1, Min Chen4, Zejun Li4.
Abstract
One novel representation of DNA sequence combining the global and local position information of the original sequence has been proposed to distinguish the different species. First, for the sufficient exploitation of global information, one graphical representation of DNA sequence has been formulated according to the curve of Fermat spiral. Then, for the consideration of local characteristics of DNA sequence, attaching each point in the curve of Fermat spiral with the related mass has been applied based on the relationships of neighboring four nucleotides. In this paper, the normalized moments of inertia of the curve of Fermat spiral which composed by the points with mass has been calculated as the numerical description of the corresponding DNA sequence on the first exons of beta-global genes. Choosing the Euclidean distance as the measurement of the numerical descriptions, the similarity between species has shown the performance of proposed method.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29765099 PMCID: PMC5953932 DOI: 10.1038/s41598-018-26005-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The graphical representation of human gene. From left to right and from top to bottom, the graphical representations are respectively for A, C, G and T subsequence.
The first exons of β-globin gene of different species.
| k | Species | Gene ID | N |
|---|---|---|---|
| 1 | Human | U01317 | 92 |
| 2 | Gorilla | X61109 | 93 |
| 3 | Chimpanzee | X02345 | 105 |
| 4 | Rat | X06701 | 92 |
| 5 | Mouse | V00722 | 93 |
| 6 | Lemur | M15734 | 92 |
| 7 | Rabbit | V00882 | 92 |
| 8 | Goat | M15387 | 86 |
| 9 | Bovine | X00376 | 86 |
| 10 | Opossum | J03643 | 92 |
| 11 | Gallus | V00409 | 92 |
The numerical representation of DNA sequence.
| Species |
|
|
|
|
|---|---|---|---|---|
| Human | 1.6674 | 1.7921 | 1.7233 | 1.7689 |
| Gorilla | 1.6674 | 1.7921 | 1.7233 | 1.7727 |
| Chimpanzee | 1.6885 | 1.8085 | 1.7239 | 1.7845 |
| Rat | 1.6074 | 1.7858 | 1.8242 | 1.7462 |
| Mouse | 1.5943 | 1.7480 | 1.8407 | 1.7921 |
| Lemur | 1.6781 | 1.6659 | 1.8149 | 1.8046 |
| Rabbit | 1.6454 | 1.7145 | 1.8690 | 1.7809 |
| Goat | 1.5416 | 1.6808 | 1.8246 | 1.8476 |
| Bovine | 1.5416 | 1.5929 | 1.8056 | 1.8471 |
| Opossum | 1.5693 | 1.6713 | 1.9416 | 1.7398 |
| Gallus | 1.7986 | 1.6879 | 1.9639 | 1.7140 |
Similarity/dissimilarity matrix under the Euclidean distance.
| Species | Human | Gorilla | Chimp | Rat | Mouse | Lemur | Rabbit | Goat | Bovine | Opossum | Gallus |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Human | 0 | 0.0038 | 0.0309 | 0.1198 | 0.1470 | 0.1604 | 0.1670 | 0.2114 | 0.2616 | 0.2696 | 0.2983 |
| Gorilla | 0.0038 | 0 | 0.0292 | 0.1206 | 0.1465 | 0.1596 | 0.1668 | 0.2100 | 0.2604 | 0.2701 | 0.2991 |
| Chimp | 0.0309 | 0.0292 | 0 | 0.1365 | 0.1620 | 0.1707 | 0.1782 | 0.2281 | 0.2804 | 0.2870 | 0.2988 |
| Rat | 0.1198 | 0.1206 | 0.1365 | 0 | 0.0631 | 0.1513 | 0.0987 | 0.1602 | 0.2282 | 0.1684 | 0.2583 |
| Mouse | 0.1470 | 0.1465 | 0.1620 | 0.0631 | 0 | 0.1208 | 0.0683 | 0.1031 | 0.1763 | 0.1393 | 0.2582 |
| Lemur | 0.1604 | 0.1596 | 0.1707 | 0.1513 | 0.1208 | 0 | 0.0832 | 0.1443 | 0.1608 | 0.1792 | 0.2131 |
| Rabbit | 0.1670 | 0.1668 | 0.1782 | 0.0987 | 0.0683 | 0.0832 | 0 | 0.1355 | 0.1844 | 0.1209 | 0.1940 |
| Goat | 0.2114 | 0.2100 | 0.2281 | 0.1602 | 0.1031 | 0.1443 | 0.1355 | 0 | 0.0900 | 0.1618 | 0.3215 |
| Bovine | 0.2616 | 0.2604 | 0.2804 | 0.2282 | 0.1763 | 0.1608 | 0.1844 | 0.0900 | 0 | 0.1921 | 0.3433 |
| Opossum | 0.2696 | 0.2701 | 0.2870 | 0.1684 | 0.1393 | 0.1792 | 0.1209 | 0.1618 | 0.1921 | 0 | 0.2324 |
| Gallus | 0.2983 | 0.2991 | 0.2988 | 0.2583 | 0.2582 | 0.2131 | 0.1940 | 0.3215 | 0.3433 | 0.2324 | 0 |
Similarity/dissimilarity between Human and other species with different methods.
| Methods | Gorilla | Chimp | Rat | Mouse | Lemur | Rabbit | Goat | Bovine | Opossum | Gallus |
|---|---|---|---|---|---|---|---|---|---|---|
| Our work | 0.0038 | 0.0309 | 0.1198 | 0.1470 | 0.1604 | 0.1670 | 0.2114 | 0.2616 | 0.2696 | 0.2983 |
| Randic | 0.0210 | 0.0170 | 0.0430 | 0.0830 | 0.0870 | 0.0420 | 0.0610 | 0.0840 | 0.1480 | 0.1090 |
| Dai | 0.0120 | 0.0155 | 0.0704 | 0.0543 | 0.0603 | 0.0287 | 0.0169 | 0.0276 | 0.1389 | 0.1146 |
| Liu and Wang 2006[ | 0.3070 | 0.3101 | 0.4256 | 0.3089 | 0.3688 | 0.2968 | 0.4341 | 0.4172 | 0.3805 | 0.4479 |
| Liao | 0.1651 | 0.4688 | 0.9202 | 0.6024 | 1.0110 | 0.7453 | 0.6010 | 0.6320 | 1.3710 | 1.5932 |
| Jafarzadeh | 0.0330 | 0.0920 | 0.2160 | 0.1630 | 0.1940 | 0.1240 | 0.1650 | 0.2210 | 0.1940 | 0.1940 |
| Bielinska-Waz | 0.0056 | 0.0314 | 0.1838 | 0.2395 | 0.2497 | 0.1844 | 0.1276 | 0.0872 | 0.3904 | 0.4687 |
Figure 2Cluster dendrogram.
Figure 3Similarity values of human-other species with different methods.