| Literature DB >> 32290485 |
Krzysztof Kowal1, Angelika Tkaczyk1, Tomasz Ząbek2, Mariusz Pierzchała3, Brygida Ślaska1.
Abstract
: The information about mtDNA methylation is still limited, thus epigenetic modification remains unclear. The lack of comprehensive information on the comparative epigenomics of mtDNA prompts comprehensive investigations of the epigenomic modification of mtDNA in different species. This is the first study in which the theoretical CpG localization in the mtDNA reference sequences from various species (12) was compared. The aim of the study was to determine the localization of CpG sites and islands in mtDNA of model organisms and to compare their distribution. The results are suitable for further investigations of mtDNA methylation. The analysis involved both strands of mtDNA sequences of animal model organisms representing different taxonomic groups of invertebrates and vertebrates. For each sequence, such parameters as the number, length, and localization of CpG islands were determined with the use of EMBOSS (European Molecular Biology Open Software Suite) software. The number of CpG sites for each sequence was indicated using the newcpgseek algorithm. The results showed that methylation of mtDNA in the analysed species involved mitochondrial gene expression. Our analyses showed that the CpG sites were commonly present in genomic regions including the D-loop, CYTB, ND6, ND5, ND4, ND3, ND2, ND1, COX3, COX2, COX1, ATP6, 16s rRNA, and 12s rRNA. The CpG distribution in animals from different species was diversified. Generally, the number of observed CpG sites of the mitochondrial genome was higher in the vertebrates than in the invertebrates. However, there was no relationship between the frequency of the CpG sites in the mitochondrial genome and the complexity of the analysed organisms. Interestingly, the distribution of the CpG sites for tRNA coding genes was usually cumulated in a larger CpG region in vertebrates. This paper may be a starting point for further research, since the collected information indicates possible methylation regions localized in mtDNA among different species including invertebrates and vertebrates.Entities:
Keywords: CpG sites; model organisms; mtDNA
Year: 2020 PMID: 32290485 PMCID: PMC7222804 DOI: 10.3390/ani10040665
Source DB: PubMed Journal: Animals (Basel) ISSN: 2076-2615 Impact factor: 2.752
MtDNA reference sequences of analysed model organisms.
| Organism | Accession Number of Reference Sequence * | Length of MtDNA (bp **) |
|---|---|---|
|
| ||
|
| NC_001328.1 | 13,794 |
|
| NC_024511.2 | 19,524 |
|
| NC_026914.1 | 14,948 |
|
| ||
|
| NC_001804.1 | 16,407 |
|
| NC_002333.2 | 16,596 |
|
| NC_005797.1 | 16,369 |
|
| NC_040970.1 | 16,785 |
|
| NC_005089.1 | 16,299 |
|
| NC_002008.4 | 16,727 |
|
| NC_008143.1 | 16,916 |
|
| KM679417.1 | 16,559 |
|
| NC_012920.1 | 16,569 |
* NC, KM—nucleotide accession prefixes. ** bp—base pair.
Positions of CpG islands in the mitochondrial genomes of the analysed animals on the light strand.
| Organism | Genome Length (bp *) | % GC ** | Positions of CpG Islands *** | Genome Region | Length of CpG Islands (bp) | Sum of C+G **** | %C + %G | Obs/Exp ***** |
|---|---|---|---|---|---|---|---|---|
|
| 16,596 | 0.40 | 3281..3531 |
| 251 | 126 | 50.20 | 0.95 |
| 6205..6432 | rep_origin, | 228 | 120 | 52.63 | 0.91 | |||
|
| 16,785 | 0.46 | 8703..8925 |
| 223 | 118 | 52.91 | 0.97 |
|
| 16,727 | 0.40 | 16,137..16,449 | D-loop | 313 | 170 | 54.31 | 2.71 |
|
| 16,559 | 0.44 | 14,246..14,447 |
| 202 | 103 | 50.99 | 1.27 |
|
| 16,569 | 0.44 | 7764..8036 |
| 273 | 137 | 50.18 | 1.13 |
* bp—base pair. ** guanine–cytosine (GC) base pairs. *** guanine-cytosine-rich regions (CpG islands). **** cytosine (C), guanine (G). ***** the observed/expected ratio.
Positions of CpG islands in the mtDNA of the analysed animals on the H strand *.
| Organism | Genome Length (bp **) | % GC *** | Start and Stop of MtDNA Sequence **** | MtDNA Region | Length of CpG Islands (bp) ***** | Sum of C+G | %C + %G | Obs/Exp ****** |
|---|---|---|---|---|---|---|---|---|
|
| 16,596 | 0.40 | 981..1180 |
| 200 | 105 | 52.50 | 1.31 |
| 6205..6432 | 228 | 120 | 52.63 | 1.17 | ||||
|
| 16,407 | 0.42 | 145..370 |
|
|
|
|
|
|
| 16,916 | 0.43 | 51..311 |
|
|
|
|
|
| 12,371..12,699 |
|
|
|
|
| |||
|
| 16,785 | 0.46 | 1784..1992 |
|
|
|
|
|
| 6901..7108 |
| 208 | 111 | 53.37 | 1.25 | |||
| 9456..9794 |
| 339 | 174 | 51.33 | 1.25 | |||
| 9920..10,551 |
| 632 | 323 | 51.11 | 0.99 | |||
| 13,647..13,925 |
|
|
|
|
| |||
| 14,984..15,210 |
| 227 | 119 | 52.42 | 1.20 | |||
| 16,297..16,508 |
| 212 | 110 | 51.89 | 0.99 | |||
|
| 16,727 | 0.40 | 16,179..16,449 | D-loop VNTR (16,130..16,430) | 271 | 149 | 54.98 | 0.83 |
|
| 16,559 | 0.44 | 2848..3136 |
| 289 | 146 | 50.52 | 1.26 |
| 5572..5779 |
| 208 | 112 | 53.85 | 1.17 | |||
| 12,379..12,642 |
|
|
|
|
| |||
| 14,246..14,447 |
| 202 | 103 | 50.99 | 1.27 | |||
|
| 16,569 | 0.44 | 1123..1352 |
|
|
|
|
|
| 3382..3717 |
| 336 | 178 | 52.98 | 1.26 | |||
| 12,907..13,115 |
|
|
|
|
| |||
| 14,804..15,044 |
| 241 | 126 | 52.28 | 1.33 |
* genes in which CpG sites are frequently distributed among species were marked with bold font (genes encoded on the L strand). ** bp—base pair. *** guanine–cytosine (GC) base pairs. **** mitochondrial DNA (mtDNA). ***** guanine-cytosine-rich regions (CpG islands). ****** the observed/expected ratio.
Distribution of CpG sites in the mtDNA of the analysed animals including the L- strand and the H- strand.
|
|
|
|
|
|
|
|
|
|
|
|
| ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Strand | L | H | L | H | L | H | L | H | L | H | L | H | L | H | L | H | L | H | L | H | L | H | L | H | |
| Genomic region | |||||||||||||||||||||||||
|
| 2 | 5 | |||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||
|
| 2 | 4 | |||||||||||||||||||||||
|
| 4 | 6 | 5 | 2 | |||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||
|
| 2 | 2 | 2 | 2 | |||||||||||||||||||||
|
| 3 | 2 | 3 | 2 | 3 | 3 | 2 | 2 | 3 | 2 | 3 | ||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||
|
| 2 | 2 | 3 | ||||||||||||||||||||||
|
| 2 | ||||||||||||||||||||||||
|
| 3 | 2 | |||||||||||||||||||||||
|
| 3 | 18 | 3 | 3 | 3 | ||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||
|
| 5 | 3 | 3 | ||||||||||||||||||||||
|
| 4 | ||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||||
|
| 4 | ||||||||||||||||||||||||
|
| 2 | 2 | |||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||
|
| 3 | ||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||||||
|
| 3 | 2 | |||||||||||||||||||||||
|
| 2 | 15 | 2 | 3 | 8 | 4 | 7 | 18 | 2 | 6 | 2 | 6 | 10 | ||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||
|
| 2 | ||||||||||||||||||||||||
|
| 2 | 4 | 2 | 7 | 4 | ||||||||||||||||||||
|
| 2 | 2 | |||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |||||||||||
|
| 6 | 4 | |||||||||||||||||||||||
|
| 3 | ||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
| 2 | ||||||||||||||||||||||||
|
| 2 | ||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||||||
|
| 14 | 40 | 79 | 174 | 16 | 28 | 143 | 308 | 163 | 328 | 98 | 226 | 247 | 492 | 183 | 326 | 110 | 201 | 192 | 317 | 170 | 332 | 196 | 356 | |
* genes in which CpG sites are frequently distributed among species were marked with bold font. ** guanine-cytosine-rich sequences.
Distribution of CpG sites in regions overlapping more than one gene in mtDNA. *
| Species | Strand | Start and Stop of MtDNA Sequence | CpG Count | Genes/Replication Origin Region |
|---|---|---|---|---|
|
| L | 3341..3356 | 3 |
|
|
|
|
|
|
|
|
|
|
|
| |
|
| L | 2762..2788 | 4 |
|
| H | 1106..1134 | 4 |
| |
| H | 2693..2819 | 12 |
| |
|
|
|
|
| |
| H | 7857..7908 | 6 |
| |
| H | 8526..8861 | 25 |
| |
| H | 15,468...15,523 | 6 |
| |
|
|
|
|
|
|
| L | 11,558..11,579 | 3 |
| |
| H | 951..1402 | 36 |
| |
| H | 3727..3873 | 12 |
| |
|
|
|
|
| |
| H | 8802..8845 | 4 |
| |
| H | 9538..9829 | 23 |
| |
| H | 10,883..11,253 | 26 |
| |
|
|
|
|
|
|
| L | 15,333..15,346 | 2 |
| |
| H | 2606..2649 | 5 |
| |
|
|
|
|
| |
| H | 15,336..15,355 | 3 |
| |
|
| L | 11,619..11,679 | 4 |
|
| L | 13,688..13,713 | 4 | ||
| H | 3624..3720 | 11 |
| |
| H | 4664..5023 | 24 |
| |
| H | 7648..7931 | 21 |
| |
| H | 9918..9935 | 2 |
| |
| H | 11,590..11,617 | 4 |
| |
| H | 11,822..12,011 | 15 |
| |
|
| H | 1199..2726 | 98 |
|
| H | 4971..5040 | 8 |
| |
| H | 6404..6523 | 10 | ||
| H | 9542..10,097 | 37 |
| |
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
| L | 7969..7991 | 3 |
| |
| H | 2652..2692 | 5 |
| |
| H | 4983..5183 | 12 | ||
| H | 7982..7995 | 3 |
| |
|
| L | 5156..5187 | 4 | |
| L | 7951..8003 | 5 |
| |
|
|
|
|
| |
| H | 7964..8004 | 6 |
| |
| H | 8558..8720 | 12 |
| |
|
| L | 5737..5768 | 5 | |
|
|
|
|
|
* CpG sites that are frequently repeated in the overlapping replication origin region, tRNA encoding genes, and COX1 gene were marked with bold font (genes encoded on the L-strand).