| Literature DB >> 35089376 |
Jan Engelhardt1,2,3,4, Oliver Scheer5,6, Peter F Stadler7,6,8,9,10,11, Sonja J Prohaska5,6,8,12.
Abstract
DNA methylation is a crucial, abundant mechanism of gene regulation in vertebrates. It is less prevalent in many other metazoan organisms and completely absent in some key model species, such as Drosophila melanogaster and Caenorhabditis elegans. We report here a comprehensive study of the presence and absence of DNA methyltransferases (DNMTs) in 138 Ecdysozoa, covering Arthropoda, Nematoda, Priapulida, Onychophora, and Tardigrada. Three of these phyla have not been investigated for the presence of DNA methylation before. We observe that the loss of individual DNMTs independently occurred multiple times across ecdysozoan phyla. We computationally predict the presence of DNA methylation based on CpG rates in coding sequences using an implementation of Gaussian Mixture Modeling, MethMod. Integrating both analysis we predict two previously unknown losses of DNA methylation in Ecdysozoa, one within Chelicerata (Mesostigmata) and one in Tardigrada. In the early-branching Ecdysozoa Priapulus caudatus, we predict the presence of a full set of DNMTs and the presence of DNA methylation. We are therefore showing a very diverse and independent evolution of DNA methylation in different ecdysozoan phyla spanning a phylogenetic range of more than 700 million years.Entities:
Keywords: DNA methyltransferase; Evolutionary epigenetics; Gaussian mixture modeling; Observed/expected CpG ratio
Mesh:
Year: 2022 PMID: 35089376 PMCID: PMC8821070 DOI: 10.1007/s00239-021-10042-0
Source DB: PubMed Journal: J Mol Evol ISSN: 0022-2844 Impact factor: 2.395
Fig. 1Conserved domains of animal DNA methyltransferases. Scaling and numbers refer to the human homologs
Fig. 2Overview of the metazoan phylogeny with a focus on Ecdysozoa. The number of species per group used in this study is given in brackets. Lophotrochozoa and Deuterostomia are shown for orientation only
Fig. 3Presence and absence of DNMT family members in Arthropoda indicated by filled and open symbols, respectively for DNMT1 (red), DNMT2 (green), and DNMT3 (blue). Data sources are indicated by symbol shape: filled circle—proteome, filled square—genome, filled triangle—transcriptome. The rightmost column (golden circles) shows the presence and absence of DNA methylation as predicted from the O/E CpG ratio. Absence of golden circle indicates missing data. The species list is given on turquoise background with alternating shades indicating the order membership. The name of the order (or suitable higher group marked with an asterisk *) is given in bold. Alternating shades of brown indicate (from top to bottom) Chelicerata, Myriapoda, Multicrustacea, Branchiopoda, and Hexapoda. Stars in the species tree denote proposed loss events inferred from absence of a DNMT in all species of a subtree comprising at least two leaves, disregarding absences in species with transcriptomic data only (Color figure online)
Fig. 4Presence and absence of DNMT family members in Nematoda. See Fig. 3 for detailed legend. Instead of order names, clade names are given (in bold)
Fig. 5Presence and absence of DNMT family members in Priapulida, Onychophora and Tardigrada and early-branching Metazoa. See Fig. 3 for detailed legend
Summary of the Gaussian Mixture Modeling for real and shuffled data
| Range | Real data | Shuffled data | ||||
|---|---|---|---|---|---|---|
| Min. | Mean | Max. | Min. | Mean | Max. | |
| Arthropoda | ||||||
| meanLow | 0.30 | 0.72 | 1.17 | 0.95 | 0.99 | 1.00 |
| meanHigh | 0.58 | 1.00 | 1.46 | 1.00 | 1.02 | 1.05 |
| distance | 0.01 | 0.28 | 0.63 | 0.00 | 0.03 | 0.11 |
| %low | 0.14 | 0.46 | 0.87 | 0.37 | 0.72 | 0.81 |
| Nematoda | ||||||
| meanLow | 0.34 | 0.94 | 1.16 | 0.93 | 0.98 | 1.00 |
| meanHigh | 0.59 | 1.10 | 1.48 | 1.00 | 1.02 | 1.07 |
| distance | 0.00 | 0.15 | 0.58 | 0.00 | 0.04 | 0.14 |
| %low | 0.13 | 0.59 | 0.96 | 0.49 | 0.74 | 0.82 |
“meanLow” and “meanHigh” are the component means corresponding to the components with lower and higher O/E CpG ratios (first and second row). The distance d between the means is given in the third row. “%low” gives the relative amount of data points (transcripts) in the component with the lower O/E CpG ratio, “%low” “%high” equals to 1. Due to its extreme values the nematode Loa loa was excluded from this table. Its values are: “meanLow” 1/1, “meanHigh” 4.53/1.18, d 3.55/0.18 and “%low” 0.99/0.98 for the real/shuffled data
Relationship between the combination of DNMT candidates and the predicted methylation level. Shown is the amount of species for which DNA methylation is predicted to be present or absent classified by the presence of DNMT enzyme combinations
| Enzymes present | Total | methylation | |
|---|---|---|---|
| Present | Absent | ||
| DNMT1 & DNMT3 | 45 | 9 | |
| DNMT1 only | 28 | 16 | 12 |
| DNMT3 only | 7 | 0 | 7 |
| no DNMT1 & no DNMT3 | 46 | 3 | |
| 126 | 55 | 71 | |
The numbers in bold correspond to the amount of species for which the presence (DNMT1 & DNMT3) or absence (no DNMT1 & no DNMT3) of DNA methylation is very likely
Fig. 6Each point shows one species analyzed by Gaussian Mixture Modeling (GMM). The axes are the means of the two components. The taxonomic group is indicated by the style of the point. The color represents if both, DNMT1 and DNMT3 (green), have been found in the species, only DNMT1 (red), only DNMT3 (black) or neither one nor the other (blue). The diagonal lines indicate the distance between the mean of both GMM components. The dotted line indicates a distance of , the dashed one and the solid line (selected threshold). ’EBM’ stands for ’Early-branching metazoa’, i.e., Porifera, Placozoa and Cnidaria (Color figure online)