| Literature DB >> 35627149 |
Salvo Danilo Lombardo1,2, Ivan Fernando Wangsaputra3, Jörg Menche1,2,4, Adam Stevens3.
Abstract
The early developmental phase is of critical importance for human health and disease later in life. To decipher the molecular mechanisms at play, current biomedical research is increasingly relying on large quantities of diverse omics data. The integration and interpretation of the different datasets pose a critical challenge towards the holistic understanding of the complex biological processes that are involved in early development. In this review, we outline the major transcriptomic and epigenetic processes and the respective datasets that are most relevant for studying the periconceptional period. We cover both basic data processing and analysis steps, as well as more advanced data integration methods. A particular focus is given to network-based methods. Finally, we review the medical applications of such integrative analyses.Entities:
Keywords: DOHaD; bioinformatics; development; epigenetics; integrative; networks; transcriptomics
Mesh:
Year: 2022 PMID: 35627149 PMCID: PMC9141211 DOI: 10.3390/genes13050764
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Figure 1Dissecting biological complexity in the different layers of biological organisation helps in prevention, diagnosis and treatment. (A) Genetic perturbation during the periconceptional period (first two weeks after conception) can propagate through the different layers of biological networks: transcriptome, epigenome, proteome, cellular level and organ level leading to predisposition for disease phenotypes later on in life. Dissecting and integrating these biological layers are crucial for prevention, early diagnosis and potential treatments. (B) Early life conditioning can influence growth trajectories in life, contributing in predisposition for different phenotypes (health, short, and obese). (C) Epigenetic modifications, such as DNA methylation in particular regions of the DNA containing imprinting genes, could alter the normal genetic balance of the maternal and paternal alleles. As an example, we show the consequences of alterations of the IGF2-H19 imprinting gene balance, which can lead to either gigantism (Beckwith–Wiedemann syndrome) or nanism (Russell–Silver syndrome).
Figure 2Transcriptomic analysis pipeline. Starting from the raw read alignments, several steps are needed to obtain concrete biological results, such as the identification of differential expressed genes or cluster marker genes. The first step is to align gene sequences to a gene annotation reference to be able to count the number of reads for each gene. This will allow us to obtain a count matrix, which can be used for differential expression analysis, identifying genes that are significantly changed (up/down regulated) in certain conditions and visualising them for example in a volcano plot. In parallel, the dimensions of the count matrix can be reduced and visualised with several techniques (i.e., PCA, t-SNE, and UMAP). This allows to identify clusters and, in the context of single-cell experiments, also to infer developmental trajectories.
Figure 3Epigenetic modifications occur on different biological scales. DNA-based mechanisms are concerned with histone modifications, consisting of chemical modifications (i.e., acetylation), DNA methylation, chromatin remodelling, and transposons. RNA-based mechanisms are multiple, complex, and still only partially known: miRNA and the RISC complex can induce mRNA degradation; lncRNAs silence the activity of miRNA, while snRNA and tsRNA can both silence, but also induce, mRNA translation to protein. piRNAs can interact with DNA, interfering with the genetic movements of transposons; snoRNAs induce chemical modifications at the mRNA level.
Figure 4Biological networks and their topological characteristics. (A) Classification of biological networks in two major categories: physical and functional interactions. The first category includes the protein–protein interaction network (interactome), which represents the map of the physical interactions of all proteins and the neural network (connectome), which shows synapses that connects neurons. Networks that are constituted by functional interactions have edges that represents functional relationships, such as the level of expression (co-expression network) or the gene regulation (gene regulatory network). (B) The most important features of a network are hub (a node connected to many others), motif (recurrent structures in different parts of the network), and community (group of densely interconnected nodes).
Published transcriptomic datasets within the context of early human development.
| Techniques | Sample Type | Number of Genes/Cells | Goals | Study |
|---|---|---|---|---|
| HG-U133 Plus 2.0 array (Affymetrix) | Oocytes | 1361 transcripts expressed in oocytes | Study of oocyte transcriptomes | [ |
| HG-U133 Plus 2.0 array (Affymetrix) | Oocytes | 1514 overexpressed in oocytes compared with cumulus cells | Understanding of the mechanisms regulating oocyte maturation | [ |
| HG-U133 Plus 2.0 array (Affymetrix) | Oocytes | 5331 transcripts enriched in metaphase II oocytes relative to somatic cells | Comprehension of genes expressed in in vivo matured oocytes | [ |
| HG-U133 Plus 2.0 array (Affymetrix) | Oocytes | 10,183 genes were expressed in germinal vesicle | Study of global gene expression in human oocytes at the later stages of folliculogenesis (germinal vesicle stage) | [ |
| HG-U133 Plus 2.0 array (Affymetrix) | Oocytes | Of the 8123 transcripts expressed in the oocytes, 374 genes showed significant differences in mRNA abundance in PCOS oocytes | Understanding of PCOS | [ |
| HG-U133 Plus 2.0 array (Affymetrix) | Oocytes | Identification of new potential regulators and marker genes that are involved in oocyte maturation | [ | |
| HG-U133 Plus 2.0 array (Affymetrix) | Oocytes | 283 genes found in the case report sample | Identification of molecular abnormalities in metaphase II (MII) oocytes | [ |
| Whole Genome Bioarrays printed with 54,840 discovery probes representing 18,055 human genes and an additional 29,378 human expressed sequence tags (EST) | Oocytes | 2000 genes were identified as expressed at more than 2-fold higher levels in oocytes matured in vitro than those matured in vivo | Analysis of the gene expression profile of oocytes following in vivo or in vitro maturation | [ |
| Applied Biosystems Human Genome Survey Microarray (32,878 60-mer oligonucleotide) | Oocytes | Germinal vesicle, in vivo-MII and IVM-MII oocytes expressed 12,219, 9735 and 8510 genes, respectively | Characterisation of the patterns of gene expression in germinal vesicle stage and meiosis II oocytes matured in vitro or in vivo | [ |
| HG-U133 Plus 2.0 array (Affymetrix) | Oocytes | 342 genes showed a significantly different expression level between the two age groups (women aged 36 years (younger) and women aged 37–39 years (older)) | Investigation of the effect of age on gene expression profile in mature oocytes | [ |
| Two cDNA microarrays, each containing about 20,000 targets (representing in total ~29,778 independent genes according to Unigene Build 155) | Oocytes and embryos | 1896 significant changes in expression following fertilization through day 3 of development | Global analysis of the preimplantation embryo transcriptome | [ |
| cDNA microarrays containing 9600 cDNA spots | Oocytes and embryos | 184, 29 and 65 genes were overexpressed in oocytes, 4- and 8-cell embryos, respectively | Identification of the differential expression profiles of genes in single oocytes, 4- and 8-cell preimplantation embryos | [ |
| Genome Survey Microarrays V2.0 (Applied Biosystems) | Oocytes and embryos | 107 DNA repair genes were detected in oocytes | Identification of the DNA repair pathways that may be active pre- and post-embryonic genome activation by investigating mRNA in human in vitro matured oocytes and blastocysts | [ |
| HG-U133 Plus 2.0 array (Affymetrix) | Oocytes and embryos | 5477 transcripts differentially expressed into transition from mature oocyte (MII) to 2-day embryo and 2989 transcripts differentially expressed into transition from 2-day to 3-day embryo | Study of global gene expression in human preimplantation development | [ |
| HG-U133 Plus 2.0 array (Affymetrix) | Oocytes and embryos | 45 eukaryotic initiation factors, 19 of which are differentially regulated between the 8-cell stage and blastocyst | Identification of gene networks behind cell fate decision in blastomeres | [ |
| Illumina HiSeq2000 unpaired (TrueSeq) | Oocytes, embryos, and hESCs | 124 single cells, 90 from 20 oocytes and embryos, 8 from primary hESC outgrowth, 26 from hESC passage 10, averaging 35.3 million reads per cell, average read length 100 bp. 22,687 maternally expressed genes detected, including 8701 lncRNAs, 2733 of them novel and developmental stage specific | Comparing the gene expression of human epiblast in vitro with hESCs | [ |
| Illumina HiSeq2000 paired-end (TrueSeq) | Embryos | 86 single cells | Validating known marker genes and highlighting differences between human and mouse pre-implantation development | [ |
| Illumina HiSeq2000 single-end (Smart-seq2) | Embryos | 1529 single cells from 88 embryos of various developmental stages, averaging 8500 expressed genes | Showcasing the differentiation of cell lineage in pre-implantation embryos and X-chromosome dosage compensation in females | [ |
| Illumina HiSeq4000 paired-end (STRT-Seq and Trio-seq2) | Embryos | 7636 single cells from 65 pre/post implantation embryos | Observation of genome regulation surrounding implantation | [ |
Resources containing epigenic data.
| Name | Type of Data | URLs | Description | Reference |
|---|---|---|---|---|
| National Institutes of Health Roadmap Epigenome Project |
DNA methylation Histone modifications Chromatin accessibility Small RNA-seq | The consortium provides an analysis of stem cells and primary ex vivo tissues to collect normal epigenomes to provide a reference for comparison and integration in future studies. | [ | |
| ENCODE (Encyclopedia of DNA Elements Project) |
DNA binding DNA accessibility DNA methylation Three-dimensional chromatin structure Replication timing Genotyping snATAC-seq DNA sequencing | The consortium built a comprehensive parts list of functional elements in the human genome, including all the regulatory elements in different biological levels of complexity. | [ | |
| Human Epigenome Consortium |
Histone modifications Chromatin accessibility Methylome Whole genome sequencing TF-binding sites | Large collection of studies containing human epigenome and transcriptome grouped by tissue and cell type. | [ | |
| Histone Infobase (HIstome) |
Histone modifications | Database covering 5 different types of histones, 8 types of their post-translational modification and 13 classes of modifying enzymes | [ | |
| DeepBlue |
DNA methylation Histone modifications and variants DNasel Transcription factors binding sites Chromatin accessibility | This source provides a great effort for integrating different databases and sources and obtaining a large comprehensive epigenomic consultable tool (via web interface or API interface) | [ | |
| MethBase |
Methylome from different organisms | For each methylome, they provide methylation level at individual sites, regions of allele specific methylation, hypo- or hyper-methylated regions, partially methylated regions, metadata and statistics. | [ | |
| iMETHYL |
Methylome Whole genome sequencing | They provide a multi-omics data centering source for DNA methylation, also including information about cell types. | [ | |
| NONCODE |
lncRNA | This database comprises lncRNA from different organisms in health and disease. | [ | |
| miRBase |
miRNA | This is a searchable database of published miRNA sequences and annotations. | [ | |
| PolymiRTS Database 3.0 |
miRNA | Database containing miRNAs biological annotations, relationships with disease states and gene expression and their polymorphisms, variants and mutations. | [ | |
| snOPY |
snoRNA | They provide a list of snoRNAs, snoRNA locus, target RNAs and orthologs for 39 different organisms. | [ | |
| snoDB |
snoRNA | It harmonises human snoRNAs information from different sources, such as sequence databases and target information. | [ | |
| RMBase v2.0 |
RNA modification peaks and sites | This database provides an important source for all the possible RNA modifications, including miRNA, snRNAs and snoRNAs. | [ | |
| mQTLdb |
Methylome Genotype profiling | They provide methylation and genotype data on mother–child pairs providing access to meQTL mapping across five different stages of life. | [ | |
| Methylomic trajectories across fetal brain development |
Methylome | DNA methylation across fetal brain development. | [ | |
| Methylation quantitative trait loci (mQTL) in the developing human brain and their enrichment in genomic regions associated with schizophrenia |
Methylation quantitative trait loci | DNA methylation quantitative trait loci of human fetal brain. | [ |