Literature DB >> 30065929

Integrating Genes Affecting Coronary Artery Disease in Functional Networks by Multi-OMICs Approach.

Baiba Vilne^1,2, Heribert Schunkert^1,2.

Abstract

Coronary artery disease (CAD) and myocardial infarction (MI) remain among the leading causes of mortality worldwide, urgently demanding a better understanding of disease etiology, and more efficient therapeutic strategies. Genetic predisposition as well as the environment and lifestyle are thought to contribute to disease risk. It is likely that non-linear and complex interactions occur between these multiple factors, involving simultaneous pathological changes in diverse cell types, tissues, and organs, at multiple molecular levels. Recent technological advances have exponentially expanded the breadth of available -omics data, from genome, epigenome, transcriptome, proteome, metabolome to even the microbiome. Integration of multiple layers of information across several -omics domains, i.e., the so-called multi-omics approach, currently holds the promise as a path toward precision medicine. Indeed, a more meaningful interpretation of genotype-phenotype relationships and the development of successful therapeutics tailored to individual patients are urgently needed. In this review, we will summarize recent findings and applications of integrative multi-omics in elucidating the etiology of CAD/MI; with a special focus on established disease susceptibility loci sequentially identified in genome-wide association studies (GWAS) over the last 10 years. Moreover, in addition to the autosomal genome, we will also consider the genetic variation in our "second genome"-the mitochondrial genome. Finally, we will summarize the current challenges in the field and point to future research directions required in order to successfully and effectively apply these approaches for precision medicine.

Entities: CellLine Chemical Disease Gene Mutation Species

Keywords: cardiovascular disease; genomics; gut microbiome; metabolomics; multiomics; transcriptomics

Year: 2018 PMID： 30065929 PMCID： PMC6056735 DOI： 10.3389/fcvm.2018.00089

Source DB: PubMed Journal: Front Cardiovasc Med ISSN： 2297-055X

Introduction

In the current era of high-potency statin therapy it becomes increasingly clear that even individuals with normal LDL-cholesterol levels without any conventional risk factors may develop atherosclerosis (1). The most pertinent manifestation of atherosclerosis is coronary artery disease (CAD), a highly complex disease, influenced by both multiple genetic risk variants and lifetime exposure to an atherogenic environment (2). A better understanding of the etiology of CAD and directions toward hitherto therapeutically not addressed disease mechanisms are urgently demanded (3). During the last 10 years, the genetic risk has been thoroughly explored in numerous genome-wide association studies (GWAS), leading to identification of >300 chromosomal loci which all significantly affect the risk of CAD (4–15). More than 90% of these common disease risk variants are located outside the protein-coding regions and have modest effect sizes (2, 16). Collectively they explain only ~25% of the overall disease heritability. This suggests that genetic variation may contribute to disease risk in a non-linear, interactive and complex way (17), leading to pathological changes in diverse cell types, tissues, and organs, at multiple molecular levels (18). Recent technological advances have exponentially expanded the breadth of available -omics data (17). High-throughput monitoring of the abundance of various biological molecules and determination of their variation between different conditions on a global scale has become possible, promoting a paradigm shift in the way we approach biomedical problems (19). At the same time, it has been increasingly recognized that no single type of data can fully capture the intricacy of most complex molecular traits that manifest collectively as disease phenotypes (20–22). Rather, it is the integration of multiple layers of information across several -omics domains, i.e., the so-called multi-omics approach [also referred to as integromics or panomics (19)], that holds the promise for precision medicine (Figure 1) (19).

Figure 1

Multi-omics approach for precision medicine. Multi-omics (i.e., genome, epigenome, transcriptome, proteome, metabolome, microbiome, and envirome) data are collected from patients and integrated to create their individual molecular signatures (i.e., complex biomarkers), which are then used to select an appropriate drug for a particular patient, thus improving the treatment efficiency and reducing the possible side effects. Of note, integrative analysis across multiple-omics layers can be conducted in two ways (Figure 2): pair-wise data integration and multi-dimensional i.e., network-based integration (22). Furthermore, pair-wise integrations can be divided into genetic and non-genetic correlations (22). In the first case, DNA variants (i.e., allelic distributions of single-nucleotide polymorphisms; SNPs) are tested for association with down-stream omics markers such as transcriptomic alterations, protein, metabolite or methylation levels or quantitative and qualitative measures of microbiome, via the so called quantitative trait loci (QTL) mapping. In the second scenario, one would explore correlations between down-stream omics data, e.g., correlation of CpG methylation levels to transcript expression or between metabolome and gut microbiome, however it may be difficult to infer causal relationships in such case (22). Considering the largely unexplored role of the established CAD risk loci from GWAS (23) and the central dogma that genetic variations control the transcriptome, which in turn affects e.g., the proteome (20), and metabolome (Figure 2, middle panel), our main focus will be pair-wise integrations linking genetic variation related to CAD risk to other down-stream omics layers such as epigenome, transcriptome, proteome or metabolome. Although multi-dimensional integrations have been widely used in the field of cancer research, their application in the context of CAD has so far been limited (22). Moreover, in addition to the autosomal genome, we will also consider the genetic variation in our “second genome”—the mitochondrial genome and its contribution to CAD.

Figure 2

Multi-omics (i.e., autosomal and mitochondrial genome, epigenome, transcriptome, proteome, metabolome, microbiome, and envirome) data integration can be conducted in two ways: pair-wise integrations, which can be further divided into non-genetic (left panel) and genetic correlations (middle panel). In the first case, one would examine the correlation patterns between the down-stream omics layers (e.g., metabolome and gut microbiome), whereas the second is achieved via the so called quantitative trait loci (QTL) mapping, linking genetic variation to methylation levels (meQTLs) or histone modifications (hQTLs), transcriptome (expression QTLs; eQTLs), protein (pQTLs), metabolite (mQTLs) or measures of microbiome (mbQTLs). Alternatively, multi-dimensional i.e., network-based integration approaches (right panel) exist, however their application in the context of CAD has so far been limited (22).

Integrating genetic variation and epigenome

Epigenomic signatures reflect various DNA modifications and may affect gene regulatory mechanisms that do not involve changes in the DNA sequence per se. Thereby, epigenomics may become a critical mediator of environmental influences and risk factors acting on the genome (20, 24). Three unique, but highly interrelated, epigenetic processes can be distinguished: DNA methylation, histone modifications (e.g., methylation, acetylation, phosphorylation, DP-ribosylation, and ubiquitination) and RNA-based mechanisms (e.g., microRNAs, long non-coding RNAs or lncRNAs, small interfering RNAs) (20, 24). Although, technically non-coding RNAs belong to the epigenome (20), we will discuss them in the next section, as the respective omics data are acquired via transcriptome profiling (RNA-seq). DNA methylation and histone modifications are the best understood of the epigenetic mechanisms thus far and have been widely suggested to regulate gene expression and affect CAD risk factors including atherosclerosis, inflammation, hypertension and diabetes (25). DNA methylation consists of the covalent methylation of the C5 position of cytosine residues, when they are followed by guanine residues (CpG dinucleotides). It is partly heritable but it is also a dynamic process related to environmental stimuli and life style factors (26). Hedman et al. (27) analyzed epigenetic changes associated with lipid concentrations and identified a number of meQTLs, enriched in signals from GWAS on lipid levels and CAD. For example, genome-wide significant variants (rs563290 and its proxies), associated with LDL cholesterol and CAD at APOB, were meQTLs for a LDL cholesterol-related differentially methylated locus (Table 1 and Figure 3).

Table 1

Genetic variation related to CAD/MI risk that has been associated with other down-stream omics layers such as transcriptome (mRNA, microRNAs and lncRNAs), epigenome, proteome or metabolome.

Data type	Tissue	Phenotypic Trait	SNP	Omics-Marker	Refernces
Transcriptome: mRNAs	Visceral abdominal fat	HDL cholesterol level	rs4148008	ABCAB/ABCA5	Franzén et al. (28)
			rs11869286	STARD3
		Total cholesterol level	rs751557	EVI5
			rs174546	TMEM258
		LDL cholesterol level	rs12046679	PCSK9
		CAD	rs892006	G3BP1	Foroughi Asl et al. (29)
			rs6908994	PSORSIC3
			rs9930148	FLYWCH1
	Internal mammary artery, atherosclerotic aortic root		rs7500448	CDH13	Nelson et al. (13)
	Blood	Blood pressure	Rs3184504	SH2B3,ALDH2,NAA25(cis) and IL8,TAGAP (trans)	Huan et al. (30)
Transcriptome: microRNAs	Circulating leukocytes, human coronary artery smooth muscle cells (HCASMC)	CAD	rs12190287	miR-224: TCF21	Miller et al. (31) Bastami et al. (32)
		The effect of diet on plasma lipid levels	rs13702	miR-410:LPL	Richardson et al. (33)
		CAD	rs989727(rs7808424)	miR-202-5p:ASZ1	Bastami et al. (34)
			rs41269915(rs2229238)	miR-485-5p:UBE2Q1
			rs15563	hsa-miR-130a-5p:UBE2Z	Brænne et al. (16)
			rs3088442	hsa-miR-130a-5p:SLC22A3
			rs2266788	hsa-miR-4722-5p:APOA5
			rs72932707	hsa-miR-4722-5p:ICA1L
		HDL,LDL, and total cholesterol,triglycerides	rs2370747(rs7115089	miR-100-5p,miR-125b-5p	Huan et al. (35)
		CAD	rs11042699 (rs6578985)	miR-483-3p-IGF2
		Platelet count	rs4905998 rs(7149242)	miR-127-3P, miR-134, miR-370, miR-376a-3p, miR-382-5p, miR-431-5p, miR-433, miR-329, miR-409-3p, miR-494, miR-411-3p, miR-654-5p, miR-668, miR-543, miR-323a-3p, miR-337-3p
		3p/5p ratio	rs13064131	miR-28:LPP	Civelek et al. (36)
Transcriptome:lnc RNAs	Internal mammary artery, atherosclerotic aortic root	CAD	rs1333045	FPKM1_group_33469_transcript_1	Ballantyne et al. (37)
		MI	rs1333049	FPKM1_group_33469_transcript_2
		T2D	rs2383208	FPKM1_group_33469_transcript_6
		Early MI	rs10757274	ANRIL	McPherson et al. (38)
Epigenome:DNA methylation		Hypertension	rs113460564, rs12443878, rs12444338, rs62040565, rs8060301	CDH13	Putku et al. (39)
		Diastolic blood pressure,serum high-density lipoprotein,high molecular weight adiponectin	rs8060301	Cg09415485(CDH13)
		High molecular weight adiponectin	rs2239857, rs77068073	CDH13
		Smoking(no assosciation)	rs75509302	cg23576855(AHRR)
		LDL cholesterol level	rs563290 (rs515135, rs562338)	Cg05337441(APOB)	Hedman et al. (27)
Proteome	Blood plasma	CAD	rs12740374	Granulin(CELSR2/SORT1)	Chen et al. (40)
			rs867186	Protein C (PROCR)	Howson et al. (14)
			rs1050362	apolipoprotein L1 (DHX38)
Metabolome	Blood plasma	CAD	rs715	CPS1,urea cycle metabolites,plasma glycine	Hartiala et al. (41)
	Blood plasma		rs10450989(USP3), rs2228513(HER-C1)rs930491,rs11827377(STIM1), rs3853422(SEL1L),rs1869075(F-BXO25),rs9591507,rs17573278,rs894840,rs9285184(SUGT1)	Circulating short-chain di-carboxylacylcarnitine(SC-DA)	Kraus et al. (42)
Multi-OMICS	Low HDL and inflammatory pathways		rs241437	TAP2	Laurila et al. (43)
			rs9272143	HLA-DRB1,HLA-DQA1
Mitochondrial Genome	Blood	Hypertension	m.8701A>G	MT-ATP6	Zhu et al. (44)
		CAD	Haplogroup T		Kofler et al. (45)
			m.16189T>C		Mueller et al. (46)
			m.15927G>A		Jia et al. (47)

Figure 3

Hedman et al. (27) identified SNP (rs515135) in an intron of APOB to be associated with LDL-C. Its proxy was also associated with CAD. Interestingly, this SNP represents a cis-meQTL. Black arrows indicate association findings. Red arrows indicate the presumed functional cascade leading to CAD.

Genetic variation related to CAD/MI risk that has been associated with other down-stream omics layers such as transcriptome (mRNA, microRNAs and lncRNAs), epigenome, proteome or metabolome. Hedman et al. (27) identified SNP (rs515135) in an intron of APOB to be associated with LDL-C. Its proxy was also associated with CAD. Interestingly, this SNP represents a cis-meQTL. Black arrows indicate association findings. Red arrows indicate the presumed functional cascade leading to CAD. Furthermore, the CDH13 (T-cadherin) locus may present an interesting example in the context of epigenetics and CAD. Putku et al. (39) reported several genetic variants in the promoter of CDH13 as meQTLs in hypertension patients (Table 1), several of them being also associated with high molecular weight adiponectin, a known ligand for CDH13, the binding of which results in increased proliferation and migration of endothelial cells (39). Moreover, recently Nelson et al. (13) identified a genetic variant in the intron of CDH13, which affects expression of this gene in vascular tissues, and is genome-wide significantly associated with CAD (28) (Table 1). Interestingly, the expression levels of CDH13 and lncRNAs from the same locus showed positive correlations, suggesting a functional link, as lncRNAs are known to display correlations with the expression of their neighboring protein-coding target genes (48). An exciting field of future research will be studies conducting parallel profiling of genetic variation with histone modifications and Hi-C and ChIA-PET-based chromatin contact maps to uncover local and distal histone quantitative trait loci (hQTLs) (49) in CAD patients. Overall, considering the critical role of epigenetic modifications as a critical mediator of environmental influences on the genome (20, 24), we urgently need more investigations studying DNA methylation and other epigenetic modifications genome-wide and in large enough cohorts, ideally also elucidating the differences between tissues and cells in healthy vs. CAD patients. Moreover, this should be supplemented with careful documentation of multiple environmental and lifestyle factors over time, i.e., the envirome, as well as comprehensive clinical information to draw a link between the environment and CAD.

Integrating genetic variation and transcriptome

Transcriptomics reflect genome-wide measures of RNA levels, both protein-coding RNA as well as the non-coding RNAs (i.e., microRNAs, lncRNAs, and small interfering RNAs) under specific conditions or in a specific cell. Moreover, the transcript levels are examined both qualitatively (i.e., which transcripts are present, identification of novel transcripts, splice sites, and RNA editing sites) and quantitatively (quantification of transcript abundance) (21).

Protein-coding RNAs

Parallel assessments of genetic variation and transcriptome profiles across disease-relevant tissues, i.e., via mapping expression quantitative trait loci (eQTLs) to identify susceptibility genes (mainly protein-coding), has been the most commonly applied approach (28, 29, 50–52). Björkegren et al. have performed a number of integrative network analysis, linking CAD risk variants and transcriptome data in seven disease-relevant vascular and metabolic tissues, collected from up to 600 CAD patients during coronary artery bypass surgery (28, 29, 53, 54). From these investigations, visceral abdominal fat has emerged as an important gene-regulatory site for blood lipids. Several risk SNPs for HDL-, LDL-, and total cholesterol levels, as well as for CAD demonstrated significant eQTL effects in visceral abdominal fat (28, 29). Huan et al. (30) also used integrative analysis to investigate the molecular mechanisms of blood pressure regulation and identified a blood pressure associated SNP (rs3184504) in SH2B3, also associated with the expression (eQTL) of several genes, including SH2B3, in the genetically inferred causal blood pressure gene sets (Table 1 and Figure 4). Some of these genes were also perturbed in Sh2b3−/− mice, demonstrating blood pressure-related phenotype (30). Rs3184504 has been previously also associated with CAD risk (9).

Figure 4

Huan et al. (30) uncovered a blood pressure associated SNP (rs3184504) in SH2B3, which also associates with the expression (eQTL) of several genes, including SH2B3 itself, in the genetically inferred causal blood pressure gene sets. Rs3184504 has been previously also associated with CAD risk. (9) Black arrows indicate association findings. Red arrows indicate the presumed functional cascade leading to CAD. Much less investigated are non-coding RNA transcripts, such as micro-RNAs (miRNAs) and long non-coding RNAs (lncRNAs). Recent evidence suggests that at least some of these may play a role in CAD (55–58). Although, technically non-coding RNAs belong to the epigenome (20), we will discuss them in this section, as the respective omics data are acquired via transcriptome profiling (RNA-seq).

Micro RNAs

MiRNAs are involved in the transcriptional control of all main cell types participating in atherosclerosis progression, including endothelial cells, vascular smooth muscle cells, and macrophages (32, 59). Several studies have investigated the differential expression patterns of miRNAs in plasma/serum, microparticles, whole blood, platelets, blood mononuclear intimal, and endothelial progenitor cells in CAD vs. non-CAD patients, as summarized by Malik et al. (60). In majority of cases, up-regulation of different miRNA in CAD patients was observed (60). Moreover, growing body of evidence suggests that genetic variations in the miRNA targetome may lead to major deleterious outcomes (61, 62). For example, Miller et al. (31) have shown that an established CAD risk variant (rs12190287) resides in the 3′ untranslated region of a transcription factor TCF21 and alters the seed binding sequence for miR-224. Moreover, allelic imbalance studies in circulating leukocytes and human coronary artery smooth muscle cells have demonstrated a significant imbalance of the TCF21 transcript levels, which correlated with genotype at rs12190287, consistent with this variant contributing to allele-specific expression differences (31). Richardson et al. (33) have reported that a variant (rs13702) in the 3'-UTR of lipoprotein lipase (LPL) disrupts the binding of miR-410 and modulates the effect of diet on plasma lipid levels (33). Recently, Bastami et al. (34) performed a more systematic computational screening, by mapping the established CAD risk variants to the miRNA targetome, identifying several links between SNPs and miRNAs (Table 1; https://www.ebi.ac.uk/gwas/). In a recent study from our group (16), we also mapped CAD risk variants from the CARDIoGRAMplusC4D GWAS meta-analyses (9), to 3′ UTR regions of genes to assess their overlaps with predicted target miRNA binding sites. Interestingly, the 3′ UTR region of MRAS was predicted to be targeted by 29 miRNAs and 23 miRNAs were predicted to bind more than one candidate CAD gene (Table 1). Thus far, there have been relatively few studies investigating genome-wide miRNA eQTLs (miR-eQTLs). Huan et al. (35) identified a genetic variant (rs2370747) associated with miR-100-5p and miR-125b-5p expression, a proxy SNP of which was also associated with lipid traits (HDL-, LDL-, and total cholesterol as well as triglycerides). Moreover, it was found that both miRNAs were also differentially expressed in relation to HDL cholesterol (35). Civelek et al. (36) examined the genetic regulation of human adipose miRNA expression and its consequences for metabolic traits. Interestingly, this study showed, how genetic variation might influence the processing of miRNAs, i.e., the ratio of miRNA expression from the 3p and 5p arms. It is known that a miRNA precursor can give rise to two mature miRNAs from the 3p and 5p arm, one of which usually having higher expression than the other. The 3p/5p ratios of several miRNAs have been shown to be significantly different among various healthy tissues (63) and altered in pathological conditions compared with healthy controls (64). Civelek et al. demonstrated a significant association of the SNP rs13064131 with the 3p/5p ratio of miR-28, encoded from the LPP gene (Figure 5) (36). However, the SNP was not associated with the expression levels of the LPP transcript itself or with the abundance of miR-28-3p or miR-28-5p, suggesting that its effect on the 3p/5p ratio may be independent of transcription, possibly via degradation or stabilization mechanisms.

Figure 5

Civelek et al. (36) demonstrated a significant association of the SNP rs13064131 with the 3p/5p ratio of miR-28, encoded from the LPP gene. The miRNA processing and strand selection was adapted from (65).

Long non-coding RNAs

The recent discovery of an extensive catalog of lncRNAs—i.e., long RNA transcripts that do not code for proteins—has opened a new perspective on the importance of the RNA-based mechanisms in gene regulation (24). LncRNAs are emerging as important regulators of various cellular processes, with many possible implications in cardiovascular disease pathophysiology (57, 58). In fact, the most prominent CAD risk locus at Chr9p21 (66, 67) harbors the lncRNA—ANRIL (Antisense Non-coding RNA in the INK4 Locus, CDKN2B antisense RNA). From these, rs10757274 is the strongest genetic predictor of early MI and is not associated with established CAD risk factors such as lipoproteins or hypertension, making ANRIL a key candidate (38). Interestingly, ANRIL is found both as a linear lncRNA (linANRIL), the transcript levels of which are known to positively correlate with disease severity (68), and is also capable of forming RNA circles (circANRIL) (69). Recently, Holdt et al. (69) demonstrated that circANRIL regulates the maturation of precursor ribosomal RNA (pre-rRNA), by this impairing ribosome biogenesis and inducing nucleolar stress and apoptosis in vascular smooth muscle cells and macrophages (Figure 6). Carriers of the CAD-protective haplotype at 9p21 showed significantly increased expression of circANRIL (69).

Figure 6

Recently, Holdt et al. (69) demonstrated that circANRIL regulates the maturation of precursor ribosomal RNA (pre-rRNA), by this impairing ribosome biogenesis and inducing nucleolar stress and apoptosis in vascular smooth muscle cells and macrophages. Moreover, carriers of the CAD-protective haplotype at 9p21 showed significantly increased expression of circANRIL. Currently, there have not been many large-scale studies on lncRNAs in the context of CAD, though. Ballantyne et al. (37) recently conducted a genome-wide interrogation of long intergenic non-coding RNAs (lincRNAs) that associate with cardiometabolic traits in GWAS, including CAD and also identified a number of CAD/MI and type 2 diabetes associated SNPs at Chr9p21 that overlapped lincRNA transcripts (Table 1) (37). In STARNET (28), 5.4% of the identified cis-expression quantitative trait loci (eQTLs) were related to the expression of lncRNAs, however these have not been further explored, so far. Overall, more studies focusing on non-coding RNAs in different CAD relevant tissues in large enough cohorts will be required to yield insights into the possible functional roles of this portion of transcriptome and its genetic determinants, in healthy and disease states. Moreover, considering that lncRNAs are generally found to be more lowly-expressed, sufficient depth of coverage for RNA-seq experiments will need to be guaranteed (21).

Integrating genetic variation and proteome

Proteomics uses high-throughput approaches (mainly MS-based) to quantify protein abundance, post-translational modifications and interactions (e.g., using phage display and yeast two-hybrid assays) in a tissue, cell or fluid compartment, such as plasma or urine (21). Considering that the transcriptome is not linearly proportional to proteome, that proteins are the biomolecules that execute cellular functions, and that many human diseases ultimately result from alterations in the proteome (70), such studies are urgently needed to facilitate the explorations of CAD etiology. However, proteome studies are still rare in relation to CAD, mostly due to the complex methodology involved. There have been some investigations in the past few years, aiming at characterizing the proteomes of several CAD-related tissues and cell types, including human arterial smooth muscle cells (71), platelets (72), as well as body fluids such as urine (73). Only few studies (14, 40) have analyzed genetic variants that modify protein levels, i.e., the so-called protein quantitative trait loci (pQTLs) (Table 1). Chen et al. (40) assayed a pre-selected set of plasma proteins, identifying several pQTLs that overlapped with CAD risk SNPs and also explained a substantial proportion of inter-individual variation in protein abundance. For example, rs12740374 at the CELSR2/SORT1 locus, a variant associated with lipids and CAD, explained 15% of inter-individual variation in plasma granulin levels (Figure 7). Interestingly, progranulin binds to SORT1 and Sort1 knockout mice show markedly elevated levels of progranulin (40). Recently, it was also demonstrated that progranulin is involved in lysosomal homeostasis and lipid metabolism (74).

Figure 7

rs12740374 at the CELSR2/SORT1 locus, (40) a variant associated with lipids and CAD, was recently found to display pQTL effects on plasma granulin levels, and pro-granulin is known to bind to SORT1. More recently, it was also demonstrated that progranulin is involved in lysosomal homeostasis and lipid metabolism (74). As the proteomics technologies improve over time (21), more genome-wide investigations of CAD-related alterations in proteome and also phosphorpoteome in increasing numbers of disease relevant tissues are expected to be conducted in the near future. However, as proteins are more sensitive to their environment (21), caution will have to be taken during sample preparation steps to obtain accurate and reproducible results.

Integrating genetic variation and metabolome

An important additional functional layer in mutli-omics data integration is the metabolome, as it represents an integrated state of all genetic, epigenetic and environmental factors, thus providing a link between genotype and phenotype (75). Metabolomics is an omics field that systematically identifies and quantifies multiple small molecule (typically < 1,500 Daltons) types, such as amino acids, fatty acids, carbohydrates and biochemical intermediates, i.e., metabolites (21). A plethora of metabolites in blood and urine have been associated with CAD and subsequent cardiovascular events (76–79) and have been demonstrated as promising biomarkers discriminating CAD vs. non-CAD subjects (78), as well as between thrombotic MI and stable CAD cases (80). Kraus et al. (42) recently identified several genetic loci demonstrating associations with blood plasma metabolites (i.e., metabolomic quantitative trait loci; mQTLs), the strongest findings being for the circulating short-chain dicarboxylacylcarnitine (SCDA) metabolite levels with variants in genes that regulate components of endoplasmic reticulum (ER) stress (Table 1 and Figure 8) (42).

Figure 8

Kraus et al. (42) performed a pathway-level integrative analyses and observed associations of circulating short-chain dicarboxylacylcarnitine (SCDA) with variants in ER stress genes, whereof several genetic variants in FBXO25 and SUGT1 genes also demonstrated evidence of cis-regulation in expression quantitative trait loci (eQTL) analyses and independently predicted CAD events. Besides blood and urine, metabolomic profiles of vascular and metabolomic tissues such as subcutaneous fat will need to be generated, ideally in conjunction with other omics layer data. Especially, gut microbiome would be of utmost interest, considering the close link between the two (81). However, of note, metabolic profiles are even more prone to variability affected by sample preparation and storage conditions, as well as by several other factors including patient heterogeneity (21). Hence, the required sample size has to be carefully considered, to inspire confidence in the generated results.

Integrating genetic variation and microbiome

Microbiomics investigates all the microorganisms of a given community, including bacteria, viruses, and fungi, collectively known as the microbiota (and their genes constituting the microbiome) (21). The human microbiome is enormously complex and there are substantial variations in microbiota composition between individuals resulting from seed during birth and development, diet and other environmental factors, drugs and age (21). Thousands of different bacterial species make up the human microbiomes, from which there is a small number of abundant species and a large number of rare or low abundance species, the differential functions of which remain poorly understood (82). Currently, several large scale initiatives are emerging including the American Gut Project http://americangut.org/ and the British Gut Project http://britishgut.org/, which are expected to produce a rich collection of anonymised human gut samples and lifestyle information for medical researchers. Gut microbiome has emerged as another rich source of information and as a possible new player contributing to the CAD/MI pathogenesis (82–84). It has long been known that bacteria activate inflammatory pathways, and recent data demonstrate that the gut microbiome may also affect lipid metabolism and influences the development of obesity and atherosclerosis (84), suggesting that gut microbiota could be used as a diagnostic marker for CAD (85). The most investigated is the association between gut microbiota and fasting plasma trimethylamine N-oxide (TMAO) levels, a gut microbiota-dependent metabolite, previously also associated with CAD and stroke (81, 86). Org et al. (81) demonstrated that certain blood plasma metabolites strongly correlated with gut microbial community structure and that some of these correlations may be specific for the pre-diabetic state. LeChatelier et al. (84) used qunatitative gut microbiome information to distinguish between individuals with “high bacterial richness” and “low bacterial richness,” were the latter were characterized by increased adiposity, insulin resistance and dyslipidemia in addition to a more pronounced inflammatory phenotype. Le Chatelier Fu et al. (84) and Fu et al. (87) reported that gut microbiota richness and diversity were negatively correlated with triglycerides and positively correlated with HDL levels, however this effect was independent of age, sex and host genetics. So far, genome-wide mapping of the so-called microbiome quantitative trait loci (mbQTLs) (88) in the context of CAD has not been performed and is definitely next in line, ideally in conjunction with comprehensive profiling of metabolome in several tissues and body fluids in large enough cohorts.

Integrating genetic variation and multiple omics datasets

An integrative analysis of genetic variation and transcriptome with additional high-throughput measurements may greatly improve the predictive power of disease networks. Zhu et al. (89) However, the number of studies conducting multi-omics integrations in the context of CAD is limited so far. Miller et al. (90) integrated genetic variation with investigations of chromatin state, enhancer activity and TF binding in human coronary artery smooth muscle cells and demonstrated, for example, that one of the lead candidate variants, rs17293632, located within an intergenic region of the SMAD3 gene, overlaps an open chromatin region. Moreover, it was observed that the major risk C allele was more associated with open chromatin and resided in a canonical AP-1 motif, which was effectively destroyed by the minor protective T allele. Preferential AP-1 binding to the risk C allele was experimentally validated using allele-specific ChIP analyses. Miller et al. (90) and Kraus et al. (42) performed a pathway-level integrative analyses, linking genetics, epigenetics, transcriptomics, and metabolomics profiles and implicating the ubiquitin proteasome system in cardiovascular disease pathogenesis. This study observed associations of circulating short-chain dicarboxylacylcarnitine (SCDA) with variants in ER stress genes, whereof several genetic variants (Table 1 and Figure 8) in FBXO25 and SUGT1 genes also demonstrated evidence of cis-regulation in expression quantitative trait loci (eQTL) analyses and independently predicted CAD events (42). Moreover, two other genes from the same ER stress pathway—BRSK2 and HOOK2—were identified as differentially methylated, when comparing individuals with high and low SCDA levels. Subsequently, experimental validation using culture of human kidney cells in the presence of levels of fatty acids found in individuals with cardiometabolic disease, demonstrated induced accumulation of SCDA metabolites in parallel with increases in the ER stress marker BiP (42). Shu et al. (20) investigated shared genetic regulatory networks for CAD and type 2 diabetes (T2D) and their key intervening drivers in multiple populations of diverse ethnicities by performing an integrative analysis of five multi-ethnic GWAS for CAD and T2D, eQTLs, ENCODE, as well as tissue-specific gene network models (both co-expression and graphical models) from disease-relevant tissues. This study identified pathways regulating the metabolism of lipids, glucose and branched-chain amino acids, as well as pathways governing oxidation, extracellular matrix and immune response as shared pathogenic processes for both diseases and identified 15 key drivers including HMGCR, CAV1, IGF1, and PCOLCE, whose network neighbors collectively accounted for ~35% of known GWAS hits for CAD and 22% for T2D (20). Laurila et al. (43) applied a combined approach using both QTLs and canonical pathway analysis to link genomics and transcriptome analysis from the subcutaneous adipose tissue and plasma HDL lipidomics profiling, highlighting change in HDL particle quality toward putatively more inflammatory and less atheroprotective phenotype in subjects with low HDL, due to their reduced antioxidative capacity. Within the HLA region, this study found two significant, dose-dependent cis-eQTL associations with low HDL and inflammatory pathways: rs241437 in the intron of TAP2 and rs9272143 between HLA-DRB1 and HLA-DQA1, the latter also being associated with down-regulation of antioxidative pathways in HDL particles (43). The application of multi-omics integrations in the field of CAD has so far been limited (22). Obviously, one of the main reasons for this is the current lack of appropriate data in large enough cohorts. However, considering the great promise such studies hold for precision medicine, it is expected that parallel measurements on multiple omics layers will be rapidly collected during the next couple of years, allowing also a comprehensive comparison, validation and improvement of the existing computational integration methods.

Mitochondrial genetic variation and downstream omics datasets

Dysfunction of mitochondria has been increasingly associated with obesity-related cardiometabolic diseases and CAD (91). Thus, genetic variation in the mitochondrial DNA (mtDNA), which codes for the 37 OXPHOS genes as well as further >1000 nuclear-coded genes imported into mitochondria constituting essential components for their proper functioning, needs exploration for a better understanding of CAD genetics. The mitochondrial haplogroup T (45) and mtDNA variants m.16189T>C (46) and m.15927G>A (47) have been associated with CAD in different ethnic groups. Another mitochondrial variant, m.8701A>G, has been associated with hypertension (44). This variant is located in MT-ATP6 (ATP synthase/complex V F0 subunit 6) gene, which is part of the ATP synthase enzyme, responsible for the final step of oxidative phosphorylation, and, on the functional level, using transmitochondrial hybrid cells (cybrids), it has been shown that it alters mitochondrial matrix pH and intracellular calcium dynamics (Figure 9) (92).

Figure 9

Mitochondrial variant m.8701A>G is located in MT-ATP6 (ATP synthase/complex V F0 subunit 6) gene, which is part of the ATP synthase enzyme, responsible for the final step of oxidative phosphorylation and has been associated with hypertension. (44) On the functional level, using transmitochondrial hybrid cells (cybrids), it has been shown that it alters mitochondrial matrix pH and intracellular calcium dynamics (92). Similarly, other mitochondria-related omics data investigations could be of interest in the context of CAD, as Baccarelli et al. (93) reported that ATP synthesis genes including protein-encoding cytochrome c oxidase genes (MT-CO1, MT-CO2, and MT-CO3) and MT-TL1 were hypermethylated in platelets of CAD cases as compared to healthy controls (93). Using eQTLs in seven CAD relevant vascular and metabolic tissues (53) in conjunction with established CAD risk loci from GWAS (9) and time-resolved transcriptome data in the aortic arch in mice with reversible hypercholesterolemia (94, 95) we recently demonstrated a massive down-regulation of nuclear-encoded mitochondrial genes (96), specifically at the time of rapid atherosclerotic lesion expansion and foam cell formation, which was largely reversible by genetically lowering plasma cholesterol. Both mitochondrial signature genes were supported as causal for CAD in humans, as eQTLs representing their genes significantly overlapped with disease risk SNPs. In line with this, the STARNET (28) study recently examined mitochondrial (i.e., mtDNA-derived) gene expression and a markedly lower expression of mitochondrial genes in the atherosclerotic aortic arterial wall as compared to non-atherosclerotic arterial wall. Furthermore, genetic variation of mitochondrial metabolome has remained largely unexplored. Hartiala et al. (41) searched for genetic factors associated with plasma betaine levels and determined their effect on CAD risk. This resulted in the identification of two significantly associated loci on chromosomes 2q34 and 5q14.1. The lead variant on 2q24—rs715—localized to carbamoyl-phosphate synthase 1 (CPS1), which encodes a mitochondrial enzyme that catalyzes the first committed reaction and rate-limiting step in the urea cycle. Rs715 was also significantly associated with decreased levels of urea cycle metabolites and increased plasma glycine levels. Finally, rs715 yielded a strikingly significant and protective association with decreased risk of CAD in women (41). Finally, in recent years, it has become increasingly evident that the gut microbiome produces metabolites that influence mitochondrial function and biogenesis (97), hence the ancestral gut microbiome-mitochondrion connection and its relation to CAD might need to be explored in the near future, as well. Resent progress in next-generation sequencing (NGS) techniques has set a scene for a second “gold rush” in mitochondrial genomics and mtDNAs are presently the most sequenced type of eukaryotic chromosome (98). At the same time, multi-omics investigations in mitochondria, mapping the genomes, transcriptomes, proteomes, and metabolomes in parallel, apart from yeast (99) have not been conducted yet. Hence, although, mitochondrial dysfunction has been associated with many human diseases, the respective proteins and pathways are not well-characterized (99), presenting an exciting future field of investigation, especially considering the fact that mitochondria play a key role in plasticity and adaptation to environmental change, including adaptation to physiological stress (100).

Conclusions and future directions

Given that CAD like other common complex disorders develops over time and involves both genetics and environment, full mechanistic insight will require coordinated sets of several-omics data at multiple time points, collected from many disease relevant tissues and body fluids in large enough cohorts (20, 21). Environmental risk factors can interact with the genome and perturb the epigenome to further modulate the transcriptome and proteome (20). Therefore, comprehensive monitoring and careful documentation of multiple environmental and lifestyle factors over time, i.e., the envirome, will be indispensable to yield significant insights into the complex etiology of CAD. Moreover, imaging and electronic health record data also will need to be considered. As more-omics and other data are generated, novel methods for efficient data integration, modeling, visualization and interpretation will be urgently needed to efficiently cope with this multi-dimensional data (101), and translate it into actionable precision medicine tools. Although, there has been major progresses in the development of multidimensional data integration algorithms and tools, the field is still in its infancy and the flexibility, effectiveness and robustness of data integration to extract biological insights is still restricted, especially when clinical outcomes (e.g., stable CAD vs. MI) need to be modeled (22, 101). In addition we still face a number of technical challenges related to patient sampling and profiling. For example, as already recognized by Hasin et al. and others (20, 21) human studies are often affected by various confounding factors, which are difficult or even impossible to control for (e.g., diet and medications). Clearly, also the available sample size will play an important role for the multi-omics approach to produce meaningful insights into CAD (21) and allow the generation of reliable prediction models for more efficient design of therapeutics, tailored to individual needs. According to Hasin et al. an underpowered study may not only miss true signals, but is also more likely to produce false positive results (21). Furthermore, already before and during data collection, careful attention has to be paid to data analysis requirements, e.g., sufficient depth of coverage for RNA-seq experiments (21).

Author contributions

BV and HS drafted and edited the manuscript.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

99 in total

1. Genetic Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions.

Authors: Fabian Grubert; Judith B Zaugg; Maya Kasowski; Oana Ursu; Damek V Spacek; Alicia R Martin; Peyton Greenside; Rohith Srivas; Doug H Phanstiel; Aleksandra Pekowska; Nastaran Heidari; Ghia Euskirchen; Wolfgang Huber; Jonathan K Pritchard; Carlos D Bustamante; Lars M Steinmetz; Anshul Kundaje; Michael Snyder
Journal: Cell Date: 2015-08-20 Impact factor: 41.582

2. Genetic regulation of human adipose microRNA expression and its consequences for metabolic traits.

Authors: Mete Civelek; Raffi Hagopian; Calvin Pan; Nam Che; Wen-pin Yang; Paul S Kayne; Niyas K Saleem; Henna Cederberg; Johanna Kuusisto; Peter S Gargalovic; Todd G Kirchgessner; Markku Laakso; Aldons J Lusis
Journal: Hum Mol Genet Date: 2013-04-04 Impact factor: 6.150

3. Inheritance of coronary artery disease in men: an analysis of the role of the Y chromosome.

Authors: Fadi J Charchar; Lisa Ds Bloomer; Timothy A Barnes; Mark J Cowley; Christopher P Nelson; Yanzhong Wang; Matthew Denniff; Radoslaw Debiec; Paraskevi Christofidou; Scott Nankervis; Anna F Dominiczak; Ahmed Bani-Mustafa; Anthony J Balmforth; Alistair S Hall; Jeanette Erdmann; Francois Cambien; Panos Deloukas; Christian Hengstenberg; Chris Packard; Heribert Schunkert; Willem H Ouwehand; Ian Ford; Alison H Goodall; Mark A Jobling; Nilesh J Samani; Maciej Tomaszewski
Journal: Lancet Date: 2012-02-09 Impact factor: 79.321

4. Plasma cholesterol-induced lesion networks activated before regression of early, mature, and advanced atherosclerosis.

Authors: Johan L M Björkegren; Sara Hägg; Husain A Talukdar; Hassan Foroughi Asl; Rajeev K Jain; Cecilia Cedergren; Ming-Mei Shang; Aránzazu Rossignoli; Rabbe Takolander; Olle Melander; Anders Hamsten; Tom Michoel; Josefin Skogsberg
Journal: PLoS Genet Date: 2014-02-27 Impact factor: 5.917

5. Circular non-coding RNA ANRIL modulates ribosomal RNA maturation and atherosclerosis in humans.

Authors: Lesca M Holdt; Anika Stahringer; Kristina Sass; Garwin Pichler; Nils A Kulak; Wolfgang Wilfert; Alexander Kohlmaier; Andreas Herbst; Bernd H Northoff; Alexandros Nicolaou; Gabor Gäbel; Frank Beutner; Markus Scholz; Joachim Thiery; Kiran Musunuru; Knut Krohn; Matthias Mann; Daniel Teupser
Journal: Nat Commun Date: 2016-08-19 Impact factor: 14.919

Review 6. Multidimensional Integrative Genomics Approaches to Dissecting Cardiovascular Disease.

Authors: Douglas Arneson; Le Shu; Brandon Tsai; Rio Barrere-Cain; Christine Sun; Xia Yang
Journal: Front Cardiovasc Med Date: 2017-02-27

7. Network analysis of coronary artery disease risk genes elucidates disease mechanisms and druggable targets.

Authors: Harri Lempiäinen; Ingrid Brænne; Tom Michoel; Vinicius Tragante; Baiba Vilne; Tom R Webb; Theodosios Kyriakou; Johannes Eichner; Lingyao Zeng; Christina Willenborg; Oscar Franzen; Arno Ruusalepp; Anuj Goel; Sander W van der Laan; Claudia Biegert; Stephen Hamby; Husain A Talukdar; Hassan Foroughi Asl; Gerard Pasterkamp; Hugh Watkins; Nilesh J Samani; Timo Wittenberger; Jeanette Erdmann; Heribert Schunkert; Folkert W Asselbergs; Johan L M Björkegren
Journal: Sci Rep Date: 2018-02-21 Impact factor: 4.379

Review 8. Genome, transcriptome and proteome: the rise of omics data and their integration in biomedical sciences.

Authors: Claudia Manzoni; Demis A Kia; Jana Vandrovcova; John Hardy; Nicholas W Wood; Patrick A Lewis; Raffaele Ferrari
Journal: Brief Bioinform Date: 2018-03-01 Impact factor: 11.622

9. Genome-wide association study and targeted metabolomics identifies sex-specific association of CPS1 with coronary artery disease.

Authors: Jaana A Hartiala; W H Wilson Tang; Zeneng Wang; Amanda L Crow; Alexandre F R Stewart; Robert Roberts; Ruth McPherson; Jeanette Erdmann; Christina Willenborg; Stanley L Hazen; Hooman Allayee
Journal: Nat Commun Date: 2016-01-29 Impact factor: 14.919

10. Metabolomic Quantitative Trait Loci (mQTL) Mapping Implicates the Ubiquitin Proteasome System in Cardiovascular Disease Pathogenesis.

Authors: William E Kraus; Deborah M Muoio; Robert Stevens; Damian Craig; James R Bain; Elizabeth Grass; Carol Haynes; Lydia Kwee; Xuejun Qin; Dorothy H Slentz; Deidre Krupp; Michael Muehlbauer; Elizabeth R Hauser; Simon G Gregory; Christopher B Newgard; Svati H Shah
Journal: PLoS Genet Date: 2015-11-05 Impact factor: 5.917

8 in total

Review 1. Systems biology of angiogenesis signaling: Computational models and omics.

Authors: Yu Zhang; Hanwen Wang; Rebeca Hannah M Oliveira; Chen Zhao; Aleksander S Popel
Journal: WIREs Mech Dis Date: 2021-12-30

Review 2. Could Artificial Intelligence/Machine Learning and Inclusion of Diet-Gut Microbiome Interactions Improve Disease Risk Prediction? Case Study: Coronary Artery Disease.

Authors: Baiba Vilne; Juris Ķibilds; Inese Siksna; Ilva Lazda; Olga Valciņa; Angelika Krūmiņa
Journal: Front Microbiol Date: 2022-04-11 Impact factor: 6.064

3. 2018 George Lyman Duff Memorial Lecture: Genetics and Genomics of Coronary Artery Disease: A Decade of Progress.

Authors: Ruth McPherson
Journal: Arterioscler Thromb Vasc Biol Date: 2019-08-29 Impact factor: 8.311

4. Serum lipids profiling perturbances in patients with ischemic heart disease and ischemic cardiomyopathy.

Authors: Lin Yang; Liang Wang; Yangyang Deng; Lizhe Sun; Bowen Lou; Zuyi Yuan; Yue Wu; Bo Zhou; Junhui Liu; Jianqing She
Journal: Lipids Health Dis Date: 2020-05-09 Impact factor: 3.876

5. Vascular Tissue Specific miRNA Profiles Reveal Novel Correlations with Risk Factors in Coronary Artery Disease.

Authors: Katrīna D Neiburga; Baiba Vilne; Sabine Bauer; Dario Bongiovanni; Tilman Ziegler; Mark Lachmann; Simon Wengert; Johann S Hawe; Ulrich Güldener; Annie M Westerlund; Ling Li; Shichao Pang; Chuhua Yang; Kathrin Saar; Norbert Huebner; Lars Maegdefessel; Rüdiger Lange; Markus Krane; Heribert Schunkert; Moritz von Scheidt
Journal: Biomolecules Date: 2021-11-12

6. Transcriptome-wide association study of coronary artery disease identifies novel susceptibility genes.

Authors: Ling Li; Zhifen Chen; Moritz von Scheidt; Shuangyue Li; Andrea Steiner; Ulrich Güldener; Simon Koplev; Angela Ma; Ke Hao; Calvin Pan; Aldons J Lusis; Shichao Pang; Thorsten Kessler; Raili Ermel; Katyayani Sukhavasi; Arno Ruusalepp; Julien Gagneur; Jeanette Erdmann; Jason C Kovacic; Johan L M Björkegren; Heribert Schunkert
Journal: Basic Res Cardiol Date: 2022-02-17 Impact factor: 12.416

7. Whole-Transcriptome Analysis of Yak and Cattle Heart Tissues Reveals Regulatory Pathways Associated With High-Altitude Adaptation.

Authors: Hui Wang; Jincheng Zhong; Jikun Wang; Zhixin Chai; Chengfu Zhang; Jinwei Xin; Jiabo Wang; Xin Cai; Zhijuan Wu; Qiumei Ji
Journal: Front Genet Date: 2021-05-21 Impact factor: 4.599

8. Examining the Association between Mitochondrial Genome Variation and Coronary Artery Disease.

Authors: Baiba Vilne; Aniket Sawant; Irina Rudaka
Journal: Genes (Basel) Date: 2022-03-15 Impact factor: 4.096

8 in total