Literature DB >> 26307062

Systems medicine of inflammaging.

Gastone C Castellani, Giulia Menichetti, Paolo Garagnani, Maria Giulia Bacalini, Chiara Pirazzini, Claudio Franceschi, Sebastiano Collino, Claudia Sala, Daniel Remondini, Enrico Giampieri, Ettore Mosca, Matteo Bersanelli, Silvia Vitali, Italo Faria do Valle, Pietro Liò, Luciano Milanesi.

Abstract

Systems Medicine (SM) can be defined as an extension of Systems Biology (SB) to Clinical-Epidemiological disciplines through a shifting paradigm, starting from a cellular, toward a patient centered framework. According to this vision, the three pillars of SM are Biomedical hypotheses, experimental data, mainly achieved by Omics technologies and tailored computational, statistical and modeling tools. The three SM pillars are highly interconnected, and their balancing is crucial. Despite the great technological progresses producing huge amount of data (Big Data) and impressive computational facilities, the Bio-Medical hypotheses are still of primary importance. A paradigmatic example of unifying Bio-Medical theory is the concept of Inflammaging. This complex phenotype is involved in a large number of pathologies and patho-physiological processes such as aging, age-related diseases and cancer, all sharing a common inflammatory pathogenesis. This Biomedical hypothesis can be mapped into an ecological perspective capable to describe by quantitative and predictive models some experimentally observed features, such as microenvironment, niche partitioning and phenotype propagation. In this article we show how this idea can be supported by computational methods useful to successfully integrate, analyze and model large data sets, combining cross-sectional and longitudinal information on clinical, environmental and omics data of healthy subjects and patients to provide new multidimensional biomarkers capable of distinguishing between different pathological conditions, e.g. healthy versus unhealthy state, physiological versus pathological aging.

Entities: Chemical Disease Gene Species

Keywords: ecological model; inflammation; multi-scale; multilayer networks; networks; propagation

Mesh：

Substances：
Biomarkers

Year: 2015 PMID： 26307062 PMCID： PMC4870395 DOI： 10.1093/bib/bbv062

Source DB: PubMed Journal: Brief Bioinform ISSN： 1467-5463 Impact factor: 11.622

Introduction

There is a large debate about a possible definition of Systems Medicine (SM). One idea is to define SM as an implementation of Systems Biology (SB) in the Medical disciplines with a particular attention to clinical applications [1], including clinical Bioinformatics and the discrimination of pathological states and related morbidities and comorbidities. The extension of SB (Table 1) to the clinical practice implies the establishment of a connection between a molecular-centered to a patient-centered world, through an organ-centered intermediate layer. This mapping (Figure 1) requires the extensive use of computational tools such as statistical, mathematical and bioinformatical techniques [1].

Table 1.

Increasing complexity and spreading of inflammation

Layer	BioMed hypothesis	Technology	Computation
Cellular	Inflammation Senescence	Epigenetic dysregulation Microscopy	Stochastic modeling Statistical distribution analysis inferential testing
Organ-tissue	Spreading of inflammation, including propagation of senescence	Sequencing Metabolomics Metagenomics	Bayesian methods Ecological modeling Diffusion models
Patient	Systemic inflammation Cancer Aging	Imaging (PET) Deep sequencing, including SNP detection	Population models Association testing (odds ratio, logistic regression) Texture analysis

Note. The propagation of inflammation through different layers of complexity can be disentangled by ad hoc combination of computational and technological tools, aimed to validate the biomedical hypotheses.

Figure 1.

SM as extension of SB. The SB basic cycle (red) is composed of Biological hypotheses, Technology (mainly devoted to omics measurements) and computational tools (statistical and modeling methods). The core SB cycle is then extended by increasing its complexity in a multi-scale way; starting from the cellular-subcellular domain, we reach the individual domain (the domain identified by a single patient and its internal structure in terms of organs and tissues). Finally, the higher domain is constituted by collection of patients and their interactions in an epidemiological context.

Increasing complexity and spreading of inflammation Inflammation Senescence Epigenetic dysregulation Microscopy Stochastic modeling Statistical distribution analysis inferential testing Sequencing Metabolomics Metagenomics Bayesian methods Ecological modeling Diffusion models Systemic inflammation Cancer Aging Imaging (PET) Deep sequencing, including SNP detection Population models Association testing (odds ratio, logistic regression) Texture analysis Note. The propagation of inflammation through different layers of complexity can be disentangled by ad hoc combination of computational and technological tools, aimed to validate the biomedical hypotheses. SM as extension of SB. The SB basic cycle (red) is composed of Biological hypotheses, Technology (mainly devoted to omics measurements) and computational tools (statistical and modeling methods). The core SB cycle is then extended by increasing its complexity in a multi-scale way; starting from the cellular-subcellular domain, we reach the individual domain (the domain identified by a single patient and its internal structure in terms of organs and tissues). Finally, the higher domain is constituted by collection of patients and their interactions in an epidemiological context. Within this perspective, SM is deeply related to Personalized (PeM) and Precision Medicine (PrM) [1-3]. PeM and PrM share the idea of the centrality of the individual patient as the common denominator for the development of tailored approaches such as therapies, drugs and treatments, according to the genetic background and, more in general, to the individual microenvironment [3]. The role of the microenvironment is becoming crucial in the comprehension of a variety of pathological and patho-physiological processes, such as cancer, diabetes, aging and age-related diseases [4-6]. Indeed, SM exploits, characterizes and quantifies the concept of microenvironment by novel analytical methods based on omics technologies and by the development of adequate computational models [7]. A research field where the microenvironment is playing a central role is those of aging and age-related diseases. The microenvironment, mainly the circulating one, has been described as related to inflammation, contributing to the definition of inflammaging, a complex phenotype that impacts in a variety of age-related diseases [4-6] as demonstrated by a series of experiments, collectively called ‘parabiosis’ [8]. These results gave birth to the concept of Communicome, defined as the set of plasmatic proteins involved in the rejuvenation effect [8]. The parabiosis experiments can be interpreted within the framework of the so-called ‘bystander effect’ [9]. The bystander effect has been observed during aging progression, cancer induction and other pathologies in which it is believed to explain the systemic propagation between cells and organs [10]. In the case of human body, the most accessible fluids are the serum and urines. The composition of these fluids can be assessed by a variety of methods, aimed to identify different fractions, such as metabolite concentrations, proteins and nucleic acids. The identification of these fractions is accomplished with omics methodologies such as metabolomics, proteomics and deep sequencing. Thus the body fluids, including both extra- and intracellular, belong to the individual microenvironment (‘le milieu interior’) [11]. Our microenvironment is in dynamical equilibrium with a variety of cells, hosting, among others, nutrients and molecular messages crucially involved in the regulation of the cellular populations. The microenvironment can be considered as a particular ‘habitat’ where different cells, and in some cases other hosted organisms, live and interact. Within this perspective, the role of microenvironment in shaping the other components of the body can be modeled from an ecological point of view. The same ecological approach can be used for the modeling of ecosystems that are in parabiosis with our body such as the gut microbiota (GM), the circulating virome (CV) and the transposable elements (TE) in our DNA [12-14], which have coevolved with the human host and his immune system (IS). This evolutionary-ecological framework is deeply related to inflammaging progression, given the role of the microenvironment, whose composition modulates the inflammatory process (pro- or anti-inflammatory action). As a further computational tool, we have to mention complex networks [15] and their extension to multiple networks: multiplex and multilayer networks [16, 17]. Multilayer networks can be used for the representation and quantification of the interactions arising from the internal environment of the human body and for those coming from the external environment. The architecture of complex networks is a natural embedding for diffusion processes as a function of topological constraints such as those observed during inflammatory processes.

The unifying biomedical hypothesis: inflammaging as the result of communicome and propagation

Chronic, low-grade inflammation is recognized as a main characteristic of aging bodies. This phenomenon, also indicated as inflammaging [4] is particularly important because practically all the major age-related diseases have an inflammatory pathogenic component [5]. Thus, age is a major risk factor for these diseases, and because they all share inflammaging as a prominent determinant, it is clear that inflammaging is a privileged target to combat age-related diseases as a whole and not one by one [6]. According to the most recent studies, inflammaging appears to be much more complex and entangled than we previously thought, and a variety of interconnected tissues, organs and systems participate in producing inflammatory stimuli [4-6]. Among them we mention the IS, the adipose tissue, the muscles and the liver. A particular attention deserves the GM, and the variety of inflammatory stimuli sent through the general circulation to all the body organs [18]. The general idea is that the inflammatory stimuli fueling inflammaging can be exogenous (e.g. persistent viral infections, diet, lifestyle, see Figure 2) [19], but probably most of them are endogenously produced by the body and should be identified with the self-debris resulting from the continuous turnover of cells and tissues [4], e.g. circulating DNA and RNA (the virome) [13], pro-inflammatory agalactosylated N-glycans [20] and pro-inflammatory circulating microRNA (‘inflammaMIR’) [21].

Figure 2.

Patient data space. Personalized medicine gathers together a huge amount of data characterizing a single patient (red shape). This integrated system can be interpreted as a network of networks. These integrated networks belong to different complexity layers, i.e. the omics layer (e.g. DNA and cells), the anatomical-functional layer (e.g. skeleton, circulatory system, nervous system, lymphatic system) and finally, the environmental layer (e.g. dietary habits, diseases and drugs, social behaviors and sports). In particular, the anatomical-functional scale is a spatial multiplex network, i.e. nodes can be considered as specific regions in space, connected by different functional links. A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org. There is a general consensus that the most important phenomena involved in inflammaging at the cellular and molecular level are the following: (i) Cell senescence and its pro-inflammatory senescent-associated secretory phenotype triggered by damaging agents (radiation, viruses) and likely by the continuous exposure to the above-mentioned self-debris [22]. It is interesting that cell senescence can spread to neighboring cells (‘senescence by senescence’) [23]. (ii) DNA damage, including telomeres, by Reactive Oxygen Species (ROS) and by a variety of other agents, which in turn triggers a DNA damage response and the production of pro-inflammatory compounds [24]. All these mechanisms suggest targeted therapies aimed to reduce inflammaging: elimination of senescent cells [25], diet enriched with omega3 fatty acids, Mediterranean diet [26] and other nutritional strategies, including caloric restriction (adapted to humans and taking into account genetic background) and intermittent fasting (Figure 2). Contemporary research is highlighting the role of inflammation in cancer induction and progression. Inflammaging seems therefore a unifying mechanism shared by different pathological processes and not just by aging and age-related diseases. The role of inflammation for cancer progression has been hypothesized by ancient Greeks and progressively confirmed by a number of researchers [27] The inflammatory state is sustained by cells and molecules that shape the microenvironment (Figure 3). Chronic inflammation affects all cancer stages: it increases the onset risk, supporting the initial genetic mutation or epigenetic mechanism, and it leads to cancer initiation, promoting tumor progression and supporting metastatic diffusion.

Figure 3.

The unifying BioMedical hypothesis of Inflammation as a driver of inflammaging and age-related pathological processes. The propagating (bystander) character of inflammation and inflammaging is strictly related to the microenvironment, including the circulating one (e.g. the communicome, virome, mobilome and GM), and its suggested interpretation from an ecological point of view. The spreading character of inflammation has been quantified by the concept of Communicome, defined by [8] as the circulating plasma proteins (i.e. microenvironment) with endocrine activity, including inflammatory cytokines, hormone-like proteins, growth factors and so forth. These factors are involved in intercellular and organ communication, and their composition changes with age and pathologies are crucial for discriminating normal versus pathological trajectories. Another major player in shaping the individual microenvironment is the GM (Figure 3). GM is a microbial ecosystem with a complex dynamics that derives from the interactions with components such as the virome (the set of viruses in the host body) and the IS [28]. The collective genome of these symbiotic microorganisms (called ‘Microbiome’) [29] constantly interacts with the host genome, forming the so called ‘Metagenome’ [30]. The microbial genome is in dynamical relation with the host organism, helping in crucial functions such as metabolic processes (e.g. food absorption), short-chain fatty acids and vitamin production [31], but also shaping, controlling and protecting the IS development [32], so fueling the (co)-evolution of the host. It is through the interaction between the different components of the metagenome that the host health phenotype is defined [33]. GM is linked, through an interdependence relationship, to the host IS [34] and metabolism [35], becoming crucial for a large number of physio-pathological conditions and diseases. These include inflammatory and metabolic diseases, in particular obesity [36], metabolic syndrome (MS) and type 2 diabetes, as well as aging [37] and cancer. Nowadays, the availability of Next-Generation Sequencing (NGS) Methods, for the characterization of bacterial communities, contributed to the creation of a new research field, called Metagenomics. Metagenomics is the set of omics measurements that quantify the composition of the metagenome and the interactions between the host and the microbiome at multiple levels: DNA (metagenome), RNA (meta-transcriptome), protein (meta-proteome) and metabolic network (metabolome).

The ecological framework: the examples of GM and TE

Two relevant examples of the microenvironment-related ecological hypothesis feasibility are GM and TEs. Regarding GM, there is a growing interest to the quantification of biodiversity and dynamics that brings to a certain composition of this complex ecosystem and, moreover, for the assessment of its homeostasis degree, its support capacity, as well as for the prediction of its temporal evolution. From this perspective it is clear that mathematical models have a major role [38]. GM dynamics can be described from an ecological point of view, with a focus on GM biodiversity, and in particular on one of its components: the Relative Species Abundance distribution (RSA). RSA is defined within a single phylogenetic level and refers to how common or rare a species is in comparison with other species and it is derived from a single family of statistical distributions, ranging from the Log-Series to a highly skewed and unveiled Log-Normal [39] (Figure 4). Modern ecological theories can be distinguished in essentially two main schools of thought: the niche assembly perspective and the dispersal one [40].

Figure 4.

RSA of one GM sample from Claesson et al. (gray histogram) fitted with a mixture of two Negative Binomials (black line). The RSA is a measure of biodiversity and is usually represented in the form of Preston Plot, plotting the number of species that have a certain number of individuals (in log 2). The neutral model proposed by Volkov et al. predicts a Negative Binomial distribution for the RSA and fits well the TE population. The GM RSA is rather fitted by a mixture of two Negative Binomials, meaning that a relaxation of the neutrality assumption is needed. The GM is thus well described by a hybrid niche-neutral model, in which two neutral niches are considered. Single-probe versus region-centric approaches for the analysis of DNA methylation microarrays. CpG DNA methylation measured by Infinium 450 k microarrays is expressed as a continuous value ranging from 0 (the CpG site is unmethylated in all the analyzed DNA molecules) to 1 (the CpG site is unmethylated in all the analyzed DNA molecules). In the analysis of differential methylation between two groups of samples, single-probe and region-centric approaches return different results. Single-probe analysis favors genomic regions like the one reported in box A, where a unique CpG site strongly differs in its methylation value between group A and group B. In the genomic region reported in box B, on the contrary, differences in DNA methylation values between groups A and B are smaller, but they involve several adjacent CpG sites; this configuration is preferentially identified by a region-centric approach. In the figure, CpG sites are represented as lollipops. A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org. The niche assembly perspective states that communities are groups of interacting species whose presence or absence and even their relative abundance can be deduced from deterministic ‘assembly rules’ that are based on the ecological niches or functional roles of each species. According to this view, species coexist in an interactive balance and a stable coexistence among competing species is made possible by niche partitioning [41]. On the other hand, the dispersal assembly (neutral theory) perspective asserts that communities are open, non-equilibrium assemblages of species largely put together by chance, history and random dispersal [40]. Species come and go, and their presence or absence is dictated by random dispersal and stochastic local extinction. Ecological communities are structured entirely by ecological drift (i.e. demographic stochasticity), random migration and random speciation. These two theories are not mutually exclusive; in fact it is possible to identify regimes where both of them are present. In particular, inflammaging supports both theories: the microenvironment concept is conceivable as an ecological niche, while the neutral theory efficiently describes and predicts RSA shape also in presence of external perturbations (diet, lifestyle, etc.) both for GM and TE. TE ecosystem (the so-called mobilome) is another component of the microenvironment, related to the CV and describable as an ecological system living in parabiosis or symbiosis (sometimes parasitical) relationship with the host DNA. TEs are also known as selfish DNA or jumping genes. They are present in DNA of eukaryotic and prokaryotic organisms [42] and often constitute a large fraction of many genomes (45% of the human genome) [42]. TEs are DNA sequences that can change their position inside the genome through a process called transposition (or retrotransposition if the process is RNA mediated). This process can be replicative, i.e. producing another copy that will be inserted in another location in the genome (insertion), or not. The replicative process will invade the host genome, increasing the genome length. TE activity may affect negatively, neutrally and occasionally positively the host fitness, generating a mutual selection owing to the interaction between host and elements living in its genome [43]. With the improvement of genomic sequencing techniques and the increasing of sequencing data, a number of computational tools appeared: ‘RepBase’, ‘Repbase Update’, ‘Pythia’, ‘CENSOR’, ‘RepeatMasker’ [44]. These databases and software tools provide information on TE sequences, copy number and their insertion coordinates in the genome. The analogy between an ecological system and TE has been proposed in [12]. If a copy of a TE is considered as an individual, one TE species comprises closely genetically related TE copies that share the same interaction with their environment. The community of a genome contains all the copies of TEs irrespective of their subfamilies, families or classes, and it is analogous to the biotic portion of an ecosystem. The abiotic component is composed by the genes and various kinds of noncoding sequences and the intracellular environment. It is still not clear whether TE community structure is mainly owing to specific element–host interaction rather than stochastic processes; thus, both niche and neutral theories of ecology should and can be taken into account. Following from the concept of ecological niche, the ‘genomic niche’ for each TE species is inferred from its limiting factors and its genomic traits, which are any biological feature involved in its relationship with the genome. Traits could include preferential site attachment in the genome, strategy of replication, propensity to undergo horizontal transfer (transfer to a new genome out of the germ line). The neutral theory of ecology [45, 46] instead considers a dynamics driven by stochastic forces at the level of individuals, providing an alternative explanation to the establishment of complex TE communities.

Technologies and computational tools

Metabolomics

Metabonomics portrays a well-recognized SM approach involving the study of metabolic responses to stimuli with the aim to understand the dynamics as well as organ-specific biochemical responses [47, 48]. Metabolomics generates multivariate information in cells, tissues and multi-compartmental biological systems, which are a reflection of changes in biological processes [31, 49, 50]. By the global study of low molecular weight compounds (<1500 Da), it provides the characterization of individual metabolic phenotype. Because specific physiological states, gene expression and environmental stressors can cause changes in the steady state of a biological system, monitoring metabolic perturbations provides a unique insight into intra- and extracellular regulatory processes involved in the metabolism [51-53]. To analyze the metabolites, compromising the metabolome, at different concentration and in multiple cellular compartments, different analytical technologies are jointly used. These metabolites include intermediaries of the metabolism, signaling molecules such as hormones and other secondary metabolites [48]. These techniques are mainly based on 1H nuclear magnetic resonance (NMR) and mass spectrometry coupled gas/high-performance liquid chromatography, with the addition of ultrahigh-performance liquid chromatography systems coupled to mass spectrometry [54]. NMR-based metabolomics provides efficient high-throughput analysis to holistically profile hundreds of metabolites with no a priori selection, while MS is characterized by its increased sensitivity, allowing precise quantification but, owing to increased steps in sample preparation, on a reduced number of samples. The two methods are comprehensively used to generate multivariate information, which are then de-convoluted, by advanced statistical tools, to provide meaningful biological readouts [55, 56]. Metabolomics applications are today directly implemented to large population-based studies to enhance our understanding of the role of genetics, environmental factors and their interactions on individual susceptibility to disease and health [57].

Epigenomics

The inclusion of epigenetic data could play a consistent role in SM applications. The term ‘epigenetics’ defines the interplay between the genetic background and the environment that, through different molecular mechanisms, eventually produces the observed phenotypes. In particular, epigenetic mechanisms influence the expression of the genes and the penetrance of the different allelic variants, without changing, in term of sequence, the DNA [58]. Among the different epigenetic mechanisms such as micro RNA, proteins ubiquitination, histone methylation and acetylation, one of the most studied one is the methylation of DNA. DNA methylation occurs in specific genomic sequences, i.e. the CpG dinucleotide motif, where a cytosine is followed by a guanine (Fig. 5). These CpG sites are about 30 million throughout the genome, and there are small genomic regions of about 500–2000 bp that are significantly enriched in CpG sites, which are called ‘CpG islands’, classified as genic CpG islands and non-genic CpG islands, if they map nearby a gene or if they map in a gene desert region accordingly [59].

Figure 5.

Single-probe versus region-centric approaches for the analysis of DNA methylation microarrays. CpG DNA methylation measured by Infinium 450 k microarrays is expressed as a continuous value ranging from 0 (the CpG site is unmethylated in all the analyzed DNA molecules) to 1 (the CpG site is unmethylated in all the analyzed DNA molecules). In the analysis of differential methylation between two groups of samples, single-probe and region-centric approaches return different results. Single-probe analysis favors genomic regions like the one reported in box A, where a unique CpG site strongly differs in its methylation value between group A and group B. In the genomic region reported in box B, on the contrary, differences in DNA methylation values between groups A and B are smaller, but they involve several adjacent CpG sites; this configuration is preferentially identified by a region-centric approach. In the figure, CpG sites are represented as lollipops. A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org.

While the role of the non-genic CpG islands is still not clear, the genic CpG islands are well-recognized functional elements, whose methylation is often correlated with the expression of the nearby gene. Even though this is not a stringent rule, in general, the hypo-methylation of a genic CpG island is correlated with the hyper-expression of the correspondent gene, and vice versa, the hyper-methylation correlates with its hypo-expression. The state of the art in epigenomic technology is the NGS, which has been successfully applied to the study of the DNA methylation. The field of human aging has gained particularly benefits by these technologies. Indeed, the age-related methylation changes are among one of the major molecular remodeling phenomena that occurs during aging [60-62]. These results strongly support the idea that DNA methylation on the whole is deeply involved in the aging process and its related effects on physiological fitness. This poses DNA methylation as one of the sharpest arrows for the system medicine arch. Nevertheless there are many issues on the use of such data and on how integrate them with the other omics markers.

Next-generation sequencing

The application of NGSs has become a powerful approach for determining DNA and RNA sequences in ‘omics’ studies (e.g. genomics, viromics, metagenomics, transcriptomics and epigenomics). The NGS techniques have been widely used to profile genetic variation of human and environmental samples, as they permit high-resolution and high-throughput detection of DNA and RNA polymorphisms. The main steps of detecting DNA/RNA polymorphisms, the also called ‘variants’, in NGS data are: quality control check, mapping reads to a reference genome, alignment processing, variant calling and annotation. The quality control check aims to remove sequence artifacts produced by sequencing platforms. It is performed by trimming or filtering reads according to base quality scores thresholds. The FASTX-Tool kit (http://hannonlab.cshl.edu/fastx_toolkit/) is an example of software that offers a collection of tools for quality control on sequencing data. The accuracy of the read mapping step has a crucial role in variant detection, as reads that are not correctly aligned can lead to errors in variant and genotype classification. Recently, several fast and memory-efficient software solutions were developed, such as Bowtie2 [63], SOAPv2 [64] and BWA [65]. The manipulation and the processing steps of indexing, sorting and duplicate removal of alignment files can be done by Picard (http://broadinstitute.github.io/picard/) and SAMtools [66]. Previous studies demonstrated that the Phred-scaled quality scores produced by the sequencing platforms may not accurately reflect the true base-calling error rates [67, 68]. As the variant calling algorithms depend on these scores, to assure their accuracy, a recalibration process can be performed by SOAPsnp [68] and GATK [69]. Determining polymorphisms in sequencing data, also referred as ‘variant calling’, permits the classification of allelic genotypes and, depending on the experimental design, also permits the discovery of molecular events such as somatic mutations, loss of heterozygosity, mRNA editing and clonal evolution of cancer samples. This kind of information is essential to differentiate between ‘normal’ and tumoral tissues and can be used for the construction of biomarker signatures useful to provide computational tools for early diagnosis of cancer risk. The variant calling can be performed by general-purpose software products, as GATK and SAMtools, or by software solutions designed to specific purposes, like VarScan2 [70], SomaticSniper [71], JointSNVMix [72], Strelka [73] and MuTect [74], that were developed for the joint analysis of matched normal-cancer samples. The final step is the variant annotation, and so variants can be prioritized according to their biological meaning and/or disease-causing potential. The Annovar [75] and SnpEff [76] are examples of software designs that perform gene-based annotation of human variants, which take in account information also present in dbSNP and 1000 Genomes databases. The choice of tools and parameters for sequencing data analysis may have great impact on the final results and it will vary according to the type of sample, sequencing platform and experimental design. NGS methods are widely used in metagenomics and are based on the taxonomic value of 16 s RNA for the microbial classification. Usually 16 s RNA are massively sequenced by various methods (e.g. 454, ion torrent) and after that, the main step of 16 S rRNA sequencing data processing is indeed clustering. Clustering means grouping sequences according to some similarity criterion and it is exploited by two fundamental analysis: taxonomic classification of bacteria in the sample and ecological description of such population. The two widely used approaches put sequences into bins based on either their similarity to reference sequences (i.e. phylotyping by reference clustering) or their similarity to other sequences in the community [i.e. Operational Taxonomic Units (OTUs)] [77]. Reference clustering is of course helpful if one's aim is taxonomic classification, but it implies a loss of information that makes it almost useless if the purpose is to give an ecological description of the microbiota community. The distribution of OTUs (RSA, see Figure 4) is an important component of biodiversity. To assess the microbiota RSA from 16 S rRNA sequencing data, the process is to cluster sequences with a de novo algorithm, thus producing the so-called OTUs. Because 16 S rRNA gene is a highly conserved sequence of the genome, sequences that belong to the same OTU will be phylogenetically closed and could be used as ‘Species’ to compute the RSA. Many different algorithms have been suggested to carry out de novo clustering: BLAST [78], MOTHUR [79], QIIME [80], CD-HIT [81] and UCLUST [82]. In particular, UCLUST is a greedy algorithm that uses less memory and is even faster than CD-HIT [82, 83]. It starts with an empty database in memory and then reads the sequences in input order. Sequences with a similarity greater than 97% are typically assigned to the same species, those with similarity >95% to the same genus, and those with >80% to the same phylum. OTUs defined by a certain similarity threshold represent therefore bacteria ‘species’ at a particular phylogenetic level. It follows that starting from OTU's abundances, one can easily build the RSA distribution by plotting the number of species (y-axis) that have a certain number of individuals, i.e. sequences, (x-axis), and have a measure of the microbiota biodiversity. Because of the RSA long tail, it is usual to consider the logarithm to base two of the x-axis, obtaining the so-called Preston plot (Figure 4). A remarkable result is that (see Figure 4) the RSA curves obtained from microbial and TE follow the same statistical distribution. In particular, a population dynamics explanation has been given in [45] and [46], by a stochastic neutral model in which species interaction is neglected and the resulting RSA is described by a Negative Binomial Distribution. The parameters of this distribution are particularly sensitive to variations such as dietary regimes (e.g. the amount of proteins, fatty acids and fibers), pathological conditions (e.g. obesity, MS and type 2 diabetes) and aging. The characterization (even in the multidimensional case) of these parameters provides a further and advanced method aimed to classify different health states.

Multilayer networks: a computational tool for multi-scale integration and diffusion

Network theory is a major branch of the so-called complexity science: it investigates the global topology and structural patterns of the interactions among the constituents of social groups, infrastructures, brain and biological networks [15, 84, 85]. Networks are the most natural way to model many types of biological knowledge at different levels: from single-cell protein interaction, transcriptional regulation and metabolic networks, up to neuronal, IS or ecological networks involving multiple cells or organisms. Network approaches can be useful, in a Personalized Medicine perspective [3], for the following purposes: Community detection (e.g. identification of functional modules or patient stratification) [86, 87]. Centrality measures for ranking network elements (nodes, edges), from classical measures like Betweenness Centrality, to more recent ones, also suited for dense and weighted networks [88, 89]. Characterization of network ensembles as null models, to identify nonrandom patterns in real networks and for multilevel network analysis embedding a priori biological information [90-92]. The omics era has converged to the era of ‘big data’ for all types of biological information: genomics, proteomics, interactomics, cellomics, organomics, in vitro and in vivo imaging [3], and it has generated a new approach to medicine, i.e. SM. The definition of SM is deeply related to complex networks: it involves a systemic view of the organism where the various building elements are considered in their interplay [16, 93, 94]. A paradigmatic example is the networks of human diseases. The diseasome [95] is a bipartite network that connects human diseases (disease phenome) and human genes (disease genome). The strong connectivity of molecular systems implies that a specific dysfunction of an element propagates throughout the network of interactions and affects the activity of other components. Therefore, different dysfunctions can lead to the same effect through different pathways. The concept of ‘disease module’ has been proposed to indicate a group of network components whose disruption leads to a particular disease phenotype. Single-target therapies may fail because they are not contemplating the underlying network characteristics during target identification [96]. On the other side, a drug used for a disease could reveal itself of valuable use also for other diseases strongly connected to the original target node. Two examples of network-based disease targeting are the ‘central hit strategy’, that selectively focuses on central nodes/edges in the biological network and the ‘network influence strategy’, in which several neighbors of central nodes/edges are targeted instead [94]. These approaches require detailed knowledge of the underlying networks and of its dynamics, as for example by controllability approaches, i.e. the study of how specific nodes can shift the dynamical status of a network [96-99]. Each patient status can be described by several networks, e.g. related to its genomic, proteomic or metabolic profile. When gene expression data are available, they can be mapped onto protein–protein networks, thus integrating general biological knowledge and sample-specific information at multiple levels (from the whole network to a single gene or pathway) [100]. Further, each patient can be summarized by a vector of network observables, like betweenness centrality, spectral centrality [88], community labels [86] or single-node entropy measures [100], and suitable clustering/discrimination algorithms can help to stratify or classify such samples. A possible vector is related to the concept of diffusion of information throughout a network. This measure has been applied to the stratification of tumor mutations [101], the association of genes and protein complexes with diseases [102], the identification of biomarkers in genome-wide studies [103, 104] and the relation between viral (Epstein Barr Virus (EBV), Human Papilloma Virus (HPV) and Hepatitis C Virus (HCV)) targets and the corresponding cellular response [105, 106]. Diffusion algorithms simulate the diffusion of information from a subset of vertexes to all the others. At the end of the process, the amount of information per vertex depends both on the initial distribution of information and on network topology; therefore, it permits to rank all vertexes in relation to their network proximity to the subset of vertexes that carry the initial information. So far, all the different networks for the same sample have been investigated separately. Recently, framework of ‘multilayer networks’ has been introduced [16, 107–109]. SM seems naturally embedded in a multilayer network. Multilayer networks might represent the tools to successfully combine the interactions between the different constituents of the cell, and the answer to the purpose of personalized medicine to define a ‘quantified self’ or for assessing global wellness [3]. For example, a multilayer network provides novel approaches for the combined analysis of samples in different states [109]. Integrative multi-omics approaches offer a more comprehensive picture, especially when embedded in a priori biological knowledge [110], and different layers of disease-related information can be analyzed altogether [111]. Some multilayer integrative approaches, like iCluster [112] and MDI (Multiple Dataset Integration) [113], can be applied to several types of data sets, while others, such as Camelot [114] and CNAmet [115], were developed for specific combinations of omics. The integration strategies can be sequential or simultaneous: sequential approaches refine the results of one layer using further layers; conversely, simultaneous approaches jointly analyze all layers without imposing an a priori order. Formally speaking [16], a multilayer graph M is described by a set of M graphs {}(called layers) and by a set of connections between nodes of different and (‘interlayer connections’, see Figure 6, red edges). ‘Intralayer connections’ connect nodes in the same layer (Figure 6, green and blue edges). We denote with the adjacency matrix of the network layer and by the adjacency matrix representing interlayer connections between nodes of different and . The adjacency matrix of the projection network is indicated as M (shown in Figure 6).

Figure 6.

A representation of a generic multilayer network composed by two graphs: G1 and G2. The interlayer connections are in red, while the intralayer connections are in green for graph G1 and in blue for graph G2. The adjacency matrix of the related projection network proj(M) is displayed in the lower-right corner. A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org. Multilayer networks can be distinguished in multiplex networks [116-119] and interacting networks of networks [120, 121]. In a multiplex network, the same set of nodes has different types of interactions in each layer. In the latter, nodes of different layers represent different elements of the system, as in the cell, where metabolites, proteins and transcription factors are distinct biological entities [109]. These systems show novel critical phenomena and dynamical processes [122-124]. Multiplex features like degree correlation [125] determine whether a hub in one layer is also a hub in another layer, while the overlap quantifies how much two nodes of the network interact in several networks at the same time. Recent application to social or biological weighted networks [108, 109] show how novel and relevant information can be uncovered only by exploiting the multiplex nature of such systems. In a typical case-control design, e.g. cancer versus healthy samples, co-expression networks (based on sample omics measurements) are naturally embedded in a case-control duplex [109]. This duplex naturally stratifies in the backbone of interactions shared by both layers, highlighting layer-specific relations and behaviors. SM approaches can be embedded in network of networks [3] or in multiplex networks (Figure 7), according to the model assumptions and to the available observables. The single-layer measures must be modified to take into account the multilayered structure, such as for the definition of shortest paths [126]. Centrality measures [127] and Network entropy [100, 107, 109] can be redefined onto such structures, extending their meaning over different types of nodes or relationships. A still unexplored field concerns the use of information in some layers as priors for network-based statistical analysis of a single layer. This can be exploited for network reconstruction or for multidimensional clustering [128], with the features of one layer used as constraints for the other layers.

Figure 7.

A possible multiplex architecture for SM: in a multiplex network, the same set of nodes (patients) has different types of interactions in each layer. Each layer corresponds to a given omics measurement, and a link might be a similarity measure between two people. We represent only few layers for the sake of simplicity. The bottom layer is divided into Genetics and Environment that may have a different role in the causation of a given phenotypical trait (e.g. clusters). A colour version of this figure is available at BIB online: http://bib.oxfordjournals.org. A multiplex network approach has been used to find recurrent subgraphs in co-expression networks from biological data [129]. These subgraphs can be associated with functional modules, protein complexes or transcriptional modules, for particular phenotypic conditions, confirming how phenotypes are related to complex structures of the genetic network. In the same context, sub-network extraction based on multi-objective optimization of several criteria permits the identification of modules enriched in multi-omics information [130]. Last but not least, multiplex networks are the ideal formalism for time-evolving networks [131]: if the same measurements are available at different stages (e.g. at disease onset, during and after therapy), neglecting time-dependence of the links compromises, from the point of view of the personalized medicine, the correct reconstruction of patient’s history and disease progression.

Conclusion and perspectives

In this article we discussed the hypothesis that inflammation/inflammaging can be a unifying mechanism for SM. The reasons are rooted in: (i) the multi-scale character of inflammaging, that starts as a local phenomenon and subsequently propagates at a systemic level; (ii) the involvement of the microenvironment as an active medium for exchanging signals between different layers of biological complexity; (iii) the possibility to use ecological modeling to describe and predict the behavior of systems such as the GM and the Mobilome; (iv) the natural embedding of SM in a multilayer network formalism, capable to integrate multi-omics data. Future challenges will investigate ecological niche and association dynamics of microbiomes by using 3D cartography of human body [132]. One important direction will be the study of the impact of human microbiome distribution on immunity and on the tissue-specific epigenomic marks of disease- and trait-associated genetic variants (see [133], for a large collections of human epigenomes). This could be done for different populations, human mobility groups in routine [133] and shift conditions. This integrated analysis will give ground for a better understanding of the evolution of comorbidities and multi-morbidities, particularly in presence of infectious diseases and for finding appropriate therapies [134, 135]. We strongly believe that the combination of these concepts and tools will be the bases for a SM focused on the comprehension of patho-physiological processes such as aging and age-related pathologies, including cancer, and they will pave the way toward a better understanding of the differences between health and disease states. Key Points Inflammation is a unifying Biomedical hypothesis particularly attractive for Systems Medicine as it is multi-scale, multi-organs and propagating among multiple spatial and temporal scales. Nowadays sequencing technologies are among the most powerful methods to achieve omics information from biological systems (Metagenome, Virome, Mobilome). All biological systems are communicating, so the Communicome is crucial, as well as its metabolomic characterization. An emerging strategy for the description and prediction of interconnected biological systems, including tissues and organs, is to model them as ecosystems, remarking the role of the environment in the propagation process. The human body, within the Systems Medicine vision, can be mapped onto a multilayer network, capable to model and quantify the endogenous and exogenous interactions.

Funding

All the authors acknowledge support by the Italian Ministry of Education and Research through the Flagship (PB05) InterOmics and EU projects FibeBiotics (GA: 289517), Mission-T2D (GA: 600803) and Methods for Integrated analysis of multiple Omics data sets (MIMOmics) (GA: 305280).

119 in total

1. Community structure in time-dependent, multiscale, and multiplex networks.

Authors: Peter J Mucha; Thomas Richardson; Kevin Macon; Mason A Porter; Jukka-Pekka Onnela
Journal: Science Date: 2010-05-14 Impact factor: 47.728

2. Catastrophic cascade of failures in interdependent networks.

Authors: Sergey V Buldyrev; Roni Parshani; Gerald Paul; H Eugene Stanley; Shlomo Havlin
Journal: Nature Date: 2010-04-15 Impact factor: 49.962

3. SOAP2: an improved ultrafast tool for short read alignment.

Authors: Ruiqiang Li; Chang Yu; Yingrui Li; Tak-Wah Lam; Siu-Ming Yiu; Karsten Kristiansen; Jun Wang
Journal: Bioinformatics Date: 2009-06-03 Impact factor: 6.937

4. Assessing and improving methods used in operational taxonomic unit-based approaches for 16S rRNA gene sequence analysis.

Authors: Patrick D Schloss; Sarah L Westcott
Journal: Appl Environ Microbiol Date: 2011-03-18 Impact factor: 4.792

5. Avalanche collapse of interdependent networks.

Authors: G J Baxter; S N Dorogovtsev; A V Goltsev; J F F Mendes
Journal: Phys Rev Lett Date: 2012-12-11 Impact factor: 9.161

6. Multiple percolation transitions in a configuration model of a network of networks.

Authors: Ginestra Bianconi; Sergey N Dorogovtsev
Journal: Phys Rev E Stat Nonlin Soft Matter Phys Date: 2014-06-26

Review 7. The virome in mammalian physiology and disease.

Authors: Herbert W Virgin
Journal: Cell Date: 2014-03-27 Impact factor: 41.582

Review 8. The senescence-associated secretory phenotype: the dark side of tumor suppression.

Authors: Jean-Philippe Coppé; Pierre-Yves Desprez; Ana Krtolica; Judith Campisi
Journal: Annu Rev Pathol Date: 2010 Impact factor: 23.472

Review 9. Chronic inflammation (inflammaging) and its potential contribution to age-associated diseases.

Authors: Claudio Franceschi; Judith Campisi
Journal: J Gerontol A Biol Sci Med Sci Date: 2014-06 Impact factor: 6.053

10. Integrative analysis of 111 reference human epigenomes.

Authors: Anshul Kundaje; Wouter Meuleman; Jason Ernst; Misha Bilenky; Angela Yen; Alireza Heravi-Moussavi; Pouya Kheradpour; Zhizhuo Zhang; Jianrong Wang; Michael J Ziller; Viren Amin; John W Whitaker; Matthew D Schultz; Lucas D Ward; Abhishek Sarkar; Gerald Quon; Richard S Sandstrom; Matthew L Eaton; Yi-Chieh Wu; Andreas R Pfenning; Xinchen Wang; Melina Claussnitzer; Yaping Liu; Cristian Coarfa; R Alan Harris; Noam Shoresh; Charles B Epstein; Elizabeta Gjoneska; Danny Leung; Wei Xie; R David Hawkins; Ryan Lister; Chibo Hong; Philippe Gascard; Andrew J Mungall; Richard Moore; Eric Chuah; Angela Tam; Theresa K Canfield; R Scott Hansen; Rajinder Kaul; Peter J Sabo; Mukul S Bansal; Annaick Carles; Jesse R Dixon; Kai-How Farh; Soheil Feizi; Rosa Karlic; Ah-Ram Kim; Ashwinikumar Kulkarni; Daofeng Li; Rebecca Lowdon; GiNell Elliott; Tim R Mercer; Shane J Neph; Vitor Onuchic; Paz Polak; Nisha Rajagopal; Pradipta Ray; Richard C Sallari; Kyle T Siebenthall; Nicholas A Sinnott-Armstrong; Michael Stevens; Robert E Thurman; Jie Wu; Bo Zhang; Xin Zhou; Arthur E Beaudet; Laurie A Boyer; Philip L De Jager; Peggy J Farnham; Susan J Fisher; David Haussler; Steven J M Jones; Wei Li; Marco A Marra; Michael T McManus; Shamil Sunyaev; James A Thomson; Thea D Tlsty; Li-Huei Tsai; Wei Wang; Robert A Waterland; Michael Q Zhang; Lisa H Chadwick; Bradley E Bernstein; Joseph F Costello; Joseph R Ecker; Martin Hirst; Alexander Meissner; Aleksandar Milosavljevic; Bing Ren; John A Stamatoyannopoulos; Ting Wang; Manolis Kellis
Journal: Nature Date: 2015-02-19 Impact factor: 69.504

11 in total

1. Systems biology approaches to study the molecular effects of caloric restriction and polyphenols on aging processes.

Authors: Sébastien Lacroix; Mario Lauria; Marie-Pier Scott-Boyer; Luca Marchetti; Corrado Priami; Laura Caberlotto
Journal: Genes Nutr Date: 2015-11-25 Impact factor: 5.523

Review 2. Inflammation-a Critical Appreciation of the Role of Myeloid Cells.

Authors: Asif J Iqbal; Edward A Fisher; David R Greaves
Journal: Microbiol Spectr Date: 2016-10

Review 3. Inflammation Thread Runs across Medical Laboratory Specialities.

Authors: Urs Nydegger; Thomas Lung; Lorenz Risch; Martin Risch; Pedro Medina Escobar; Thomas Bodmer
Journal: Mediators Inflamm Date: 2016-07-14 Impact factor: 4.711

4. Network diffusion-based analysis of high-throughput data for the detection of differentially enriched modules.

Authors: Matteo Bersanelli; Ettore Mosca; Daniel Remondini; Gastone Castellani; Luciano Milanesi
Journal: Sci Rep Date: 2016-10-12 Impact factor: 4.379

5. Interdisciplinary approach towards a systems medicine toolbox using the example of inflammatory diseases.

Authors: Christian R Bauer; Carolin Knecht; Christoph Fretter; Benjamin Baum; Sandra Jendrossek; Malte Rühlemann; Femke-Anouska Heinsen; Nadine Umbach; Bodo Grimbacher; Andre Franke; Wolfgang Lieb; Michael Krawczak; Marc-Thorsten Hütt; Ulrich Sax
Journal: Brief Bioinform Date: 2017-05-01 Impact factor: 11.622

6. Multi-omic analysis of signalling factors in inflammatory comorbidities.

Authors: Hui Xiao; Krzysztof Bartoszek; Pietro Lio'
Journal: BMC Bioinformatics Date: 2018-11-30 Impact factor: 3.169

Review 7. Community effort endorsing multiscale modelling, multiscale data science and multiscale computing for systems medicine.

Authors: Massimiliano Zanin; Ivan Chorbev; Blaz Stres; Egils Stalidzans; Julio Vera; Paolo Tieri; Filippo Castiglione; Derek Groen; Huiru Zheng; Jan Baumbach; Johannes A Schmid; José Basilio; Peter Klimek; Nataša Debeljak; Damjana Rozman; Harald H H W Schmidt
Journal: Brief Bioinform Date: 2019-05-21 Impact factor: 11.622

Review 8. Potential therapeutic effects of the MTOR inhibitors for preventing ageing and progeria-related disorders.

Authors: Camilla Evangelisti; Vittoria Cenni; Giovanna Lattanzi
Journal: Br J Clin Pharmacol Date: 2016-05-18 Impact factor: 4.335

9. Network integration of multi-tumour omics data suggests novel targeting strategies.

Authors: Ítalo Faria do Valle; Giulia Menichetti; Giorgia Simonetti; Samantha Bruno; Isabella Zironi; Danielle Fernandes Durso; José C M Mombach; Giovanni Martinelli; Gastone Castellani; Daniel Remondini
Journal: Nat Commun Date: 2018-10-30 Impact factor: 14.919

Review 10. Inflammaging and Brain: Curcumin and Its Beneficial Potential as Regulator of Microglia Activation.

Authors: Antonia Cianciulli; Rosa Calvello; Melania Ruggiero; Maria Antonietta Panaro
Journal: Molecules Date: 2022-01-06 Impact factor: 4.927