Literature DB >> 23061026

Large-scale dynamics of horizontal transfers.

Luigi Grassi1, Jacopo Grilli, Marco Cosentino Lagomarsino.   

Abstract

The widespread exchange of genes between bacteria must have consequences on the global architecture of their genomes, which are being found in the abundant genomic data available today. Most of the expansion of bacterial protein families can be attributed to transfer events, which are positively biased for smaller evolutionary distances between genomes, and more frequent for classes that are larger, when summed over all known bacteria. Moreover, "innovation" events where horizontal transfers carry exogenous evolutionary families appear to be less frequent for larger genomes. This dynamic expansion of evolutionary families is interconnected with the acquisition of new biological functions and thus with the size and distribution of the genes' functional categories found on a genome. This commentary presents our recent contributions to this line of work and possible future directions.

Entities:  

Year:  2012        PMID: 23061026      PMCID: PMC3463476          DOI: 10.4161/mge.21112

Source DB:  PubMed          Journal:  Mob Genet Elements        ISSN: 2159-2543


The current era of fully sequenced genomes and metagenomes confronts us with the challenge and the opportunity of integrating unprecedented amounts of biological data. With such an abundance of data, simplifying views are often useful for figuring out relevant biological phenomena. For example, one can characterize the content of a genome by partitioning it into classes of functional and evolutionary levels. In other words, a genome can be divided in subsets describing its different operative elements, such as genes and their functional and evolutionary, families, transposons and their families, noncoding RNAs, etc. Notably, studies following this approach revealed that some of these elementary features of the functional and evolutionary composition of a genome are often governed by simple quantitative laws. Considering the protein-coding part of genomes, it is often an advantage to focus on protein domains, rather than full proteins, because they are the building blocks of proteins and they follow similar trends. The sizes of protein/protein-domain families have a fat-tailed distribution- whose “slope” depends on genome size. The overall number of families represented by at least one member exhibits a slower-than-linear scaling with the total number of genes in a genome. Biologically, the growth of evolutionary families derives from combined processes of horizontal gene transfer, gene duplication, gene genesis and gene loss (Figs. 1 and 2). For prokaryotes, horizontal gene transfer (HGT), the acquisition of genetic material in a non-hereditary manner, is probably the main innovative force,- and there are systematic indications that HGT dominates gene family expansion. The same process is presumably very important for the introduction of a new evolutionary family into an extant genome. Accordingly, theoretical models have been proposed that account for the observed power-law distribution of family sizes,,,-17 mostly using class-expansion/innovation/loss moves, abstractly mimicking basic evolutionary moves such as horizontal transfer, gene duplication and gene loss. A related model, in addition to family size distributions, also explains and successfully fits the scaling of the number of distinct gene families represented in a genome as a function of genome size.,

Figure 1. Reports a hypothetical model describing the evolutionary dynamics of protein domains. In this model, horizontal gene transfer can play a double role, on one hand causing the expansion of existing families, and on the other determining innovation through the foundation of new families for a specific lineage which did not possess it.

Figure 2. A representation of all species examined in reference 21. Each bar on the outer circle represents a studied genome and links represent protein domains. Different genomes are connected if they share a domain subject to HGT (in the cross-genomic gene pool formed by the union of the analyzed genomes). The color of the links reflects the number of transfers as shown in the legend.

Figure 1. Reports a hypothetical model describing the evolutionary dynamics of protein domains. In this model, horizontal gene transfer can play a double role, on one hand causing the expansion of existing families, and on the other determining innovation through the foundation of new families for a specific lineage which did not possess it. Figure 2. A representation of all species examined in reference 21. Each bar on the outer circle represents a studied genome and links represent protein domains. Different genomes are connected if they share a domain subject to HGT (in the cross-genomic gene pool formed by the union of the analyzed genomes). The color of the links reflects the number of transfers as shown in the legend. A related important observation is that horizontal gene transfer is reported to be generally biased toward a closer evolutionary lineage with respect to distantly related lineages. This bias in transfer partners can create phylogenetic signals that are similar to shared ancestry but are not due to vertical inheritance. In other words, there exist HGT “exchange groups” of genomes, which are analogous to populations able to exchange alleles by recombination. Data and models, taken together confronted us with two puzzles. First, the growth of the number of families with proteome size is sublinear, indicating that introduction of new families becomes relatively less likely than class expansion with genome size (both processes being presumably driven by HGT). Second, in order to reproduce the correct tails of the evolutionary family histograms, the models need to introduce a rich-get-richer principle for class expansion, where the probability of adding a new member to the class is proportional to the class size. Motivated by these questions, we recently performed a detailed analysis of 20 genomes of Proteobacterial species, evaluating the extent to which HGTs expand the genomes’ domain repertoires (Fig. 2). As a control, we compared our results with those obtained with a database containing HGTs for hundreds of genomes. We used these data to address the two main questions: (1) does a “rich-get-richer” principle hold for genome growth by HGT? And, (2) do horizontally transferred genes carry novel domains more than expected by chance? Currently, there is no systematic quantification of how HGT success is correlated with the existing pool of gene classes in a genome. One possibility is that HGT could act effectively as a duplication move in a larger cross-genomic gene family pool (affected by the ecosystems where genes can be exchanged). In some cases, this pool may resemble the genome in question in terms of frequency of gene families, thereby causing a larger class-expansion rate for larger families, but in general this is not necessarily true. Furthermore, HGT may be more likely to be successful for domains that are rare (in the “metagenome” creating the community gene pool or in the receiving genome). We found that horizontally transferred genes carry domains of exogenous families less frequently for larger genomes, although they might do it more than expected by chance. Additionally, protein domains that are more common in the total pool of genomes appear to have a proportionally higher chance to be transferred. Both features suggest that transfer events behave as if they were drawn randomly from a “ross-genomic” or metagenomic community gene pool, much like gene duplicates are drawn from a genomic gene pool. Since larger genomes will possess more domain classes, the first finding is also in agreement with the observation that the probability of true innovations will be smaller. Clearly, it is not obvious that the amount of transfers should behave as if they were drawn randomly from a common pool. Other scenarios are possible in which either a decrease of novelty in larger genomes or a rich-gets-richer class-expansion by horizontal transfer, or both, are not trivially expected. For example, gene gains could be sampled from a very large effective pool of families, or horizontal transfers could be dominated by specific ecological or functional mechanisms. We know that transfers are not random; protein length plays a role, for example, and it is natural to expect that selective pressure favors the acquisition of specific traits. However, our results suggest that, when averaging over many transfer events there is a large contribution of purely combinatorial and statistical aspects to the “emergent” overall distributions of HGTs and their contribution to protein families, as typically happens in systems of many agents (such as crowds, the stock exchange, species in ecosystems-). In these cases the analysis tools and the modeling frameworks of statistical physics may prove useful, as they were developed having in mind closely related phenomenologies in physical and chemical systems. Notably, since the class sizes within a single genome are similar to the corresponding sizes in the cross-genome gene pool, this also has the consequence that classes should grow according to a rich-get-richer principle. The latter has often been assumed, but is not justified in current models.,, For gene duplications, a rich-get-richer principle follows from the null assumption that all genes of a given class are a priori equally likely to get duplicated. However, as we discussed, prokaryotes tend to add genes by HGT rather than by gene duplication.,, This behavior also affects the statistics of (domain) functional categories, which in the case of domains are typically made of the sum of a number of evolutionary classes, and empirically grow as power laws with genome size at a specific rate, termed “evolutionary potential”. The joint partitioning of genes into functional and evolutionary classes also shows relevant universal quantitative trends, and is connected to genome innovation by horizontal transfer. Presumably, addition of new genes needs to follow correlated functional “recipes” where genes whose functions are related are added together. For example having in mind the classic “operon model” (the general model of bacterial transcription control that partitions genes into specific regulatory genes and responding to metabolic cues, environment, and “structural” target genes performing specific tasks) it can be stated that addition of transcription factors needs to be related to the addition of a set of metabolic enzymes that are related by common metabolic pathways. The consequences of these statements have been explored recently using an integrated approach of data analysis and models,, and appear to explain very well the observed quantitative relationship between transcription factors and metabolic pathways, despite the fact that this might be subject to other constraints. However, we still know very little about the nature and the very existence of these recipes, and gathering new insight into the process of how a prokaryotic genome builds new functions will be important for future studies, with evident implications for the applicability of synthetic biology. The approach followed in our study disregarded relevant ecological aspects, which will be important to explore in future studies, such as population sizes, by assuming that a given domain has a certain occurrence just based on genomic sequences. For a specific ecosystem, total domain occurrence should ideally be derived from a weighted average, where weights are empirical population sizes. Results from individual ecosystems should then be averaged over all the ecosystems concerning the considered set of species, weighted by their relevance in evolutionary terms. We believe this can be addressed in future studies, as data of this kind starts to become available. Overall, we believe there are great potentials and great unmet challenges in genomics and metagenomics studies addressing the reciprocal roles of ecology and evolution.
  31 in total

Review 1.  Lateral gene transfer and the nature of bacterial innovation.

Authors:  H Ochman; J G Lawrence; E A Groisman
Journal:  Nature       Date:  2000-05-18       Impact factor: 49.962

2.  Expanding protein universe and its origin from the biological Big Bang.

Authors:  Nikolay V Dokholyan; Boris Shakhnovich; Eugene I Shakhnovich
Journal:  Proc Natl Acad Sci U S A       Date:  2002-10-16       Impact factor: 11.205

3.  Scaling laws in the functional content of genomes.

Authors:  Erik van Nimwegen
Journal:  Trends Genet       Date:  2003-09       Impact factor: 11.639

4.  Genetic regulatory mechanisms in the synthesis of proteins.

Authors:  F JACOB; J MONOD
Journal:  J Mol Biol       Date:  1961-06       Impact factor: 5.469

5.  Integration of horizontally transferred genes into regulatory interaction networks takes many million years.

Authors:  Martin J Lercher; Csaba Pál
Journal:  Mol Biol Evol       Date:  2007-12-24       Impact factor: 16.240

Review 6.  Biased gene transfer in microbial evolution.

Authors:  Cheryl P Andam; J Peter Gogarten
Journal:  Nat Rev Microbiol       Date:  2011-06-13       Impact factor: 60.633

Review 7.  Applications of the principle of maximum entropy: from physics to ecology.

Authors:  Jayanth R Banavar; Amos Maritan; Igor Volkov
Journal:  J Phys Condens Matter       Date:  2010-01-22       Impact factor: 2.333

8.  The frequency distribution of gene family sizes in complete genomes.

Authors:  M A Huynen; E van Nimwegen
Journal:  Mol Biol Evol       Date:  1998-05       Impact factor: 16.240

9.  Joint scaling laws in functional and evolutionary categories in prokaryotic genomes.

Authors:  J Grilli; B Bassetti; S Maslov; M Cosentino Lagomarsino
Journal:  Nucleic Acids Res       Date:  2011-09-21       Impact factor: 16.971

10.  Coding limits on the number of transcription factors.

Authors:  Shalev Itzkovitz; Tsvi Tlusty; Uri Alon
Journal:  BMC Genomics       Date:  2006-09-19       Impact factor: 3.969

View more
  4 in total

Review 1.  Physiology of the read-write genome.

Authors:  James A Shapiro
Journal:  J Physiol       Date:  2014-06-01       Impact factor: 5.182

Review 2.  Living Organisms Author Their Read-Write Genomes in Evolution.

Authors:  James A Shapiro
Journal:  Biology (Basel)       Date:  2017-12-06

3.  Family-specific scaling laws in bacterial genomes.

Authors:  Eleonora De Lazzari; Jacopo Grilli; Sergei Maslov; Marco Cosentino Lagomarsino
Journal:  Nucleic Acids Res       Date:  2017-07-27       Impact factor: 16.971

4.  Cross-species gene-family fluctuations reveal the dynamics of horizontal transfers.

Authors:  Jacopo Grilli; Mariacristina Romano; Federico Bassetti; Marco Cosentino Lagomarsino
Journal:  Nucleic Acids Res       Date:  2014-05-14       Impact factor: 16.971

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.