Literature DB >> 29491738

Conservation and Variation in Strategies for DNA Replication of Kinetoplastid Nuclear Genomes.

Catarina A Marques1, Richard McCulloch2.   

Abstract

INTRODUCTION: Understanding how the nuclear genome of kinetoplastid parasites is replicated received experimental stimulus from sequencing of the Leishmania major, Trypanosoma brucei and Trypanosoma cruzi genomes around 10 years ago. Gene annotations suggested key players in DNA replication initiation could not be found in these organisms, despite considerable conservation amongst characterised eukaryotes. Initial studies that indicated trypanosomatids might possess an archaeal-like Origin Recognition Complex (ORC), composed of only a single factor termed ORC1/CDC6, have been supplanted by the more recent identification of an ORC in T. brucei. However, the constituent subunits of T. brucei ORC are highly diverged relative to other eukaryotic ORCs and the activity of the complex appears subject to novel, positive regulation. The availability of whole genome sequences has also allowed the deployment of genome-wide strategies to map DNA replication dynamics, to date in T. brucei and Leishmania. ORC1/CDC6 binding and function in T. brucei displays pronounced overlap with the unconventional organisation of gene expression in the genome. Moreover, mapping of sites of replication initiation suggests pronounced differences in replication dynamics in Leishmania relative to T. brucei.
CONCLUSION: Here we discuss what implications these emerging data may have for parasite and eukaryotic biology of DNA replication.

Entities:  

Keywords:  DNA replication; Kinetoplastid parasites; Leishmania; Origin recognition complex (ORC); Origins of replication; Trypanosoma brucei

Year:  2018        PMID: 29491738      PMCID: PMC5814967          DOI: 10.2174/1389202918666170815144627

Source DB:  PubMed          Journal:  Curr Genomics        ISSN: 1389-2029            Impact factor:   2.236


INTRODUCTION

Every cell cycle, the genome must be completely and accurately duplicated before being transmitted to daughter cells. Incomplete or inaccurate replication risks genome instability, a hallmark of cancer [1], and it is vital that the entire genome is normally strictly replicated just once during the S phase of the cell cycle. Hence, DNA replication is a complex, tightly regulated process [2-5] that has been extensively studied in model eukaryotes, in bacteria and, most recently, in archaea [6]. DNA replication is initiated at origins of replication, which are found as a single sequence-defined site in all bacterial genomes. In archaea, origins are also defined by conserved sequences, but their organisation differs from bacteria in that some archaeal genomes possess a few origins. The large genomes of eukaryotes differ yet further, in that each linear chromosome contains many (sometimes hundreds to thousands) of potential origins of replication, which are not all activated in every cell cycle and are rarely defined by sequence elements but, instead, are associated with poorly defined features like chromatin structure and status [7]. Despite these differences, origins of replication across the three domains of life are recognised by initiator factors belonging to the AAA+ superfamily of NTPases [8]: the single DnaA [9] and Orc1/Cdc6 [10] factors in bacteria and archaea, respectively, and the six-subunit (Orc1-Orc6) origin recognition complex (ORC) in eukaryotes [11]. Binding of these factors demarcates the origins of replication and leads to the recruitment and activation of the replicative helicase. Subsequently, the remaining constituents of the replication machinery are in turn recruited, thus starting, and driving, DNA synthesis [3]. Much of the diversity of the eukaryotic domain resides in microbes, many of which display dramatic deviation in core biology from model, primarily opisthokont, organisms. Such diversity extends to nuclear biology, including variations in mitosis [12], gene expression [13], and genome organisation and stability [14]. Though most of our understanding of the machinery and execution of DNA replication in eukaryotes has focused on the opisthokonts, a number of recent studies have begun to explore nuclear replication in kinetoplastids, suggesting that here too there may be surprising variation [15, 16].

The kinetoplastid origin recognition complex

Sequencing of the TriTryp genomes (Trypanosoma brucei, Leishmania major and Trypanosoma cruzi) [17-19] in 2005 provided a spur for the field of kinetoplastid biology [20], including nuclear DNA replication. Prior to genome sequencing, attempts to examine replication dynamics could not consider the whole genome (see below) and little work had examined the replication machinery. Surprisingly, sequence similarity searches were unable to identify various core DNA replication proteins, in particular those involved in the initiation steps of the process: while orthologues of most factors involved in replication fork assembly and DNA synthesis were readily identified, only one ORC-related factor (a putative orthologue of Orc1) could be found [19] (Fig. and ). Like in other eukaryotes, this putative Orc1 subunit also shared homology with Cdc6 (an essential factor that interacts with ORC to allow loading of the replicative helicase onto origins of replication) [11]. The lack of identifiable orthologues of the other ORC subunits, Cdc6, and a further helicase loader (Cdt1), suggested that initiation of DNA replication in these parasites might be mechanistically more similar to archaeal organisms than model eukaryotes [19]. In line with this hypothesis, and supported by experimental evidence, Orc1 in T. cruzi and T. brucei was re-named ORC1/CDC6 [21] (Fig. ), reflecting the combined ORC and Cdc6 functions that reside in the single protein that binds each origin in archaea, while other studies analysed Orc1 in various Leishmania species [22, 23]. A 2011 study [24], however, questioned whether the kinetoplastid initiator is just a single protein. While focusing on the functional analysis of components of the replicative helicase in T. brucei - the heterohexameric minichromosome maintenance complex (MCM2-7), GINS complex and CDC45 - the authors identified a second, highly divergent orthologue of Orc1 [24] (Fig. and ). Labelled as TbORC1B, this putative initiator was shown to interact with TbORC1/CDC6 and TbMCM3, but its role in DNA replication was not explored. Nonetheless, further evidence that T. brucei might not have a single protein initiator factor, but an ORC-like complex, was provided by a study the following year, in which three more TbORC1/CDC6-interacting factors were identified [25] (Fig. and ): a highly divergent Orc4-like subunit (TbORC4), and two apparently kinetoplastid-specific factors, Tb7980 and Tb3120, with very limited homology with Orc proteins. More recent experiments, thus far limited to T. brucei, provide strong evidence of an ORC-like complex. RNA interference (RNAi) shows that loss of TbORC1/CDC6, TbORC1B, TbORC4 or Tb3120 impedes DNA replication [21, 25-27] and leads to similar growth and cell cycle defects [27]. Though loss of Tb7980 also results in the same proliferation defects [25], clear evidence of a role in DNA replication remains to be assessed. TbORC1/CDC6 depletion additionally leads to the expression of silent Variant Surface Proteins (VSGs) in bloodstream form cells [26] and in procyclic cells [28], suggesting that, similar to other eukaryotic Orc1 subunits, and despite lacking a bromo adjacent homology (BAH) domain (Fig. ), TbORC1/CDC6 has a role in gene silencing. However, whether these activities are executed within the putative ORC is unknown. TbORC1/CDC6-binding sites have also been mapped genome-wide [28]. Within the core of the chromosomes TbORC1/CDC6 binds mainly at the boundaries of the multi-gene transcription units, where it overlaps with mapped origins of replication (see below for full explanation), supporting a role in replication initiation. In addition, ~60% of TbORC1/CDC6 binding sites locate to the chromosomes’ subtelomeric regions [28]. Whether this dense binding in the subtelomeres is functionally related to binding directly to the telomeres [26], and whether this is related to TbORC1/CDC6’s silencing function(s), is not clear. Finally, gel filtration analysis suggests TbORC1/CDC6 and TbORC4 are present, most likely together, in a high molecular complex (~1011 to 530 kDa) that also seems to include the helicase subunit TbMCM3 [27]. As TbMCM3 has been shown to interact with TbORC1/CDC6 and TbORC1B [24], and its orthologue in yeast mediates the recruitment of the MCM2-7 helicase to the DNA-bound ORC [29], the accumulation of data indicate the presence of an ORC in T. brucei, albeit one that defied simple sequence-based identification. As each putative ORC component is syntenically conserved in T. cruzi and Leishmania, there is a strong likelihood that the diverged composition of T. brucei ORC is a feature of all kinetoplastids. What is less clear is why kinetoplastid ORC might be diverged, and whether this extends to changes in function or regulation. Most of our knowledge of the processes and molecular machineries involved in the eukaryotic cell cycle has been inferred from studies in a limited group of model organisms, including Saccharomyces cerevisiae, Schizosaccharomyces pombe, Drosophila melanogaster, Xenopus laevis, mouse and human, and extrapolated across the eukaryotic tree. However, because these organisms belong to a single supergroup of the eukaryotic domain, termed the Opisthokonta, it is less clear how well details are conserved in the five other eukaryotic supergroups, and thus, the extent of diversity in replication processes and molecular machineries across the eukaryotic domain is poorly explored [30, 31]. From studies in model eukaryotes, Orc1-5 each appear to be structurally similar, with each subunit containing a central or N-terminal AAA+ ATPase domain and at least one C-terminal winged-helix (WH) DNA-binding domain [6, 11] (Fig. , insert box). Within the AAA+ ATPase domain, conserved Walker A and B motifs (fundamental components of the ATP-binding site and essential for ATPase activity) and a signature arginine finger (involved in ATP hydrolysis of the adjacent AAA+ ATPase subunit in the protein complex) are needed to modulate ORC activity [32, 33]. In contrast, Orc6 appears to have evolved independently and does not possess any of these Orc-characteristic domains [11]. Recently, the ever-growing availability of sequenced genomes from a wide range of organisms, spanning all eukaryotic supergroups, has allowed sequence-based analysis of ORC-orthologues across the eukaryotic domain [30]. Reflecting the putative kinetoplastid ORC composition, not all six subunits of ORC were found across the analysed genomes, suggesting there is variability in the composition of ORC between eukaryotes. However, whether this reflects the existence of simpler ORCs, high divergence at the protein sequence level, less functional conservation of some ORC subunits, or the existence unrelated factors performing the role of Orc-subunits in certain organisms is unknown, and requires experimental investigation. Recently, the structure of D. melanogaster ORC was solved [34]. Prior to this elegant study, only archaeal ORC1/CDC6 protein structures, and a fragment of the human Orc6 subunit [35], were solved and available on the RCSB Protein Data Bank. This meant that protein structure modelling software could only model eukaryotic ORC subunits structure based on the archaeal ORC1/CDC6 structures. While this had the potential to model Orc4 and Orc5 sequences, as these subunits possess a conserved AAA+ ATPase domain [6, 11], modelling of Orc2 and Orc3 was particularly problematic, as these subunits have highly divergent AAA+ folds containing non-canonical Walker A and B motifs that are, however, conserved amongst different organisms’ Orc2 and Orc3 subunits [36, 37]. Because Orc6 is the least conserved of the Orc subunits in both sequence and function [35, 38-40], it was unlikely that other organisms’ Orc6 subunits could be modelled based on the available human structure. The availability of structures of all D. melanogaster ORC subunits [34], and very recently, of the human counterparts [41], has therefore opened the door for the modelling of Orc2-Orc5 related sequences. From this approach T. brucei Tb7980 and Tb3120 emerge with potential structural similarity to D. melanogaster Orc5 and Orc2 subunits, respectively [27]. Nevertheless, whether these act as functional TbORC5 and TbORC2 orthologues is unknown, and requires further study. Very recently, we have further identified another TbORC1/CDC6-interacting factor, which is a hypothetical zinc-finger protein [42, 43] that appears to possess feeble homology with D. melanogaster Orc3 (unpublished). Again, however, functional analysis is needed to verify such potential orthology. Such functional studies may be of value, since little work has explored if individual ORC subunits provide conserved, discrete functions across eukaryotes. To our knowledge, there is currently no hint for the presence of an Orc6 subunit in kinetoplastids. Nonetheless, if the conformation of kinetoplastid ORC follows that of D. melanogaster, where the subunits are arranged in a Orc1-Orc4-Orc5-Orc3-Orc2 ring, with Orc6 interacting with Orc3 [34], then the greater conservation of Orc1 and Orc4 would suggest these components represent a functionally constrained ‘core’, with greater functional flexibility amongst the other subunits, an arrangement consistent with eukaryote-wide ORC sequence comparisons [30]. Structural analysis is needed to test the above prediction of kinetoplastid ORC architecture, but sequence and experimental analyses provide further clues about subunit activity. Reflecting its early identification [19], ORC1/CDC6 is the subunit showing the greatest conservation with other eukaryotic Orc proteins. Analyses of L. major, T. brucei and T. cruzi ORC1/CDC6 protein sequences [21, 22] were able to identify a putative N-terminal AAA+ ATPase domain, with clearly identifiable Walker A, Walker B and arginine finger motifs, as well as a putative WH domain at the C-terminus [21, 22] (Fig. ). Functionality of these domains has been confirmed in T. brucei and T. cruzi ORC1/CDC6 proteins [21], while a Nuclear Localisation Signal (NLS) at the N-terminus has been experimentally confirmed in the L. donovani counterpart [23]. Nonetheless, despite their sequence homology with Orc1 subunits, the kinetoplastid ORC1/ CDC6 proteins are atypical, as none possesses an identifiable BAH domain at their N-terminus [21, 22], a feature thought to be universal across Orc1 subunits and shown to be involved in origin recognition and transcriptional silencing [44-49]. Despite being larger than TbORC1/CDC6, TbORC1B (and its L. major and T. cruzi counterparts) also lacks a BAH domain [24]. An N-terminal AAA+ ATPase domain is seen in TbORC1B, with conserved but widely spaced Walker A and B motifs [24] (Fig. ), but no arginine finger signature is found [24]. Whether this explains experimental data suggesting TbORC1B is devoid of ATPase activity [24] is unclear. Finally, sequence alignments with various eukaryotic Orc1 and Cdc6 proteins allows a weak prediction of a WH domain at the C-terminus of TbORC1B [43], but whether it binds to DNA has not been examined. Taken together, it remains unclear if TbORC1/CDC6 and TbORC1B are orthologues of Orc1 and Cdc6 and, since functional data suggest TbORC1B is highly unorthodox (see below), it remains premature to assign such orthology. TbORC4 also seems to possess an AAA+ ATPase domain; although this includes an Orc4 conserved arginine finger motif, both Walker A and B motifs appear degenerate [25] (Fig. ), suggesting that TbORC4 is, most likely, devoid of ATPase activity. Nevertheless, Orc4 subunits in other eukaryotes do not possess ATPase activity themselves, but their conserved arginine finger appears to be necessary for Orc1 ATP hydrolysis, by supplying the arginine finger to Orc1’s ATP-binding site [34, 50-52]. Analogously, it is thus possible that TbORC4 might stimulate TbORC1/CDC6 ATPase activity. It has not been possible to assess whether TbORC4 possesses a WH domain [25], and its binding to DNA has not been tested, though S. pombe Orc4 clearly provides such a function through AT-hook motifs [53]. Sequence analysis of Tb7980 suggests the presence of an AAA+ ATPase domain with relatively conserved Walker A and Walker B motifs, but no arginine finger or WH signatures [15, 25, 27] (Fig. ). Like TbORC4, Tb7980 ATPase and DNA binding activities have not been assessed. Structural modelling of Tb3120 hints at homology with D. melanogaster Orc2 [27], but this is mainly restricted to their C-termini (where the WH domain of D. melanogaster Orc2 is present) [27] (Fig. ). Further alignment of Tb3120 with multiple Orc2 protein sequences revealed that Tb3120 contains non-canonical Walker A and B motifs with characteristic signatures of Orc2 proteins. These are, however, separated by a large insertion (also observed in L. major and T. cruzi Tb3120 orthologues), suggesting that Tb3120 does not possess an intact AAA+ ATPase domain and is, therefore, unlikely to possess ATPase activity. In addition, Tb3120 and its orthologues in L. major and T. cruzi are considerably larger than model eukaryotes’ Orc2 subunits, and appear to have evolved a N-terminal extension with no detectable features or predicted function [27]. Further experimental analysis of Tb3120 will be essential to test whether this protein is the kinetoplastid equivalent of Orc2, and to better understand the functional implications of the sequence features highlighted above. TbORC1/CDC6 [21], TbORC4, Tb3120 and Tb7980 all localise to the T. brucei nucleus throughout the cell cycle [27]. In striking contrast, TbORC1B localisation and expression is cell cycle dependent, with TbORC1B being detected only in the nucleus of cells in late G1 to late S or G2 phase [27]. Together with the rapid impairment of DNA replication and cell cycle progression upon depletion by RNAi [27], the available data suggest that TbORC1B might act as a positive regulator of DNA replication in T. brucei, rather than being a static member of the putative ORC-like complex. However, TbORC1B expression and localisation dynamics do not resemble any regulatory DNA replication factor described in other eukaryotes, including Cdc6. Indeed, TbORC1B’s lack of ATPase activity renders it an unlikely candidate to provide Cdc6 function, which acts by ATPase-dependent remodelling of ORC [51]. Thus, TbORC1B may represent a truly kinetoplastid-specific adaptation and it remains possible that T. brucei ORC-MCM interactions are archaeal-like in lacking Cdc6/Cdt1 mediation [54], consistent with TbORC1/CDC6 having been shown to be able to complement S. cerevisiae cdc6 temperature-sensitive mutants [21]. Whether or not the emerging data on the putative ORC in kinetoplastids can provide potential targets for drug development against these parasites remains to be seen [15, 16, 55], but unexpected levels of divergence in a core component of genome maintenance is emerging. It is intriguing to note that characterisation of the kinetoplastid kinetochore complex has also required protein interaction-based recovery of novel diverged components [56, 57], with an emerging picture of greatest variation in the subunits that interact with the genome [58]. Perhaps these parallel sets of studies hint that the unusual organisation of the kinetoplastid genome has necessitated widespread innovation in the protein factories that act in chromosome biology.

DNA replication dynamics in TRYPANOSOMA BRUCEI and LEISHMANIA

In order to be completely and accurately duplicated within the S phase of the cell cycle, the large linear eukaryotic genomes are replicated from many origins of replication distributed throughout their multiple chromosomes. Increased numbers of origins necessitated the evolution of DNA replication controls to ensure that the genome is duplicated exactly once during the cell cycle, and that under-replication and/or re-replication of parts of the genome is prevented. To this end, the process of eukaryotic DNA replication is divided into two non-overlapping phases: origin licensing and origin firing. Origin licensing takes place from late mitosis to the end of G1 phase and consists of the loading of the replicative helicase (MCM2-7 double hexamers), in an inactive state, to every potential origin of replication in the genome, which are already demarcated by the binding of ORC [2-5]. During S phase, only a subset of origins is fired, not simultaneously but according to a DNA replication-timing programme [59], by the recruitment of additional replication complexes to the potential origins, establishing bidirectional replication forks at sites distributed across the genome [2-5]. By licensing origins before entry in S phase, already replicated origins cannot be re-licensed during S or G2 phases of the cell cycle, thus avoiding re-replication and the risk of genomic instability [2-5]. However, this implies that before entering S phase, the number of licensed origins must be sufficient to allow the complete replication of the genome, including in the event of replication stress from internal and external sources [1]. Therefore, the licensing of more origins than are needed to fire during S phase provides dormant origins that can be activated as a failsafe to prevent under-replication of regions of the genome and, thus, ensure genomic integrity [60]. If not activated, the excess licensed origins are passively replicated by replication forks established at neighbouring origins. Origin usage in eukaryotes is, therefore, very flexible [4]. With the exception of S. cerevisiae [61], eukaryotic origins of replication are not defined by specific DNA sequences. Instead, ORC-binding to origins and their later activation during S phase seems to depend on the combination of an assortment of less defined markers [2, 4, 7, 62, 63], such as: sequence features, including AT-rich regions [53, 64, 65] and CpG islands [66-71]; DNA topology, like G-quadruplexes [71-76] and negatively supercoiled DNA [77]; chromatin structure and status, such as nucleosome-free chromatin [78-80] and DNase-sensitive sites [76]; histone modifications [76, 81-84]; transcription [85, 86], with regions containing transcriptionally active genes being replicated earlier [87, 88], and some origins found at promoters (including RNA Polymerase II-binding sites) [65, 68, 78]; and the positioning of origin DNA in the nuclear space [89-92], with early fired origins localising to the nuclear interior, and late activated ones locating to the nuclear periphery. Due to their elusive characteristics, identification and mapping of origins of replication in any eukaryotic genome has not been straightforward. However, the development of high throughput sequencing methodologies [93] has increased the availability of sequenced genomes and allowed the development of techniques to map and correlate processes genome-wide, including, amongst others, the chromatin landscape [94-97], gene expression [98-100], and DNA replication dynamics [63, 101] - including origin localisation and usage. Recent studies in kinetoplastids provide an example of how these developments rapidly improved our understanding of DNA replication in these parasites. Prior to genome sequencing, origin mapping relied upon evaluating the ability of genome sequences to enhance episome stability [102] or maintain stability of fragmented chromosomes [103], or on analysis of small T. brucei minichromosomes [104]. The availability of near wholly compiled genomes has, to date, provided a genome-wide view of replication in T. brucei and two Leishmania species. The T. brucei [17] and L. major [18] genomes (~50-66 Mbp, diploid) are organised in an unconventional architecture: unlike most other eukaryotes, the genomes of these closely related parasites are structured into ~200 multi-gene clusters, each of which is transcribed from its own RNA polymerase (Pol) II transcription start site [105-107]. Genome-wide multigenic transcription means most of the genome is traversed by RNA Pol II and gene expression controls are mainly post-transcriptional [108, 109]. Moreover, the core genomes of these parasites are extensively syntenic [105], despite being organised in strikingly different numbers of chromosomes: 11 stably diploid megabase chromosomes in T. brucei [17], and 36 chromosomes of variable ploidy in L. major [18, 110, 111]. In T. brucei aneuploidy appears much less pervasive than in Leishmania, being limited to the subtelomeres of the megabase chromosomes and to mini- and intermediate chromosomes, all of which act as stores for VSG genes [112]. Genome-wide mapping of origins of replication was first performed in T. brucei [28] using deep sequencing marker frequency analysis (MFA-seq), also known as Sort-seq in yeast [113, 114]. Briefly, starting from an unsynchronised population, cells in S and G2 phase are isolated based on their DNA content by fluorescence activated cell sorting (FACS), and their DNA is then purified and sequenced. The resulting reads are mapped to the reference genome, allowing the ratio between S (replicating) and G2 (non-replicating) phases to be calculated. Representing the S/G2 read depth ratios across the chromosomes (Fig. ) reveals ‘peaks’ that emanate from active origins, while “valleys” denote zones of replication termination [114], thus allowing assessment of origin location and strength genome-wide. In yeast, Sort-seq shows remarkably precise correlation with predicted sequence-conserved origins [113] and with recent replication mapping by Okazaki fragment sequencing (OK-seq) [115]. MFA-seq in T. brucei insect stage form (procyclic forms) cells in early-mid S phase [28] revealed that these parasites comply with the general principles of the eukaryotic DNA replication dynamics model: multiple origins were mapped per chromosome; the number of origins per chromosome correlates with chromosome size; all origins overlap with a subset (~4.4%) of TbORC1/CDC6-binding sites; differences in MFA-peak height suggest origins are activated at different times during S phase, with the earliest activated origins co-localising with mapped centromeres (chromosomes 1 to 8 [116]); and neither active origins nor TbORC1/CDC6-binding sites appear to be defined by specific sequences. Though MFA-seq cannot localise origins with great accuracy, all peaks centred on the so-called strand-switch regions (SSRs) that separate multi-gene transcription units and contain transcription start (divergent SSRs) or end (convergent SSRs) locations, or both (head-to-tail SSRs) [28]. In fact, evidence for DNA replication and transcription functionally intersecting could be seen after RNAi depletion of TbORC1/CDC6, which results in increased transcript levels at the SSRs, suggesting TbORC1/CDC6 might contribute to the outlining of transcription boundaries [28]. However, not all of the putative ORC-like components have been localised in the genome, the manner of ORC recruitment to DNA remains unclear, and MFA-seq could not map origin activity in the chromosome subtelomeres or in mini- and intermediate chromosomes. Despite localisation of both origins and TbORC1/CDC6-binding sites to SSRs, no common sequence elements have to date been identified, suggesting T. brucei origin demarcation is dissimilar to yeast. Moreover, the number of detectable origins (42 MFA-seq peaks in the ~26 Mbp haploid genome) suggests an inter origin distance (IOD) of ~600 kbp, which is considerably greater than the predicted IOD of ~42 kbp in S. cerevisiae (based on ~280 Sort-seq peaks in a haploid genome of ~12 Mbp) [114]. More recently, MFA-seq was used to map origins in early and late S phase T. brucei cells, in two different strains (TREU927 and Lister 427), and in the two culturable life cycle stages of the parasite (procyclic and bloodstream forms) [43, 117]. In all these settings, essentially the same MFA-seq peaks are observed in the core genome, suggesting that origin location and usage is relatively rigid, though what distinguishes an origin-active SSR from an origin-inactive SSR is unknown. However, one locus shows pronounced changes in replication usage dependent on transcription activity: the single telomeric bloodstream VSG expression site (BES) that is transcriptionally active in bloodstream form cells is replicated early, whereas the remaining ~15 silent BES replicate in late S phase; moreover, all VSG BES are late-replicating in procyclic cell forms, when transcription is suppressed at all the sites [117]. These data strengthen the link between T. brucei DNA replication and transcription, and suggest a potential exploitation of DNA replication to drive antigenic variation, a topic recently reviewed elsewhere [118]. The wide spacing of T. brucei origins and pronounced rigidity in replication dynamics across the chromosomes’ core may result from the parasite’s odd genome organisation in well-defined multi-gene transcription units, meaning there is little flexibility in sites of replication initiation, as origins sited within the transcription units could lead to catastrophic clashes between RNA Pol II and the replisome. Nevertheless, single molecule analysis of DNA replication in chromosome 1 suggests at least some SSRs might be activated after hydroxyurea-induced replicative stress, and some replication might initiate from undefined subtelomeric sites [119]. Whether there is a genome-wide flexibility in origin usage under replicative stress or other conditions requires further investigation. Given that T. brucei and Leishmania share an unconventional genome architecture and possess high levels of gene synteny, it might be predicted that DNA replication dynamics are comparable in the two parasites. However, MFA-seq in Leishmania insect stage cells (promastigotes) confounds this expectation (Fig. ) [120]. MFA-seq has so far been performed in two species of Leishmania, L. major (old world) and L. mexicana (new world), whose genomes are, respectively, distributed into 36 and 34 chromosomes [121, 122] of variable ploidy [110]. Strikingly, in both Leishmania species, and in both early and late S cells, only one origin could be identified by MFA-seq per chromosome [120]. If confirmed, origin singularity in Leishmania is unprecedented in eukaryotes studied to date, as this was thought to be exclusive to the smaller genomes of bacteria and some archaea. L. mexicana chromosomes 8 and 20 are each syntenic not to one, but to two L. major chromosomes (29 plus 8, and 36 plus 20, respectively) [110]. Despite this genome reorganisation, only one origin could be mapped to each L. mexicana chromosome, suggesting that a single origin per chromosome is retained in the face of pronounced chromosome rearrangements [120]. Additionally, all detected MFA-seq peaks have a similar height and width, suggesting that all origins are of similar strength/usage and that there is no replication timing programme in Leishmania [120]. Nonetheless, like in T. brucei, all detected origins overlap exclusively with SSRs and, remarkably, ~40% of the identified origins were conserved in location in the two genomes [120]. Though such conservation might indicate a commonality in kinetoplastid origins, no origin-defining sequence, motif or pattern could be identified [120]. Nonetheless, detailed analysis of SSRs with and without origin activity revealed that in Leishmania, but not in T. brucei, the distance between the two most proximal genes to the SSR was significantly larger in SSRs with origin activity than in non-origin active SSRs [120]. This difference even extended to the origin-active SSRs of L. major chromosomes 29 and 36 and their syntenic, but non-origin active SSRs in L. mexicana chromosomes 8 and 20, respectively [120]. These data support the idea that in each Leishmania chromosome a specific SSR is associated with pronounced origin activity, though what causes this effect (e.g. the accumulation of specific factors) has not been investigated. One possibility is that, as in T. brucei, the Leishmania origin-active SSRs overlap with centromeres, but to date centromeres in Leishmania have not been successfully mapped. Many of the Leishmania SSRs, which provide origins of equal strength, are conserved with SSRs that in T. brucei map as origins of variable strength [28, 117], a change in function that also deserves further examination. The MFA-seq prediction of a single origin in each Leishmania chromosome appears inadequate to explain complete genome replication. Estimations of a 3-4 hour S phase [123] and a ~2.5 kbp.min-1 DNA replication rate [124] suggest that while a single origin might be enough to drive the complete replication of the parasite’s smaller chromosomes (0.28-0.84 Mbp, ~66% of the genome), it is insufficient to fully replicate the larger ones (up to ~3.3 Mbp) [120]. How, then, Leishmania parasites completely replicate their genome is unclear. It is possible that less efficient or less frequently used origins might have escaped MFA-seq detection (detection threshold of ~25% of the activity of the mapped origins) [120]. However, because the detected origins in Leishmania and T. brucei localise exclusively to SSRs, one would predict that other origins would also localise to SSRs. Chromosome 32, of ~1.6 Mbp, for instance, has only four SSRs and, therefore, only four potential origins, but only one displays an MFA-seq peak. Origin activity at the other SSRs, if based on stochastic firing in the population, would therefore exceed the MFA-seq detection threshold. These considerations suggest that other events, perhaps not used in T. brucei, may promote complete Leishmania genome replication (see below). Two more recent studies have investigated DNA replication in Leishmania using different approaches to MFA-seq and the conflicting results may provide clues to how genome replication occurs in these parasites [124, 125]. One study, using the highly sensitive mapping methodology of short nascent leading strand purification coupled with sequencing (SNS-seq) [72] predicts ~5100 DNA replication initiation sites across the L. major genome (Fig. ), with no preferential localisation at SSRs [125]. This prediction is >100 fold more than the origins mapped by MFA-seq and suggests one potential origin every other gene (~8300 total genes), meaning origins are found throughout the polycistronic transcription units. The huge number of SNS-seq predicted origins may suggest the technique’s sensitivity identifies origins used by only a few cells in the population, perhaps indicating flexibility in origin usage across the cellular population, which contrasts with the rigidity in origin usage and timing in T. brucei [28, 117]. SNS-seq has not to date been applied to T. brucei, but if similar numbers of origins of replication were detected, it would represent 14-fold greater origins than TbORC1/CDC6-binding sites (~360) [28]. DNA fibre stretching analysis, in two independent studies [124, 125], provides evidence that MFA-seq and SNS-seq respectively underestimate and overestimate DNA replication initiation events in Leishmania. Both studies could detect molecules with >1 site of replication initiation, which can be extrapolated to IODs of ~60-300 kbp (with considerably greater variation than fibre analysis predicts in T. brucei) [119]. Such IODs are 10-30 fold greater than the potential IOD of ~6.5 kbp that SNS-seq predicts [125]. Moreover, further extrapolation from the fibre analysis predicts ~150-400 origins, equalling or exceeding the number of mapped SSRs. It is currently impossible to correlate the Leishmania fibre data with MFA-seq, SNS-seq or genomic landmarks as the labelling is unable to localise replication sites within or between chromosomes, or indeed exclude that episomes have been examined. In contrast, fibre analysis linked to fluorescence in situ hybridisation (FISH) in chromosome 1 of T. brucei provides good correspondence with MFA-seq [119]. Can the apparent discord in current Leishmania replication data be explained? The above considerations suggest replication dynamics differ dramatically between T. brucei and Leishmania, despite the common use of SSRs as origins (as revealed by MFA-seq; Fig. ). Though there are gaps in our understanding in both parasites (e.g. ORC1/CDC6-binding sites have not been mapped in Leishmania; SNS-seq has not been applied to T. brucei), it is attractive to speculate that differing replication strategies are found in the two parasites and that these might explain their differing genome organisations (larger, stable chromosome in T. brucei and smaller, frequently aneuploid chromosomes in Leishmania). At least two hypotheses for differing replication dynamics might be considered. First, it is possible that MFA-seq hugely underestimates the number of origins in the Leishmania genome and the single peaks observed in each chromosome arise from dense clustering of multiple initiation sites, as revealed by SNS-seq [125, 126]. This model suggests Leishmania adopts a replication strategy similar to metazoans [72, 74, 127], with domains of early and late firing clusters of origins. This may be plausible, as SNS-seq has been widely used in metazoans but shows limited overlap with other established origin-mapping methods, such as ORC chromatin immunoprecipitation (ChIP-seq) or OK-seq mapping [128], and has an apparent bias towards CpG islands and G-quadruplex motifs [63]. However, if this model is correct, it is not clear why only a single, early-replicating origin cluster (MFA-seq peak) is seen per chromosome in Leishmania, whereas multiple putative clusters (MFA-seq peaks) are seen in each T. brucei chromosome. In addition, the model does not readily explain why the origins cluster so precisely around an SSR in each chromosome. In T. brucei, we know that SSRs are bound by ORC1/CDC6 and therefore the MFA-seq peaks, like similar analyses in yeast [113, 115], detect ORC-defined origins and not origin clusters [28]. In considering any such genome-wide mapping approaches it is important to take account of the very different genome sizes being examined in single-celled eukaryotes relative to metazoans [59]. Finally, if all SNS-seq peaks represent replication initiation at a conventional ORC-defined origin, then it is legitimate to ask how Leishmania (and perhaps T. brucei) copes with the likely pronounced impediment to efficient RNA Pol II progression through the multigene transcription units. Given all these questions, it is possible that DNA replication initiation in Leishmania does not rely on ORC, but on transcription-associated events, as postulated by Lombraña et al. (2016), who showed that 87-90% of SNS-seq peaks localise to gene trans-splicing regions, suggesting they correlate with genomic regions where transcription is predicted to decelerate or stall. In contrast, <4% of SNS-seq peaks map to convergent and head-to-tail SSRs, which correspond with ~81-85% of all transcription termination sites, and may be where ORC1/CDC6 localises (by analogy with T. brucei) [125]. A second hypothesis is that Leishmania uses a bi-modal DNA replication strategy. In this model, the MFA-seq peaks represent constitutive activation of a single ORC-bound origin, localised at an SSR, in each chromosome. These initiation events are therefore comparable with the MFA-seq mapping in T. brucei, explaining origin location conservation [129]. The second arm of the bi-modal strategy, necessary for complete duplication of the genome, is stochastic, with DNA replication being randomly initiated at ORC-independent (thus “origin”-independent) sites throughout the chromosomes. What these putative origin-independent replication initiation events are is unclear, but two possibilities can be envisaged, both of which might explain the SNS-seq data. The first possibility was suggested by Lombraña et al. [125], with the SNS-seq peaks representing sites where the MCM helicase accumulates and recruits the replisome without ORC [130]. An alternative, and more radical explanation, is that stochastic, ORC-independent initiation of DNA replication relies on DNA recombination. This suggestion is also compatible with the SNS-seq data, since, similar to mouse and human cells, ~74% of the SNS-seq peaks appeared to be significantly associated with G-quadruplex motifs [125], structures that can lead to DNA Pol slow down/stalling and, consequently, DNA damage [75]. In this model, ORC does not recruit the rest of the replisome, but instead recombination factors perform this role, such as during break-induced replication [131, 132]. Recombination-directed initiation of DNA replication, while unprecedented on such a scale, has the attraction of explaining the pervasive genome plasticity of Leishmania, including the role of recombination in the formation of genome-wide episomes [133-135]. Furthermore, recombination is needed to support continued replication in at least some polyploid archaea in which origins have been deleted [136]. Testing all these models will require genome-wide mapping of origins of replication in Leishmania using techniques other than MFA-seq and SNS-seq, allied to ChIP-seq mapping of key replication and recombination factors. In addition, it will be illuminating to ask if putative ORC-independent events are found in some circumstances in the T. brucei genome, and in other kinetoplastids.

CONCLUSION

Our understanding of DNA replication in kinetoplastids remains limited, but is growing. Most of our knowledge on this vital cellular process has been unravelled in the last ten years. In this period, we have moved from an archaeal-like model of the DNA replication initiator factor to a eukaryotic ORC-like complex, though a highly divergent one in which some subunits still remain to be identified or may be absent, and some appear to display pronounced sequence divergence. Due to its divergence from model eukaryotic ORCs, further analysis of kinetoplastid ORC structure and the functionality of the most diverged subunits, such as Tb3120, may illuminate ORC evolution and activity, as well as their potential as targets for drug development. Presently, the emerging data on DNA replication in Leishmania is confusing, but we envisage that the current discord will be reconciled in time. Nonetheless, it seems clear that the dynamics of origin usage, replication timing and potentially replication execution in Leishmania are drastically different from T. brucei and other characterised microbial eukaryotes. Comparison of these two related kinetoplastids, and analysis of further organisms in this eukaryotic tree grouping, may reveal fundamental features of the evolution of eukaryotic chromosome replication, while uncovering the processes that shape the biology of Leishmania.

CONSENT FOR PUBLICATION

Not applicable.
  132 in total

1.  DNA topology, not DNA sequence, is a critical determinant for Drosophila ORC-DNA binding.

Authors:  Dirk Remus; Eileen L Beall; Michael R Botchan
Journal:  EMBO J       Date:  2004-02-05       Impact factor: 11.598

Review 2.  Nuclear architecture, genome and chromatin organisation in Trypanosoma brucei.

Authors:  Klaus Ersfeld
Journal:  Res Microbiol       Date:  2011-03-21       Impact factor: 3.992

Review 3.  Motors and switches: AAA+ machines within the replisome.

Authors:  Megan J Davey; David Jeruzalmi; John Kuriyan; Mike O'Donnell
Journal:  Nat Rev Mol Cell Biol       Date:  2002-11       Impact factor: 94.444

Review 4.  Transcription-replication encounters, consequences and genomic instability.

Authors:  Anne Helmrich; Monica Ballarino; Evgeny Nudler; Laszlo Tora
Journal:  Nat Struct Mol Biol       Date:  2013-04       Impact factor: 15.369

5.  The Leishmania genome comprises 36 chromosomes conserved across widely divergent human pathogenic species.

Authors:  P Wincker; C Ravel; C Blaineau; M Pages; Y Jauffret; J P Dedet; P Bastien
Journal:  Nucleic Acids Res       Date:  1996-05-01       Impact factor: 16.971

6.  A cytokinetic function of Drosophila ORC6 protein resides in a domain distinct from its replication activity.

Authors:  Igor N Chesnokov; Olga N Chesnokova; Michael Botchan
Journal:  Proc Natl Acad Sci U S A       Date:  2003-07-23       Impact factor: 11.205

7.  Cdc6-induced conformational changes in ORC bound to origin DNA revealed by cryo-electron microscopy.

Authors:  Jingchuan Sun; Hironori Kawakami; Juergen Zech; Christian Speck; Bruce Stillman; Huilin Li
Journal:  Structure       Date:  2012-03-07       Impact factor: 5.006

8.  Trypanosome prereplication machinery: a potential new target for an old problem.

Authors:  Simone Guedes Calderano; Patricia Diogo de Melo Godoy; Julia Pinheiro Chagas da Cunha; Maria Carolina Elias
Journal:  Enzyme Res       Date:  2011-05-25

9.  Genome-wide mapping reveals single-origin chromosome replication in Leishmania, a eukaryotic microbe.

Authors:  Catarina A Marques; Nicholas J Dickens; Daniel Paape; Samantha J Campbell; Richard McCulloch
Journal:  Genome Biol       Date:  2015-10-19       Impact factor: 13.583

Review 10.  Does DNA replication direct locus-specific recombination during host immune evasion by antigenic variation in the African trypanosome?

Authors:  Rebecca Devlin; Catarina A Marques; Richard McCulloch
Journal:  Curr Genet       Date:  2016-11-07       Impact factor: 3.886

View more
  10 in total

1.  Conditional knockout of RAD51-related genes in Leishmania major reveals a critical role for homologous recombination during genome replication.

Authors:  Jeziel D Damasceno; João Reis-Cunha; Kathryn Crouch; Dario Beraldi; Craig Lapsley; Luiz R O Tosi; Daniella Bartholomeu; Richard McCulloch
Journal:  PLoS Genet       Date:  2020-07-01       Impact factor: 5.917

2.  Replication origin location might contribute to genetic variability in Trypanosoma cruzi.

Authors:  Christiane Bezerra de Araujo; Julia Pinheiro Chagas da Cunha; Davi Toshio Inada; Jeziel Damasceno; Alex Ranieri Jerônimo Lima; Priscila Hiraiwa; Catarina Marques; Evonnildo Gonçalves; Milton Yutaka Nishiyama-Junior; Richard McCulloch; Maria Carolina Elias
Journal:  BMC Genomics       Date:  2020-06-22       Impact factor: 3.969

Review 3.  Origins of DNA replication.

Authors:  Babatunde Ekundayo; Franziska Bleichert
Journal:  PLoS Genet       Date:  2019-09-12       Impact factor: 5.917

4.  Persistent DNA Damage Foci and DNA Replication with a Broken Chromosome in the African Trypanosome.

Authors:  Lucy Glover; Catarina A Marques; Olga Suska; David Horn
Journal:  mBio       Date:  2019-07-09       Impact factor: 7.867

5.  The CRK2-CYC13 complex functions as an S-phase cyclin-dependent kinase to promote DNA replication in Trypanosoma brucei.

Authors:  Kyu Joon Lee; Ziyin Li
Journal:  BMC Biol       Date:  2021-02-11       Impact factor: 7.431

6.  Genome-scale RNA interference profiling of Trypanosoma brucei cell cycle progression defects.

Authors:  Catarina A Marques; Melanie Ridgway; Michele Tinti; Andrew Cassidy; David Horn
Journal:  Nat Commun       Date:  2022-09-10       Impact factor: 17.694

Review 7.  Read, Write, Adapt: Challenges and Opportunities during Kinetoplastid Genome Replication.

Authors:  Jeziel D Damasceno; Catarina A Marques; Jennifer Black; Emma Briggs; Richard McCulloch
Journal:  Trends Genet       Date:  2020-09-28       Impact factor: 11.821

8.  Genome-wide mapping reveals conserved and diverged R-loop activities in the unusual genetic landscape of the African trypanosome genome.

Authors:  Emma Briggs; Graham Hamilton; Kathryn Crouch; Craig Lapsley; Richard McCulloch
Journal:  Nucleic Acids Res       Date:  2018-12-14       Impact factor: 16.971

9.  Chromosomal copy number variation analysis by next generation sequencing confirms ploidy stability in Trypanosoma brucei subspecies.

Authors:  Laila Viana Almeida; Anderson Coqueiro-Dos-Santos; Gabriela F Rodriguez-Luiz; Richard McCulloch; Daniella Castanheira Bartholomeu; João Luís Reis-Cunha
Journal:  Microb Genom       Date:  2018-09-27

10.  Genome duplication in Leishmania major relies on persistent subtelomeric DNA replication.

Authors:  Jeziel Dener Damasceno; Catarina A Marques; Dario Beraldi; Kathryn Crouch; Craig Lapsley; Ricardo Obonaga; Luiz Ro Tosi; Richard McCulloch
Journal:  Elife       Date:  2020-09-08       Impact factor: 8.713

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.