Literature DB >> 34610021

High-quality reference genome of Fasciola gigantica: Insights into the genomic signatures of transposon-mediated evolution and specific parasitic adaption in tropical regions.

Xier Luo1,2, Kuiqing Cui2, Zhiqiang Wang2, Zhipeng Li2, Zhengjiao Wu2, Weiyi Huang2, Xing-Quan Zhu3, Jue Ruan1,2, Weiyu Zhang2, Qingyou Liu1,2.   

Abstract

Fasciola gigantica and Fasciola hepatica are causative pathogens of fascioliasis, with the widest latitudinal, longitudinal, and altitudinal distribution; however, among parasites, they have the largest sequenced genomes, hindering genomic research. In the present study, we used various sequencing and assembly technologies to generate a new high-quality Fasciola gigantica reference genome. We improved the integration of gene structure prediction, and identified two independent transposable element expansion events contributing to (1) the speciation between Fasciola and Fasciolopsis during the Cretaceous-Paleogene boundary mass extinction, and (2) the habitat switch to the liver during the Paleocene-Eocene Thermal Maximum, accompanied by gene length increment. Long interspersed element (LINE) duplication contributed to the second transposon-mediated alteration, showing an obvious trend of insertion into gene regions, regardless of strong purifying effect. Gene ontology analysis of genes with long LINE insertions identified membrane-associated and vesicle secretion process proteins, further implicating the functional alteration of the gene network. We identified 852 predicted excretory/secretory proteins and 3300 protein-protein interactions between Fasciola gigantica and its host. Among them, copper/zinc superoxide dismutase genes, with specific gene copy number variations, might play a central role in the phase I detoxification process. Analysis of 559 single-copy orthologs suggested that Fasciola gigantica and Fasciola hepatica diverged at 11.8 Ma near the Middle and Late Miocene Epoch boundary. We identified 98 rapidly evolving gene families, including actin and aquaporin, which might explain the large body size and the parasitic adaptive character resulting in these liver flukes becoming epidemic in tropical and subtropical regions.

Entities:  

Mesh:

Substances:

Year:  2021        PMID: 34610021      PMCID: PMC8519440          DOI: 10.1371/journal.pntd.0009750

Source DB:  PubMed          Journal:  PLoS Negl Trop Dis        ISSN: 1935-2727


Introduction

Fasciola gigantica and Fasciola hepatica, known as liver flukes, are two species in the genus Fasciola, which cause fascioliasis commonly in domestic and wild ruminants, but also are causal agents of fascioliasis in humans. Fascioliasis reduces the productivity of animal industries, imposes an economic burden of at least 3.2 billion dollars annually worldwide, and is a neglected zoonotic tropical disease of humans, according the World Health Organization’s list [1]. F. gigantica, the major fluke infecting ruminants in Asia and Africa, has been a serious threat to the farming of domesticated animals, such as cows and buffaloes, and dramatically reduces their feed conversion efficiency and reproduction [2]. The prevalence of F. gigantica infection has greatly affected subsistence farmers, who have limited resources to treat their herds, and has hindered economic development and health levels, especially in developing countries. The various omics technologies provide powerful tools to advance our understanding of the molecules that act at the host-parasite interface, and allow the identification of new therapeutic targets against fascioliasis [3]. To date, four assemblies for F. hepatica and two assemblies for F. gigantica have been deposited at the NCBI [4-7]. These assemblies reveal a large genome with a high percentage of repeat regions in Fasciola species, and provided valuable insights into features of adaptation and evolution. However, these assemblies are based on the short read Illumina sequencing or hybrid sequencing methods, with limited ability to span large families of repeats. Various limitations have led to the current assemblies in the genus Fasciola being fragmented (8 kb to 33 kb and 128 kb to 1.9 Mb for contig and scaffold N50s, respectively). Subsequent gene annotation analysis using current assemblies were also challenging, with abundant transposition events occurring over evolutionary history, which significantly increased the repeat components in intron regions, resulting in considerable fragmentation in gene annotation. Infection by Fasciola causes extensive damage to the liver, and excretory/secretory (E/S) proteins play an important role in host-parasite interactions. Parasite-derived molecules interact with proteins from the host cell to generate a protein interaction network, and these proteins partly contribute to Fasciola’s striking ability to avoid and modulate the host’s immune response [8]. Previous proteomics of E/S proteins have highlighted the importance of secreted extracellular vesicles (EVs) and detoxification enzymes to modulate host immunity by internalizing with host immune cells [9,10]. The anthelminthic drug, triclabendazole (TCBZ), is currently the major drug available to treat fascioliasis at the early and adult stages, which acts by disrupting β-tubulin polymerization [11]; however, over-reliance on TCBZ to treat domesticated ruminants has resulted in selection for resistance to liver flukes [12]. Drug and vaccine targets for molecules associated with reactive oxygen species (ROS)-mediated apoptosis have recently been validated as an effective tools in multiple helminth parasites [13]. Increased understanding of host-parasite and drug-parasite interactions would facilitate the development of novel strategies to control fascioliasis. In recent years, there have been increasing numbers of human cases of fascioliasis, becoming a major public health concern in many regions [14,15]. However, high quality genome assemblies for liver flukes are still insufficient. In the present study, we combined multiple sequencing technologies to assemble a chromosome-level genome for F. gigantica and provided integrated gene annotation. Protein-protein interactions were analyzed between the predicted F. gigantica secretome and host proteins expressed in the small intestine and liver. In addition, gene family analysis identified a series of genes expansions in F. gigantica. Interestingly, the distribution of repeat sequences in the genome exhibit an excess of long interspersed element (LINE) duplications inserted into intronic regions, potentially helping to explain the duplications of transposable element (TE) plasticizing gene structures and possibly acting as long-term agents in the speciation of Fasciola.

Results

Pacbio long reads-based de novo assembly and gene annotation

The F. gigantica genome contains abundant repeat sequences that are difficult to span using short read assembly methods, and the complex regions also hinder integrated gene annotation of the genome. Therefore, in the present study, multiple sequencing technologies, have been applied: (1) Single-molecule sequencing long reads (~91× depth) using the Pacbio Sequel II platform; (2) paired-end reads (~66× depth) using the Illumina platform; and (3) chromosome conformation capture sequencing (Hi-C) data (~100× depth) (S1 Table). The initial assembly was performed using the Pacbio long reads, followed by mapping using single-molecule sequencing and Illumina sequencing reads to polish assembly errors and sequencing mistakes, resulting in a contig N50 size of 4.89 Mb (Fig 1A). The Hi-C data were used to build final super-scaffolds, resulting in a total length of 1.35 Gb with a scaffold N50 size of 133 Mb (Fig 1B and S1 Fig and Table 1 and S1–S3 Tables). The final assembly consists of 10 pseudo-chromosomes covering more than 99.9% of the F. gigantica genome, and the length distribution was approximate equal to the estimation by karyotype in previous research (S2 Fig and S4 Table) [16]. The assessment of nucleotide accuracy shows that the error rate was 5.7×10−6 in the genome. QUAST analysis [17] showed a high mapping and coverage rate using both Illumina short reads and Pacbio long reads, in which 99.73% of reads mapped to 99.85% of the genome with more than 10× depth (S5 Table).
Fig 1

Landscape of the Fasciola gigantica genome.

(A) Comparisons of the assembled contigs and scaffold lengths (y-axis) and tallies (x-axis) in Fasciola species. (B) Hi-C interactive heatmap of the genome-wide organization. The effective mapping read pairs between two bins were used as a signal of the strength of the interaction between the two bins. (C) Integration of genomic and annotation data using 1 Mb bins in 10 Hi-C assembled chromosomes. (a) Distribution of the GC content (GC content > 39% and < 52%); (b) distribution of the long interspersed element (LINE) percentage > 0% and < 50%; (c) distribution of the long terminal repeat (LTR) percentage > 0% and < 50%; (d) distribution of the gene percentage > 0% and < 70%; (e) distribution of the heterozygosity density of our sample (percentage > 0% and < 1%); (f) distribution of the heterozygosity density of SAMN03459319 in the NCBI database. Hi-C, chromosome conformation capture sequencing;

Table 1

Summary statistics for the genome sequences and annotation.

F. gigantica
GenomeTotal Genome Size (Mb)1,348
Chromosome Number10
Scaffold Number a10+24
Scaffold N50 (Mb)133
Scaffold L504
Contig Number1,022
Contig N50 (Mb)4.89
Heterozygosity Rate (%)1.9 × 10−3
AnnotationTotal Gene Number12,503
Average CDS Length (bp)1552.7
Average Gene Length (kb)28.8
Percentage of Genome Covered by CDSs (%)1.5%
BUSCO Assessment90.4%
Repeat Content70.0%

a number of chromosome level scaffolds and unplaced scaffolds. CDS, coding sequence.

Landscape of the Fasciola gigantica genome.

(A) Comparisons of the assembled contigs and scaffold lengths (y-axis) and tallies (x-axis) in Fasciola species. (B) Hi-C interactive heatmap of the genome-wide organization. The effective mapping read pairs between two bins were used as a signal of the strength of the interaction between the two bins. (C) Integration of genomic and annotation data using 1 Mb bins in 10 Hi-C assembled chromosomes. (a) Distribution of the GC content (GC content > 39% and < 52%); (b) distribution of the long interspersed element (LINE) percentage > 0% and < 50%; (c) distribution of the long terminal repeat (LTR) percentage > 0% and < 50%; (d) distribution of the gene percentage > 0% and < 70%; (e) distribution of the heterozygosity density of our sample (percentage > 0% and < 1%); (f) distribution of the heterozygosity density of SAMN03459319 in the NCBI database. Hi-C, chromosome conformation capture sequencing; a number of chromosome level scaffolds and unplaced scaffolds. CDS, coding sequence. Combing de novo/homolog/RNA-seq prediction, a total of 12,503 protein coding genes were annotated in the F. gigantica genome. BUSCO assessment [18] indicated that the genome is 90.4% complete and 5.6% fragmented, underscoring the significant improvement of the genome continuity and gene-structure predictions compared with previous assemblies (S6 Table). Specifically, the average gene length in the annotated data is 28.8 kb, nearly twice the length of that in other digenean species, but contrasted with the similar average length of the coding sequences (CDSs). Through functional annotation, we found that 8569 of the genes could be characterized in the InterPro database [19,20], 7892 of them were mapped to the gene ontology (GO) terms, and 5353 of them were identified by the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways database (S3–S4 Figs and S7 Table).

The unique repeat duplications in Fasciola

TEs are insertional mutagens and major drivers of genome evolution in eukaryotes, and replication of these sequences, resulting in variation of gene structure and expression, have been extensively documented [21,22]. Besides, TEs are molecular fossils, being remnants of past mobilization waves that occurred millions of years ago [23]. In the present study, we identified repeat sequences combined the analysis from RepeatModeler [24] and RepeatMasker [25], and detected a significant proportion of them neglected by previous studies. In the F. gigantica genome, we identified 945 Mb of repeat sequences, which was approximate 20% more than that identified in other assemblies in Fasciola species, while the lengths of non-repeat sequences were nearly identical. The most convincing explanation for the additional assembled repeat sequences was that the contigs constructed from Pacbio long reads spanned longer repeat regions, which were compressed in previous assemblies. Among these repeat sequences, there were 408 Mb of LINEs (corresponding to 30.3% of the assembled genome), 285 Mb of long terminal repeats (LTRs, corresponding to 21.2% of the assembled genome), and 162 Mb of unclassified interspersed repeats (corresponding to 12.0% of the assembled genome) (S5 Fig and S8 Table). According to the repeat landscapes, we found that there were two shared expansion events for LINEs and LTRs that occurred approximately 12 million years ago (Ma) and 65 Ma, and an additional expansion event at 33 Ma for LTRs (S6–S7 Figs). Our result confirmed previous study on family Fasciolidae [6], and the abundant repeat sequences in the Fasciola genomes aroused the interest concerning the role of repeats in evolution (Fig 2A), which implied a hypothesize that the expansion of TEs enlarged the genome size of an ancestor of Fasciola to gain a new advantage by rewiring gene networks. To test this hypothesis, we focused on the genome-wide repeats distribution and test signatures of selection.
Fig 2

Identification of repeat expansion and alternative gene networks in the Fasciola gigantica genome.

(A) The distribution of repetitive sequence length among the genomes of six flatworms and the human genome. (B) Landscape of LINEs and LTRs distribution in the Fasciola gigantica genome. The x-axis shows the expansion time of TEs calculated by the divergence between repeat sequences. The mutation rate was set as 1.73 × 10−9 per year. The orange line represents the repeat length ratio, used to estimate the signatures of selection, which was corrected by the total length of intronic and intergenic regions in history. (C) The functional enrichment of genes with more than 10 kb LINE insertions between 41 Ma and 62 Ma by Gene Ontology (GO) classification. The GO terms related to vesicle secretion are marked in red. (D) TMED10 gene structure map. LINEs original between 41 Ma and 62 Ma and longer than 500 bp identified by RepeatMasker were plotted. LTRs longer than 500 bp were plotted. Long interspersed element, LINE; long terminal repeat, LTR; TE, transposable element; TMED10, transmembrane P24 trafficking protein 10.

Identification of repeat expansion and alternative gene networks in the Fasciola gigantica genome.

(A) The distribution of repetitive sequence length among the genomes of six flatworms and the human genome. (B) Landscape of LINEs and LTRs distribution in the Fasciola gigantica genome. The x-axis shows the expansion time of TEs calculated by the divergence between repeat sequences. The mutation rate was set as 1.73 × 10−9 per year. The orange line represents the repeat length ratio, used to estimate the signatures of selection, which was corrected by the total length of intronic and intergenic regions in history. (C) The functional enrichment of genes with more than 10 kb LINE insertions between 41 Ma and 62 Ma by Gene Ontology (GO) classification. The GO terms related to vesicle secretion are marked in red. (D) TMED10 gene structure map. LINEs original between 41 Ma and 62 Ma and longer than 500 bp identified by RepeatMasker were plotted. LTRs longer than 500 bp were plotted. Long interspersed element, LINE; long terminal repeat, LTR; TE, transposable element; TMED10, transmembrane P24 trafficking protein 10. For new TE insertions to persist through vertical inheritance, transposition events must be under strong purifying effect among gene loci to avoid disturbing their biological function. However, we observed many intronic repeat elements in Fasciola, resulting in a larger intron size per gene. If there are equal selection effects on newly inserted TEs in intronic and intergenic regions, there would be a high correlation between the distribution of insertion time and retained TE lengths between these two regions. By contrast, there would be fewer accumulated repeat sequences existing under purifying effect. In this study, we use the relative proportion of TEs between intronic and intergenic regions as a simple indicator, and use the inferred size of intronic and intergenic regions over evolutionary history as a control to estimate the signatures of selection. The results showed that TE insertions into intronic regions are under persistent intense purifying effect, except for LINEs. There was an excess of persistent LINE insertions into intronic regions between 41 Ma and 62 Ma, indicating different modes of accumulating LINEs into intronic regions compared with that in other periods (Fig 2B). Specifically, the time of the ancient intronic LINE expansion (~51.5 Ma) was different to the genome-wide LINE expansion time (~68.0 Ma), whereas the time was coincident with two important environmental change events, the Cretaceous-Paleogene boundary (KPB) mass extinction (~66.0 Ma) and the Paleocene-Eocene Thermal Maximum (PETM) (~55.8 Ma). Both the PETM and KPB events recorded extreme and rapid warming climate changes; however, rapid evolutionary diversification followed the PETM event, as opposed to near total mass extinction at the KPB [26]. Therefore, we selected genes with different LINE lengths, derived between 41 Ma and 62 Ma, and expected to identify a transposon-mediated alterative gene network contributing to the host switch and the shift from intestinal to hepatic habitats.

LINE-mediated alterative gene network

We identified a substantial proportion of genes with LINE insertions, derived between 41 Ma and 62 Ma, indicating a universal effect of the gene network. We selected 1288 genes with the LINE insertions of more than 10 kb, representing more than one third of the average gene length, and annotated the genes using Gene Ontology (GO) terms and processes and Kyoto encyclopedia of genes and genomes (KEGG) pathways (Fig 2C and S9–S11 Tables). These genes involve molecules internalizing substances from their external environment, including membrane-associated and vesicle secretion process proteins. Meanwhile, the gene network was likely adapted to the evolution of protein biosynthesis and modification of histones. Enrichment analysis of GO terms showed that membrane and membrane-associated proteins are over-represented, involving “synaptic membrane” (P = 3.52E-04), “clathrin-coated vesicle membrane” (P = 1.08E-03), and “synaptic vesicle” (P = 3.02E-03), as well as vesicles secretion processes, such as “endocytosis” (P = 7.06E-06), “Golgi organization” (P = 7.45E-05), “COPII vesicle coating” (P = 2.72E-04), “intracellular signal transduction” (P = 5.16E-04), and “endosomal transport” (P = 2.47E-03). Besides, proteins relating to phosphorylation and GTPase activators were also enriched, such as "Protein phosphorylation" (p = 1.73E-03), "Regulation of small GTPase mediated signal transduction" (P = 1.21E-03), "GTPase activator activity" (P = 1.13E-10). The over-representation of genes involved in membrane transport and signal transduction was particularly interesting because helminth parasites interfere with the host immune system by secreting molecules from surface tegument or gut. The TMED10 gene in F. gigantica (encoding transmembrane P24 trafficking protein 10) was used as an example. TMED10 is a cargo receptor involved in protein vesicular trafficking along the secretory pathway [27,28], and the genes have an 11.1 kb LINE insertion in the third intron, resulting in an over three-fold increment in the gene length (Fig 2D). The enrichment suggests that the gene network related to secretion could have experienced adaptive evolution during LINE transposition events. We further compared our dataset with the proteome result from F. hepatica extracellular vesicles (EVs) [9], and found 21 proteins that were also identified as surface molecules associated with EV biogenesis and vesicle trafficking (IST1, VPS4B, TSG101, MYOF, ATG2B, STXBP5L, and 15 Rho GTPase-activating related proteins). Specifically, IST1, VPS4B, and TSG101 are members of the endosomal sorting complex required for transport (ESCRT) pathway, which promotes the budding and release of EVs. TSG101, a crucial member of the ESCRT-I complex, has an important role in mediating the biogenesis of multi-vesicular bodies, cargo degradation, and recycling of membrane receptors. Besides, the ESCRT pathway promotes the formation of both exosomal carriers for immune communication. During the formation of the immunological synapse between T-cells and antigen-presenting B cells, TSG101 ensures the ubiquitin-dependent sorting of T-Cell Receptor (TCR) molecules to exosomes that undergo VPS4-dependent release into the synaptic cleft[29]. The most significant KEGG pathway was aminoacyl-tRNA biosynthesis (P = 7.16E-04), containing 15 out of 38 annotated aminoacyl tRNA synthetases (AARSs). AARSs are the enzymes that catalyze the aminoacylation reaction by covalently linking an amino acid to its cognate tRNA in the first step of protein translation. The large-scale insertion of LINEs reside in AARS genes suggested that the ancestor of Fasciola may have profited from the effect of transposition, with changes to protein biosynthesis and several metabolic pathways for cell viability. In addition, a significant number of genes are strongly associated with histone modulation, including “histone deacetylase complex” (P = 1.89E-03), “histone methyltransferase activity (H3-K36 specific)” (P = 1.08E-03), and “methylated histone binding” (P = 2.37E-03). Histone modifications play fundamental roles in the manipulation and expression of DNA. We found nine histone deacetylases and Histone methyltransferases in the gene set (HDAC4, HDAC8, HDAC10, KMT2E, KMT2H, KMT3A, KDM8, NSD1, and NSD3). Histone modifications can exert their effects by influencing the overall structure of chromatin and modifying and regulating the binding of effector molecules [30,31]; therefore, the variation of these genes might bring about evolution from a disturbed gene structure to a mechanism of genome stabilization to tackle a continuous genome amplification process in evolutionary history.

Genome-wide host-parasite interaction analysis

In the Fasciola genome, we predicted genes encoding 268 proteases, 36 protease inhibitors (PIs), and 852 predicted excretory/secretory (E/S) proteins that are commonly involved in interacting with hosts and modulating host immune responses (S8 Fig). The largest class of proteases was cysteine peptidases (n = 113), which was also identified in the F. hepatica genome (Fig 3A and S12 Table). The largest (n = 19, 52.8% of PIs) PI family was the I02 family of Kunitz-BPTI serine protease inhibitors, which bind to Cathepsin L with a possible immunoregulatory function [32] (S13 Table). GO enrichment analysis of E/S proteins showed that proteins related to “activation of cysteine-type endopeptidase activity” (P = 6.14E-19), “peroxidase activity” (P = 3.79E-07) and “protein disulfide isomerase activity” (P = 3.75E-06) are over-represented (Fig 3B, S14–S15 Tables). Indeed, there were 38 cysteine peptidases identified as E/S proteins, including cathepsin L-like, cathepsin B-like, and legumain proteins, which participate in excystment, migration through gut wall, and immune evasion [33].
Fig 3

Genome-wide host-parasite interaction analysis.

(A) Pie chart for proteases identified in Fasciola gigantica. (B) The interaction mode between the adult Fasciola gigantica and the host. (C) The protein-protein interaction (PPI) network of redox-related pathways in Fasciola gigantica with host proteins. The genes indicated in the three gene ontology (GO) terms were significantly enriched and have their encoded proteins have PPIs with excretory/secretory (E/S) proteins.

Genome-wide host-parasite interaction analysis.

(A) Pie chart for proteases identified in Fasciola gigantica. (B) The interaction mode between the adult Fasciola gigantica and the host. (C) The protein-protein interaction (PPI) network of redox-related pathways in Fasciola gigantica with host proteins. The genes indicated in the three gene ontology (GO) terms were significantly enriched and have their encoded proteins have PPIs with excretory/secretory (E/S) proteins. In parasites, as in mammalian cells, ROS are produced as a by-product of cell metabolism and from the metabolism of certain pharmacological agents. The ability of a parasite to survive in its host has been directly related to its antioxidant enzyme content [34]. To further analyze host-parasite interactions, we identified the protein-protein interactions (PPIs) between the F. gigantica secretome and human proteins expressed in the small intestine and liver [8]. In total, we identified 3300 PPIs, including rich interactions that directly or indirectly participated in the two phases of detoxification pathways (Fig 3C). Superoxide dismutase [Cu-Zn] (SOD, PPIs = 49) was first highlighted because of its important role on phase I detoxification against ROS, in which it catalyzes the dismutation of the superoxide radical to molecular oxygen and hydrogen peroxide (H2O2) [35]. Gene family analysis identified six SOD paralogs in F. gigantica, and two of them contained a signal peptide (Fig 4D). Previous enzyme activity assays also confirmed a significant difference between SOD activities and concentration in E/S proteins of two Fasciola species [36], suggesting an intense ability to resist superoxide radical toxicity. Meanwhile, the metabolite of phase I, H2O2, can also damage parasites, which requires detoxification enzymes, including glutathione-dependent enzymes GPx, glutathione reductase, and other peroxidases. Protein disulfide-isomerase (P4HB, PPIs = 132) and phospholipid hydroperoxide glutathione peroxidase (GPX4, PPIs = 28) were as functioning in phase II detoxification. GPx catalyzes the reduction of hydroperoxides (ROOH) to water, using glutathione (GSH) as the reductant. P4HB also participates in the process by mediating homeostasis of the antioxidant glutathione [37]. However, we did not identify E/S proteins in the Cytochrome P450 (CYP450) family in phase III detoxification. Therefore, we speculated that successful parasite defense against F. gigantica mainly depends on the strong superoxide activity and efficient hydrogen peroxide detoxification.
Fig 4

Phylogenetic tree and gene family analysis.

(A) A phylogenetic tree generated using 559 single-copy orthologous genes. The numbers on the species names are the expanded (+) and contracted (-) gene families. The numbers on the nodes are the divergence time between species. (B) A phylogenetic tree of actin genes in flatworms and humans. All human homologue genes are selected as outgroup. (C) Phylogenetic tree of aquaglyceroporin (AQP) family genes in flatworms and humans. The human homolog genes (AQP11, AQP12A, and AQP12B) were selected as the outgroup. (D) A phylogenetic tree of copper/zinc superoxide dismutase (SOD) genes in flatworms and humans. The midpoint was selected as the root node.

Phylogenetic tree and gene family analysis.

(A) A phylogenetic tree generated using 559 single-copy orthologous genes. The numbers on the species names are the expanded (+) and contracted (-) gene families. The numbers on the nodes are the divergence time between species. (B) A phylogenetic tree of actin genes in flatworms and humans. All human homologue genes are selected as outgroup. (C) Phylogenetic tree of aquaglyceroporin (AQP) family genes in flatworms and humans. The human homolog genes (AQP11, AQP12A, and AQP12B) were selected as the outgroup. (D) A phylogenetic tree of copper/zinc superoxide dismutase (SOD) genes in flatworms and humans. The midpoint was selected as the root node.

Gene family analysis

Gene family analysis was performed using eight taxa (F. gigantica, F. hepatica, Fasciolopsis buski[38], Clonorchis sinensis [39], Schistosoma mansoni)[40], Taenia multiceps [41], swamp buffalo [42], and human [43], which identified 17,992 gene families (Fig 4A). Phylogeny analysis of 559 single-copy orthologs showed that F. gigantica and F. hepatica shared a common ancestor approximately 11.8 million years ago (2.2–22.5 Ma, 95% highest posterior density [HPD]) near the Middle and Late Miocene Epoch boundary. The Miocene warming began 21 million years ago and continued until 14 million years ago, when global temperatures took a sharp drop at the Middle Miocene Climate Transition (MMCT). The divergence of the two Fasciola species may have resulted from the consequences of rapid climate changes, such as migration of the host causing geographic isolation. Our estimation is between the previously suggested date of 5.3 Ma based on 30 nuclear protein-coding genes[6], and 19 Ma based on cathepsin L-like cysteine proteases [44]. Although we used a more integrative gene dataset, the wide HPD interval could not be neglected, raising possible uncertainty from the complex process of speciation or inappropriate protein sequence alignment between members of the genus Fasciola. The distribution of gene family size among different species is used to estimate which lineages underwent significant contractions or expansions. Compared with F. hepatica, F. gigantica shows more gene family expansion events (643 compared to 449) and a similar number of gene family contractions (713 compared to 672). The result emphasizes the general trend that, relative to the common ancestor of Fasciola, the ancestor of F. gigantica apparently underwent a higher extent of gene-expansion than did the ancestor of F. hepatica. Gene duplication is one of the primary contributors to the acquisition of new functions and physiology [45]. We identified 98 gene families, including 629 genes, as rapidly evolving families specific to F. gigantica. Family analysis showed a fascinating trend of gene duplication, with substantial enrichment for the “structural constituent of cytoskeleton” (P = 3.52E-24), “sarcomere organization” (P = 2.29E-14), “actin filament capping” (P = 6.19E-13), and “spectrin” (P = 3.03E-11) in F. gigantica (S16 Table). There were 24 actin paralogs in F. gigantica, in contrast to 8 actin paralogs in F. hepatica. Actin is one of the most abundant proteins in most cells, and actin filaments, one of the three major cytoskeletal polymers, provide structure and support internal movements of organisms [46]. They are also highly conserved, varying by only a few amino acids between algae, amoeba, fungi, and animals [47]. We observed three types of actin proteins in flukes, according to their identity from human actin family. Seventeen of the 24 actin proteins in F. gigantica are highly conserved (Identity > 95%) (Fig 4B). Consistent with the accepted role of the epidermal actin cytoskeleton in embryonic elongation [48,49], we speculated that the significant expansion of actin and spectrin genes increased the body size of F. gigantica via cell elongation or proliferation during morphogenesis. Another rapidly evolving family is the aquaglyceroporin subfamily in the membrane water channel family. We found six aquaglyceroporin paralogs in F. gigantica, which were over-represented in the GO term “water transport” (P = 2.10E-06) (Fig 4C). Aquaglyceroporins are highly permeated by glycerol and other solutes, and variably permeated by water, as functionally validated by several studies [50,51]. The mammalian aquaglyceroporins regulate glycerol content in epidermal, fat, and other tissues, and appear to be involved in skin hydration, cell proliferation, carcinogenesis, and fat metabolism. A previous study showed that F. gigantica could withstand a wider range of osmotic pressures compared with F. hepatica [52], and we speculated that a higher aquaglyceroporin gene copy number might help explain this observation. It is worth mentioning that 57.6% of the rapidly evolving expansion genes specific to the F. gigantica genome were driven by tandem duplication, such that the newly formed duplicates preserved nearly identical sequences to the original genes. The newly formed genes would accumulate non-functionalizing mutations, or develop new functions over time. We found only few tandem duplicated genes that had non-functionalizing mutations, suggesting that adaptive evolution could have an important role in the consequences of these genes via a dosage effect or neo-functionalization.

Discussion

The genome of Fasciola species contains a large percentage of repeat sequences, making them the largest parasite genomes sequenced to date. Since the first assembly of F. hepatica was submitted in 2015 [5], several studies have aimed to improve the quality of assembly and gene annotation [4,6,7]. With advances in long read sequencing assembly and Hi-C scaffolding technologies, it is now viable to resolve the genomic “dark matter” of repetitive sequences, and other complex structural regions at relatively low cost [53]. Therefore, we present the highest quality genome and gene annotation for F. gigantica to date, and provide long-awaited integrated genome annotation for fascioliasis research. In previous study of Fasciolidae family, Choi et al. have discovered TE expansion in Fasciola, which also explained the large lineage-specific genome size and longer annotated gene [6]. We confirmed the result in F. gigantica genome and further identified signatures of selection based on unbalanced distribution of inserted TEs between intronic and intergenic regions in history. Especially, the strongest selection signal occurred in the speciation between the Fasciola and Fascioloides—a habitat switch from the small intestine to the liver in the host—during the PETM, which accompanied by LINE expansion biased toward intronic regions (Fig 5). This unexpected event provided a new evidence of adaptive evolution driven by transposition events and will prompt investigations of how such differences contribute mechanistically to the morphological phenotypes of liver flukes and related species. There are also many studies in other species supporting the hypothesis that TE invasions endured by organisms have catalyzed the evolution of gene-regulatory networks [54]. For example, Eutherian-specific TEs have the epigenetic signatures of enhancers, insulators, and repressors, and bind directly to transcription factors that are essential for pregnancy and coordinately regulate gene expression [55]. Similarly, genes with large-scale insertion of TEs in Fasciola species identified here, represent a signature of Fasciola-specific evolutionary gene network to distinguish other flukes of the family Fasciolidae. These genes overlap significantly with host-parasite interaction genes, including proteases and E/S proteins, and are enriched in the pathways of EV biogenesis and vesicle trafficking.
Fig 5

Schematic diagram of the process of Fasciola-specific repeat expansion during evolution.

The data from genomic, transcriptomic, and proteomic studies can form a good complementary relationship to further our understanding of helminth parasites and their interaction with their hosts. Previous studies have identified a rich source of stage-specific molecules of interest using transcriptomic and proteomic analysis [56,57]. Here, we provided a comprehensive list of predicted E/S proteins in F. gigantica and predicted 3300 PPIs at the host-parasite interface, extending our understanding of how the phase I and phase II detoxification enzymes counteract the effect of ROS. The ability of Fasciola species to infect and survive in different tissue environments is underpinned by several key E/S protein gene duplications. Both Fasciola species have a common expansion in the secretion of papain-like cysteine peptidase family (Clan A, family C1) [5]. Besides, F. gigantica has a specific variation in the SOD gene copy number, allowing it to regulate the catalytic activity of the superoxide radical released by the host. The effect of specific gene duplications can also be reflect in the increased body size of F. gigantica, which is an important morphometric character to distinguish Fasciola species and has a decisive influence on the final host species [58], although a gene level study of this phenotype is barely reported. Overall, our study demonstrated that the combination of long-read sequencing with Hi-C scaffolding produced a very high-quality liver fluke genome assembly and gene annotation. Additionally, identification of the repeat distribution among the gene regions extended our understanding of the evolutionary process in Fasciola species. Further detailed functional studies of secretion might be of great scientific significance to explore their potential application in fascioliasis treatment.

Materials and methods

Ethics statement

This study was approved by the Research Ethics Committee of the Guangxi University (Permit code: GXU2019-029). In present study, experiment was performed by the Principle Guidance for the Use and Care of Laboratory Animals.

Sample collection and de novo sequencing

All animal work was approved by the Guangxi University Institutional Animal Care and Use Committee. For the reference genome sequencing, one F. gigantica at adult stage was derived from infected buffalo in the Guangxi Zhuang Autonomous Region. Nucleic acids were extracted using a QIAGEN DNeasy (DNA) kit (Qiagen Hilden, Germany). Three de novo genome sequencing methods were performed on the liver fluke: We generated (1) 122.4 Gb (~88× depth) PacBio Sequel II single-molecule long reads, with an average read length of 15.8 kb (PacBio, Menlo Park, CA, USA); (2) 89.5 Gb (~66× depth) Illumina HiSeq PE150 pair-end sequencing to correct errors (Illumina, San Diego, CA, USA); and (3) 134 Gb (~100× depth) chromosome conformation capture sequencing (Hi-C) data (sequenced by Illumina platform).

De novo assembly and assessment of the genome quality

A PacBio-only assembly was performed using Canu v2.0 [59,60] using new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. To remove haplotigs and contig overlaps in the assembly, we used Purge_Dups based on the read depth [61]. Arrow (https://github.com/PacificBiosciences/GenomicConsensus) was initially used to reduce the assembly error in the draft assembly, with an improved consensus model based on a more straightforward hidden Markov model approach. Pilon [62] was used to improve the local base accuracy of the contigs via analysis of the read alignment information based on paired-end bam files (thrice). As a result, the initial assembly resulted had an N50 size of 4.89 Mb for the F. gigantica reference genome. ALLHiC was capable of building chromosomal-scale scaffolds for the initial genome using Hi-C paired-end reads containing putative restriction enzyme site information (S1 Text) [63]. The whole genome assembly (contig version) have been deposited in the Genome Warehouse in BIG Data Center under accession number GWHAZTT00000000 and NCBI under Bioproject PRJNA691688. Three methods were used to evaluate the quality of the genomes. First, we used QUality ASsessment Tool (QUAST) [64] to align the Illumina and PacBio raw reads to the F. gigantica reference genome to estimate the coverage and mapping rate. Second, all the Illumina paired-end reads were mapped to the final genome using BWA [65], and single nucleotide polymorphisms (SNPs) were called using Samtools and Bcftools. The predicted error rate was calculated by the homozygous substitutions divided by length of the whole genome, which included the discrepancy between assembly and sequencing data. Thirdly, we assessed the completeness of the genome assemblies and annotated the genes using BUSCO [18].

Genome annotation

Three gene prediction methods, based on de novo prediction, homologous genes, and transcriptomes, were integrated to annotate protein-coding genes. RNA-seq data of F. gigantica were obtained from the NCBI Sequence Read Archive, SRR4449208 [66]. RNA-seq reads were aligned to the genome assembly using HISAT2 (v2.2.0) [67] and subsequently assembled using StringTie (v2.1.3) [68]. PASA (v2.4) [69] was another tool used to assemble RNA-seq reads and further generated gene models to train de novo programs. Two de novo programs, including Augustus (v3.0.2) [70] and SNAP (v2006-07-28) [71], were used to predict genes in the repeat-masked genome sequences. For homology-based prediction, protein sequences from UniRef100 [72] (plagiorchiida-specific, n = 75,612) were aligned on the genome sequence using TBLASTn [73] (e-value < 10−4), and GeneWise (version 2.4.1) [74] was used to identify accurate gene structures. All predicted genes from the three approaches were combined using MAKER (v3.1.2) [75] to generate high-confidence gene sets. To obtain gene function annotations, Interproscan (v5.45) [76] was used to identify annotated genes features, including protein families, domains, functional sites, and GO terms from the InterPro database. SwissProt and TrEMBL protein databases were also searched using BLASTp [77] (e-value < 10−4). The best BLASTp hits were used to assign homology-based gene functions. BlastKOALA [78] was used to search the KEGG ORTHOLOGY (KO) database. The subsequent enrichment analysis was performed using clusterProfiler using total annotated genes as the background with the “enricher” function [79].

Repeat annotation and analysis

We combined de novo and homology approaches to identify repetitive sequences in our assembly and previous published assemblies, including F. gigantica, F. hepatica, and Fasciolopsis buski. RepeatModeler (v2.0.1) [24] was first used to construct the de novo identification and accurate compilation of sequence models representing all of the unique TE families dispersed in the genome. Then, RepeatMasker (v4.1.0) [25] was run on the genome using the combination of de novo libraries and a library of known repeats (Repbase-20181026). The relative position between a repeat and a gene was identified using bedtools [80], and the type of repeat was further divided to intronic and intergenic origin. The repeat landscape was constructed using sequence alignments and the complete annotations output from RepeatMasker, depicting the Kimura divergence (Kimura genetic distances between identified repeat sequences and their consensus) distribution of all repeats types. The most notable peak in the repeat landscapes was considered as the most convincing time of repeat duplication in that period. We inferred the time of LINEs insertion by transferring Kimura divergence in RepeatMasker to age (t = d/2mu). The distributions of TE elements were calculated with sliding windows (n = 50). In each sliding window, we calculated the relative proportion of TE between intronic and intergenic regions, and further corrected them using the whole ratio between intronic and intergenic regions. To calculate mutation rate, we used 559 single-copy orthologs multiple sequence alignment among 8 species produced in the latter gene family analysis, and estimated the mutation rate using MCMCtree with global clock. A Markov chain Monte Carlo (MCMC) process was run for 2,000,000 iterations, with sample frequency of 100 after a burn-in of 1,000 iterations. The median of simulated data was selected as mutation rate (mu = 1.73×10−9 per base per year).

Genome-wide host-parasite protein interaction analysis

In addition to the genome data that we generated for F. gigantica, we downloaded genome annotation information for human (GCA_000001405.28), swamp buffalo (GWHAAJZ00000000), F. hepatica (GCA_002763495.2), Fasciolopsis buski (GCA_008360955.1), Clonorchis sinensis (GCA_003604175.1), Schistosoma mansoni (GCA_000237925.2), and Taenia multiceps (GCA_001923025.3) from the NCBI database and BIG Sub (China National Center for Bioinformation, Beijing, China). Proteases and protease inhibitors were identified and classified into families using BLASTp (e-value < 10−4) against the MEROPS peptidase database (merops_scan.lib; (European Bioinformatics Institute (EMBL-EBI), Cambridge, UK)), with amino acids at least 80% coverage matched for database proteins. These proteases were divided into five major classes (aspartic, cysteine, metallo, serine, and threonine proteases). E/S proteins (i.e., the secretome) were predicted by the programs SignalP 5.0 [81], TargetP [82], and TMHMM [83]. Proteins with a signal peptide sequence but without a transmembrane region were identified as secretome proteins, excluding the mitochondrial sequences. Genome-wide host-parasite protein interaction analysis was perform by constructing the PPIs between the F. gigantica secretome and human proteins expressed in the tissues related to the liver fluke life cycle. For the hosts, we selected human proteins expressed in the small intestine and liver, and located in the plasma membrane and extracellular region. The gene expression and subcellular location information were obtained from the TISSUES [84] and Uniprot (EMBL-EBI) databases, respectively. For F. gigantica, secretome molecules were mapped to the human proteome as the reference, using the reciprocal best-hit BLAST method. These two gene datasets were used to construct host-parasite PPI networks. We downloaded the interaction files (protein.links.v11.0) in the STRING database [85], and only highly credible PPIs were retained by excluding PPIs with confidence scores below 0.7. The final STRING network was plotted using Cytoscape [86]. We chose the longest transcript in the downloaded annotation dataset to represent each gene, and removed genes with open reading frames shorter than 150 bp. Gene family clustering was then performed using OrthoFinder (v 2.3.12) [87], based on the predicted gene set for eight genomes, including F. gigantica (our assembly), F. hepatica (NCBI: GCA_002763495.2), Fasciolopsis buski (NCBI: GCA_008360955.1), Clonorchis sinensis (NCBI: GCA_003604175.1), Schistosoma mansoni (NCBI: GCF_000237925.1), Taenia multiceps (NCBI: GCA_001923025.3), swamp buffalo (BIG sub: GWHAAJZ00000000), and human (NCBI: GCF_000001405.39). This analysis yielded 17,992 gene families. To identify gene families that had undergone expansion or contraction, we applied the CAFE (v5.0.0) program [88], which inferred the rate and direction of changes in gene family size over a given phylogeny. Among the eight species, 559 single-copy orthologs were aligned using MUSCLE (v3.8.1551) [89], and we eliminated poorly aligned positions and divergent regions of the alignment using Gblock 0.91b [90]. RAxML (v 8.2.12) was then used with the PROTGAMMALGF model to estimate a maximum likelihood tree. Divergence times were estimated using PAML MCMCTREE [91]. A Markov chain Monte Carlo (MCMC) process was run for 2,000,000 iterations, with a sample frequency of 100 after a burn-in of 1,000 iterations under an independent rates model. Two independent runs were performed to check the convergence. The fossil-calibrated eukaryote phylogeny was used to set the root height for the species tree, taken from the age of Animals (602–661 Ma) estimated in a previous fossil-calibrated eukaryotic phylogeny [92] and the divergence time between the euarchontoglires and laurasiatheria: (95.3–113 Ma) [93]. To enhance the reproducibility of the results, we deposit the laboratory protocols in protocols.io (PROTOCOL DOI): http://dx.doi.org/10.17504/protocols.io.bxatpien.

Genome-wide all-by-all chromosome conformation capture sequencing (Hi-C) interaction in F. gigantica (Bins = 500 K).

(TIF) Click here for additional data file.

Comparison of chromosome length between the chromosome conformation capture sequencing (Hi-C) assembly and estimates from published karyotype data by Jae Ku Rhee.

(TIF) Click here for additional data file.

Boxplot of average gene length.

(TIF) Click here for additional data file.

Boxplot of average coding sequence (CDS) length per gene.

(TIF) Click here for additional data file.

Divergence distribution of classified families of transposable elements.

The classified transposon families in F. gigantica. (TIF) Click here for additional data file.

Expansion time of long terminal repeats (LTRs) and long interspersed elements (LINEs).

The mutation rate was 1.73×10−9. (TIF) Click here for additional data file.

Estimation of F. gigantica genome size based on the expansion time of repeat sequences during evolution.

The mutation rate was 1.73×10−9. (TIF) Click here for additional data file.

Overlapping E/S proteins between this study and proteomic study by Di Maggio LS et al [94].

(TIF) Click here for additional data file.

Genome sequencing strategy for buffaloes.

(XLSX) Click here for additional data file.

Summary of the Fasciola gigantica genome assembly.

(XLSX) Click here for additional data file.

Summary of different assemblies in Fasciola species.

(XLSX) Click here for additional data file.

Summary of chromosome conformation capture sequencing (Hi-C) assembly of the chromosome length in Fasciola gigantica.

(XLSX) Click here for additional data file.

Assessment of the completeness and accuracy of the genome.

(XLSX) Click here for additional data file.

BUSCO assessment of the genome.

(XLSX) Click here for additional data file.

Number of genes with functional classification gained using various methods.

(XLSX) Click here for additional data file.

Transposable element content of Fasciola gigantica genome.

(XLSX) Click here for additional data file.

The list of genes with more than 10 kb of long interspersed element (LINE) insertion between 41 Ma and 62 Ma.

(XLSX) Click here for additional data file.

Gene ontology (GO) term category enrichment for genes with more than 10 kb of long interspersed element (LINE) insertion between 41 Ma and 62 Ma.

(XLSX) Click here for additional data file.

Kyoto Encyclopedia of Genes and Genomes (KEGG pathway enrichment for genes with more than 10 kb of long interspersed element (LINE) insertion between 41 Ma and 62 Ma.

(XLSX) Click here for additional data file.

Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment for genes with more than 10 kb of long interspersed element (LINE) insertion between 41 Ma and 62 Ma.

(XLSX) Click here for additional data file.

Protein inhibitors in the Fasciola gigantica genome.

(XLSX) Click here for additional data file.

Excretory/secretory (E/S) proteins in the Fasciola gigantica genome.

(XLSX) Click here for additional data file.

Gene ontology (GO) term category enrichment for excretory/secretory (E/S) proteins.

(XLSX) Click here for additional data file.

Gene ontology (GO) term category enrichment for rapidly evolving families specific to F. gigantica.

(XLSX) Click here for additional data file.

Function annotation based on human uniprot gene using blastp with E-value < 10–4.

(XLSX) Click here for additional data file.

AGP file for Fasciola gigantica.txt.

(DOC) Click here for additional data file.

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present. 24 May 2021 Dear Dr. Liu, Thank you very much for submitting your manuscript "High-quality reference genome of Fasciola gigantica: Insights into the genomic signatures of transposon-mediated evolution and specific parasitic adaption in tropical regions" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Neil David Young Associate Editor PLOS Neglected Tropical Diseases Makedonka Mitreva Deputy Editor PLOS Neglected Tropical Diseases *********************** Reviewer's Responses to Questions Key Review Criteria Required for Acceptance? As you describe the new analyses required for acceptance, please consider the following: Methods -Are the objectives of the study clearly articulated with a clear testable hypothesis stated? -Is the study design appropriate to address the stated objectives? -Is the population clearly described and appropriate for the hypothesis being tested? -Is the sample size sufficient to ensure adequate power to address the hypothesis being tested? -Were correct statistical analysis used to support conclusions? -Are there concerns about ethical or regulatory requirements being met? Reviewer #1: 1. How many parasites were used for the DNA extraction and from which life cycle stage? If more than one parasite was used detail how the sequencing libraries were prepared and from which samples. 2. Lines 454-456 - detail genome/transcriptome assemblies used here, provide accession/identifier numbers. 3. Line 502 - include details of the 8 genomes used here - species names and genome identifiers. Reviewer #2: Major points: 1. Lines 469-470: The Authors should clarify which species were uses to produce single copy orthologs and where the underlying evidence to estimate the speciation events among these species was collected from? Did the author have for instance estimates from fossil data for calibration? Also, the authors should give more detailed description for the calculation of mutation rate because plain referral to MCMC and CDS alignments do not make the method repeatable for other researches. Reviewer #3: The idea behind the study is clear and sound. Methods in genome sequencing, assembly and annotation seem appropriate. Some of the downstream analysis would need a bit of improvement -------------------- Results -Does the analysis presented match the analysis plan? -Are the results clearly and completely presented? -Are the figures (Tables, Images) of sufficient quality for clarity? Reviewer #1: 1. Lines 42 and in the results section - the authors describe how they identified ES proteins that were used to investigate protein-protein interactions between host and parasite. a. How were these proteins identified/classified? b. Did the authors base their predictions solely on whether the proteins contained a signal peptide and therefore would be likely to be secreted? c. Analysis from F. hepatica ES proteome studies has shown that proteins that do not have signal peptides are also present in the ES products, particularly within the EVs. Did the authors check their predicted ES proteins to known ES proteome datasets for Fasciola to confirm this list? d. The authors should mention that these proteins represented the predicted secretome/ES proteins rather than stating these are ES proteins, unless confirmed. This data also impacts on how the PPI data should be interpreted. 2. Figures - all the figures are composite figures - the authors should modify the font size of the text so that it will be visible upon publication. Reviewer #3: The article provide a significant improvement in the knowledge of the genome of F.gigantica, that might be a very useful resource. In general the results are presented in a clear way, although some clarifications and improvements can be made as detailed below. In general the figures are of good quality, the supplementary information provided in tables need improvement, particularly by providing a reasonable annotation of individual recognizable genes. -------------------- Conclusions -Are the conclusions supported by the data presented? -Are the limitations of analysis clearly described? -Do the authors discuss how these data can be helpful to advance our understanding of the topic under study? -Is public health relevance addressed? Reviewer #1: (No Response) Reviewer #3: Most of the ideas behind the article are sound and supported by the data, although some analysis might need to be revised. The discussion in general stress relevant points, that are open for discussion The subject is relevant and the data generated might be a valuable resource -------------------- Editorial and Data Presentation Modifications? Use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity. If the only modifications needed are minor and/or editorial, you may wish to recommend “Minor Revision” or “Accept”. Reviewer #1: 1. Line 28 and throughout the manuscript – fascioliasis should not be in italics, however, Fasciola should be in italics. 2. Line 59 – prevalence of F. gigantica infection. 3. Line 208 – sentence should read as ‘genes have’ or ‘gene has’ depending on the number of genes the authors are referring to. 4. Line 333 - of the rapidly evolving 5. Line 364 - networks 6. Lines 226-227 - rationale for abbreviation AAASs when the abbreviation for aminoacyl tRNA synthetases is typically AARSs- also why is the abbreviation in italics? The AARSs abbreviation is used on line 227 but is not defined. 7. References: a. Lines 85-86 – Reference [1] is not the correct reference for this statement relating to the mode of action of TCBZ in disrupting beta tubulin polymerization. b. References 7 and 45 are the same; References 10 and 29 are the same. Reviewer #2: Minor points 1. Line 165: would -> would be 2. Line 281: is mainly -> mainly 3. Line 304: emphasize -> emphasizes 4. Line 469: 173x10-9 -> 173x10-9 per basepair per year Reviewer #3: A minor suggestion, I feel that swapping some paragraphs in the intro might improve the understanding without changing the meaning. Fisrt sentence of last paragraph (lines 92-93) would be better after line 62, that can be followed by the 3rd paragraph of the intro (lines 77 -91). This would join all the info on the biology of the parasite, before presenting the status on trematode omics and particularly Fasciolidae data (lines 63-76). -------------------- Summary and General Comments Use this section to provide overall comments, discuss strengths/weaknesses of the study, novelty, significance, general execution and scholarship. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. If requesting major revision, please articulate the new experiments that are needed. Reviewer #1: The manuscript by Luo and colleagues describes a high quality genome assembly for Fasciola gigantica. This genome dataset represents an improved dataset for the tropical liver fluke, which has been assembled into 10 potential chromosomes. This study paves the way for improvements for the other Fasciola species genome allowing future comparatative analyses and provides novel insight for liver fluke biology. The manuscript is well written and is suitable for publication in PLoS NTD. Reviewer #2: This manuscript introduces reference genome for a socio-economically important trematode Fasciola gigantica and hypothesizes both the role of both transposon expansion event and uses single copy gene to infer speciation events between Fasciola and Fasciolopsis and between Fasciola gigantica and Fasciola hepatica, respectively. Moreover, the authors hypothesize the role of gene family expansions to the size of the trematode and its development to epidemic in tropical and subtropical regions. The assembled genome for F. gigantiga described in this manuscript clearly has major importance to the community studying parasites and the diseases they cause. The manuscript is well written, the methods for the assembly and annotation follow currently known best practices (I could only slightly criticize the use of SNAP in de novo prediction). Apart from the required clarification to the calculation of the mutation rate, and few minor enhancements, there are not objections to publish the manuscript. Reviewer #3: The manuscript presented by Luo et al provides a novel assembly of the genome of the liver fluke Fasciola gigantica, and analyze the plausible role of transposons in driving adaptation in this species. The work is well written and contributes a good wealth of data, providing an almost chromosome level assembly of this relevant species, with a good deal of info on supplementary data, that would undoubtedly move forward the knowledge of Fasciolidae biology. I feel the article has a lot of strengths, but also some weakness that might be improved a bit in order to get it published. These lay essentially in the analysis of the repeats and the prediction of interactions between parasite and host genes, and the absence of an annotated list of genes that should be provided. Anyway I provide below a more detailed account of doubts and questions that I feel can clarify the findings and their relevance. 1. The first section describes the novel assembly that integrates long reads, paired reads and HiC data. This results in a slightly larger genome than previously published (refs 5-8), consistent with a better resolution of probably collapsed repeat sequences that here are resolved. Gene content is similar to previously described genomes of Fasciolidae, although functional annotation results in roughly 2/3 of genes with assigned function (either on Interpro, GO or KEGG). I wonder from table S7 how many genes have no associated function whatsoever, is a minor point, but would stress the size of the unknown bin. 2. When looking at all the supplementary data, I noticed that an annotation file with the identified genes and their putative annotation is missing. This is a relevant tool that should be included, either as a supplementary table and/or as a gff annotation file with the genome assembly. 3. The notion that genes were larger in Fasciolidae than in other trematodes (lines 129-131) have already been advanced, and figures S3 and s4 are almost identical to figure 1C from Choi et al, ref 7, that also includes a comparison of intron lengths. This should be properly referred and discussed. By the way ref 7 and 45 are the same…please correct. 4. The second section focus on the repeats expansions that resulted in larger genomes. The authors estimate two expansions events and time them to 12 and 65 Ma (lines 153-155). I feel that the procedures for the estimation of these times need to be further clarified. 5. The authors made the reasonable hypothesis that they might be underlying the increase in genome size, and probably providing evolutive advantages (lines 156-160). This idea, and the distribution of repeats between intergenic and intronic regions (next point) were also previously advanced by Choi et al, and the similarities and differences should be acknowledged. 6. The comparison of the distribution of repeats between intronic and intergenic regions (lines 161-185) is a reasonable thing to do, but I feel that the description of the procedures and results are not fully clear. I assume that “the relative proportion of TEs between intronic or intergenic regions” is calculated in all intronic vs all intergenic repeats. What is the “inferred size” of these regions? What is the corrective repeat ratio that appears in Fig 2B and how is calculated? 7. The idea of purifying selection acting on intronic regions, except for LINEs (line 172) would imply that a statistical significant difference in TE types within intron exists in comparison with intergenic repeats. I feel this is missing, or I cannot figure out this from fig 2B. 8. While this transmits the idea of LINEs being enriched in intronic sequences, a different picture has been presented previously by Choi et al, showing LINE enrichment both intronic and intergenic (see Fig.2B of ref 7) but possibly consisting of different LINE types. Are the results consistent to this? It would be good to discuss it. 9. The idea of selecting a set of enlarged Fasciola introns and compare to other trematode species is a good one, and searching for functions through GO enrichment also is reasonable (187-197), but I feel that the functions highlighted do not reflect the diversity described in Fig 2C or in the supplementary tables (by the way, they are slightly different, there are missing categories in one or the other). 10. Although is a good proxy to what might be happening, I found strange that some highly represented categories as signal transduction was not highlighted, especially considering that components such as protein phosphorylation or GTPase activators are also enriched. 11. Several questions come to my mind on the finding of particular groups of genes like tRNA synthetases (lines 225-231), or histone modificators (lines 234-240) as enriched in long introns. For example, are there are other copies of these genes in the genome that were not affected by the inclusion of long introns? Are the large introns in a particular position (more 5’ or 3’) within the genes? Intersting… 12. The fourth section attempts to find genes that interact with those of the host, and here is where I have more concerns. First proteases and inhibitors are identified by comparison with the merops database (lines 244-257), however, the annotation provided does not allow to compare if these results have, for example improved the annotation of these relevant gene families in relation to previously published assemblies (refs 5-8). By analyzing the supplementary tables I noticed that the gene annotation based on this assembly is not provided. The tables provides access to merops entries, but does not correlate with annotated genes in FGIG. I assume that repeated merops entries that appear in the table correspond to diverse genes in the FGIG genome that are best match to these merops entries, but this need to be clarified. Gene names should be corrected providing a unique identifier for each gene as well as their putative annotation. This is particularly relevant in gene families as for example those that appear in suppl table 15. It seems that approximately 17 Legumain genes have been found, but if all share the same name is impossible to evaluate for example if they are differentially expressed in diverse life stages or tissues. A complete annotation of the genes list should be provided as previously stated (in point 2). 13. The major concern is however, the assumption of interactions based on the protein-protein interaction (lines 260-280). I feel this analysis is essentially incorrect. Interactions were inferred by selecting host intestinal and liver surface and secreted proteins, and those also expressed by the parasite. I wonder how many of the FGIG surface proteins mapped to human counterparts. Beside this, the interactions reported seem to be based on those described for the mammalian counterparts of the FGIG proteins. These are proved interactions within mammals, but there is no line of evidence so far that the same interactions would take place with heterologous proteins from the parasite. Extreme care should be taken into inferring biologically significant interactions without further experimental evidence. I feel this whole section is over interpreting the results. 14. I also have some doubts about the gene family analysis, since some of the F.gigantica specific families highlighted correspond to quite conserved GO functions, that surely are present in F.hepatica and other trematodes. Consequently they should represent duplication events exclusive of F.gigantica, that can be easily analyzed in more detail. Unfortunately the poor annotation of genes does not allow to test this since rather than having a list of genes that constitute the family we have a database hitname repeated as many times as copies. As already mentioned good annotation would allow a better understanding of the results. In general I feel that the article has interesting novel data, but have some weakness in the analysis, most of them that can be reasonably sorted out before publication. -------------------- PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols 21 Jun 2021 Submitted filename: Response to reviewers.doc Click here for additional data file. 28 Jul 2021 Dear Dr. Liu, Thank you very much for submitting your manuscript "High-quality reference genome of Fasciola gigantica: Insights into the genomic signatures of transposon-mediated evolution and specific parasitic adaption in tropical regions" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Thank you for submitting your rejoinder. Please submit a word document with changes in the document highlighted or using tracked changes. Please make sure that questions raised by each reviewer are addressed in the manuscript. Also, please include a table with a summary of the annotation linked to each gene. Please see other minor comments from the two reviewers. Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Neil David Young Associate Editor PLOS Neglected Tropical Diseases Makedonka Mitreva Deputy Editor PLOS Neglected Tropical Diseases *********************** Thank you for submitting your rejoinder. Please submit a word document with changes in the document highlighted or using tracked changes. Please make sure that questions raised by each reviewer are addressed in the manuscript. Also, please include a table with a summary of the annotation linked to each gene. Please see other minor comments from the two reviewers. Reviewer's Responses to Questions Key Review Criteria Required for Acceptance? As you describe the new analyses required for acceptance, please consider the following: Methods -Are the objectives of the study clearly articulated with a clear testable hypothesis stated? -Is the study design appropriate to address the stated objectives? -Is the population clearly described and appropriate for the hypothesis being tested? -Is the sample size sufficient to ensure adequate power to address the hypothesis being tested? -Were correct statistical analysis used to support conclusions? -Are there concerns about ethical or regulatory requirements being met? Reviewer #3: Some methodological issues still present see below. -------------------- Results -Does the analysis presented match the analysis plan? -Are the results clearly and completely presented? -Are the figures (Tables, Images) of sufficient quality for clarity? Reviewer #3: Please see below in general comments -------------------- Conclusions -Are the conclusions supported by the data presented? -Are the limitations of analysis clearly described? -Do the authors discuss how these data can be helpful to advance our understanding of the topic under study? -Is public health relevance addressed? Reviewer #3: Fine. ssee below -------------------- Editorial and Data Presentation Modifications? Use this section for editorial suggestions as well as relatively minor modifications of existing data that would enhance clarity. If the only modifications needed are minor and/or editorial, you may wish to recommend “Minor Revision” or “Accept”. Reviewer #3: Some changes in the way data are presented are still needed. See below -------------------- Summary and General Comments Use this section to provide overall comments, discuss strengths/weaknesses of the study, novelty, significance, general execution and scholarship. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. If requesting major revision, please articulate the new experiments that are needed. Reviewer #1: The authors have addressed the comments raised by the previous review - however several of the points included in their response have not been incorporated into the manuscript (see below). Following the edits/comments below the manuscript will be suitable for publication. 1. For example - the authors stated that only one adult fluke was just for the sequencing. This needs to be included in the methods. Please include where relevant the points raised in the review in the text. 2. Also it was difficult to see where the changes were made as the authors used the comment box on the pdf to add in their changes. The edits apparently made in the highlighted manuscript, such as lines 161-172 relating to the purifying selection, were not incorporated in the final version. In the next review please include a highlighted manuscript using tracked changes or highlighting the text in colour so that the changes can be followed. 3. Line 105 - The reference included for the beta tubulin polymerization is still not right. Include a relevant reference. 4. Lines 221-224 - Although the authors state that they checked their predicted secreted sequences against the EV proteins published by de la Torre et al., the authors should also check their data against the available secretome data not just the EV proteins, to identify the actual proteins present in the ES products. 5. Line 309-310 - how did the authors perform this analysis? Is this based on their predicted secretome data or from proteomic analysis? 6. Line 383 - amend the sentence to: Choi et al. 7. Lines 544-547 - The species names in this section should be in italics. Reviewer #3: COMMENTS ON REVISED VERSION OF LUO ET AL. While most of the issues and questions have been considered, some of the main issues are still there, and need to be corrected in order to make the manuscript acceptable for publication. 1. The main issue is still the poor annotation, devoid of unique Identifiers for the genes, that make most of the data presented in tables and figures completely unfollowable. While the manuscript is an excellent effort assembling a chromosome level genome, the usefulness of this resource for the community upon publication would be limited by this issue, making it not comparable to other assemblies available. Following one of the suggestions a gff table is included (A23). While that provides a detailed information for those more interested, is not a solution for the simple reader interested in looking the information of any one of the genes that appear in tables 9-16. There is no place to go, and since genes are named based on their hits, when these are repeated (as frequently occur) there is no way of identifying a particular gene and differentiate it from other members of the same family. Is impossible to pretend that each reader interested in a particular gene should download and filter a gff table (as suggested in responses A33 and A35). Providing appropriate and clear information is a task of the authors, not the readers. 2. A very simple solution for this is a table with the 12503 identified genes each with a unique IDs and a definition. Beside this, all tables should include gene IDs (it can also combine names but unique identifiers are mandatory) in order to make it useful and verifiable. This is a main issue and should be solved before publication. The work has reached excellent results assembling the genome, and annotation, and it would be a pity if published in a way that requires other researchers to run complex filtering steps and/or reanalize the data to find information that the authors already have, and should be made available in a way that could be easily used by the community. 3. The annotation weakness also compromises the idea of the protein-protein interaction analysis. I agree as stated in response A34 that this could be an approach to gain information based on better analyzed protein interactions in host, but we have to be very careful on the design of the assay and on the interpretation of the results. In order to make this kind of analysis, we have to be sure that the proteins considered are true orthologs, ie. that the actual parasite gene considered is the unique orthologue of the host one detected in the interaction. Several of the proteins included might be part of gene families, so the question of finding the correct orthologue (and not paralogues) is not trivial. While is stated that the reciprocal blast best-hits were considered, the absence of proper identifiers make this untestable. Here again, having good unique gene ids is essential. 4. Similarly in relation to the gene family analysis, is not a question of if is reasonable to believe in the results as stated in answer 33 but rather if the information provided sustain the results. I do believe that the authors have performed well the experiments and have sound results, but I cannot find useful information in what is published, since I cannot identify different genes of the same family. Questions related to these families, their origins and similarities with those from other species remain obscure if we cannot pick the appropriate genes to compare and advance in their knowledge. 5. Other issues related to the repeats and the acknowledge and discussion of previous works in the same line have been taken into account, making minor modification in the discussion (responses A24, A26 and A29). Since some of the results here mirror those already published I would have expected them to be commented earlier than in discussion, but this is a matter of opinion. 6. Other questions related to repeat expansion analysis and selection (A25-28), were answered or clarified appropriately. Thanks for that. Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols References Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article's retracted status in the References list and also include a citation and full reference for the retraction notice. 8 Aug 2021 Submitted filename: 2nd-Response to reviewers.doc Click here for additional data file. 10 Aug 2021 Dear Dr. Liu, Thank you very much for submitting your manuscript "High-quality reference genome of Fasciola gigantica: Insights into the genomic signatures of transposon-mediated evolution and specific parasitic adaption in tropical regions" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Thank you for addressing comments from each reviewer. I am satisfied that you have addressed their comments and improved the quality of the manuscript. Please format the references correctly and resubmit the manuscript. Please check format suggested for PLoS NTD and then consider the specific modifications: The species name should be italicised Decision should be made on the use of sentence case or capitalise every word in title Ensure consistency of information displayed (e.g. doi numbers, PubMed IDs) Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Neil David Young Associate Editor PLOS Neglected Tropical Diseases Makedonka Mitreva Deputy Editor PLOS Neglected Tropical Diseases *********************** Thank you for addressing comments from each reviewer. I am satisfied that you have addressed their comments and improved the quality of the manuscript. Please format the references correctly and resubmit the manuscript. Please check format suggested for PLoS NTD and then consider the specific modifications: The species name should be italicised Decision should be made on the use of sentence case or capitalise every word Ensure consistency of information displayed (e.g. doi numbers, PubMed IDs) Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols References Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article's retracted status in the References list and also include a citation and full reference for the retraction notice. 12 Aug 2021 Submitted filename: Response to reviewers.doc Click here for additional data file. 18 Aug 2021 Dear Dr. Liu, Thank you very much for submitting your manuscript "High-quality reference genome of Fasciola gigantica: Insights into the genomic signatures of transposon-mediated evolution and specific parasitic adaption in tropical regions" for consideration at PLOS Neglected Tropical Diseases. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. In the R2 version of this manuscript, i can see references that are still incorrectly formatted. There are several discrepancies with the format of your reference list. Reference format needs to be unified and the following addressed: Check format suggested for PLoS NTD and then consider the specific modifications: The species names must be italicised Decision should be made on the use of sentence case or to capitalise every word or letter. There is a mix of all styles at the moment. Ensure consistency of information displayed (e.g. doi numbers, PubMed IDs). Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Neil David Young Associate Editor PLOS Neglected Tropical Diseases Makedonka Mitreva Deputy Editor PLOS Neglected Tropical Diseases *********************** In the R2 version of this manuscript, i can see references that are still incorrectly formatted. There are several discrepancies with the format of your reference list. Reference format needs to be unified and the following addressed: Check format suggested for PLoS NTD and then consider the specific modifications: The species names must be italicised Decision should be made on the use of sentence case or to capitalise every word or letter. There is a mix of all styles at the moment. Ensure consistency of information displayed (e.g. doi numbers, PubMed IDs). Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols References Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article's retracted status in the References list and also include a citation and full reference for the retraction notice. 19 Aug 2021 Submitted filename: Response to reviewers.doc Click here for additional data file. 23 Aug 2021 Dear Dr. Liu, We are pleased to inform you that your manuscript 'High-quality reference genome of Fasciola gigantica: Insights into the genomic signatures of transposon-mediated evolution and specific parasitic adaption in tropical regions' has been provisionally accepted for publication in PLOS Neglected Tropical Diseases. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Neglected Tropical Diseases. Best regards, Neil David Young Associate Editor PLOS Neglected Tropical Diseases Makedonka Mitreva Deputy Editor PLOS Neglected Tropical Diseases *********************************************************** 1 Oct 2021 Dear Dr. Liu, We are delighted to inform you that your manuscript, "High-quality reference genome of Fasciola gigantica: Insights into the genomic signatures of transposon-mediated evolution and specific parasitic adaption in tropical regions," has been formally accepted for publication in PLOS Neglected Tropical Diseases. We have now passed your article onto the PLOS Production Department who will complete the rest of the publication process. All authors will receive a confirmation email upon publication. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any scientific or type-setting errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Note: Proofs for Front Matter articles (Editorial, Viewpoint, Symposium, Review, etc...) are generated on a different schedule and may not be made available as quickly. Soon after your final files are uploaded, the early version of your manuscript will be published online unless you opted out of this process. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Neglected Tropical Diseases. Best regards, Shaden Kamhawi co-Editor-in-Chief PLOS Neglected Tropical Diseases Paul Brindley co-Editor-in-Chief PLOS Neglected Tropical Diseases
  91 in total

Review 1.  p24 family proteins: key players in the regulation of trafficking along the secretory pathway.

Authors:  Noelia Pastor-Cantizano; Juan Carlos Montesinos; César Bernat-Silvestre; María Jesús Marcote; Fernando Aniento
Journal:  Protoplasma       Date:  2015-07-30       Impact factor: 3.356

2.  Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments.

Authors:  Gerard Talavera; Jose Castresana
Journal:  Syst Biol       Date:  2007-08       Impact factor: 15.683

3.  Primary experimental infection of riverine buffaloes with Fasciola gigantica.

Authors:  S C Yadav; R L Sharma; A Kalicharan; U R Mehra; R S Dass; A K Verma
Journal:  Vet Parasitol       Date:  1999-05       Impact factor: 2.738

4.  Generation of oxidative stress and induction of apoptotic like events in curcumin and thymoquinone treated adult Fasciola gigantica worms.

Authors:  Abdur Rehman; Rizwan Ullah; Divya Gupta; M A Hannan Khan; Lubna Rehman; Mirza Ahmar Beg; Asad U Khan; S M A Abidi
Journal:  Exp Parasitol       Date:  2019-12-01       Impact factor: 2.011

5.  Phenotypic analysis of adults of Fasciola hepatica, Fasciola gigantica and intermediate forms from the endemic region of Gilan, Iran.

Authors:  K Ashrafi; M A Valero; M Panova; M V Periago; J Massoud; S Mas-Coma
Journal:  Parasitol Int       Date:  2006-08-08       Impact factor: 2.230

6.  Isolation of Fasciola hepatica haemoglobin.

Authors:  S McGonigle; J P Dalton
Journal:  Parasitology       Date:  1995-08       Impact factor: 3.234

7.  InterProScan: protein domains identifier.

Authors:  E Quevillon; V Silventoinen; S Pillai; N Harte; N Mulder; R Apweiler; R Lopez
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

8.  BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics.

Authors:  Robert M Waterhouse; Mathieu Seppey; Felipe A Simão; Mosè Manni; Panagiotis Ioannidis; Guennadi Klioutchnikov; Evgenia V Kriventseva; Evgeny M Zdobnov
Journal:  Mol Biol Evol       Date:  2018-03-01       Impact factor: 16.240

9.  Versatile genome assembly evaluation with QUAST-LG.

Authors:  Alla Mikheenko; Andrey Prjibelski; Vladislav Saveliev; Dmitry Antipov; Alexey Gurevich
Journal:  Bioinformatics       Date:  2018-07-01       Impact factor: 6.937

Review 10.  Inter-Species/Host-Parasite Protein Interaction Predictions Reviewed.

Authors:  Jumoke Soyemi; Itunnuoluwa Isewon; Jelili Oyelade; Ezekiel Adebiyi
Journal:  Curr Bioinform       Date:  2018-08       Impact factor: 3.543

View more
  4 in total

Review 1.  Scratching the Itch: Updated Perspectives on the Schistosomes Responsible for Swimmer's Itch around the World.

Authors:  Eric S Loker; Randall J DeJong; Sara V Brant
Journal:  Pathogens       Date:  2022-05-16

2.  A global phosphoproteomics analysis of adult Fasciola gigantica by LC-MS/MS.

Authors:  Ming Pan; Shao-Yuan Bai; Jing-Zhi Gong; Dan-Dan Liu; Feng Lu; Qi-Wang Jin; Jian-Ping Tao; Si-Yang Huang
Journal:  Parasitol Res       Date:  2022-01-05       Impact factor: 2.289

3.  Draft genome of the bluefin tuna blood fluke, Cardicola forsteri.

Authors:  Lachlan Coff; Andrew J Guy; Bronwyn E Campbell; Barbara F Nowak; Paul A Ramsland; Nathan J Bott
Journal:  PLoS One       Date:  2022-10-14       Impact factor: 3.752

4.  Transcriptomic landscape of hepatic lymph nodes, peripheral blood lymphocytes and spleen of swamp buffaloes infected with the tropical liver fluke Fasciola gigantica.

Authors:  Rui-Si Hu; Fu-Kai Zhang; Qiao-Ni Ma; Muhammad Ehsan; Quan Zhao; Xing-Quan Zhu
Journal:  PLoS Negl Trop Dis       Date:  2022-03-23
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.