Literature DB >> 35752825

De novo transcriptome assembly and annotation of the third stage larvae of the zoonotic parasite Anisakis pegreffii.

Marialetizia Palomba1, Pietro Libro1, Simonetta Mattiucci2,3, Tiziana Castrignanò1, Jessica Di Martino1, Aurelia Rughetti4, Mario Santoro5.   

Abstract

OBJECTIVES: Anisakis pegreffii is a zoonotic parasite requiring marine organisms to complete its life-history. Human infection (anisakiasis) occurs when the third stage larvae (L3) are accidentally ingested with raw or undercooked infected fish or squids. A new de novo transcriptome of A. pegreffii was here generated aiming to provide a robust bulk of data to be used for a comprehensive "ready-to-use" resource for detecting functional studies on genes and gene products of A. pegreffii involved in the molecular mechanisms of parasite-host interaction. DATA DESCRIPTION: A RNA-seq library of A. pegreffii L3 was here newly generated by using Illumina TruSeq platform. It was combined with other five RNA-seq datasets previously gathered from L3 of the same species stored in SRA of NCBI. The final dataset was analyzed by launching three assembler programs and two validation tools. The use of a robust pipeline produced a high-confidence protein-coding transcriptome of A. pegreffii. These data represent a more robust and complete transcriptome of this species with respect to the actually existing resources. This is of importance for understanding the involved adaptive and immunomodulatory genes implicated in the "cross talk" between the parasite and its hosts, including the accidental one (humans).
© 2022. The Author(s).

Entities:  

Keywords:  Anisakis pegreffii; De novo assembly; Gene annotation; Transcriptome; Zoonotic parasite

Mesh:

Year:  2022        PMID: 35752825      PMCID: PMC9233829          DOI: 10.1186/s13104-022-06099-9

Source DB:  PubMed          Journal:  BMC Res Notes        ISSN: 1756-0500


Objective

Anisakis pegreffii is a parasitic nematode belonging to the A. simplex (s.l.) species complex [1, 2]. It has a heteroxenous life cycle involving mainly cetaceans as definitive hosts, crustaceans as first intermediate hosts, fish, and squids as intermediate/paratenic ones. Its geographical distribution includes the Mediterranean Sea, the Iberian Atlantic coast waters, and the Austral region waters, between 30°S and 60°S. In humans, the accidental ingestion of third-stage larvae (L3) through the consumption of infected raw, undercooked, or improperly processed fish, causes a zoonosis, known as anisakiasis. Among the currently recognized nine biological species of the genus, so far only A. pegreffii and A. simplex (s.s.) cause anisakiasis [1, 3, 4]. The investigation of genes and proteins of A. pegreffii is crucial for understanding the parasite biological functions and its adaptation to abiotic and biotic conditions. It also represents a fundamental aspect to add knowledge about the molecular mechanisms involved in the evolutionary host-parasite interaction. Additionally, the molecules involved in the interaction between A. pegreffii and humans have not yet been elucidated. Finally, the absence of a suitable reference genome of this parasite species could make it difficult to achieve those goals. Although several RNA-seq analyses of L3 A. pegreffii at different experimental conditions and from different larvae tissues were carried out [5-9], a complete “ready to use” transcriptome is missing. Objective of this research was to provide a robust high-confidence protein-coding transcriptome of the L3 stage of A. pegreffii acquired from the assembly of data newly generated in the present study with those previously stored. The findings were to provide a more accurate de novo reference transcriptome of A. pegreffii that will allow to shed light on genes implicated in the "cross-talk" between the parasite and its natural and accidental hosts.

Data description

The input dataset for de novo assembly of A. pegreffii L3 was composed by six RNA-seq datasets (Table 1, Data file 1, 2): one obtained in the present study (PRJNA752284) (Table 1, Data file 2) and five retrieved from SRA of NCBI (PRJNA589243, PRJNA602791, PRJNA374530, PRJNA316941, PRJNA312925). In order to obtain the RNA-seq dataset in this study, A. pegreffii L3, collected from the viscera of fish from the Mediterranean Sea, were maintained in vitro culture for 24 h. RNA and DNA were extracted from nine L3 using TRIzol reagent, as previously described [10, 11]. The extracted RNA from each three L3 was pooled, and the quantity check was performed by using Agilent 2100 Bioanalyzer. The cDNA library was prepared using the TruSeq Stranded mRNA kit (Illumina). Ligated products of 200 bp were excised from agarose gels and PCR amplified. Products were single end sequenced on an Illumina TruSeq platform. Genetic/molecular identification of L3 A. pegreffii was performed by sequences analysis of mitochondrial (mtDNA cox2), and nuclear (EF1 α − 1 nDNA, nas 10 nDNA) gene loci, as previously described [12].
Table 1

Overview of data files/data sets

LabelName of data file/data setFile types (file extension)Data repository and identifier (DOI or accession number)
Data file 1RNA-seq datasets from NCBITable file (.doc)Figshare https://doi.org/10.6084/m9.figshare.19174214 [24]
Data file 2RNA-seq dataset obtained in this studyFastq file (.fastq)NCBI https://identifiers.org/ncbi/bioproject:PRJNA752284 [25]
Data file 3MultiQC reads quality resultsImage file (.jpg)Figshare https://doi.org/10.6084/m9.figshare.18480635 [26]
Data file 4Trinity RNA-seq de novo transcriptome assemblyFasta file (.fasta)Figshare https://doi.org/10.6084/m9.figshare.18300896 [27]
Data file 5rnaSPAdes RNA-seq de novo transcriptome assemblyFasta file (.fasta)Figshare https://doi.org/10.6084/m9.figshare.18301337 [28]
Data file 6Oases RNA-seq de novo transcriptome assemblyFasta file (.fasta)Figshare https://doi.org/10.6084/m9.figshare.18480689 [29]
Data file 7Anisakis pegreffii RNA-seq de novo transcriptome assemblyFastq file (.fastq)

ENA

https://identifiers.org/ena.embl:ERZ5400090 [30]

Data file 8UnigenesFasta file (.fasta)Figshare https://doi.org/10.6084/m9.figshare.18301772 [31]
Data file 9Open reading frames (ORFs) predictionFasta file (.fasta)Figshare https://doi.org/10.6084/m9.figshare.18302102 [32]
Data file 10Functional annotation from non-redundant (nr) NCBIText file (.txt)Figshare https://doi.org/10.6084/m9.figshare.18295190 [33]
Data file 11Functional annotation from Swiss-ProtText file (.txt)Figshare https://doi.org/10.6084/m9.figshare.18295970 [34]
Data file 12Functional annotation from TrEMBL UniProtText file (.txt)Figshare https://doi.org/10.6084/m9.figshare.18296603 [35]
Data file 13Functional annotation from non-redundant (nr) protein NCBIText file (.txt)Figshare https://doi.org/10.6084/m9.figshare.18296933 [36]
Data file 14Functional annotation from Swiss-Prot ProteinText file (.txt)Figshare https://doi.org/10.6084/m9.figshare.18297410 [37]
Data file 15Functional annotation from TrEMBL UniProt ProteinText file (.txt)Figshare https://doi.org/10.6084/m9.figshare.18297938 [38]
Data file 16InterProScan resultsText file (.txt)Figshare https://doi.org/10.6084/m9.figshare.18298319 [39]
Bioinformatic analysis was performed using a High-Performance-Computing platform [13]. For each bioproject, the quality control of reads was performed running FastQC v.0.11.2, before and after trimming step (Trimmomatic v.0.39 [14]). The quality assessment metrics for all trimmed data were aggregated with MultiQC v.1.9 [15]. Data file 3 (Table 1) shows both the mean read counts per quality scores and the mean quality scores in each base position higher than 35, for all the samples in the six analyzed bioprojects. A total of 393,512,048 cleaned reads (97% of whole raw reads) were obtained after the removal of the low-quality reads. In order to construct a robust de novo transcriptome, three assembly tools with a multi-kmer approach were adopted: Trinity v.2.11.0 [16] (Table 1, Data file 4), rnaSPAdes v.3.14.1 [17] (Table 1, Data file 5) and Oases v.0.2.09 [18] (Table 1, Data file 6). Results for each assembler were merged with Transabyss v.2.0.1 [19] (Table 1, Data file 7). The merged assembly of A. pegreffii showed an average length of 939 bp and an N50 of 2859 bp. The assembly was validated with two algorithms: Busco v.4.1.4 [20] and Transrate v.1.0.3 [21]. A CD-HIT-est run v.4.8.1 was applied to the merged assembly to remove any redundant transcripts. A total of 394,635 unique genes were provided (Table 1, Data file 8) and a quality check was re-applied. A total of 260,872 ORFs were predicted by using Transdecoder v.5.5.0 [22] (Table 1, Data file 9). The functional annotation of contigs was performed by using DIAMOND v.2.0.11 [23], calling both blastp and blastx functions against three databases (Nr, SwissProt and TremBL). The obtained results for blastx consisted in 86,982 (88.93%), 56,997 (58.47%) and 87,134 (89.39%) sequences against Nr (Table 1, Data file 10), SwissProt (Table 1, Data file 11) and TremBl (Table 1, Data file 12), respectively. Mapped transcripts listed in the Data file 10, yielded 38,972 matches (hits) with A. simplex. Blastp results also are available for Nr (Table 1, Data file 13), SwissProt (Table 1, Data file 14) and TremBl (Table 1, Data file 15). Output from InterProScan used to annotate protein signatures is available in Data file 16 (Table 1). In detail, 18,976 contigs were annotated: 5099 GO-annotated and 2800 KEGG-annotated. Overview of data files/data sets ENA https://identifiers.org/ena.embl:ERZ5400090 [30]

Limitations

The A. pegreffii transcriptome here obtained was assembled with those RNA-seq data sets from the third larval stage of the parasite species. The single transcriptome available from the fourth stage larva of A. pegreffii [8] was not included in this analysis because the main aim of this analysis was to provide a robust and "ready to use'' transcriptome of the infective stage (third larval stage) of the parasite also provoking the zoonotic disease (anisakiasis) to humans.
  22 in total

1.  De novo assembly and analysis of RNA-seq data.

Authors:  Gordon Robertson; Jacqueline Schein; Readman Chiu; Richard Corbett; Matthew Field; Shaun D Jackman; Karen Mungall; Sam Lee; Hisanaga Mark Okada; Jenny Q Qian; Malachi Griffith; Anthony Raymond; Nina Thiessen; Timothee Cezard; Yaron S Butterfield; Richard Newsome; Simon K Chan; Rong She; Richard Varhol; Baljit Kamoh; Anna-Liisa Prabhu; Angela Tam; YongJun Zhao; Richard A Moore; Martin Hirst; Marco A Marra; Steven J M Jones; Pamela A Hoodless; Inanc Birol
Journal:  Nat Methods       Date:  2010-10-10       Impact factor: 28.547

2.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

Authors:  Felipe A Simão; Robert M Waterhouse; Panagiotis Ioannidis; Evgenia V Kriventseva; Evgeny M Zdobnov
Journal:  Bioinformatics       Date:  2015-06-09       Impact factor: 6.937

3.  IgE sensitization to Anisakis pegreffii in Italy: Comparison of two methods for the diagnosis of allergic anisakiasis.

Authors:  S Mattiucci; A Colantoni; B Crisafi; F Mori-Ubaldini; L Caponi; P Fazii; G Nascetti; F Bruschi
Journal:  Parasite Immunol       Date:  2017-07       Impact factor: 2.280

Review 4.  Molecular Epidemiology of Anisakis and Anisakiasis: An Ecological and Evolutionary Road Map.

Authors:  Simonetta Mattiucci; Paolo Cipriani; Arne Levsen; Michela Paoletti; Giuseppe Nascetti
Journal:  Adv Parasitol       Date:  2018-03-06       Impact factor: 3.870

5.  De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis.

Authors:  Brian J Haas; Alexie Papanicolaou; Moran Yassour; Manfred Grabherr; Philip D Blood; Joshua Bowden; Matthew Brian Couger; David Eccles; Bo Li; Matthias Lieber; Matthew D MacManes; Michael Ott; Joshua Orvis; Nathalie Pochet; Francesco Strozzi; Nathan Weeks; Rick Westerman; Thomas William; Colin N Dewey; Robert Henschel; Richard D LeDuc; Nir Friedman; Aviv Regev
Journal:  Nat Protoc       Date:  2013-07-11       Impact factor: 13.491

6.  MultiQC: summarize analysis results for multiple tools and samples in a single report.

Authors:  Philip Ewels; Måns Magnusson; Sverker Lundin; Max Käller
Journal:  Bioinformatics       Date:  2016-06-16       Impact factor: 6.937

7.  Tissue-specific transcriptomes of Anisakis simplex (sensu stricto) and Anisakis pegreffii reveal potential molecular mechanisms involved in pathogenicity.

Authors:  Serena Cavallero; Fabrizio Lombardo; Xiaopei Su; Marco Salvemini; Cinzia Cantacessi; Stefano D'Amelio
Journal:  Parasit Vectors       Date:  2018-01-10       Impact factor: 3.876

8.  Functional insights into the infective larval stage of Anisakis simplex s.s., Anisakis pegreffii and their hybrids based on gene expression patterns.

Authors:  C Llorens; S C Arcos; L Robertson; R Ramos; R Futami; B Soriano; S Ciordia; M Careche; M González-Muñoz; Y Jiménez-Ruiz; N Carballeda-Sangiao; I Moneo; J P Albar; M Blaxter; A Navas
Journal:  BMC Genomics       Date:  2018-08-07       Impact factor: 3.969

9.  De novo transcriptome sequencing and analysis of Anisakis pegreffii (Nematoda: Anisakidae) third-stage and fourth stage larvae.

Authors:  U-Hwa Nam; Jong-Oh Kim; Jeong-Ho Kim
Journal:  J Nematol       Date:  2020       Impact factor: 1.402

10.  ELIXIR-IT HPC@CINECA: high performance computing resources for the bioinformatics community.

Authors:  Tiziana Castrignanò; Silvia Gioiosa; Tiziano Flati; Mirko Cestari; Ernesto Picardi; Matteo Chiara; Maddalena Fratelli; Stefano Amente; Marco Cirilli; Marco Antonio Tangaro; Giovanni Chillemi; Graziano Pesole; Federico Zambelli
Journal:  BMC Bioinformatics       Date:  2020-08-21       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.