| Literature DB >> 32085426 |
Xingyu Zhu1, Shuangfei Li1,2,3, Liangxu Liu1, Siting Li1, Yanqing Luo1, Chuhan Lv1, Boyu Wang1,2,3, Christopher H K Cheng4, Huapu Chen5, Xuewei Yang1,2,3.
Abstract
Thraustochytriidae sp. have broadly gained attention as a prospective resource for the production of omega-3 fatty acids production in significant quantities. In this study, the whole genome of Thraustochytriidae sp. SZU445, which produces high levels of docosapentaenoic acid (DPA) and docosahexaenoic acid (DHA), was sequenced and subjected to protein annotation. The obtained clean reads (63.55 Mb in total) were assembled into 54 contigs and 25 scaffolds, with maximum and minimum lengths of 400 and 0.0054 Mb, respectively. A total of 3513 genes (24.84%) were identified, which could be classified into six pathways and 44 pathway groups, of which 68 genes (1.93%) were involved in lipid metabolism. In the Gene Ontology database, 22,436 genes were annotated as cellular component (8579 genes, 38.24%), molecular function (5236 genes, 23.34%), and biological process (8621 genes, 38.42%). Four enzymes corresponding to the classic fatty acid synthase (FAS) pathway and three enzymes corresponding to the classic polyketide synthase (PKS) pathway were identified in Thraustochytriidae sp. SZU445. Although PKS pathway-associated dehydratase and isomerase enzymes were not detected in Thraustochytriidae sp. SZU445, a putative DHA- and DPA-specific fatty acid pathway was identified.Entities:
Keywords: docosahexaenoic acid (DHA); fatty acid synthesis pathway; polyketide synthase pathway; polyunsaturated fatty acid; whole-genome sequencing
Mesh:
Substances:
Year: 2020 PMID: 32085426 PMCID: PMC7073664 DOI: 10.3390/md18020118
Source DB: PubMed Journal: Mar Drugs ISSN: 1660-3397 Impact factor: 5.118
Summary of clean data assembly for Thraustochytriidae sp. SZU445.
| Sample Name | Seq Type (#) | Total Number (#) | Total Length (Mb) | N50 Length (Mb) | N90 Length (Mb) | Max Length (Mb) | Min Length (Mb) | Gap Number (Mb) | GC Content (%) |
|---|---|---|---|---|---|---|---|---|---|
| SZU445 | Scaffold | 25 | 61.97 | 5.98 | 2.41 | 13.75 | 0.0054 | 0.091 | 45.04 |
| SZU445 | Contig | 54 | 61.88 | 2.55 | 1.39 | 4.00 | 0.0054 | - | 45.04 |
Seq Type (#): Sequence type (Scaffold, Contig). Total number (#): Total number of Contig or Scaffold. Total length (Mb): Total length of assembly results. N50 Length (Mb): The N50 length is used to determine the assembly continuity; the higher the better. N50 is a weighted median statistic that 50% of the total length is contained in transcripts that are equal to or larger than this value. N90 length (Mb): similar to N50 length. Max length (Mb): max length of scaffold or contig. Min length (Mb): min length of scaffold or contig. Gap number (Mb): number of gaps in the sequence. GC content (%): the percentage of G and C bases in the assembly result sequence.
Figure 1Statistical analysis of the GC content and depth correlation analysis of Thraustochytriidae sp. SZU445. The abscissa is the GC content, and the ordinate is the average depth. The scatter plot shows a shape that approximates a Poisson distribution and shows that sequencing data have low GC bias.
Figure 2Genomic circle diagram of Thraustochytriidae sp. SZU445. From the outer to the inner rings: Genome (sorted by length), gene density (gene number in 50,000 bp nonoverlapping windows), ncRNA density (ncRNA number in 100,000 bp nonoverlapping windows), GC (GC rate in 20,000 bp nonoverlapping windows), GC_skew (GC skew in 20,000 bp nonoverlapping windows).
The Genetic component of Thraustochytriidae sp. SZU445.
| Sample Name (#) | Type (#) | Total Number (#) | Total Length (bp) | Average Length (bp) | Length/Genome Length (%) |
|---|---|---|---|---|---|
| SZU445 | Gene Stat | 14,145 | 26,947,341 | 1905.08 | 43.48 |
| Exons Stat | 18,768 | 25,518,500 | 1359.68 | 41.18 | |
| CDS Stat | 14,145 | 25,518,500 | 1804.07 | 41.18 | |
| Intron Stat | 4623 | 1,428,841 | 309.07 | 2.31 |
Figure 3Gene length distribution of Thraustochytriidae sp. SZU445. The abscissa is the length of the gene, and the ordinate is the number of genes corresponding to the length of the gene.
Noncoding RNA statistics of Thraustochytriidae sp. SZU445.
| Sample Name (#) | Type | Copy# | Avg_Len | Total_Len | % in Genome |
|---|---|---|---|---|---|
| tRNA | 493 | 77.81 | 38,362 | 0.0619 | |
| SZU445 | rRNA | 235 | 1683.92 | 395,723 | 0.6385 |
| snRNA | 77 | 70.15 | 5402 | 0.0087 |
Type: ncRNA type. Copy: The number of ncRNA type copies. Avg_Len: The average length of ncRNA. Total_Len: The total length of ncRNA. % in Genome: ncRNA types as a percentage of the genome.
Gene annotation results of Thraustochytriidae sp. SZU445 according to the database.
| Total | CAZY | TCDB | IPR | SWISS-PROT | GO | KEGG | KOG | COG | P450 | TF | EKPD | NOG | CARD | CWDE | NR | DBCAN | PHI | PHOSPHATASE | Overall |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 14,145 | 34 (0.24%) | 232 (1.64%) | 9707 (68.62%) | 2122 (15%) | 7255 (51.29%) | 1625 (11.48%) | 1866 (13.19%) | 1324 (9.36%) | 740 (5.23%) | 364 (2.57%) | 369 (2.60%) | 3550 (25.09%) | 7 (0.04%) | 1 (0.7%) | 2629 (18.58%) | 207 (1.46%) | 412 (2.91%) | 85 (0.60%) | 9852 (69.65%) |
CAZY: Carbohydrate-Active enZYmes Database. TCDB: Transporter Classification Database. IPR: InterPro Database. GO: Gene Ontology Database. KEGG: Kyoto Encyclopedia of Genes and Genomes. KOG: EuKaryotic Orthologous Groups. COG: Clusters of Orthologous Groups. P450: Fungal Cytochrome P450 Database. TF: Transcription Factor database. EKPD: Eukaryotic Protein Kinases and Protein Phosphatases. NOG: Evolutionary genealogy of genes: Non-supervised Orthologous Groups. CARD: The Comprehensive Antibiotic Resistance Database. CWDE: Cell Wall Degrading Enzyme. NR: Non-Redundant Protein Database. DBCAN: a web server and Database for automated Carbohydrate-active enzyme ANnotation. PHI: Pathogen–Host Interactions. PHOSPHATASE: a phosphatases database of EKPD.
Figure 4Distribution of GO database functional annotations. The ordinate is the annotation item, and the abscissa is the number of corresponding genes.
Figure 5Distribution of KEGG database functional annotations. The ordinate is the annotation item, and the abscissa is the number of corresponding genes.
Figure 6The phylogenetic tree produced using the neighbor-joining method analysis. The evolutionary history was inferred using the neighbor-joining method. The optimal tree with the sum of branch length = 0.01309240 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) is shown next to the branches. The evolutionary distances were computed using the maximum composite likelihood method and are in the units of the number of base substitutions per site. The analysis involved six nucleotide sequences. Codon positions included were 1st + 2nd + 3rd + noncoding. All ambiguous positions were removed for each sequence pair. There were a total of 1739 positions in the final dataset. Evolutionary analyses were conducted in MEGA7.
Enzymes involved in fatty acid biosynthesis and metabolism identified by annotation of the Thraustochytriidae sp. SZU445 genome.
| Enzyme | EC Number | Number of Transcripts |
|---|---|---|
|
| ||
| delta7-sterol 5-desaturase | 1.14.19.20 | 1 |
| sphingolipid 8-(E)-desaturase | 1.14.19.18 | 1 |
| sphingolipid 4-desaturase | 1.14.19.17 1.14.18.5 | 1 |
| aldehyde dehydrogenase (NAD+) | 1.2.1.3 | 4 |
| 17beta-estradiol 17-dehydrogenase | 1.1.1.62 1.1.1.330 | 1 |
| acyl-CoA dehydrogenase | 1.3.8.7 | 145 |
| glycerol-3-phosphate dehydrogenase | 1.1.5.3 | 26 |
| S-(hydroxymethyl)glutathione dehydrogenase/alcohol dehydrogenase | 1.1.1.284 1.1.1.1 | 2 |
| glycerol-3-phosphate dehydrogenase | 1.1.5.3 | 26 |
| glutaryl-CoA dehydrogenase | 1.3.8.6 | 5 |
| glycerol-3-phosphate dehydrogenase (NAD+) | 1.1.1.8 | 1 |
| alcohol dehydrogenase (NADP+) | 1.1.1.2 | 9 |
| aldehyde dehydrogenase family 7 member A1 | 1.2.1.31 1.2.1.8 1.2.1.3 | 2 |
| 3-hydroxyacyl-CoA dehydrogenase | 1.1.1.35 | 29 |
| glycerol 2-dehydrogenase (NADP+) | 1.1.1.156 | 1 |
| S-(hydroxymethyl)glutathione dehydrogenase/alcohol dehydrogenase | 1.1.1.284 1.1.1.1 | 2 |
| 17beta-estradiol 17-dehydrogenase/very-long-chain 3-oxoacyl-CoA reductase | 1.1.1.62 1.1.1.330 | 1 |
| delta14-sterol reductase | 1.3.1.70 | 1 |
|
| ||
| acetyl-CoA acyltransferase 2 | 2.3.1.16 | 2 |
| acetyl-CoA acyltransferase 1 | 2.3.1.16 | 2 |
| hydroxymethylglutaryl-CoA synthase | 2.3.3.10 | 6 |
| fatty acid synthase subunit alpha | 2.3.1.86 | 2 |
| 3-oxoacyl-[acyl-carrier-protein] synthase II | 2.3.1.179 | 1 |
| acetyl-CoA carboxylase/biotin carboxylase 1 | 6.4.1.2 6.3.4.14 2.1.3.15 | 1 |
Figure 7Putative fatty acid synthase pathway of Thraustochytriidae sp. SZU445 with the classical fatty acid synthesis pathway and the polyketide synthase pathway. The enzymes colored blue are present in the classical fatty acid synthase (FAS) and polyketide synthase (PKS) pathways. The enzymes colored green are present in Thraustochytriidae sp. SZU445 and correspond to the classical FAS and PKS pathways. The enzymes colored red are isozymes of dehydrase and isomerase in the PKS pathway that exist in Thraustochytriidae sp. SZU445.
The National Center for Biotechnology Information (NCBI) accession numbers of the reference strains for the phylogenetic analysis.
| Reference Strains | NCBI Accession |
|---|---|
| MH319338.1 | |
| JQ982490.1 | |
| JX847377.1 | |
| KX379459.1 | |
| AY705769.1 |