| Literature DB >> 27017378 |
Kai Yu1,2, Yang Yu1,2, Xiaoyan Tang1,2, Huimin Chen1,2, Junyu Xiao3,4, Xiao-Dong Su5,6.
Abstract
The High Five cell line (BTI-TN-5B1-4) isolated from the cabbage looper, Trichoplusia ni is an insect cell line widely used for baculovirus-mediated recombinant protein expression. Despite its widespread application in industry and academic laboratories, the genomic background of this cell line remains unclear. Here we sequenced the transcriptome of High Five cells and assembled 25,234 transcripts. Codon usage analysis showed that High Five cells have a robust codon usage capacity and therefore suit for expressing proteins of both eukaryotic- and prokaryotic-origin. Genes involved in glycosylation were profiled in our study, providing guidance for engineering glycosylated proteins in the insect cells. We also predicted signal peptides for transcripts with high expression abundance in both High Five and Sf21 cell lines, and these results have important implications for optimizing the expression level of some secretory and membrane proteins.Entities:
Keywords: High Five cell line; baculovirus-insect cell system; codon usage; glycosylation; signal peptide
Mesh:
Substances:
Year: 2016 PMID: 27017378 PMCID: PMC4853316 DOI: 10.1007/s13238-016-0260-y
Source DB: PubMed Journal: Protein Cell ISSN: 1674-800X Impact factor: 14.870
Assembly statistics information
| Raw assembly | Duplicate removed assembly | |
|---|---|---|
| Trinity ‘genes’ | 27,389 | 24,000 |
| Trinity transcripts | 31,068 | 25,234 |
| GC content (%) | 40.84 | 40.71 |
| Median contig length (bp) | 722 | 622 |
| Average contig length (bp) | 1269.6 | 1160.9 |
| Total assembled bases (bp) | 39,444,068 | 29,294,166 |
Figure 1Assembly quality assessment. (A) Full-length transcript assessment. Bin on x-axis represent the percentage of the hit’s length included in the alignment to the Trinity transcript. Left y-axis with bar plot is the transcript count in each bin and right y-axis with point plot is the accumulate count below that bin. (B) N50 of subset of transcript by decreasing the expression level. Ex is the top most expressed transcripts that represent x% of the data. ExN50 is the length of a transcript while the total length of transcripts shorter that it reached 50% of total length of all transcripts in this dataset. (C) Transcript count with a threshold of negative minimum TPM value
Figure 2Gene Ontology of High Five and Sf21 transcriptome. Summarized in three main GO categories: Cellular component, Molecular function and Biological process. Right y-axis is the transcript count in that function item, left y-axis is the corresponding percentage of transcripts number
Figure 3Transcript number in each EggNOG function classes. Divided into 3 parts by colors. (Red) Information storage and processing; (Blue) Cellular processes and signaling; (Purple) Metabolism
Figure 4Codon usage of 20 amino acids across 10 different species. Each subplot is an amino acid, x-axis is different species and y-axis is the RSCU value of codons coding that amino acid
Figure 5Glycogene profile of High Five and Sf21 cell line. (A) Heatmap represents the gene constituent of each species. Blue mark of Sf21 and High Five showed the expression value of each gene. Red mark only represent they have this gene. (B) Bar plot of glycogene categories. (C) Ring plot representing the properties of glycogenes in High Five and Sf21 cell lines
Figure 6Highly expressed predicted signal peptides. Plot on the left are the expression values of those transcripts which have a predicted signal peptide. Sequences on the right are the top 100 signal peptide sequence with the signalP score in the brackets