Literature DB >> 20563642

Could protein tertiary structure influence mammary transgene expression more than tissue specific codon usage?

Zuyong He1, Yiqiang Zhao, Gui Mei, Ning Li, Yaosheng Chen.   

Abstract

Animal mammary glands have been successfully employed to produce therapeutic recombinant human proteins. However, considerable variation in animal mammary transgene expression efficiency has been reported. We now consider whether aspects of codon usage and/or protein tertiary structure underlie this variation in mammary transgene expression.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20563642      PMCID: PMC2902731          DOI: 10.1007/s11248-010-9411-8

Source DB:  PubMed          Journal:  Transgenic Res        ISSN: 0962-8819            Impact factor:   2.788


Introduction

The expression of functional proteins in heterologous hosts is a cornerstone of modern biopharming. However, many human proteins are often difficult to express in unicellular organisms such as bacterium and yeast. The underlying problem is that codon bias has a profound impact on the heterologous expression of human proteins in these organisms. Codon usage has been found to be the single most important factor in prokaryotic gene expression. Therefore laborious and time consuming codon optimization is often necessary to achieve a successful expression in unicellular organisms (Gustafsson et al. 2004). Contrary to unicellular organisms, no obvious codon bias has been observed among human and several other mammals in previous studies. Thus animal mammary gland is considered an ideal bioreactor for producing functional human proteins without codon optimization. With this concept in mind, a number of transgenic livestock have been created to produce different recombinant human proteins in their milk, and in 2006 a recombinant human protein purified from a transgenic goat was approved for clinical use in Europe by the European Medicine Agency (Houdebine 2009). However, considerable variation in expression efficiency has been found in the heterologuous expressions of human proteins in the milk of transgenic animals produced in our lab and other groups. Generally a number of proteins such as serpin peptidase inhibitors and immunoglobulins which are abundant in human tissues other than mammary gland tend to exhibit a higher expression level in transgenic milk, while many other human proteins such as interleukin-2, coagulation factor 8 and catalase are difficult to express in milk (Wright et al. 1991; Tang et al. 2008; Buhler et al. 1990; Niemann et al. 1999; He et al. 2008a, b). These examples cause us to question whether tissue specific codon usage pattern affects translational regulation during the heterologuous expression of human proteins in animal mammary gland? Recently several independent studies concluded that codon bias might be a factor involved in translation regulation in humans. One study concluded that genes selectively expressed in one human tissue can often be discriminated from genes expressed in another tissue on the basis of their synonymous codon usage (Plotkin et al. 2004) while another study reported that the amount of tRNA varies widely among different human tissues based on microarray results, further more they showed that the relative tRNA abundance significantly correlates with codon usage of tissue specific genes (Dittmar et al. 2006). The effective number of codons (ENC) is a most common index for measurement of codon bias. ENC is analogous to the effective number of alleles in population genetics. However, ENC can not reveal which codons are more frequent than others but rather indicates the overall departure from random synonymous codon choice. As a result, two genes may exhibit the same degree of overall bias, but differ dramatically in their particular choice of synonymous codons. Thus in this study, we used a two-tailed Fisher exact test to measure the distance of synonymous codon usage between two genes (Plotkin et al. 2004). Unlike metrics such as ‘‘relative synonymous codon usage’’, which are noisy when applied to individual genes, the Fisher exact test for small sample sizes based codon usage measure can be applied to genes that contain only a few examples of each amino acid. We propose two scenarios: first, the different expression levels of recombinant human proteins in the milk of transgenic animals is due to the variation in synonymous codon usage patterns between mammary gland and other human tissues; or, second, protein tertiary structure may influence mammary transgene expression.

Human and mouse tissue specific codon usage

We have made pair-wise comparisons of codon usage among seven human tissues. When comparing heart to kidney (Fig. 1), virtually all kidney associated genes are clustered in a separate middle clade from the heart associated genes. The observed separation between these two classes of genes would not have occurred by random chance (P < 0.001). The observed clustering is the result of systematic differential codon usage between heart and kidney specific genes. Fig. 1 indicates that we can generally discriminate between heart and kidney expressed genes on the basis of their codon usage alone. Similarly, kidney specific genes can be discriminated from lung and pancreas specific genes (supplementary Fig. 1). However, many pairs of tissue specific gene sets do not exhibit significantly different codon usage patterns (e.g., brain versus pancreas, P = 0.384; supplementary Fig. 2). Unexpectedly, in the tested six mouse tissues, we can not observe any pair of tissues that can be separated from each other on the basis of their codon usage with a statistical significant test. Only heart specific genes can be nearly discriminated from liver specific genes (P = 0.072; supplementary Fig. 3).
Fig. 1

A dendogram reflecting the codon usage of 22 genes selectively expressed in heart (red) and 17 genes selectively expressed in human kidney (black). The pairwise distances underlying this tree reflect the degree to which the genes differ in their codon usage. As this tree demonstrates, heart-specific genes can generally be distinguished from heart-speicific genes purely on the basis of their synonymous codon usage. The observed separation between these two classes of genes would not have occurred by random chance (P < 0.001)

A dendogram reflecting the codon usage of 22 genes selectively expressed in heart (red) and 17 genes selectively expressed in human kidney (black). The pairwise distances underlying this tree reflect the degree to which the genes differ in their codon usage. As this tree demonstrates, heart-specific genes can generally be distinguished from heart-speicific genes purely on the basis of their synonymous codon usage. The observed separation between these two classes of genes would not have occurred by random chance (P < 0.001) When comparing human mammary gland to six other tissues, only heart, lung and pancreas specific genes can be discriminated from mammary gland specific genes on the basis of codon usage (heart versus mammary gland, P < 0.001; lung versus mammary gland, P = 0.002; pancreas versus mammary gland, P < 0.001; supplementary Fig. 4), and the other three tissues in the test can not be distinguished from mammary gland (brain versus mammary gland, P = 0.368; kidney versus mammary gland, P = 0.368; liver versus mammary gland, P = 0.536; supplementary Fig. 5). In the tested six mouse tissues, only genes selectively expressed in heart and pancreas can be distinguished from mouse mammary gland specific genes (heart versus mammary gland, P = 0.002; pancreas versus mammary gland, P = 0.038; supplementary Fig. 6). Thus there does appear to be codon usage differences between mammalian tissues.

Does expression levels of recombinant human proteins in transgenic milk correlate with mammary gland specific codon usage?

Successful examples of expressing recombinant human proteins in transgenic animals to date are summarized in Table 1 and the greatest expression level of the 31 recombinant proteins reported in Table 1 were showed in Fig. 2. When comparing codon usage of the 31 recombinant proteins expressed in the milk of transgenic animals to human mammary gland specific genes and milk proteins, we found several most efficiently expressed recombinant human proteins (SERPINC1, ATCD20IgL, REG3A, LTF, FGA, FGB and FGG) were clustered close to the mammary gland specific genes and milk proteins. However, we also found the moderately expressed FIX and those less efficiently expressed recombinant proteins (LYZ, CAT, IL2 and mCol18a1) were clustered close to most human mammary gland specific genes and milk proteins. The observed three classes of genes can not be discriminated from each other (mammary gland, P = 0.574; milk proteins, P = 0.900; supplementary Fig. 7). Similar results can be observed in the comparing of recombinant proteins to mouse mammary gland specific genes and milk proteins (mammary gland, P = 0.952; milk proteins, P = 0.974; supplementary Fig. 8). Further comparing codon usage of milk proteins among 19 different mammals showed that most milk proteins were prone to use similar codon usage patterns among different mammalian species (supplementary Fig. 9). Thus we compared the codon usage of recombinant proteins to milk proteins in the main five livestock cow, sheep, goat, rabbit and pig. In each animal, except for certain proteins, the cluster result was quite similar to human and mouse (Fig. 3). Thus we dismiss our proposal that expression levels of recombinant human proteins in the milk of transgenic animals correlate with mammary gland specific codon usage patterns.
Table 1

Expression of recombinant human proteins in the milk of transgenic animals

ProteinPromoter/vectorGene structureTransgenic animalExpression levelCompanyReference
Anti-CD20 mAbGoat beta-caseincDNAMice17.0 mg/mLTang et al. 2008
SERPINA1Goat beta-caseinGoats14.0 mg/mLGTC biotherapeuticsGTC literature (1996)
SERPINA1Goat beta-caseinRabbits4.0 mg/mLGTC giotherapeuticsGTC literature (1996)
SERPINA1Ovine BLGGenomic DNASheep35.0 mg/mLWright et al. 1991
AFPGoat beta-caseinGenomic DNAGoats0.6–1.1 mg/mLGTC giotherapeuticsParker et al. 2004
GAABovine aS1-caseinGenomic DNAMice2.0 mg/mLBijvoet et al. 1998
GAABovine aS1-caseinGenomic DNARabbits8.0 mg/mLBijvoet et al. 1999
LALBACattle2.4 mg/mLKrimpenfort et al. 1991
LALBAHuman a-LAGenomic DNACattle1.55 mg/mLWang et al. 2008
LALBAHuman a-LACattle2.4 mg/mLPPL therapeuticsPPL literature
SERPINC1Goat beta-caseinGoats20.0 mg/mLGTC biotherapeuticsGTC literature(1996)
SERPINC1Goat beta-caseinGoats5.8 mg/mLBaguisi et al. 1999
REG3ARabbit WAPGenomic DNAMice11.2 mg/mLChrista et al. 2000
BCHEcDNAMice1.0–5.0 mg/mLBaldassarre et al. 2008
BCHEcDNAMice0.1–5.0 mg/mLHuang et al. 2007
CELMouse WAPcDNAMice0.5–1.0 mg/mLStromqvist et al. 1996
SERPING1Bovine aS1-caseinRabbits12.0 mg/mLPharmingPharming literature online
CATGoat beta-caseincDNAMice145.3 ug/mLHe et al. 2008a, b
SOD3Murine WAPcDNARabbits2.9 mg/mLStromqvist et al. 1997
EPOBovine BLGcDNAMice0.3 mg/mLKorhonen et al. 1997
EPORabbit-WAPcDNARabbits50.0 ug/mLMassoud et al. 1996
EPOBovine BLGcDNARabbits0.5 mg/mLKorhonen et al. 1997
EPOAdenovirusGoats2.0 mg/mLToledo et al. 2006
FIBOvine BLGGenomic DNAMice2.0 mg/mLPrunkard et al. 1996
FIBOvine BLGSheep5.0 mg/mLPPL therapeuticsPPL literatur (1998)
FIBOvine BLGSheep5.0 mg/mLButler et al. 1997
FIBBovine aS1-caseinCattle3.0 mg/mLPharming
FIXMouse WAPcDNAPigs2.0–3.0 mg/mLLindsay et al. 2004
FIXOvine BLGSheep25.0 ng/mLSimons et al. 1987
FIXOvine BLGSheep5.0 ng/mLClark 1998
FIXOvine BLGGenomic DNASheep1.0 mg/mLPPL therapeuticsSchnieke et al. 1997
FVIIOvine BLGSheep2.0 mg/mLPPL therapeuticsPPL literature
FVIIIMouse WAPcDNAPigs2.7 ug/mLPaleyanda et al. 1997
FVIIIOvine BLGcDNASheep6.0 ng/mLNiemann et al. 1999
GHRat beta-caseinGenomic DNAMice19.0–5500.0 ug/mLLee et al. 1996
GHAdenovirusGenomic DNAMice2.8 mg/mLSanchez et al. 2004
GHMouse WAPGenomic DNARabbits50.0 ug/mLLimonta et al. 1995
GHRetroviruscDNAGoats60.0 ng/mLArcher et al. 1994
GHAdenovirusGenomic DNAGoats0.3 mg/mLSanchez et al. 2004
GHBovine aS1-caseinCattle5.0 mg/mLBio Sidus SASalamone et al. 2006
IGF1Bovine a-LAcDNAPigs228.0-1600.0 ug/mLMonaco et al. 2005
IGF1Bovine aS1-caseincDNARabbits1.0 mg/mLBrem et al. 1994
IL2Rabbit beta-caseinGenomic DNARabbits430.0 ng/mLBuhler et al. 1990
LTFHuman LTF genomic DNAMice8.02 mg/mLLiu et al. 2004
LTFBovine aS1-caseincDNAMice36.0 ug/mLPlatenburg et al. 1994
LTFsGenomic DNAMice6.6 mg/mLKim et al. 1999
LTFGoat beta-caseincDNARabbits153.0 ug/mLLi et al. 2006
LTFAdenoviruscDNAGoats2.6 mg/mLHan et al. 2007
LTFGoat beta-caseincDNAGoats0.765 mg/mLZhang et al. 2008
LTFBovine aS1-caseinCattle2.8 mg/mLPharmingvan Berkel et al. 2002
LTFHuman LFGenomic DNACattle3.4 mg/mLYang et al. 2008
LTFCattle2.9 mg/mLHyvonen et al. 2006
LYZGoat beta-caseinGenomic DNAMice1.405 mg/mLYu et al. 2006
NGFBovine aS1-caseincDNARabbits50.0–250.0 ug/mLCoulibaly et al. 1999
PROCOvine BLGRabbits0.7 mg/mLGTC BiotherapeuticsGTC literature (1996)
PROCMouse WAPGenomic DNARabbits0.109–0.301ug/mLDragin et al. 2005
PROCMouse WAPcDNAPigs1.0 mg/mLGTC BiotherapeuticsGTC literature (1992)
PROCMouse WAPcDNAPigs40.0–450.0 ug/mLVan Cott et al. 2001
PROCMouse WAPGenomic DNAPigs160.0–1200.0 ug/mLVan Cott et al. 2001
PROCOvine BLGPigs0.75 mg/mLPPL therapeuticsPPL literature
PROCOvine BLGSheep0.3 mg/mLPPL therapeuticsPPL literature (1998)
ALBBovine BLGcDNAMice2.5 mg/mLShani et al. 1992
tPABovine aS1-caseincDNARabbits8.0–50.0 ng/mLRiego et al. 1993
tPAMurine WAPcDNAGoats3.0 ug/mLEbert et al. 1991
tPAGoat beta-caseinGoats3.0 mg/mLEbert et al. 1994
mCol18a1a Bovine aS1-caseincDNAMice70.0–300.0 ng/mLZavadskaia et al. 2001
Calc1b Ovine BLGcDNARabbits1.0–2.1 mg/mLPPL TherapeuticsMcKee et al. 1998

amCol18a1 mouse collagen, type XVIII, alpha 1

bCalc1 salmon calcitonin: The human calcitonin aggregates; therefore, the piscine equivalent was produced

Fig. 2

Expression levels of recombinant human proteins in the milk of transgenic animals. The expression level of each protein presents the highest one among different studies summarized in Table 1; ATCD20 IgH and ATCD20 IgL indicate the heavy and light chain of human anti-CD20 monoclonal antibody; FGA, FGB and FGG note the alpha chain, beta chain and gamma chain of fibrinogen (FIB) respectively; mCol18a1 stands for mouse collagen, type XVIII, alpha 1; star notes the expression level of recombinant protein with a cDNA based expression construct; open circle indicates the expression level values of hFVIII(0.0027 mg/mL), hIL-2(0.000043 mg/mL) and mCol18a1(0.00003 mg/mL) which all are too low to be fully presented on the top of the bar

Fig. 3

Dendograms reflecting the codon usage of milk proteins of the main five livestock (red) and recombinant proteins. The top 13 high level expressed recombinant proteins (≧5 mg/mL) are showed in blue, and moderate and lower expressed recombinant proteins are indicated in black. The three classes of genes (indicated in three different colour) can not be clearly discriminated from each other (cow, P = 0.992; goat, P = 0.982; sheep, P = 0.992; rabbit, P = 0.962; pig, P = 0.994)

Expression of recombinant human proteins in the milk of transgenic animals amCol18a1 mouse collagen, type XVIII, alpha 1 bCalc1 salmon calcitonin: The human calcitonin aggregates; therefore, the piscine equivalent was produced Expression levels of recombinant human proteins in the milk of transgenic animals. The expression level of each protein presents the highest one among different studies summarized in Table 1; ATCD20 IgH and ATCD20 IgL indicate the heavy and light chain of human anti-CD20 monoclonal antibody; FGA, FGB and FGG note the alpha chain, beta chain and gamma chain of fibrinogen (FIB) respectively; mCol18a1 stands for mouse collagen, type XVIII, alpha 1; star notes the expression level of recombinant protein with a cDNA based expression construct; open circle indicates the expression level values of hFVIII(0.0027 mg/mL), hIL-2(0.000043 mg/mL) and mCol18a1(0.00003 mg/mL) which all are too low to be fully presented on the top of the bar Dendograms reflecting the codon usage of milk proteins of the main five livestock (red) and recombinant proteins. The top 13 high level expressed recombinant proteins (≧5 mg/mL) are showed in blue, and moderate and lower expressed recombinant proteins are indicated in black. The three classes of genes (indicated in three different colour) can not be clearly discriminated from each other (cow, P = 0.992; goat, P = 0.982; sheep, P = 0.992; rabbit, P = 0.962; pig, P = 0.994)

Recombinant human proteins with greater expression levels in transgenic milk share similar protein domains with milk proteins

The main domains of recombinant proteins expressed in the milk of transgenic animals and mammalian milk proteins derived from CATH were summarized in Table 2. Casein makes up the main component of milk proteins, as in bovine milk, it reaches as great as 82% of the total milk proteins (Jensen 1995). When we investigated the main domains of the main four caseins [casein alpha S1 (CSN1S1), casein alpha S2 (CSN1S2), casein beta (CSN2), and casein kappa (CSN3)], it can be observed that all four caseins share a similar alpha–beta based major domain (Fig. 4). With the exception of CSN1S2 which is composed of an alpha–beta barrel domain, the other three caseins all are composed of a 3-layer (aba) sandwich shaped major domain. Interestingly we found that within the top 13 high level expressed recombinant proteins in the milk of transgenic animals (expression level ≧5 mg/mL), 10 proteins share a similar alpha–beta based major domain with casein proteins, especially those with extra-higher expression levels such as SERPINA1, SERPINC1, SERPING1, REG3A, LTF and BCHE which all possess a 2-layer or 3-layer (aba) sandwich shaped major domain similar to the most abundant milk proteins CSN1S1, CSN2 and CSN3. Beside the 10 high level expressed proteins, there are another two proteins CEL (1.00 mg/mL) and Calc1(2.10 mg/mL) each has a CSN2 like 3-layer (aba) sandwich domain and mainly alpha beta domain, respectively, though these two recombinant proteins do not get an extra-higher expression levels in the transgenic milk. However, the moderate expression levels are probably due to their cDNA based expression constructs. Because the gene structure used in an expression construct seems to have significant impact on expression level, generally a genomic DNA sequence results in several orders of magnitude greater expression level than a cDNA sequence (Whitelaw et al. 1991).
Table 2

The main domains of recombinant proteins expressed in transgenic milk

ProteinCATH CodeDomain
IGF11.10.100.10Mainly alpha; orthogonal bundle; insulin-like, subunit E; insulin-like, subunit E
AFP1.10.246.10Mainly alpha; orthogonal bundle; serum albumin; chain A, dDomain 1
ALB1.10.246.10Mainly alpha; orthogonal bundle; serum albumin; chain A, domain 1
LALBA 1.10.530.10Mainly alpha; orthogonal bundle; lysozyme
LYZ1.10.530.10Mainly alpha; orthogonal bundle; lysozyme
EPO1.20.1250.10Mainly alpha; up-down bundle; growth hormone; chain: A;
GHa 1.20.1250.10Mainly alpha; up-down bundle; growth hormone; chain A
IL21.20.1250.10Mainly alpha; up-down bundle; growth hormone; chain A
NGF2.10.90.10Mainly beta; ribbon; cystine knot cytokines, subunit B; cystine-knot cytokines
FIX2.40.10.10Mainly beta; beta barrel; thrombin, subunit H; trypsin-like serine proteases
FVII2.40.10.10Mainly beta; beta barrel; thrombin, subunit H; trypsin-like serine proteases
FVIII2.40.10.10Mainly beta; beta barrel; thrombin, subunit H; trypsin-like serine proteases
PROC2.40.10.10Mainly beta; beta barrel; thrombin, subunit H; trypsin-like serine proteases
tPA2.40.10.10Mainly beta; beta barrel; thrombin, subunit H; trypsin-like serine proteases
BLG 2.40.128.20Mainly beta; beta barrel; lipocalin
CAT2.40.180.10Mainly beta; beta barrel; catalase HpII, chain A, domain 1
ATCD20IgHa 2.60.40.10Mainly beta; sandwich; immunoglobulin-like; immunoglobulins
ATCD21IgLa 2.60.40.10Mainly beta; sandwich; immunoglobulin-like; immunoglobulins
SOD32.60.40.200Mainly beta; sandwich; immunoglobulin-like
mCol18a1b 3.10.100.10Alpha beta; roll; mannose-binding protein A, chain A; mannose-binding protein A, subunit A
REG3Aa,b 3.10.100.10Alpha beta; roll; mannose-binding protein A, Chain A; mannose-binding protein A, subunit A
CSN1S2 3.20.20.70Alpha beta; alpha–beta barrel; TIM Barrel; aldolase class I
GAAa,b 3.20.20.80Alpha beta; alpha–beta barrel; TIM barrel; glycosidases
SERPINA1a,b 3.30.497.10Alpha beta; 2-layer sandwich; antithrombin; chain I, domain 2; antithrombin, subunit I, domain 2
SERPINC1a,b 3.30.497.10Alpha beta; 2-layer sandwich; antithrombin; chain I, domain 2; antithrombin, subunit I, domain 2
SERPING1a,b 3.30.497.10Alpha beta; 2-layer sandwich; antithrombin; chain I, domain 2; antithrombin, subunit I, domain 2
LTFa,b 3.40.190.10Alpha beta; 3-layer(aba) sandwich; D-maltodextrin-binding protein; domain 2; periplasmic binding protein-like II
CSN2 3.40.50.10150Alpha Beta; 3-Layer(aba) Sandwich; Rossmann fold; Diol Dehydratase; Chain B
BCHEa,b 3.40.50.1820Alpha beta; 3-layer(aba) sandwich; rossmann fold
CELb,c 3.40.50.1820Alpha beta; 3-layer(aba) sandwich; rossmann fold
CSN1S1 3.40.630.10Alpha beta; 3-layer(aba) sandwich; aminopeptidase; Zn peptidases
CSN3 3.40.710.10Alpha beta; 3-layer(aba) sandwich; beta-lactamase; DD-peptidase/beta-lactamase superfamily
FGAa,b 3.90.215.10Alpha beta; alpha–beta complex; gamma fibrinogen; chain A, domain 1
FGBa,b 3.90.215.10Alpha beta; alpha–beta complex; gamma fibrinogen; chain A, domain 1
FGGa,b 3.90.215.10Alpha beta; alpha–beta complex; gamma fibrinogen; chain A, domain 1
Calc1b,c 3.90.320.20Alpha beta; alpha–beta complex; lambda exonuclease; chain A
WAP 4.10.75.10Few secondary structures; irregular; R-elafin; R-elafin

The italic bold words note milk proteins

aThe top 13 high level expressed recombinant proteins in the milk of transgenic animals (expression level ≧5 mg/mL)

bThe 13 recombinant proteins share similar domains with casein proteins

cThe expression level was based on a cDNA construct

Fig. 4

The tertiary structures of the main domains of casein proteins, 10 recombinant human proteins highly expressed in the milk of transgenic animal, and two moderately expressed proteins CEL (1.00 mg/mL) and Calc1(2.10 mg/mL) based on cDNA expression constructs. Figures were made using VMD (http://www.ks.uiuc.edu/Research/vmd/) and rendered using Snapshot

The main domains of recombinant proteins expressed in transgenic milk The italic bold words note milk proteins aThe top 13 high level expressed recombinant proteins in the milk of transgenic animals (expression level ≧5 mg/mL) bThe 13 recombinant proteins share similar domains with casein proteins cThe expression level was based on a cDNA construct The tertiary structures of the main domains of casein proteins, 10 recombinant human proteins highly expressed in the milk of transgenic animal, and two moderately expressed proteins CEL (1.00 mg/mL) and Calc1(2.10 mg/mL) based on cDNA expression constructs. Figures were made using VMD (http://www.ks.uiuc.edu/Research/vmd/) and rendered using Snapshot Promoter plays a critical role at transcriptional regulation of transgene expression. Milk protein gene promoters must be used when expressing recombinant proteins in animal mammary gland. Promoters derived from κ-casein and αS2-casein are particularly weak (Houdebine 2000). In Table 1, none of the reported studies used these two kinds of promoter. Bovine αS1-casein, goat β-casein, mouse murine acidic protein, and ovine β-lactoglobulin derived promoters were most popularly used in these studies. Under the regulation of ovine β-lactoglobulin promoter and using genomic DNA of the foreign gene, transgenic sheep expressed high level of SERPINA1 (35 mg/mL) and FIB (5 mg/mL) which both possessing major domains similar to caseins, whereas lower level of FIX (1 mg/mL) which has a major domain distinct with caseins. Similarly, under the control of goat beta-casein promoter, transgenic goats expressed high levels of SERPINC1 (20 mg/mL) and SERPINA1 (14 mg/mL), whereas lower levels of AFP (1.1 mg/mL) and tPA (3 mg/mL);furthermore, under the control of bovine αS2-casein, transgenic rabbits expressed high levels of SERPING1 (12 mg/mL) and GAA (8 mg/mL), whereas lower levels of tPA (0.05 mg/mL), NGF(0.25 mg/mL) and IGF1(1 mg/mL). These cases may indicate protein structure appears to be an important factor affecting the expression level of transgene in the mammary gland. Taken together, we suggest that those recombinant proteins share similar major domains to casein proteins may have potential to achieve a greater expression level in the milk of transgenic animals when they are under similar transcriptional regulation.

Materials and methods

Gene sequences analysis

Coding sequences of genes in this study were obtained from GenBank. On the basis of BioGPS (http://biogps.gnf.org/#goto=welcome) and several other extensive mRNA expression microarray studies (Warrington et al. 2000; Hsiao et al. 2001; Liang et al. 2006; Shyamsundar et al. 2005; Saito-Hisaminato et al. 2002), we have identified genes which are selectively expressed in human and mouse mammary gland at lactation stage and in six other tissues: human [mammary gland (13 genes), brain (25 genes), heart (22 genes), kidney (17 genes), liver (26 genes), lung (19 genes) and pancreas (27 genes; S1)] and mouse [mammary gland (15 genes), brain (24 genes), heart (22 genes), kidney (13 genes), liver (24 genes), lung (17 genes) and pancreas (21 genes; S2)]. The sequences of animal milk protein genes and recombinant proteins were summarized in supporting materials Table S3 and Table S4 respectively.

Codon usage analysis

For this study, the distance of synonymous codon usage between two genes were measured based on a two-tailed Fisher exact test method (Plotkin et al. 2004). Briefly, degree of codon bias in common sense as the departure from random synonymous codon choice is not concerned, but rather the degree to which genes differ in their encoding of amino acids is concerned. Given the coding sequences for a pair of genes, absolute frequency of each codon in each gene is first tabulated with condonW (http://codonw.sourceforge.net//). For each amino acid, a two-tailed Fisher exact test on the n × 2 contingency table given by the frequencies of the amino acid’s synonymous codons is calculated (e.g., for Ser n = 4: TCA, TCC, TCG, and TCT). As a result, for each amino acid a P-value indicating whether or not the genes use significantly different codons to encode that amino acid is obtained. For example, when comparing human brain (25 genes) to heart (22 genes), distance between codon usage of every pair of genes (including pairs from the same tissue) is calculated, thus obtaining a 47-by-47 symmetric matrix of pairwise distances. Distance between two genes is given by the number of amino acids that exhibit significantly different (P < 0.01) codon usage, as defined above (detailed SAS program was presented in supplementary materials S5). Then by using the neighbor joining method (PHYLIP v3.66; http://evolution.gs.washington.edu/phylip.html), a dendogram that graphically represents the measured pairwise distances between the codon usage in the study genes can be produced. To test whether observed clustering of genes in a dendogram is nonrandom, P value is calculated by comparing the observed summed distances along the tree between genes of the same tissue against a null distribution produced by randomly permuting the labels of the leaves.

Protein main domain analysis

Domains of recombinant proteins expressed in the milk of transgenic animals and mammalian milk proteins were evaluated with CATH (http://www.cathdb.info/) which is a hierarchical classification of protein domain structures based on clustering proteins at four major levels: Class (C), Architecture (A), Topology (T) and Homologous superfamily (H). Class is determined according to the secondary structure composition and packing within the structure. Three major classes are recognized; mainly-alpha, mainly-beta and alpha–beta. Architecture describes the overall shape of the domain structure as determined by the orientations of the secondary structures but ignores the connectivity between the secondary structures. It is currently assigned manually using a simple description of the secondary structure arrangement e.g. barrel or 3-layer sandwich. Topology indicates that structure are grouped according to whether they share the same topology or fold in the core of the domain, that is, if they share the same overall shape and connectivity of the secondary structures in the domain core. Homologous superfamily groups together protein domains which are thought to share a common ancestor and can therefore be described as homologous. Similarities are identified either by high sequence identity or structure comparison using SSAP. Boundaries and assignments for each protein domain are determined using a combination of automated and manual procedures which include computational techniques, empirical and statistical evidence, literature review and expert analysis. Generally if a given protein chain has sufficiently high sequence identity and structural similarity (i.e. 80% sequence identity, SSAP score ≥80) with a chain that has previously been chopped, the domain boundary assignment is performed automatically by inheriting the boundaries from the other chain (ChopClose). Otherwise, the domain boundaries are assigned manually, based on an analysis of results derived from a range of algorithms which include structure based methods [CATHEDRAL, SSAP, DETECTIVE (Swindells, 1995)], PUU (Holm and Sander, 1994), DOMAK (Siddiqui and Barton, 1995), sequence based methods (Profile HMMs) and relevant literature.

How to improve mammary transgene expression?

We suspect that it is not necessary to consider codon usage optimization when targeting a human gene to express in the animal mammary gland. So we investigated the codon usage pattern between mammary gland and other tissues. Significant differences were found between mammary gland with heart and pancreas tissues both in human and mouse. However, no significant correlation was found between expression levels and codon usage of recombinant human proteins expressed in the milk of transgenic animals. This may indicate that even though a human gene shares a similar codon usage with a mammary gland specific gene, especially milk protein genes, does not guarantee an effective expression in the mammary gland. In contrast, those proteins which share a similar domain with the four caseins are capable of achieving higher expression levels in animal mammary gland. We suppose that the tertiary structure of a recombinant protein adapts to the synthesis and secretion process of milk proteins especially caseins may permit it to be efficiently expressed in animal mammary gland. In the secretory epithelial cell of mammary gland, milk protein precursors are assembled on the ribosomes of the highly developed rough endoplasmic reticulum (ER). All milk proteins have conserved secretory signal peptide sequences, which lead the growing nascent peptides to insert into the lumen of the ER. The proteins are then transported to the Golgi apparatus. The caseins then gradually intercalate with each other, calcium and phosphate to form a submicellar structures which lead to the formation of casein micelles, finally secreted by reverse pinocytosis (Farrell et al. 2006). Protein folding in the ER is monitored by ER quality control (ERQC) mechanisms. Proteins that pass ERQC criteria traffic to their final destinations through the secretory pathway, whereas non-native and unassembled subunits of multimeric proteins are degraded by the ER-associated degradation (ERAD) pathway (Vembar and Brodsky 2008). Overall, the environment of the ER lumen would be conducive to the proper casein-casein association, which helps them to escape ERAD and move onto the Golgi for processing and secretion. Thus the recombinant proteins as SERPINA1, SERPINC1, SERPING1 and BCHE, which have compact spherical structures composed of mainly alpha beta, 2-lalyer or 3-layer sandwich like domains, presumably form self-association or interact with caseins through the conserved structural elements, and escape the ERAD, and are efficiently transported to the Golgi apparatus to assemble micelles, and ultimately efficiently secreted into milk, lead to a high expression level in the transgenic milk. In contrast, AFP, CAT, FIX, IL2, NGF, and tPA all have a mainly alpha or beta barrel based domain which is quite different from casein (Supplementary Fig. 10). They may not interact with caseins properly and are aggregated to evoke the ERAD, thus fail to traffic to the Golgi apparatus, and ultimate lead to less efficient expression in the milk of transgenic animals. Today it is still difficult to exactly predict the expression efficiency of one transgenic animal bioreactor. Because the expression efficiency of mammary transgene relies on a series of complex molecular regulation and cellular process. And it may be animal species dependent. Protein tertiary structure may be an important factor affecting the expression of animal mammary transgene. We doubt this is an absolute factor but, for example, for those proteins failing to secrete into transgenic milk even after transcriptional optimization, we might need to pay more attention to their tertiary structures. We may need to generate a group of elaborately designed gene targeting mice expressing structural distinct proteins under similar transcriptional regulation to verify this hypothesis. Furthermore, if for mammary transgenes encoding proteins that are structurally distinct from caseins such that they may pass through the ER/Golgi-dependent classical secretion pathway inefficiently, raises engineering the non-classical secretary pathway (He et al. 2008a, b; Nickel 2003) as an intriguing option. Below is the link to the electronic supplementary material. Supplementary material 1 (PDF 929 kb)
  67 in total

Review 1.  The mystery of nonclassical protein secretion. A current view on cargo proteins and potential export routes.

Authors:  Walter Nickel
Journal:  Eur J Biochem       Date:  2003-05

2.  Human lactoferrin transgenic rabbits produced efficiently using dimethylsulfoxide-sperm-mediated gene transfer.

Authors:  Lan Li; Wei Shen; Lingjiang Min; Huansheng Dong; Yujiang Sun; Qingjie Pan
Journal:  Reprod Fertil Dev       Date:  2006       Impact factor: 2.311

3.  Transgenic rabbits as bioreactors for the production of human growth hormone.

Authors:  J M Limonta; F O Castro; R Martínez; P Puentes; B Ramos; A Aguilar; R L Lleonart; J de la Fuente
Journal:  J Biotechnol       Date:  1995-05-15       Impact factor: 3.307

4.  Production of goats by somatic cell nuclear transfer.

Authors:  A Baguisi; E Behboodi; D T Melican; J S Pollock; M M Destrempes; C Cammuso; J L Williams; S D Nims; C A Porter; P Midura; M J Palacios; S L Ayres; R S Denniston; M L Hayes; C A Ziomek; H M Meade; R A Godke; W G Gavin; E W Overström; Y Echelard
Journal:  Nat Biotechnol       Date:  1999-05       Impact factor: 54.908

5.  Mammary specific transgenic over-expression of insulin-like growth factor-I (IGF-I) increases pig milk IGF-I and IGF binding proteins, with no effect on milk composition or yield.

Authors:  Marcia H Monaco; Derek E Gronlund; Gregory T Bleck; Walter L Hurley; Matthew B Wheeler; Sharon M Donovan
Journal:  Transgenic Res       Date:  2005-10       Impact factor: 2.788

6.  The deleterious effects of human erythropoietin gene driven by the rabbit whey acidic protein gene promoter in transgenic rabbits.

Authors:  M Massoud; J Attal; D Thépot; H Pointu; M G Stinnakre; M C Théron; C Lopez; L M Houdebine
Journal:  Reprod Nutr Dev       Date:  1996

7.  A compendium of gene expression in normal human tissues.

Authors:  L L Hsiao; F Dangond; T Yoshida; R Hong; R V Jensen; J Misra; W Dillon; K F Lee; K E Clark; P Haverty; Z Weng; G L Mutter; M P Frosch; M E MacDonald; E L Milford; C P Crum; R Bueno; R E Pratt; M Mahadevappa; J A Warrington; G Stephanopoulos; G Stephanopoulos; S R Gullans
Journal:  Physiol Genomics       Date:  2001-12-21       Impact factor: 3.107

8.  Nonclassical secretion of human catalase on the surface of CHO cells is more efficient than classical secretion.

Authors:  Zuyong He; Xiuzhu Sun; Gui Mei; Shengli Yu; Ning Li
Journal:  Cell Biol Int       Date:  2007-12-23       Impact factor: 3.612

9.  High expression of the human hepatocarcinoma-intestine-pancreas/pancreatic-associated protein (HIP/PAP) gene in the mammary gland of lactating transgenic mice. Secretion into the milk and purification of the HIP/PAP lectin.

Authors:  L Christa; A Pauloin; M T Simon; M G Stinnakre; M L Fontaine; S Delpal; M Ollivier-Bousquet; C Bréchot; E Devinoy
Journal:  Eur J Biochem       Date:  2000-03

Review 10.  Transgenic animal bioreactors.

Authors:  L M Houdebine
Journal:  Transgenic Res       Date:  2000       Impact factor: 2.788

View more
  1 in total

1.  Comparison of human coagulation factor VIII expression directed by cytomegalovirus and mammary gland-specific promoters in HC11 cells and transgenic mice.

Authors:  Qing Wang; Siguo Hao; Liyuan Ma; Wenhao Zhang; Jiangbo Wan; Xiaohui Deng
Journal:  Blood Coagul Fibrinolysis       Date:  2015-10       Impact factor: 1.276

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.