Literature DB >> 16689700

Predicting the nuclear localization signals of 107 types of HPV L1 proteins by bioinformatic analysis.

Jun Yang, Yi-Li Wang, Lü-Sheng Si.   

Abstract

In this study, 107 types of human papillomavirus (HPV) L1 protein sequences were obtained from available databases, and the nuclear localization signals (NLSs) of these HPV L1 proteins were analyzed and predicted by bioinformatic analysis. Out of the 107 types, the NLSs of 39 types were predicted by PredictNLS software (35 types of bipartite NLSs and 4 types of monopartite NLSs). The NLSs of the remaining HPV types were predicted according to the characteristics and the homology of the already predicted NLSs as well as the general rule of NLSs. According to the result, the NLSs of 107 types of HPV L1 proteins were classified into 15 categories. The different types of HPV L1 proteins in the same NLS category could share the similar or the same nucleocytoplasmic transport pathway. They might be used as the same target to prevent and treat different types of HPV infection. The results also showed that bioinformatic technology could be used to analyze and predict NLSs of proteins.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16689700      PMCID: PMC5054033          DOI: 10.1016/S1672-0229(06)60014-4

Source DB:  PubMed          Journal:  Genomics Proteomics Bioinformatics        ISSN: 1672-0229            Impact factor:   7.691


Introduction

Human papillomaviruses (HPVs) are small, non-enveloped DNA viruses (. HPV infection is associated with more than 90% of all the cases of cervical cancer, which is the second leading cause of cancer death among women worldwide 1., 2.. HPVs have been classified into more than 100 types based on the nucleotide sequence homology of a single molecule of 8-Kb double-stranded circular DNA. Each HPV type has different specificity for infection of skin or mucosa (. HPV virion (55–60 nm in diameter) is contained within an icosahedral capsid, which comprises L1 major and L2 minor capsid proteins (. L1 proteins form pentamers (capsomeres), and 72 capsomeres assemble into a T-7d icosahedral lattice 5., 6.. An HPV capsid comprises 360 molecules of L1 proteins. L2 proteins interact with L1 pentamers (. The molar ratio of L1 and L2 proteins is estimated to be 30:1 7., 8.. L1 proteins can self-assemble into virus-like particles, which have the similar size, shape, and conformational epitope to native virion capsid proteins, although L2 proteins increase the efficiency of DNA encapsidation by at least 50 folds 8., 9., 10.. HPVs infect basal cells of epithelium through microlesions and replicate only in the differentiating cells. These cells are difficult to culture in vitro; hence, no tissue culture system for the large-scale propagation of HPV virions in vitro is available at present. The study of these viral structural proteins is behind that of the oncoproteins of their counterfeits. Consequently, little is known about the cellular and viral factors that control the switch and process of papillomavirus genome replication and viral protein expression. Many events in the papillomavirus life cycle have not been elucidated, and particularly the nuclear transport process of the viral genome and structural proteins is poorly understood. However, at present the knowledge of L1 proteins of HPVs is understood at the molecular level in a certain extent. During the virus life cycle, L1 proteins seem to enter the nuclei of host cells twice. In the initial stage of HPV infection, immediately after the virions infect the undifferentiated proliferating epithelial cells, L1 proteins together with the viral genome are transported into the nuclei of proliferating epithelial cells. During the late stage of HPV infection, the newly synthesized L1 proteins in cytoplasm are transported into the nuclei of terminally differentiated keratinocytes to package the replicated HPV genomic DNAs and assemble into infectious virions, together with L2 proteins (. This would suggest that the nuclear import of L1 proteins plays a very important role in HPV infection and production. The ability of the virus importing into the nucleus is determined by the nuclear localization signal (NLS) in the C-terminal of HPV L1 proteins, so it is important to investigate the NLSs of HPVs. To date, more than 120 HPV types have been isolated and partially characterized, and about 100 distinct HPV types have been identified and fully sequenced. But only few NLSs of HPV L1 proteins have been experimentally determined. The fact means that it is very difficult and unpractical to identify the NLSs of all HPV types by experiments. In this paper, we attempt to analyze and predict the NLSs of 107 types of HPV L1 proteins by bioinformatic analysis.

Results

The full sequences of 107 types of HPV L1 proteins were obtained from available databases (see Materials and Methods). Out of the 107 types, the NLSs of 39 types were predicted by PredictNLS software (http://cubic.bioc.columbia.edu/predictNLS/). Among them, 35 types contain bipartite NLSs, where the two tight clusters of basic residues (one is KRKR, KRKRK, KRKKRK, the other is KR, RKR, KRK) are preceded, with a spacer of 10–14 amino acids. The other four types (HPV22, HPV34, HPV48, and HPV73) were predicted to contain monopartite NLSs, where these arginines and/or lysines form a tight cluster of basic residues as typified by the simian virus 40 large T antigen (SV40 T). The NLSs of the remaining HPV types were predicted according to the characteristics and the homology of the already predicted NLSs as well as the general rule of NLSs. According to the result, the NLSs of 107 types of HPV L1 proteins were classified into 15 categories (Table 1), among which the categories XIV and XV contain monopartite NLSs. In addition, the NLSs of HPV L1 proteins 1, 6, 11, 16, 31, 33, 35, and 45 can also be obtained from the literature 12., 13., 14., 15..
Table 1

The Homologous Analysis of the NLSs of 107 Types of HPV L1 Proteins

Discussion

In eukaryotic cells, the nucleus has a highly specialized structure that participates in the regulation of cell processes, including the regulation of cell cycle and the induction of antiviral responses (. The nuclear pore complex (NPC) has a large supramolecular structure with a mass of 125 kDa in vertebrates, which is embedded in nuclear envelope as the only gateway between nucleus and cytoplasm 17., 18., 19., 20.. Over the past years, a consensus model of the three-dimensional (3D) architecture of NPC shows that it is composed of an eight-fold symmetric central framework (. In the course of biological evolution, NPC keeps a very high homology in eukaryotic cells, sharing a similar nuclear transport mechanism 19., 20.. The nuclear import of proteins typically requires the presence of NLSs, which are characteristically rich in basic amino acids 22., 23., 24.. NLS motifs play a key role in the nuclear transport mechanism. In order to enter into nucleus, the transport of proteins with a molecular weight (MW) at 45–60 kDa must be made through NPC via an NLS or be associated with another protein via a piggyback mechanism, whereas the nuclear import of small proteins (MW<40 kDa) cross NPC via passive diffusion 25., 26., 27.. NLSs are subsequently found in numerous viral and nuclear proteins of eukaryotic cells. At present, they can be classified into two major categories. The first category includes the monopartite (single type) NLSs that contain 3–5 basic amino acids with the weak consensus Lys-Arg/Lys-X-Arg/Lys residues preceded by a helix-breaking residue, which are similar to the SV40 T NLS (pKKKRKv) (. They are now referred to as classical NLSs. The second category includes the bipartite NLSs that contain two clusters of basic regions of 3–4 residues with a basic dipeptide upstream from a simple basic sequence, each separated by approximately 10 amino acids, which are similar to the nucleoplasmin NLS (KRpaatkkagqaKKKKldk) 29., 30.. The sequences (pKKKRKv and KRpaatkkagqaKKKKldk) found in SV40 T and nucleoplasmin are prototypes for monopartite and bipartite NLSs, now known to be present in many, probably thousands of different proteins. NLSs are capable of directing a non-karyophilic protein into nucleus when conjugated genetically or chemically. However, not all experimentally known NLSs comply with the above rules 31., 32., 33.. Several other NLS sequences have been identified, which are quite different from classical NLSs, such as the NLSs discovered in hnRNP proteins, ribosomal proteins, and UsnRNPs 34., 35., 36., 37.. Recent studies have identified several proteins that contain more than one NLS, including the nuclear factor 1-A (, the cell division control protein mcm10 (, the herpes simplex virus gene product ICP22 (, the HIV preintegration complex (, the Epstein-Barr virus DNase (, the papillomavirus oncoprotein E6 (, BRCA2 (, and Rep68/78 proteins (. The enzyme 5-lipoxygenase (5-LO) has three NLSs that contain dispersed basic residues, unlike the tight cluster of basic residues of the classical SV40 T NLS (. It is not clear why some proteins contain multiple NLSs. One explanation is that multiple NLSs may cooperate with one another and allow more efficient nuclear import or share an alternative entry mechanism in the nuclear import, affording redundancy in proteins that require successful nuclear import, as in cell cycle proteins or viral integration proteins 42., 47.. Traditionally, in order to identify an NLS experimentally, both of the facts should be considered routinely. Firstly, the candidate should be deleted to disrupt the nuclear import of the NLS; secondly, a non-nuclear protein will be imported into the nucleus if fused to the NLS (. It is very difficult and unpractical to identify the NLS motifs of more than 120 types of HPVs by experiments. Only few NLSs of HPV L1 proteins have been experimentally determined up to date. Bioinformatic techniques perhaps could be applied to analyze and predict new NLSs, which would remedy this situation. Cokol et al. ( have found some upper boundaries. The method comprises two steps: (1) data collection: collect experimental NLS motifs from literature, and extend the motifs through close homologues; (2) generalization: refine the motifs found by shortening (for those too specific) or lengthening (for those not specific enough), and test the new motifs conceptually similar to the known motifs found in nuclear protein families. The crucial component of both steps is to accept motifs if NOT found in non-nuclear proteins. Therefore, it is feasible to discover new NLSs in HPV L1 proteins by comparing the homologues of different types. According to Cokol’s method, we analyzed the HPV L1 protein sequences for the confirmation of NLSs using PredictNLS software. Out of the 107 types of HPV L1 proteins, the NLSs of 39 types were predicted by PredictNLS. Applying PSORTII and PredictNLS could not reveal any typical NLS in the remaining 68 types. In general, two naturally evolved proteins with more than 30% identical residues could share similar 3D structures (. The sequence similarity required to infer function is much higher (. It is possible to infer NLSs by comparing the homologues of more than one protein. Due to the high homology of the NLSs among the L1 proteins of all HPV types, we subsequently found similar sequences in the C-terminal of all the remaining 68 types of HPV L1 proteins, which were similar to the NLSs already predicted by PredictNLS and collected in experimental data. According to the consensus rule of NLSs and the high homologues, the NLSs of 107 types of HPV L1 proteins were classified into 15 categories (Table 1). Among them, the categories I to XIII contain classical bipartite NLSs, while the categories XIV and XV contain classical monopartite NLSs. However, the NLSs predicted in this paper have been proved with few experimental data. This classification cannot always consist with experimental results. The cluster of basic residues RRR in the upstream of the bipartite NLS (RRRptigpRKRpaast-stastasRpaKRvRiRsKK) of HPV45 has been proved certainly to have the nuclear localization ability (. The discontinuous basic amino acids K and R in the upstream of the NLSs of 107 types of HPV L1 proteins perhaps also possess the ability of nuclear localization. At the same time, the NLS of HPV33 was proposed to be bipartite, while the experimental result proved that HPV33 possibly contains a monopartite NLS (. On the other hand, while the experimental results proved that many types of HPV L1 proteins (for example, HPV1, 6, 31, 33, and 35) contain NLSs, they cannot be found by PredictNLS. Whereas, it is not all the clusters of basic residues of predicted NLSs that have the nuclear localization ability, such as the NLS of the IL1 β 53., 54.. This instance perhaps occurs in HPV L1 proteins. Therefore, it is surely worth amending and supplementing the classification. In conclusion, this classification would play an important role in the study of the NLSs of HPV L1 proteins. The results of this paper suggested that the different HPV types classified in the same category could share the similar or the same nucleocytoplasmic transport pathway. The NLSs in the same category would be used as a common realistic and feasible target for preventing and treating different types of HPV infection. The results also showed that bioinformatic technology could be used to analyze and predict the NLSs of proteins.

Materials and Methods

The HPV L1 protein sequences were searched from the following databases: http://cubic.bioc.columbia.edu/db/ http://www.ncbi.nlm.nih.gov/ http://www.stdgen.lanl.gov/stdgen/virus/ http://ca.expasy.org/ Firstly, the initial sets from the literature for experimentally determined NLSs were collected. Secondly, ENTREZ, BLAST, and DNAClub software tools were used to analyze the homology of all types of HPV L1 protein sequences obtained. The useful web server (PredictNLS) for identifying potential NLSs in protein sequences is available at http://cubic.bioc.columbia.edu/predictNLS/ and was used to analyze and predict the NLSs of HPV L1 proteins. According to the characteristics and the homology of the NLSs predicted by PredictNLS, as well as the general rule of NLSs, the HPV L1 proteins were classified into 15 categories. The program also allows experimentalists to test the accuracy and coverage for new NLS motifs that they may find or suspect. This feature has already helped to experimentally unravel a novel NLS in the hairless protein (.
  54 in total

1.  Nuclear import of HPV11 L1 capsid protein is mediated by karyopherin alpha2beta1 heterodimers.

Authors:  E Merle; R C Rose; L LeRoux; J Moroianu
Journal:  J Cell Biochem       Date:  1999-09-15       Impact factor: 4.429

2.  Twilight zone of protein sequence alignments.

Authors:  B Rost
Journal:  Protein Eng       Date:  1999-02

3.  The bipartite nuclear localization sequence of Rpn2 is required for nuclear import of proteasomal base complexes via karyopherin alphabeta and proteasome functions.

Authors:  Petra Wendler; Andrea Lehmann; Katharina Janek; Sabine Baumgart; Cordula Enenkel
Journal:  J Biol Chem       Date:  2004-06-21       Impact factor: 5.157

4.  A nuclear localization signal in the matrix of spleen necrosis virus (SNV) does not allow efficient gene transfer into quiescent cells with SNV-derived vectors.

Authors:  Marie-Christine Caron; Manuel Caruso
Journal:  Virology       Date:  2005-08-01       Impact factor: 3.616

5.  Epstein-Barr virus DNase contains two nuclear localization signals, which are different in sensitivity to the hydrophobic regions.

Authors:  M T Liu; T Y Hsu; J Y Chen; C S Yang
Journal:  Virology       Date:  1998-07-20       Impact factor: 3.616

6.  The L1 major capsid protein of human papillomavirus type 11 interacts with Kap beta2 and Kap beta3 nuclear import receptors.

Authors:  Lisa M Nelson; Robert C Rose; Junona Moroianu
Journal:  Virology       Date:  2003-02-01       Impact factor: 3.616

7.  Structure of small virus-like particles assembled from the L1 protein of human papillomavirus 16.

Authors:  X S Chen; R L Garcea; I Goldberg; G Casini; S C Harrison
Journal:  Mol Cell       Date:  2000-03       Impact factor: 17.970

8.  Nuclear import strategies of high risk HPV16 L1 major capsid protein.

Authors:  Lisa M Nelson; Robert C Rose; Junona Moroianu
Journal:  J Biol Chem       Date:  2002-04-23       Impact factor: 5.157

9.  Characterization of a nuclear localization signal in the C-terminus of the adeno-associated virus Rep68/78 proteins.

Authors:  Geoffrey D Cassell; Matthew D Weitzman
Journal:  Virology       Date:  2004-10-01       Impact factor: 3.616

10.  Importin beta-depending nuclear import pathways: role of the adapter proteins in the docking and releasing steps.

Authors:  Christiane Rollenhagen; Petra Mühlhäusser; Ulrike Kutay; Nelly Panté
Journal:  Mol Biol Cell       Date:  2003-02-06       Impact factor: 4.138

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.