| Literature DB >> 21812952 |
Michelle S Scott1, Peter V Troshin, Geoffrey J Barton.
Abstract
BACKGROUND: Nucleolar localization sequences (NoLSs) are short targeting sequences responsible for the localization of proteins to the nucleolus. Given the large number of proteins experimentally detected in the nucleolus and the central role of this subnuclear compartment in the cell, NoLSs are likely to be important regulatory elements controlling cellular traffic. Although many proteins have been reported to contain NoLSs, the systematic characterization of this group of targeting motifs has only recently been carried out.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21812952 PMCID: PMC3166288 DOI: 10.1186/1471-2105-12-317
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Figure 1Example of NoLS prediction returned by NoD. If at least one NoLS is predicted in a protein, NoD returns an output page that displays the sequence and position of the predicted NoLSs, the full-length protein sequence as entered by the user with the NoLSs in red and a graph showing the average NoLS prediction score for every 20-residue window in the protein. The region shown in pink in this graph is the NoLS candidate segment region and represents the range of scores within which a 20-residue segment is predicted to be a NoLS.
Clinod output formats
| Format name | Format Description | Example outputa |
|---|---|---|
| MINIMAL | Sequence name and number of NOLS predicted | > NOL12 |
| SHORT | Same as MINIMAL plus, start and end position of each NOLS | > NOL12 |
| MEDIUM (default) | Same as SHORT plus the sequences of all NOLS | > NOL12 |
| FULL | Same as MEDIUM plus the predictor score for each residue in the sequence | > NOL12 |
| COMPLETE | Same as FULL plus the input sequences | > NOL12 |
a The sequences and scores are truncated in the Table for clarity of presentation.
Detail of NoD predictions on the multi-organism testing dataset assembled
| Organism | Protein Accession | Name | Experimentally determined NoLS position | NoD prediction | Refa |
|---|---|---|---|---|---|
| Homo sapiens | NP_001012333 | Midkine | 129-143 | 120-143 | [ |
| Homo sapiens | NP_055701 | NSA2 | 10-41 | no NoLS | [ |
| Homo sapiens | NP_055701 | NSA2 | 131-154 | 133-155 | [ |
| Homo sapiens | NP_872604 | RASSF5 | 51-100 | 78-98 | [ |
| Homo sapiens | NP_037541 | follistatin | 93-116 | 98-121 | [ |
| Homo sapiens | CAA41051 | histone H2B | 28-35 | 15-42 | [ |
| Mus musculus | NP_001012495 | Cxcl12 | 98-118 | 92-119 | [ |
| Mus musculus | NP_081208 | NoBP | 220-262 | 230-255 and 276-306 | [ |
| Mus musculus | NP_082355 | aminopeptidase O | 688-725 | 682-712 | [ |
| Dictyostelium discoideum | XP_002649205 | eIF6 | 31-64 | 27-49 | [ |
| Dictyostelium discoideum | XP_002649205 | eIF6 | 246-252 | 295-320 | [ |
| Aplysia kurodai | B0FRH7 | ApLLP | 1-19 | 1-21 | [ |
| Aplysia kurodai | B0FRH7 | ApLLP | 90-120 | 96-120 | [ |
| Trypanosome brucei | CAD21884 | ESAG8 | 48-79 | no NoLS | [ |
| Trypanosome cruzi | XP_817097 | Met-III | 1-19 | No NoLS | [ |
| Trypanosome cruzi | XP_817097 | Met-III | 146-191 | No NoLS | [ |
| Solanum lycopersicum | Q944N1 | LHP1 | 141-171 | 141-165 and 276-296 | [ |
| Arabidopsis thaliana | NP_001078269 | HMGB1 | 1-47 | 22-60 | [ |
| Bovine herpesvirus 1 | CAA90914 | BICP27 | 86-97 | 75-108 | [ |
| Human Adenovirus C | YP_001551773 | E4orf4 | 66-75 | 61-82 | [ |
| SARS | P59633 | Non-structural protein 3b | 134-154 | No NoLS | [ |
| HTLV-1 | BAH85789 | Tof | 71-98 | No NoLS | [ |
| Human herpes simplex | P08353 | Gamma-1 34.5 protein | 1-16 | 1-22 | [ |
| Human adenovirus 2 | P68950 | protein VII | 93-112 | 90-117 | [ |
| African Swine Fever Virus | AAA87288 | I14L | 1-14 | 1-26 | [ |
| PRRSV (porcine) | AAD00244 | N protein | 41-48 | 1-21 and 32-59 | [ |
| Tomato Leaf Curl Java Virus | BAD90868 | Capsid protein | 1-30 | no NoLS | [ |
| Potato leafroll virus | P11624 | Capsid protein | 17-31 | 10-64 | [ |
| Marek's disease virus type 1 | AAS01627 | MEQ protein | 62-78 | 22-47 and 52-81 | [ |
| Avian Infectious Bronchitis Virus | CAC39307 | N protein | 71-78 | 347-377 | [ |
| Betanodavirus GGNNV | NP_689432 | Protein alpha | 23-31 | 10-40 | [ |
a Ref: Reference reporting the experimental NoLS identification
b In [22], the NoLS for follistatin is reported at positions 64-87. These correspond to the positions in the protein once the signal peptide has been removed.
Accuracy of NoD predictions in all organisms investigated
| distinct protein count | NoLS count | TP | FP | Sensitivity | PPV | Specificity | |
|---|---|---|---|---|---|---|---|
| 8 | 9 | 8 | 1 | 0.89 | 0.89 | 0.88 | |
| H. sapiens | 5 | 6 | 5 | 0 | 0.83 | 1.0 | 1.0 |
| M. musculus | 3 | 3 | 3 | 1 | 1.0 | 0.75 | 0.67 |
| 1 | 2 | 1 | 1 | 0.5 | 0.5 | 0.0 | |
| Dictyostelium discoideum | 1 | 2 | 1 | 1 | 0.5 | 0.5 | 0.0 |
| 1 | 2 | 2 | 0 | 1.0 | 1.0 | 1.0 | |
| A. kurodai | 1 | 2 | 2 | 0 | 1.0 | 1.0 | 1.0 |
| 2 | 3 | 0 | 0 | 0 | N/A | 1.0 | |
| T. brucei | 1 | 1 | 0 | 0 | 0 | N/A | 1.0 |
| T. cruzi | 1 | 2 | 0 | 0 | 0 | N/A | 1.0 |
| 2 | 2 | 2 | 1 | 1.0 | 0.67 | 0.5 | |
| S. lycopersicum | 1 | 1 | 1 | 1 | 1.0 | 0.50 | 0.0 |
| A. thaliana | 1 | 1 | 1 | 0 | 1.0 | 1.0 | 1.0 |
| Mammalian host | 8 | 8 | 6 | 1 | 0.75 | 0.86 | 0.88 |
| Plant host | 2 | 2 | 1 | 0 | 0.5 | 1.0 | 1.0 |
| Avian host | 2 | 2 | 1 | 2 | 0.5 | 0.33 | 0.0 |
| Fish host | 1 | 1 | 1 | 0 | 1.0 | 1.0 | 1.0 |
a TP: true positive
b FP: false positive
c PPV: positive predictive value
d The specificity was calculated as the number of proteins considered for which no FP was identified divided by the number of proteins considered (this defines all non NoLS regions as negatives).
e For each of the count columns, the top row of each of the subsections in the Eukaryotes section represents the sum of the rows below it belonging to this subsection.