| Literature DB >> 30154395 |
Abstract
Osteopontin (OPN) is important for tissue remodeling, cellular immune responses, and calcium homeostasis in milk and urine. In pathophysiology, the biomolecule contributes to the progression of multiple cancers. Phylogenetic analysis of 202 osteopontin protein sequences identifies a core block of integrin-binding sites in the center of the protein, which is well conserved. Remarkably, the length of this block varies among species, resulting in differing distances between motifs within. The amino acid sequence SSEE is a candidate phosphorylation site. Two copies of it reside in the far N-terminus and are variably affected by alternative splicing in humans. Between those motifs, birds and reptiles have a histidine-rich domain, which is absent from other species. Just downstream from the thrombin cleavage site, the common motif (Q/I)(Y/S/V)(P/H/Y)D(A/V)(T/S)EED(L/E)(-/S)T has been hitherto unrecognized. While well preserved, it is yet without assigned function. The far C-terminus, although very different between Reptilia/Aves on the one hand and Mammals on the other, is highly conserved within each group of species, suggesting important functional roles that remain to be mapped. Taxonomic variations in the osteopontin sequence include a lack of about 20 amino acids in the downstream portion, a small unique sequence stretch C-terminally, a lack of six amino acids just upstream of the RGD motifs, and variable length insertions far C-terminally.Entities:
Keywords: evolution; osteopontin; protein; sequence; taxonomy
Mesh:
Substances:
Year: 2018 PMID: 30154395 PMCID: PMC6164354 DOI: 10.3390/ijms19092557
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1The domain structure of osteopontin. Known subunits are displayed as colored blocks. The vertical red lines in the primates reflect the splice sites in human osteopontin. The unmarked yellow boxes show the newly identified conserved domain without a known function. The numbers indicate the number of amino acids in the canonical sequence. While an effort was made to accurately reflect the size differences across taxonomic groups within each domain, the model is not precisely drawn to scale. Artiod. = Artiodactyla, Lago (et al.) = Lagomorpha and similar species, (v) = (various), Afrot, Xen = Afroteria and Xenarthra, Perissod. = Perissodactyla.
Homologies at the osteopontin-c splice junction.
|
| |||||
|
|
| ||||
| OPNa |
| ||||
| OPNb | |||||
| OPNc | |||||
|
| |||||
|
|
|
|
| ||
| BLASTP | EJP73436.1 |
| ribosomal protein L25 |
| |
| XP_016425206.1 |
| G2 and S phase-expressed protein 1 | | ||
| XP_016313011.1 |
| G2 and S phase-expressed protein 1-like | | ||
| WP_035346390.1 |
| carbohydrate ABC transporter substrate-binding protein |
| ||
| WP_092544273.1 |
| DUF1343 domain-containing protein | | ||
| XP_016350400.1 |
| protein P200-like | | ||
| PNY04530.1 |
| GDSL esterase/lipase | | ||
| EJP72445.1 |
| dihydrolipoyllysine-residue acetyltransferase | | ||
| XP_018304503.1 |
| putative inhibitor of apoptosis | | ||
| XP_010124132.1 |
| MAX gene-associated protein | | + | |
| XP_010079896.1 |
| MAX gene-associated protein | | + | |
| ELM | DOC_USP7_MATH_1 | USP7 MATH domain binding motif variant (MDM2 and p53 interactions) | | + | |
| MOD_CK2_1 | CK2 phosphorylation site | | (+) | ||
| MOD_GSK3_1 | GSK3 phosphorylation recognition site | | (+) | ||
| MOD_CK1_1 | CK1 phosphorylation site | | (+) | ||
| MOD_CK2_1 | CK2 phosphorylation site |
| (+) | ||
| LIG_TRAF2_1 | Major TRAF2-binding consensus motif | | |||
| DEG_Nend_UBRbox_2 | N-terminal motif that initiates protein degradation |
| |||
| MOD_GlcNHglycan | Glycosaminoglycan attachment site |
| |||
(A) Perijunctional sequences for osteopontin-a compared to the splice variants-b and -c. (B) Homologies to the splice junction of osteopontin-c according to BLASTP and to ELM. The top row shows the search sequence. In the last column, + indicates nuclear localization, (+) indicates that the location of the match may be nuclear or cytosolic.
Figure 2Integrin-binding block. The highest scoring motif (score 23718.7) in gapped local alignment with glam2scan (Meme Suite) covers the downstream portion of the poly-aspartate sequence through GRGDSV. Further, the sequence alignment by glam2scan confirms the varying block sizes from Figure 3.
Figure 3Sequences proximal to the RGD motif. (A) Select characteristic sequences for two manifestations of the motif spanning upstream of the canonical RGD domain. The upper block represents Aves and the subgroup of Reptilia that harbor two adjacent RGD sequences. The lower block represents all others. Preserved stretches of amino acids are highlighted with colored background, such that matching motifs are shaded by identical background color. (B) The common sequence motifs derived from the two groups of patterns are shown.
Figure 4(T/S)EED motif. The overlap between motifs identified in Meme Suite by a Multiple Em for Motif Elicitation (MEME) search (top) or a GLAM2 search (bottom) concurs with the sequence characterized here as well conserved. The horizontal bar over the bottom sequence marks a portion of the integrin-binding block through the thrombin-cleavage/heparin-binding site, which belongs to a separate motif.
Homologies at the (T/S)EED domain of osteopontin.
|
| ||||
|
|
| |||
|
| Primate | |||
|
| Artiodactyla b | |||
|
| Carnivora | |||
|
| Chiroptera | |||
|
| Marsupialia | |||
|
| Prototheria | |||
|
| Reptilia a | |||
|
| Reptilia b | |||
|
| Aves 1 | |||
|
| Aves 4 | |||
|
| Fish | |||
|
| ||||
|
| ||||
|
| ||||
|
| ||||
| WP_010917268.1 | Thermoplasma | radical SAM protein | | generates radicals by close proximity of a 4Fe-4S cluster and S-adenosylmethionine |
| WP_077813583.1 | Acetobacter | (2Fe-2S)-binding protein | | Ferredoxins are iron-sulphur proteins that mediate electron transfer |
| WP_101432357.1 | Bifidobacterium | NAD-dependent succinate-semialdehyde dehydrogenase | | oxidoreductase acting on donor aldehyde or oxo group with NAD+ or NADP+ as acceptor |
| WP_089241263.1 | Belliella | OsmC family peroxiredoxin | | osmotically induced, preferentially metabolizes organic over inorganic hydrogen peroxide |
| APR84224.1 | Minicystis | Oxidoreductase, short chain dehydrogenase/reductase | | NAD(P)(H)-dependent oxidoreductase |
| ODM17387.1 | Aspergillus | Delta-1-pyrroline-5-carboxylate dehydrogenase | | oxidoreductase, acting on the CH-NH group of donors with NAD+ or NADP+ as acceptor |
|
| ||||
| WP_069109061.1 | Jiangella | helix-turn-helix domain-containing protein | | helix-turn-helix motif, contained in DNA binding proteins that regulate gene expression |
| KYN05282.1 | Cyphomyrmex | X-ray repair cross-complementing protein 5 | | single-stranded DNA-dependent ATP-dependent helicase |
| XP_017256983.1 | Daucus | mitotic spindle checkpoint protein BUBR1 | | control of cell division |
|
| ||||
| WP_100323843.1 | Xanthomonadaceae | 3-oxoacyl-ACP synthase III | | acyl-transferase that participates in fatty acid biosynthesis |
| WP_019509474.1 | Pleurocapsa | 1-acyl-sn-glycerol-3-phosphate acyltransferase |
| converts lysophosphatidic acid into phosphatidic acid by incorporating an acyl moiety |
| WP_056956354.1 | Lactobacillus | aryl-phospho-beta- | | catalyzes the hydrolysis of aryl-phospho-beta- |
| WP_101098638.1 | Stenotrophomonas | VacJ family lipoprotein | | contributes to virulence, affects outer membrane and contributes to serum resistance |
| WP_018890097.1 | Streptomyces | ABC transporter ATP-binding protein | | ATPase activity, coupled to transmembrane movement of substances |
| XP_011594169.1 | Aquila | unconventional myosin-IXb |
| intracellular movements, binds actin, inhibited by calcium, GTPase activator for RHOA |
| WP_056534355.1 | Bacillus | DUF1836 domain-containing protein | | domain of unknown function |
| WP_072744531.1 | Sporanaerobacter | Stk1 family PASTA domain-containing Ser/Thr kinase |
| StkP activation and substrate recognition depend on the PASTA domain |
| XP_011141917.1 | Harpegnathos | UDP-glucuronosyltransferase 2B15 |
| glucuronidation of various xenobiotics and endogenous estrogens and androgens |
| XP_019462595.1 | Lupinus | ubiquitin carboxyl-terminal hydrolase 27 |
| Deubiquitinase, reduces BCL2L11/BIM ubiquitination and stabilize BCL2L11 |
| XP_020978497.1 | Arachis | glutamate receptor 3.6 | | cell surface receptor |
(A) Canonical sequences for the taxonomic subgroups analyzed. (B) Homologies in BLASTP searches. The upper row shows the consensus motif.
Homologies at C-terminus of osteopontin.
|
| ||||
|
|
| |||
| Reptilia/aves | SNQTLESAEDXQD(R/H)HSIEXNEVT(R/L/I) | |||
| mammals | D(P/H/R)KS(K/E/V)EEDK(H/Y)LKFR(I/V)SHEL(D/E)SASSEVN | |||
|
| ||||
|
|
|
| ||
| BLASTP | OBT50472.1 | membrane proton-efflux P-type ATPase | | |
| WP_083469460.1 |
| right-handed parallel beta-helix repeat-containing protein | | |
| WP_037618789.1 |
| ribonuclease E/G | | |
| XP_011461236.1 | inner membrane protein PPF-1, chloroplastic | | ||
| XP_018138390.1 | response regulator | | ||
| ELM | LIG_BIR_II_1 | abrogation of caspase inhibition by IAPs in apoptotic cells |
| |
| MOD_CK1_1 | CK1 phosphorylation site |
| ||
| MOD_GlcNHglycan | Glycosaminoglycan attachment site | | ||
| MOD_N-GLC_1 | Generic motif for N-glycosylation |
| ||
| MOD_Plk_1 | Ser/Thr residue phosphorylated by the Plk1 kinase |
| ||
| MOD_PKA_2 | Secondary preference for PKA-type AGC kinase phosphorylation | | ||
| MOD_Plk_2-3 | Ser/Thr residue phosphorylated by Plk2 and Plk3 | | ||
|
| ||||
|
|
|
| ||
| BLASTP | WP_034747182.1 |
| AraC family transcriptional regulator | |
| KUG41143.1 |
| Methyl-accepting chemotaxis protein | | |
| AAP88241.1 | UL74 protein | | ||
| WP_077317536.1 |
| acetate-CoA ligase | | |
| WP_039995451.1 |
| ATP-dependent protease |
| |
| WP_094346596.1 |
| ATP-binding protein | | |
| WP_104809105.1 |
| DUF2867 domain-containing protein | | |
| ELM | CLV_PCSK_SKI1_1 | Subtilisin/kexin isozyme-1 (SKI1) cleavage site | | |
| DOC_CYCLIN_1 | interacts with cyclin, increases phosphorylation by cyclin/cdk complexes | | ||
| DOC_PP1_RVXF_1 | Protein phosphatase 1 catalytic subunit (PP1c) interacting motif | | ||
| DOC_USP7_UBL2_3 | USP7 CTD domain binding motif variant | | ||
| LIG_TRAF2_1 | Major TRAF2-binding consensus motif | | ||
| MOD_SUMO_rev_2 | Inverted version of SUMOylation motif recognized for modification by SUMO-1 |
| ||
| MOD_CK2_1 | CK2 phosphorylation site |
| ||
| MOD_CK1_1 | CK1 phosphorylation site | | ||
| MOD_PKA_2 | Secondary preference for PKA-type AGC kinase phosphorylation. | | ||
| MOD_GlcNHglycan | Glycosaminoglycan attachment site | | ||
(A) Canonical sequences for Aves/Reptilia and for Mammals. (B) Homologies to the avian C-terminus of osteopontin according to BLASTP and to ELM. The top row shows the search sequence. (C) Homologies to the primate C-terminus of osteopontin according to BLASTP and to ELM. The top row shows the search sequence.
Figure 5Phylogenetic tree analysis of osteopontin. (A) Individual osteopontin sequences. The color coding reflects the taxonomic affiliation as displayed in (B). The tree with the highest log likelihood (−1039.72) is shown. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 202 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 17 positions in the final dataset. (B) Canonical osteopontin sequences. The tree with the highest log likelihood (−7192.43) is shown. The analysis involved 19 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 156 positions in the final dataset. (reptilia A = Crocodilia, Testudines; reptilia B = Squamata; artiodactyla a = Camelidae, Suidae, Celaceae; artiodactyla b = Cervidae, Bovidae). (C) Evolutionary relationships of major vertebrate groups as a reference point. Adopted from the University of California Museum of Paleontology’s Understanding Evolution (https://evolution.berkeley.edu/evolibrary/search/imagedetail.php?id=251&topic_id=&keywords=phylogeny). (D) An evolutionary tree of Mammals as a reference point. The tree depicts historical divergence relationships among the living orders of Mammals. The phylogenetic hierarchy is a consensus view of several decades of molecular genetic, morphological and fossil inference. Double rings indicate mammalian supertaxa, numbers indicate the possible time of divergences [23]. This file has been reproduced from https://commons.wikimedia.org/wiki/File:An_evolutionary_tree_of_mammals.jpeg under the Creative Commons Attribution 2.0 Generic license.
Extent of osteopontin sequence homologies across species.
|
| Refined Alignment | Draft Source Alignment | Phylogenetic Tree | |||||
|---|---|---|---|---|---|---|---|---|
| Power | Homology (%) | Power | Homology (%) | Cluster Algorithm | ||||
| Fish | 9 | 154.94 | 42.30 | 51.78 | 14.80 | 0.999999 | ||
| Aves and Reptilia | 79 | 1252.13 | 38.60 | 784.90 | 24.80 | 0.999999 | ||
| Aves | 64 | 1348.34 | 51.00 | 815.27 | 31.50 | 0.744865 | ||
| Aves 1 | 4 | 109.38 | 92.70 | 109.30 | 92.60 | 0.000001 | ||
| Aves 2 | 3 | 62.42 | 69.50 | 61.93 | 68.70 | 0.000001 | ||
| Aves 3 | 11 | 296.96 | 68.80 | 292.26 | 67.10 | 0.000001 | ||
| Aves 4 | 46 | 1239.51 | 72.50 | 825.87 | 49.80 | 0.508657 | ||
| Reptilia | 15 | 205.90 | 35.30 | 110.73 | 19.50 | 0.999999 | ||
| Reptilia C + T | 8 | 166.91 | 66.40 | 159.88 | 63.20 | 0.212424 | ||
| Reptilia S | 7 | 147.30 | 54.20 | 96.46 | 35.40 | 0.931465 | ||
| Mammalia | ||||||||
| Rodentia | 20 | 374.84 | 44.30 | 278.71 | 33.90 | 0.999999 | ||
| Chiroptera | 11 | 320.96 | 80.90 | 315.53 | 79.60 | 0.000001 | ||
| Marsupialia | 3 | 72.70 | 79.70 | 69.58 | 75.70 | 0.000001 | ||
| Perissodactyla | 4 | 118.78 | 92.40 | 118.78 | 92.40 | 0.000001 | ||
| Artiodactyla | 24 | 404.76 | 42.30 | 389.37 | 40.90 | 0.997109 | ||
| Artiodactyla CCS | 11 | 282.10 | 66.50 | 270.16 | 63.60 | 0.000001 | ||
| Artiodactyla CB | 13 | 390.29 | 88.90 | 383.18 | 87.30 | 0.000001 | ||
| Afroteria/Xenarthra | 5 | 97.01 | 57.90 | 85.01 | 51.20 | 0.833609 | ||
| Carnivora | 13 | 353.49 | 84.70 | 324.64 | 73.10 | 0.000001 | ||
| Primates | 25 | 785.85 | 84.20 | 659.82 | 71.00 | 0.101152 | ||
| (all) | (canonical) | 19 | 263.52 | 32.70 | 175.98 | 22.80 | 0.999999 | |
The data were generated in GeneBee. Two algorithms for alignment (refined alignment, draft source alignment) were applied. The cluster algorithm for unrooted tree with scaled branches had the max/min factor set to 8. The three columns on the left show the taxonomic groups analyzed in hierarchical order. N = number of sequences within the group.
Figure 6Physico-chemical properties of osteopontin in various orders of species. The graph shows isoelectric point versus molecular mass for each available member of nine orders of species. Fish osteopontin is very different from others, which is reflected in its separation. In general, higher order organisms (including Carnivora, Primates) cluster more tightly than lower orders (such as Rodentia).