| Literature DB >> 22540284 |
Karlheinz Mann1, Eric Edsinger-Gonzales, Matthias Mann.
Abstract
BACKGROUND: Invertebrate biominerals are characterized by their extraordinary functionality and physical properties, such as strength, stiffness and toughness that by far exceed those of the pure mineral component of such composites. This is attributed to the organic matrix, secreted by specialized cells, which pervades and envelops the mineral crystals. Despite the obvious importance of the protein fraction of the organic matrix, only few in-depth proteomic studies have been performed due to the lack of comprehensive protein sequence databases. The recent public release of the gastropod Lottia gigantea genome sequence and the associated protein sequence database provides for the first time the opportunity to do a state-of-the-art proteomic in-depth analysis of the organic matrix of a mollusc shell.Entities:
Year: 2012 PMID: 22540284 PMCID: PMC3374290 DOI: 10.1186/1477-5956-10-28
Source DB: PubMed Journal: Proteome Sci ISSN: 1477-5956 Impact factor: 2.480
Figure 1PAGE comparison of acid-soluble matrices from shells. Molecular weight markers are indicated at the left. Each lane was loaded with 200 μg of matrix in a volume of 30 μl. A, matrices of shells cleaned with different sodium hypochlorite protocols. Lane A, 2 h hypochlorite at room temperature; lane B, 2 h hypochlorite with 2 x 5 min ultrasound treatment at the start of each hour; lane C, cleaned with hypochlorite for 24 h with 2 x 5 min ultrasound bursts as before and one after 24 h. B, matrices of different shells, all cleaned with hypochlorite according to protocol B (2 h hypochlorite, 2 x 5 min ultrasound).
Figure 2PAGE comparison of acid-soluble and acid-insoluble matrix. Molecular weight markers are indicated at the left. S, acid soluble matrix; I, acid-insoluble matrix. The sections for in-gel digestion are indicated at the right of each lane. With longer exposure times sections 1–8 of the acid-insoluble sample became a feature-less smear, while faint bands became apparent in sections 9–12.
Figure 3Venn diagrams of protein identifications in different samples. A, matrix isolated after sodium hypochlorite treatment of the shells for 2 h at room temperature. B, 2 h hypochlorite cleaning with 2 x 5 min ultrasound at the start of each hour. C, 24 h hypochlorite with 2 x 5 min ultrasound bursts as before and one after 24 h. The consensus proteome comprises all identifications occurring in all three types of samples. Venn diagrams were prepared using the Venn Diagram Plotter of http://omics.pnl.gov/software/VennDiagram Plotter.php.
Previously uncharacterized majorshell matrix proteins with unusual primary sequence features
| Lotgi1|115147 | 14% P, 11% T, 6 repeats of ~30aa, starting with MITPE; pI: 4.7; 319aa |
| Lotgi1|142790 | 25% Q, 10% E, 17% P, 12% V, 10% N 10% L; 6 short repeats: k/qQQPxVELNKQQP; pI 5.2; 182aa |
| Lotgi1|142814 | 38% Q, 11% L, 10% P; 5 ~70aa repeats containing shorter repeat motifs like NQQQ and KQQQ; pI: 10.5; 322aa |
| Lotgi1|152688 | 20% G, 12% P; pI: 9.7; 137aa |
| 11% P; Q-rich C-term (aa210–240); pI: 9.7; 258aa | |
| Lotgi1|159331 | 26% E, 13% L,12% T; pI: 4; starting with aa156 8x SNLLQQPDa/tTQqLa/tTNeQQQ; (Figure |
| Lotgi1|163637 | 17% D, 16% A; EFh, pI: 3.8; 643aa; 12 ca30aa repeats similar to AxVDNxxMADMIDTxQDxxEDAADNMADNIDTAQDAQ between aa32–453 |
| Lotgi1|171084 | 13% S; frequent doublets (SS, QQ, TT, YY, NN); G/E block aa322–337; pI: 4.4; 357aa |
| Lotgi1|172698 | 23% Q, 13% N, 13% S; aa130–702: 31 x 14aa repeats similar to QSNQQFNxxQSNQQF; pI: 7.1; 1184aa |
| ~10% of P, N and G; in aa107–170 10x GAMP/GSMP; pI: 9.6; 563aa | |
| Lotgi1|174003 | 19% P in aa50–400 and 35% P in aa778–882; pI: 9.5; 882aa |
| aa17–126: 17% R + K, 12% P, 11% L; pI: 11; 126aa | |
| Lotgi1|228385 | 16% R, 11% S; pI: 11.7; 160aa; R/H-rich from aa103–150 |
| Lotgi1|231186 | 19% G, 12% P; aa433–481: 27% M; pI: 4.6; 481aa; R/H-rich C-term half |
| Lotgi1|231509 | aa26–230: 18% P; pI: 4.2; 230aa; acidic blocks in N-term half |
| A/P-rich motif aa150–170; H-rich motif aa171–185; pI: 8.8; 219aa | |
| 31% D, 10% E; pI: 3.6; similar to aspein? | |
| Lotgi1|234884 | 42% Q in aa281–630; G/L/A-rich region aa631–928; pI: 9.2; 928aa |
| aa120–247: 20% P, 16% A, 10% Q; pI: 9.7; 247aa | |
| 15% P, 15% T; pI:5.7; 557aa | |
| aa171–270: 33% G, 25% T, 15% P, 14% Q; 16 x GGQPs/tT; pI: 5.4; 303aa | |
| Lotgi1|235812 | 24% P, 18% Q, 10% N; pI: 8.9; 729aa; aa57–376: 17 repeats of 16aa, similar to NNxa/vQPPxxQxxYQPt/p |
| Lotgi1|236689 | 19% P, 10% A, 10% V, 10% R; pI: 10; 317aa |
| Lotgi1|236690 | 21% Q, 18% P; aa268–356: 4 xAQPGAYQQP(x)2–4 GAYxQQP repeats; pI: 8.4; 440aa |
| Lotgi1|236691 | 22% P, 13% Q, 10% A; Q-rich regions: ~aa61–160 and ~ aa721–990; P-rich: ~aa280–600 and ~780–970¸pI: 8.8; 1035aa |
| Lotgi1|238358 | aa61–232: 32% D + E, 12% N; pI: 3.7; 323aa; (Figure |
| 13% A, 11% R, 11% L; K/R/A-rich C-terminus (aa185–219); pI. 10.3; 219aa | |
| 16% G, 12% M, 10% Q; G blocks in N-term half; pI: 9.9; 145aa | |
| 20% G, 18%M, 12%A, 10% L; pI: 11.2; 186aa; some similarity to shematrins | |
| Lotgi1|239339 | 13% T, 12% S, 10% P; blocks of T from aa185–240; pI: 9.7; 609aa |
| 22% G, 12% N; pI:9.5; 191aa; some similarity to GAAP_HALAI (Figure | |
| Lotgi1|77105 | 19% P, 15% S; 12% G; 9 x g/dSQPGIYP and 4 x imperfect; pI: 4.5; 173aa |
| Lotgi1|84059 | 23% N, 15% P, 15%T, 11% S; 7 repeats similar to TPxxxNNVNPGSETPxTxNNVNPGSE and 2 incomplete; pI: 3.8; 234aa |
For complete lists of matrix proteins see Additional file 1 and Additional file 2. Accessions in bold belong to the 26 most abundant proteins with average emPAI >1000 ( Additional file 1 and Additional file 2).
Figure 4The amino acid sequence of a very acidic protein, Lotgi1|238358. Entry Lotgi1|238358 contains the sequence of a predicted transmembrane protein with a short intracellular domain (aa2–20), the predicted transmembrane segment (underlined) and a very acidic extracellular domain (theoretical pI 3.6) with Asp and Glu adding up to 30% of the amino acid composition. This protein was more abundant in the acid-insoluble than in the acid-soluble fraction. Sequences covered by MS/MS spectra a shown in red. The lower part shows the spectrum of one of the acidic, doubly charged peptides (shown in bold italics and underlined in the complete sequence) with m/z 831.3731, a mass error of 1.4 ppm and a PEP of 1.1E-12.
Figure 5The amino acid sequence of the Gly/Asn-rich protein in Lotgi1|239447. This was one of the most abundant proteins in the acid-soluble matrix. The sequence contained a Gly/Asn-rich domain (aa41–105; shaded yellow) consisting of 55% Gly and 28% Asn. This is followed by a cysteine-containing domain (cysteines shaded green) that can be presumed to have a more rigid structure stabilized by disulfide bonds. The Gly/Asn-rich domain did not yield a peptide because of the lack of tryptic cleavage sites. However, it is framed by MS/MS-sequenced peptides. A very similar G/N-rich sequence region was found in the otherwise unrelated shell protein GAAP_HALAI, identified in Haliotis asinina[6] and in nacrein_like proteins [7,46]. Sequences covered by MS/MS are in red, the peptide giving rise to the spectrum is in bold italics and underlined. The doubly charged peptide with m/z 994.4501 and a deviation from the calculated value of 0.1 ppm had a PEP of 4.7E-13. Very typically, the most intense fragments, y8 and y10, were produced by preferential fragmentation N-terminal to Pro and in the +1 position of Pro.
Figure 6The amino acid sequence of Lotgi1|159331, an acidic Gln-rich protein with multiple sequence repeats. The predicted secretion signal sequence (aa1–19) is underlined. Sequences covered by MS/MS are in red, the peptide giving rise to the spectrum below is in bold italics and underlined. The theoretical pI for this sequence is 4.0, and the amino acid composition includes 27% Gln, 13% leu and 12% Thr. Eight 21aa-long Gln-rich sequence repeats are alternately shaded grey and yellow. No peptides from the repeat region were obtained because of the lack of tryptic cleavage sites. The doubly charged peptide with m/z 642.80 and a mass deviation of 0.6 ppm had a PEP of 6.2E-09.
matrix proteins with possible sequence homologs in other shells
| BMSP (fragment) | [ | 44% (5.0E–30) | |||
| BMSP (fragment) | [ | 37% (1.6E–33) | |||
| BMSP 100 | [ | 21% (4.0E–7) | |||
| Lotgi1|133595 | dermatopontin | [ | 31% (6.6E–17) | Figure | |
| Lotgi1|233583 | ependymin-related protein | [ | 27% (6.5E–9) | ||
| Lotgi1|235548 | gigasin-2 | [ | 26% (8.6E–4) | ||
| Lotgi1|132911 | Kunitz-type protease inhibitor KCP_HALAI | [ | 56% (3.6E–18) | | |
| Lotgi1|233461 | nacrein B4/B3/A1/B2 | [ | 36–38% (1.6E–9 – 5.2E-6) | | |
| nacrein-like protein | [ | 25% (4.1E–13) | |||
| Lotgi1|239188 | nacrein B2/B3/A1/B4; aa421–633 very acidic, with similarity to such proteins as aspein | [ | 27–33% | | |
| Lotgi1|229175 | perlucin_like | [ | 26% (1.3E-4) | Figure | |
| Lotgi1|235529 | perlucin_like | [ | 31% (1.0E-4) | Figure | |
| perlustrin | [ | 33% (0.076) | Figure | ||
| perlustrin | [ | 39% (1.1E-7) | Figure | ||
| Lotgi1|143247 | perlwapin | [ | 31% (0.003) | | |
| Lotgi1|201804 | perlwapin | [ | 35% (1.2E-5) | | |
| Lotgi1|239125 | perlwapin | [ | 40% (4.3E-9) | | |
| Lotgi1|228264 | Pif (fragment) | [ | 28% (5.8E-5) | ||
| Lotgi1|232022 | Pif | [ | 24% (3.3E-15) | ||
| Lotgi1|239574 | BMSP | [ | 22% (5.9E-9) | ||
| Lotgi1|237510 | P86860 | [ | 28% (2.0E-9) | | |
| Lotgi1|166196 | tyrosinase | [ | 35% (5.7E-5) | ||
| UP2 | [ | 28% (2.9) |
For complete lists of matrix proteins see Additional file 1 and Additional file 2. 1, identified in database searches against complete databases (UniProt Knowledgebase, NCBI non-redundant protein sequences) the suggested homolog was usually not the best match, but the best mollusc shell match. 2, sequence identity in regions of sequence similarity identified by database searches; E values for the FASTA results are shown in brackets. Accessions in bold belong to the 26 most abundant proteins with average emPAI > 1000 ( Additional file 1 and Additional file 2).
Figure 7Comparison of Lotgi1|133595 to dermatopontin. The sequence of Lotgi1|133595 is compared to the sequence of Biomphalaria glabrata dermatopontin [49] and to the unpublished sequence of Haliotis discus dermatopontin submitted to EMBL by H.-S. Kang, M. De Zoysa and J. Lee. Peptides sequenced by MS/MS are shown in red. The N-glycosylation site of B. glabrata dermatopontin is shaded green. The Biomphalaria sequence is the sequence of the mature protein determined by Edman degradation and therefore lacks a secretion signal peptide.
Other proteins with a possible or established link to biomineralization
| Similar to calcineurin | 30% identity in a 120aa overlap (Fasta E value: 0.37) with | |
| Lotgi1|205401 | Carbonic anhydrase | Minor protein; possibly intracellular |
| Lotgi1|66515 | Carbonic anhydrase | Major protein in acid-soluble shell proteoime; possibly intracellular |
| Lotgi1|159694 | Chitin-binding | Minor protein, 4 chitin-binding peritrophin A domains and 4–6 SRCR (scavenger receptor-related) domains |
| Lotgi1|160173 | Chitin-binding | Major protein, secreted; 2–3 chitin-binding peritrophin A domains |
| Lotgi1|231395 | Chitin-binding | Sequence contains predicted secretion signal sequence followed by two chitin-binding peritrophin A domains |
| Lotgi1|226726 | Chitin-binding | Major protein in acid-soluble, minor in acid-insoluble consensus proteome; chitin-binding_3 domain |
| Lotgi1|231869 | Chitin-binding | Major protein in acid soluble proteome; 10 chitin-binding perotrophin A domains organized in two blocks separated by four Pro-rich extensin-like motifs (aa470–600; 29% Pro, 16% Thr, 12% Gln, 12% Asn) |
| Lotgi1|232880 | Chitin-binding/chitinase | Major protein in acid-insoluble proteome; several SEA domains; chitin-binding peritrophin domain (aa2140–2200)with some similarity to chitinases |
| Lotgi1|234405 | Chitin-binding | Major protein in acid soluble proteome; four chitin-binding peritrophin A domains preceded by a predictedsecretion signal sequence |
| Lotgi1|238400 | Chitin-binding | Major protein in acid-insoluble proteome; predicted secretion signal sequence, VWA domain and Chitin-binding peritrophin A domain |
| Lotgi1|209107 | Chitinase | Lysosomal; chitin degradation; major protein |
| Lotgi1|181237 | Chitin deacetylase | Minor secreted protein |
| Lotgi1|156599 | FAM20C/DMP4 | Extracellular matrix protein; minor |
| Lotgi1|109908 | Osteonectin/SPARC/BM-40 | Overlapping fragments; extracellular matrix protein; major in acid-soluble matrix, minor in acid-insoluble matrix; |
Accessions in bold belong to the 26 most abundant proteins with average emPAI > 1000 ( Additional file 1 and Additional file 2). For complete lists of matrix proteins see Additional file 1 and Additional file 2.
Figure 8Sequence comparison of perlucin-like proteins. Peptides sequenced by MS/MS are shown in red. The sequence of PLCL_MYTGA is from [15] (P86854), PLC_HALLA is from [62] (P82596). This latter sequence had been determined by Edman degradation with the isolated mature protein. Therefore there is no secretion signal sequence as in the other sequences.
Figure 9Sequence comparison of perlustrin-like proteins. Peptides sequenced by MS/MS are shown in red. Unlike the Lottia proteins, Haliotis laevigata perlustrin has no secretion signal sequence because the mature protein had been sequenced by Edman degradation [50] (P82595).
Figure 10Domain organization of WAP-containing proteins of the shell matrix. WAP (whey acidic protein) domains are shown in green, antistasin-like protease inhibitor domains are shown in blue. Lotgi1|143274 starts with a partial WAP domain. Perlwapin is the Haliotis laevigata protein [51]. Domain borders were determined with Prosite (http://prosite.expasy.org/), the drawing was prepared with the help of Prosite MyDomains (http://prosite.expasy.org/cgi-bin/prosite/mydomains/).