| Literature DB >> 35418183 |
Panga Jaipal Reddy1, Nevil Pinto2, Mehar Un Nissa3, Zhi Sun1, Biplab Ghosh4, Robert L Moritz1, Mukunda Goswami5, Sanjeeva Srivastava6.
Abstract
Labeo rohita (Rohu) is one of the most important fish species produced in world aquaculture. Integrative omics research provides a strong platform to understand the basic biology and translate this knowledge into sustainable solutions in tackling disease outbreak, increasing productivity and ensuring food security. Mass spectrometry-based proteomics has provided insights to understand the biology in a new direction. Very little proteomics work has been done on 'Rohu' limiting such resources for the aquaculture community. Here, we utilised an extensive mass spectrometry based proteomic profiling data of 17 histologically normal tissues, plasma and embryo of Rohu to develop an open source PeptideAtlas. The current build of "Rohu PeptideAtlas" has mass-spectrometric evidence for 6015 high confidence canonical proteins at 1% false discovery rate, 2.9 million PSMs and ~150 thousand peptides. This is the first open-source proteomics repository for an aquaculture species. The 'Rohu PeptideAtlas' would promote basic and applied aquaculture research to address the most critical challenge of ensuring nutritional security for a growing population.Entities:
Mesh:
Year: 2022 PMID: 35418183 PMCID: PMC9008064 DOI: 10.1038/s41597-022-01259-9
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Tissue types and sampling details.
| S. no. | Sample | Collection stage |
|---|---|---|
| Fingerling (10 ± 2 g) | ||
| Adult female (1000 ± 100 g) | ||
| Adult male (1000 ± 100 g) | ||
| 4-day post fertilisation |
Fig. 1An overview of experimental design and analysis workflow. (a) Fishes were dissected to collect the tissue/samples followed by protein extraction and SDS-PAGE. Gel slices were excised and processed for in-gel based tryptic digestion followed by Liquid chromatography tandem mass spectrometry (LC-MS/MS) and analysis in Trans proteomic pipeline (TPP), (b) Raw data obtained from DDA-MS were processed along the pipeline for building PeptideAtlas. Raw files were first converted to mzml followed by comet search and analysis pipeline including peptide prophet, reSpect, iPROphet, protein prophet and final filtering and validation to compile the atlas.
Fig. 2An overview of phylogenetically annotated orthologs for the canonical proteins. The distribution of identified proteins mapped against each ortholog group is presented here (ortholog details in the Table 2).
Distribution of identified canonical proteins across various orthologs*.
| Groups | Description | No. of proteins |
|---|---|---|
| Cell cycle control, cell division, chromosome partitioning | 62 | |
| Cell wall/membrane/envelope biogenesis | 38 | |
| Cell motility | 5 | |
| Post-translational modification, protein turnover, and chaperones | 505 | |
| Signal transduction mechanisms | 982 | |
| Intracellular trafficking, secretion, and vesicular transport | 318 | |
| Defense mechanisms | 50 | |
| Extracellular structures | 119 | |
| Nuclear structure | 4 | |
| Cytoskeleton | 249 | |
| RNA processing and modification | 226 | |
| Chromatin structure and dynamics | 40 | |
| Translation, ribosomal structure and biogenesis | 232 | |
| Transcription | 158 | |
| Replication, recombination and repair | 74 | |
| Energy production and conversion | 195 | |
| Amino acid transport and metabolism | 163 | |
| Nucleotide transport and metabolism | 96 | |
| Carbohydrate transport and metabolism | 187 | |
| Coenzyme transport and metabolism | 43 | |
| Lipid transport and metabolism | 213 | |
| Inorganic ion transport and metabolism | 120 | |
| Secondary metabolites biosynthesis, transport, and catabolism | 129 | |
| General function prediction only | 0 | |
| Function unknown | 1289 | |
*This data is in continuation of data represented in Fig. 2.
Organ wise numerical summary for the data in Labeo rohita PeptideAtlas.
| Dataset | Experiment Tag | MS Runs | Spectra Searched | Distinct Peptides | Unique Peptides | Cumulative Peptides | Distinct Canonical Proteins | Unique Canonical Proteins | Unique All Proteins | Cumulative Canonical Proteins |
|---|---|---|---|---|---|---|---|---|---|---|
| PrePX245 | Air bladder | 18 | 1073739 | 21148 | 1754 | 21147 | 2428 | 10 | 66 | 2428 |
| PrePX245 | Embryo | 6 | 342968 | 18801 | 1063 | 31710 | 2813 | 14 | 108 | 3360 |
| PrePX245 | Female gonad | 24 | 1299928 | 30777 | 6364 | 49703 | 2828 | 53 | 194 | 4038 |
| PrePX245 | Fin | 6 | 421953 | 9329 | 274 | 51602 | 2074 | 2 | 44 | 4188 |
| PrePX245 | Gallbladder | 9 | 489743 | 16708 | 690 | 55308 | 2953 | 7 | 77 | 4478 |
| PrePX245 | Gill | 18 | 1999739 | 32801 | 2706 | 64160 | 3552 | 17 | 222 | 4808 |
| PrePX245 | Gut | 18 | 1005859 | 28610 | 2781 | 70910 | 3105 | 12 | 147 | 4929 |
| PrePX245 | Female plasma | 8 | 430788 | 6497 | 1743 | 74623 | 562 | 0 | 19 | 4954 |
| PrePX245 | Scales | 15 | 710427 | 1225 | 123 | 74888 | 323 | 0 | 7 | 4956 |
| PrePX245 | Skin | 13 | 776257 | 18838 | 944 | 77876 | 2164 | 1 | 17 | 4970 |
| PrePX245 | Spinal cord | 18 | 1093689 | 43051 | 4862 | 92280 | 3920 | 28 | 169 | 5380 |
| PrePX245 | Brain | 18 | 1113115 | 53736 | 13205 | 109378 | 4343 | 190 | 868 | 5698 |
| PrePX245 | Eye | 18 | 797200 | 29665 | 4328 | 115355 | 2727 | 35 | 116 | 5757 |
| PrePX245 | Kidney | 18 | 1131662 | 34743 | 2527 | 119636 | 3467 | 15 | 107 | 5803 |
| PrePX245 | Liver | 18 | 1203630 | 47487 | 10469 | 132274 | 3544 | 40 | 282 | 5887 |
| PrePX245 | Muscle | 21 | 1034502 | 15167 | 2288 | 134681 | 1697 | 5 | 37 | 5894 |
| PrePX245 | Spleen | 16 | 817942 | 22059 | 1999 | 136880 | 3036 | 11 | 145 | 5916 |
| PrePX245 | Heart | 18 | 1042352 | 36798 | 3751 | 140791 | 3554 | 16 | 160 | 5936 |
| PrePX245 | Male gonad | 18 | 847818 | 40259 | 9990 | 150781 | 3501 | 79 | 277 | 6015 |
Fig. 3An overview of Labeo rohita PeptideAtlas build. (a,b), Plots showing cumulative number of peptides and canonical proteins respectively contributed by each experiment. Height of the blue/navy blue bar represents cumulative number of peptides/proteins, height of the orange/red bar represents number of peptides/proteins identified in each experiment and width of the bar (x-axis) represents the number of spectra identified (PSMs) for each experiment, (c) Distribution of peptide spectral matches against the peptide charge, (d) Graph showing the spectral count for the peptides of different lengths and (e). Bar plot representing the number of unique peptides (distinct peptides) per canonical protein where the x-axis shows the bins for number of unique peptides and y-axis show the number of respective canonical proteins, (f) Distribuition of canonical proteins based on percentage sequence coverage [Fig. 3a–e are taken from ‘Experiment Contribution Plots’ section of first page of Labeo rohita PeptideAtlas].
Fig. 4Example of a protein search and peptide search in Rohu PeptideAtlas. (a) Out of several collapsible sections for protein search, three are shown to provide an overview of protein information, observed peptides highlighted in red font and additional information for each observed peptide, respectively. (b) Under peptide view, two sections for one of the observed peptides of the same protein are shown representing general information about peptide and respective annotated MS2 spectrum where x-axis represents the m/z and y-axis shows the intensity.
List of peptides selected for SRM based verification along with some details from PeptideAtlas and match score (dotp*) with spectral library.
| Sequence | Accession | ESS | EOS | dotp ( + 2) | dotp ( + 3) |
|---|---|---|---|---|---|
| VFVDSCVATQAPDVNSLPR | PAp07598395 | 0.89 | 1.00 | 0.85 | 0.82 |
| ALWSPMGMASALQSPFGVQEK | PAp07599055 | 0.78 | 0.33 | 0.85 | 0.77 |
| QLLQGPVKPLDWR | PAp07598382 | 0.78 | 0.67 | 0.75 | 0.84 |
| ADGAIVGVQCHYPR | PAp07598197 | 0.76 | 0.67 | 0.79 | 0.81 |
| NMITGTSQADAALLIVSAAK | PAp04190051 | 0.75 | 0.26 | 0.76 | 0.76 |
| YSFIENHGCFVDAK | PAp04184446 | 0.74 | 0.67 | 0.80 | 0.84 |
| QPVTPSSVAVQCSEDR | PAp07601931 | 0.72 | 0.67 | 0.76 | 0.80 |
| FPLVPEVQR | PAp07604998 | 0.69 | 0.83 | 0.88 | NA |
| EVAVDFQMR | PAp07599266 | 0.68 | 0.50 | 0.93 | NA |
| IETGVLKPGMVLTFSPAK | PAp04175036 | 0.62 | 0.32 | 0.79 | 0.85 |
| SIEMHHQGLQTALPGHNVGFNIK | PAp04186345 | 0.60 | 0.26 | NA | 0.65 |
| FMPQTQPEK | PAp07599143 | 0.59 | 0.50 | 0.82 | NA |
| VGYSPVLDCHTTHVSCR | PAp04189673 | 0.55 | 0.37 | 0.56 | 0.70 |
| ATFASVPSDAGR | PAp07604862 | 0.55 | 0.50 | 0.87 | NA |
| TLLEVLDSLLPPVR | PAp07588809 | 0.54 | 0.42 | 0.80 | 0.85 |
| IGGVGTVPVGK | PAp04171392 | 0.54 | 0.42 | 0.85 | NA |
| LVPNKPLCVESFFHYPPLGR | PAp04189456 | 0.53 | 0.26 | 0.65 | 0.77 |
| IHINLVIIGHVDSGK | PAp07601820 | 0.49 | 0.16 | 0.64 | 0.71 |
| YTFTIIDAPGHR | PAp04188238 | 0.49 | 0.32 | 0.70 | 0.87 |
| VYNHVPLR | PAp07604851 | 0.49 | 0.33 | 0.88 | NA |
| MDLTEPPFSQK | PAp04189675 | 0.47 | 0.26 | 0.87 | NA |
| STTTGHLVYK | PAp04190655 | 0.45 | 0.26 | 0.75 | NA |
| GDVAGNAQQDPPSDVSSFIAQIIMLNHPGK | PAp04190359 | 0.44 | 0.16 | NA | 0.52 |
| LEDWPQYLMSGDGATVK | PAp07599462 | 0.44 | 0.11 | 0.69 | 0.46 |
| GEFEAGISR | PAp04185068 | 0.44 | 0.21 | 0.85 | NA |
| IGFEIGAVPFIPVSGWSGENMIAPSQK | PAp07598582 | 0.43 | 0.11 | NA | 0.52 |
| LMLDDWSYERPSNYYFLGNVFNLEASVK | PAp07599621 | 0.41 | 0.17 | NA | 0.47 |
| VQFQLEAFMFQEGQSPSIYITCLLK | PAp07605170 | 0.40 | 0.17 | NA | 0.27 |
| QLMVCVNK | PAp07602504 | 0.40 | 0.11 | 0.88 | NA |
| GITIDISLLK | PAp04189685 | 0.39 | 0.16 | 0.82 | NA |
*dotp represents the measure of similarity between spectral library and experimental data.
Fig. 5Targeted proteomic verification using spectral libraries. Left panel shows the peak view for the spectral information obtained for the peptide after performing SRM experiment and right panel shows the peak area view of the replicate runs along with match with the spectral library, (a,b) Spectral information for two peptides showing single, consistent peak with good match with library, (c) Wrongly annotated peak for the given peptide at 5.9 min with a dotp of 0.34 in both the replicate runs (right panel), (d) Correct annotated peak (4.6 min) based on the match with library (0.85/0.84) in both the replicates. [TL1 and TL2 represents the two transition lists, R1 and R2 represents the duplicate runs for the same sample].
| Measurement(s) | Proteins and Peptides |
| Technology Type(s) | Mass Spectrometry |
| Sample Characteristic - Organism |
|
| Sample Characteristic - Location | India |