| Literature DB >> 32128557 |
Robert S Nash1, Shuai Weng1, Kalpana Karra1, Edith D Wong1, Stacia R Engel1, J Michael Cherry1.
Abstract
The identification and accurate quantitation of protein abundance has been a major objective of proteomics research. Abundance studies have the potential to provide users with data that can be used to gain a deeper understanding of protein function and regulation and can also help identify cellular pathways and modules that operate under various environmental stress conditions. One of the central missions of the Saccharomyces Genome Database (SGD; https://www.yeastgenome.org) is to work with researchers to identify and incorporate datasets of interest to the wider scientific community, thereby enabling hypothesis-driven research. A large number of studies have detailed efforts to generate proteome-wide abundance data, but deeper analyses of these data have been hampered by the inability to compare results between studies. Recently, a unified protein abundance dataset was generated through the evaluation of more than 20 abundance datasets, which were normalized and converted to common measurement units, in this case molecules per cell. We have incorporated these normalized protein abundance data and associated metadata into the SGD database, as well as the SGD YeastMine data warehouse, resulting in the addition of 56 487 values for untreated cells grown in either rich or defined media and 28 335 values for cells treated with environmental stressors. Abundance data for protein-coding genes are displayed in a sortable, filterable table on Protein pages, available through Locus Summary pages. A median abundance value was incorporated, and a median absolute deviation was calculated for each protein-coding gene and incorporated into SGD. These values are displayed in the Protein section of the Locus Summary page. The inclusion of these data has enhanced the quality and quantity of protein experimental information presented at SGD and provides opportunities for researchers to access and utilize the data to further their research.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32128557 PMCID: PMC7054198 DOI: 10.1093/database/baaa008
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1Experimental Data Section of the Protein Page. This section of the Protein page contains two tables, one containing protein half-life data and the second containing the protein abundance data, and associated metadata, along with the original reference and the reference for the combined unified dataset. This table is both sortable and filterable.
Figure 2Protein Section of the Locus Summary Page. The protein section of the Locus Summary pages, located between the Sequence and Gene Ontology sections, contains the calculated median and MAD for the protein of interest expressed in molecules/cell in addition to basic sequence-derived information (length, molecular weight and isoelectric point). Median was calculated based on all values for a given protein from untreated cells, and MAD was calculated using the same values. When the median value was generated based on a single value, a MAD could not be calculated.