| Literature DB >> 36009835 |
Liya Kondratyeva1, Irina Alekseenko1,2, Igor Chernov1, Eugene Sverdlov2,3.
Abstract
In this brief review, we attempt to demonstrate that the incompleteness of data, as well as the intrinsic heterogeneity of biological systems, may form very strong and possibly insurmountable barriers for researchers trying to decipher the mechanisms of the functioning of live systems. We illustrate this challenge using the two most studied organisms: E. coli, with 34.6% genes lacking experimental evidence of function, and C. elegans, with identified proteins for approximately 50% of its genes. Another striking example is an artificial unicellular entity named JCVI-syn3.0, with a minimal set of genes. A total of 31.5% of the genes of JCVI-syn3.0 cannot be ascribed a specific biological function. The human interactome mapping project identified only 5-10% of all protein interactions in humans. In addition, most of the available data are static snapshots, and it is barely possible to generate realistic models of the dynamic processes within cells. Moreover, the existing interactomes reflect the de facto interaction but not its functional result, which is an unpredictable emerging property. Perhaps the completeness of molecular data on any living organism is beyond our reach and represents an unsolvable problem in biology.Entities:
Keywords: big data; bioinformatics; complexity; genome; systems biology
Year: 2022 PMID: 36009835 PMCID: PMC9404739 DOI: 10.3390/biology11081208
Source DB: PubMed Journal: Biology (Basel) ISSN: 2079-7737
Figure 1A simple illustration of the data incompleteness problem. Achieving “n = all” for many biological data may be an unreachable or unrealistic goal. With the same set of incomplete data, it is possible to arrive at different versions of the organization of any biological process.
Figure 2Illustration of a protein–protein interaction network: colored lines indicate an interaction between a pair of proteins, some of which are possibly spurious. The strength of the interaction between a pair of proteins is indicated by its color and is stronger when the color is closer to red. Translucent-colored clouds uniting proteins in a cluster symbolize common localization or function.