| Literature DB >> 31906602 |
Qian Li1, Xiaojun Peng1, Yuanqing Li1, Wenqin Tang1, Jia'an Zhu1, Jing Huang1, Yifei Qi2, Zhuqing Zhang1.
Abstract
Liquid-liquid phase separation (LLPS) leads to a conversion of homogeneous solution into a dense phase that often resembles liquid droplets, and a dilute phase. An increasing number of investigations have shown that biomolecular condensates formed by LLPS play important roles in both physiology and pathology. It has been suggested the phase behavior of proteins would be not only determined by sequences, but controlled by micro-environmental conditions. Here, we introduce LLPSDB (http://bio-comp.ucas.ac.cn/llpsdb or http://bio-comp.org.cn/llpsdb), a web-accessible database providing comprehensive, carefully curated collection of proteins involved in LLPS as well as corresponding experimental conditions in vitro from published literatures. The current release of LLPSDB incorporates 1182 entries with 273 independent proteins and 2394 specific conditions. The database provides a variety of data including biomolecular information (protein sequence, protein modification, nucleic acid, etc.), specific phase separation information (experimental conditions, phase behavior description, etc.) and comprehensive annotations. To our knowledge, LLPSDB is the first available database designed for LLPS related proteins specifically. It offers plenty of valuable resources for exploring the relationship between protein sequence and phase behavior, and will enhance the development of phase separation prediction methods, which may further provide more insights into a comprehensive understanding of LLPS in cellular function and related diseases.Entities:
Mesh:
Substances:
Year: 2020 PMID: 31906602 PMCID: PMC6943074 DOI: 10.1093/nar/gkz778
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Data structure of LLPSDB. Most of the data are curated from literatures including information of main components (proteins and nucleic acids, as listed in left column of the top box), experimental conditions and phase behavior description (listed in the right column of the top box). Some protein sequences and important annotations of natural protein are from UniProt/NCBI (26) and MobiDB (28) (boxes with yellow frame). Other annotations are added to organize data methodically (box with green frame).
Overview of the data subsets in LLPSDB
| Classifications | Data type | Number of entry | Proportion (%) |
|---|---|---|---|
| Protein type | Natural protein | 1054 | 85 |
| Designed protein | 191 | 15 | |
| Main components type | Protein(s) | 913 | 77 |
| Protein(s) + RNA | 183 | 16 | |
| Protein(s) + DNA | 86 | 7 | |
| Main components number | One component | 542 | 46 |
| Two components | 502 | 43 | |
| More components | 128 | 11 |
Figure 2.Statistical analysis of protein sequence in simple LLPS entries. ‘Protein(1)’ means there is only one IDR-contained protein in the LLPS system, and ‘Protein(1)+Nucleic acid’ means the system also contains DNA or RNA. For the two subsets, the amino acid compositions of the proteins (A), LCRs (B), as well as the disordered (top panel of C) and the folded (bottom panel of C) regions are presented on the left side. On the right side, the sequence length distributions of the proteins and LCRs in the subsets are shown in (D) and (E), respectively; (F) exhibits the distribution of SCD (sequence charge decoration) for the whole protein sequences in the subsets. The statistical analysis from the human proteome is shown as a comparison (black bar), where all sequences were used in (A), (B) and (D), and disordered regions and folded ones were used in the top and the bottom panels of (C) respectively.
Figure 3.The Browse module in LLPSDB. (A) Dataset can be browsed by three distinct classifications. (B) An extra ‘Protein List page’ is displayed when browsing natural or designed proteins. (C) The ‘Protein Details page’ consists of two parts: protein information and related ‘Table of Entries’. (D) The ‘Table of Entries’ displays brief information of main components in each entry. (E) The expanded ‘Entry page’ displays the comprehensive information of the entry, including general information for main components and phase separation conditions.