Chao Wang1, Haicang Zhang1, Wei-Mou Zheng2, Dong Xu3, Jianwei Zhu4, Bing Wang4, Kang Ning5, Shiwei Sun4, Shuai Cheng Li6, Dongbo Bu4. 1. Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, University of Chinese Academy of Sciences, Beijing, China. 2. Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing, China. 3. Department of Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, MO 65211, USA. 4. Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China. 5. College of Life Science, Huazhong University of Science and Technology, Wuhan, China and. 6. Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong.
Abstract
SUMMARY: The protein structure prediction approaches can be categorized into template-based modeling (including homology modeling and threading) and free modeling. However, the existing threading tools perform poorly on remote homologous proteins. Thus, improving fold recognition for remote homologous proteins remains a challenge. Besides, the proteome-wide structure prediction poses another challenge of increasing prediction throughput. In this study, we presented FALCON@home as a protein structure prediction server focusing on remote homologue identification. The design of FALCON@home is based on the observation that a structural template, especially for remote homologous proteins, consists of conserved regions interweaved with highly variable regions. The highly variable regions lead to vague alignments in threading approaches. Thus, FALCON@home first extracts conserved regions from each template and then aligns a query protein with conserved regions only rather than the full-length template directly. This helps avoid the vague alignments rooted in highly variable regions, improving remote homologue identification. We implemented FALCON@home using the Berkeley Open Infrastructure of Network Computing (BOINC) volunteer computing protocol. With computation power donated from over 20,000 volunteer CPUs, FALCON@home shows a throughput as high as processing of over 1000 proteins per day. In the Critical Assessment of protein Structure Prediction (CASP11), the FALCON@home-based prediction was ranked the 12th in the template-based modeling category. As an application, the structures of 880 mouse mitochondria proteins were predicted, which revealed the significant correlation between protein half-lives and protein structural factors. AVAILABILITY AND IMPLEMENTATION: FALCON@home is freely available at http://protein.ict.ac.cn/FALCON/. CONTACT: shuaicli@cityu.edu.hk, dbu@ict.ac.cn SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
SUMMARY: The protein structure prediction approaches can be categorized into template-based modeling (including homology modeling and threading) and free modeling. However, the existing threading tools perform poorly on remote homologous proteins. Thus, improving fold recognition for remote homologous proteins remains a challenge. Besides, the proteome-wide structure prediction poses another challenge of increasing prediction throughput. In this study, we presented FALCON@home as a protein structure prediction server focusing on remote homologue identification. The design of FALCON@home is based on the observation that a structural template, especially for remote homologous proteins, consists of conserved regions interweaved with highly variable regions. The highly variable regions lead to vague alignments in threading approaches. Thus, FALCON@home first extracts conserved regions from each template and then aligns a query protein with conserved regions only rather than the full-length template directly. This helps avoid the vague alignments rooted in highly variable regions, improving remote homologue identification. We implemented FALCON@home using the Berkeley Open Infrastructure of Network Computing (BOINC) volunteer computing protocol. With computation power donated from over 20,000 volunteer CPUs, FALCON@home shows a throughput as high as processing of over 1000 proteins per day. In the Critical Assessment of protein Structure Prediction (CASP11), the FALCON@home-based prediction was ranked the 12th in the template-based modeling category. As an application, the structures of 880 mouse mitochondria proteins were predicted, which revealed the significant correlation between protein half-lives and protein structural factors. AVAILABILITY AND IMPLEMENTATION: FALCON@home is freely available at http://protein.ict.ac.cn/FALCON/. CONTACT: shuaicli@cityu.edu.hk, dbu@ict.ac.cn SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Agnieszka S Karczyńska; Karolina Ziȩba; Urszula Uciechowska; Magdalena A Mozolewska; Paweł Krupa; Emilia A Lubecka; Agnieszka G Lipska; Celina Sikorska; Sergey A Samsonov; Adam K Sieradzan; Artur Giełdoń; Adam Liwo; Rafał Ślusarz; Magdalena Ślusarz; Jooyoung Lee; Keehyoung Joo; Cezary Czaplewski Journal: J Chem Inf Model Date: 2020-02-11 Impact factor: 4.956
Authors: Lellys M Contreras; Paz Sevilla; Ana Cámara-Artigas; José G Hernández-Cifre; Bruno Rizzuti; Francisco J Florencio; María Isabel Muro-Pastor; José García de la Torre; José L Neira Journal: Int J Mol Sci Date: 2018-06-24 Impact factor: 5.923