| Literature DB >> 33426353 |
Abstract
Protein sequence similarity networks (SSNs) constitute a convenient approach to analyze large polypeptide sequence datasets, and have been successfully applied to study a number of protein families over the past decade. SSN analysis is herein combined with traditional cladistic and phenetic phylogenetic analysis (respectively based on multiple sequence alignments and all-against-all three-dimensional protein structure comparisons) in order to assist the ancestral reconstruction and integrative revision of the superfamily of metallo-β-lactamases (MBLs). It is shown that only 198 out of 15,292 representative nodes contain at least one experimentally obtained protein structure in the Protein Data Bank or a manually annotated SwissProt entry, that is to say, only 1.3 % of the superfamily has been functionally and/or structurally characterized. Besides, neighborhood connectivity coloring, which measures local network interconnectivity, is introduced for detection of protein families within SSN clusters. This approach provides a clear picture of how many families remain unexplored in the superfamily, while most MBL research is heavily biased towards a few families. Further research is suggested in order to determine the SSN topological properties, which will be instrumental for the improvement of automated sequence annotation methods.Entities:
Keywords: Metallo-lactamase; Neighborhood connectivity; Protein superfamily; Sequence similarity network; Tanglegram
Year: 2021 PMID: 33426353 PMCID: PMC7785958 DOI: 10.1016/j.heliyon.2020.e05867
Source DB: PubMed Journal: Heliyon ISSN: 2405-8440