| Literature DB >> 22006916 |
Douglas M Fowler1, Carlos L Araya, Wayne Gerard, Stanley Fields.
Abstract
SUMMARY: Measuring the consequences of mutation in proteins is critical to understanding their function. These measurements are essential in such applications as protein engineering, drug development, protein design and genome sequence analysis. Recently, high-throughput sequencing has been coupled to assays of protein activity, enabling the analysis of large numbers of mutations in parallel. We present Enrich, a tool for analyzing such deep mutational scanning data. Enrich identifies all unique variants (mutants) of a protein in high-throughput sequencing datasets and can correct for sequencing errors using overlapping paired-end reads. Enrich uses the frequency of each variant before and after selection to calculate an enrichment ratio, which is used to estimate fitness. Enrich provides an interactive interface to guide users. It generates user-accessible output for downstream analyses as well as several visualizations of the effects of mutation on function, thereby allowing the user to rapidly quantify and comprehend sequence-function relationships.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22006916 PMCID: PMC3232369 DOI: 10.1093/bioinformatics/btr577
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Enrich visualizations. Enrich produces three visualizations; examples from the dataset included with Enrich are shown here. (a) The diversity within a library is illustrated by a heatmap of the frequency of each position–mutation combination. (b) The position-averaged change in mutational frequency between two libraries is shown. (c) The log2-scaled enrichment ratio for each position–mutation combination is plotted, individually organized both by position and by amino acid (a single amino acid, serine, is shown). Blue dots indicate the enrichment or depletion of substitutions. Red squares correspond to wild-type residues. Grey squares correspond to unobserved mutations.