| Literature DB >> 31170285 |
Benjamin D Lee1,2,3, Michael A Timony2,4, Pablo Ruiz3.
Abstract
Raw DNA sequences contain an immense amount of meaningful biological information. However, these sequences are hard for humans to intuitively interpret. To solve this problem, a number of methods have been proposed to transform DNA sequences into two-dimensional visualizations. DNAvisualization.org implements several of these methods in a cost effective and performant manner via a novel, entirely serverless architecture. By taking advantage of recent developments in serverless parallel computing and selective data retrieval, the website is able to offer users the ability to visualize up to thirty 4.5 Mb DNA sequences simultaneously using one of five supported methods and to export these visualizations in a variety of publication-ready formats.Entities:
Year: 2019 PMID: 31170285 PMCID: PMC6602497 DOI: 10.1093/nar/gkz404
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Sequence mode.
Figure 2.File mode.
Figure 3.A diagram displaying the flow of information during initial sequence transformation (1) and sequence querying (2). For initial sequence transformation, FASTA files (1A) are parsed in the user’s browser and submitted asynchronously in parallel to the serverless Lambda functions (1B). If the sequence does not already have an existing transformation using the specified visualization method, the functions transform the sequence and store the data in AWS S3 (1C) in the binary Parquet format. The functions then downsample the transformed data and return it the client’s browser in JSON format (1E) to be plotted (1F). If a user wishes to see more detail of a particular region (2A), the browser sends an asynchronous query containing the location of the region for each plotted DNA sequence to a Lambda function (not shown). In parallel, each function converts the query into a SQL statement and submit the query to S3 Select (2B). S3 Select scans the transformed DNA sequence and returns only the data in the region to the Lambda function, which in turn downsamples to JSON (2C) and returns it to the user’s browser for plotting (2D).