| Literature DB >> 32529275 |
Celia Blanco1,2, Samuel Verbanic3,4, Burckhard Seelig5,6, Irene A Chen7,3,4.
Abstract
In vitro evolution is a well-established technique for the discovery of functional RNA and peptides. Increasingly, these experiments are analyzed by high-throughput sequencing (HTS) for both scientific and engineering objectives, but computational analysis of HTS data, particularly for peptide selections, can present a barrier to entry for experimentalists. We introduce EasyDIVER (Easy pre-processing and Dereplication of In Vitro Evolution Reads), a simple, user-friendly pipeline for processing high-throughput sequencing data from in vitro selections and directed evolution experiments. The pipeline takes as input raw, paired-end, demultiplexed Illumina read files. For each sample provided, EasyDIVER outputs a dereplicated list of unique nucleic acid and/or peptide sequences and their count reads.Entities:
Keywords: Bioinformatics; High-throughput sequencing; In vitro evolution; SELEX; mRNA display
Year: 2020 PMID: 32529275 PMCID: PMC7324411 DOI: 10.1007/s00239-020-09954-0
Source DB: PubMed Journal: J Mol Evol ISSN: 0022-2844 Impact factor: 2.395
Fig. 1EasyDIVER flow chart. Input and output files are represented by white rectangles, subdirectory names by dashed gray rectangles, processes by rounded gray rectangles, and additional requirements by black ovals. Letters enclosed in diamond shapes represent flag variables (Table 1). The gray outline rectangle, together with the enclosed flags, represents the overall PANDAseq process
Flag variables
| Flag | Description | Comments |
|---|---|---|
| -i | Input directory path and name | Required |
| -o | Output directory path and name | Optional Default value: /pipeline.output |
| -p | Extraction forward DNA primer | Optional |
| -q | Extraction reverse DNA primer | Optional |
| -T | Number of threads used for computation | Optional Default value: 1 |
| -a | Translation into amino acids is performed | Optional Default value: FALSE |
| -r | Files for individual lanes are retained | Optional Default value: FALSE |
| -e | Additional internal PANDAseq flags | Optional Must be entered in quotation marks (e.g., -e “-L 50”) Default value: “-l 1 -d rbfkms“ |
| -h | Help message | Optional |
Fig. 2Peptide length histogram. Normalized length distribution of translated sequences for the two different samples in the test dataset: a test1_S1 and b test2_S2, using a bin size 10. See Supporting Figure S2 for the DNA length distributions