| Literature DB >> 34216059 |
Andrew J Miles1, Sergio G Ramalli1, B A Wallace1.
Abstract
Circular dichroism (CD) spectroscopy is a widely-used method for characterizing the secondary structures of proteins. The well-established and highly used analysis website, DichroWeb (located at: http://dichroweb.cryst.bbk.ac.uk/html/home.shtml) enables the facile quantitative determination of helix, sheet, and other secondary structure contents of proteins based on their CD spectra. DichroWeb includes a range of reference datasets and algorithms, plus graphical and quantitative methods for determining the quality of the analyses produced. This article describes the current website content, usage and accessibility, as well as the many upgraded features now present in this highly popular tool that was originally created nearly two decades ago.Entities:
Keywords: bioinformatics; calculations; circular dichroism spectroscopy; data analyses; disordered structure; protein secondary structure; reference datasets; soluble and membrane proteins; α-helix; β-sheet
Mesh:
Substances:
Year: 2021 PMID: 34216059 PMCID: PMC8740839 DOI: 10.1002/pro.4153
Source DB: PubMed Journal: Protein Sci ISSN: 0961-8368 Impact factor: 6.725
FIGURE 1The DichroWeb Server. (a) The “Landing Page” for the DichroWeb server, located at: http://dichroweb.cryst.bbk.ac.uk/html/home.shtml. It indicates details of how to obtain an account, the link to data input/analysis sections, links to associated video guides and other related software, and usage statistics. (b) The DichroWeb server “Data Input” page
Characteristics of the algorithms available in DichroWeb (and references to the original articles describing them)
| Algorithm [reference] | Method | Reference datasets | Notes |
|---|---|---|---|
| SELCON3 [ | SVD plus variable selection |
| A suitable NRMSD value is ≤0.1 |
| CONTINLL [ | RR plus variable selection |
| A suitable NRMSD value is ≤0.1 |
| CDSSTR [ | SVD plus variable selection |
|
A suitable NRMSD value is ≤0.01 Many iterations are done, so this can take longer than other methods |
| VARSLEC [ | SVD plus variable selection |
Built‐in dataset of 33 spectra |
Requires data from 260–178 nm. No NRMSD (or other goodness‐of‐ fit parameter), nor back‐calculated spectrum produced |
| K2D [ | NN | Built in weightings |
Requires data from 241 to 200 nm |
Abbreviations: NN, neural network; RR, ridge regression; SVD, singular value deconvolution.
Characteristics of the reference datasets available for use in DichroWeb, and the secondary structure classifications they produce (αR, regular helix; αD, disordered (end of) helix; βR, regular sheet; βD, disordered ((end of) sheet); T, turn (any type); PP2, polyproline II; and U, unordered/other)
| Reference dataset | Types of proteins in dataset | Wavelength range (nm) | Number of proteins | Structure assignments |
|---|---|---|---|---|
| SET1 | Soluble globular | 178–260 | 29 | αR, αD, βR, βD, T, U |
| SET2 | Soluble globular | 178–260 | 22 | α‐helix, 310 helix, β, T, PP2, U |
| SET3 | Soluble globular | 185–240 | 37 | αR, αD, βR, βD, T, U |
| SET4 | Soluble globular | 190–240 | 43 | αR, αD, βR, βD, T, U |
| SET5 | Soluble globular | 178–260 | 17 | helix, β, turn, PP2, U |
| SET6 | Soluble globular and denatured proteins | 185–240 | 42 | αR, αD, βR, βD, T, U |
| SET7 | Soluble globular and denatured proteins | 190–240 | 48 | αR, αD, βR, βD, T, U |
| SP175 | Soluble globular | 175–240 | 71 | αR, αD, βR, βD, T, U |
| SP175t | Soluble globular | 190–240 | 71 | αR, αD, βR, βD, T, U |
| SMP180 | Membrane and soluble proteins | 180–240 | 128 | αR, αD, βR, βD, T, U |
| SMP180t | Membrane and soluble proteins | 190–240 | 128 | αR, αD, βR, βD, T, U |
| Cryst175 | Proteins with crystallin‐type folds | 175–240 | 9 | αR, αD, βR, βD, T, U |
Data input parameters showing the available options
| Section | Input | Options available |
|---|---|---|
| Information about the input data and analysis parameters |
File format |
Free (2 column) Free (2 column) (with preview) DRS (Daresbury synchrotron format) yy (2 column) BP (bitpad scanned) (2 column) Applied Photophysics Aviv v4.1i Aviv:CDS Aviv v2.86 Jasco: v.1.30 Jasco: v1.50 |
|
Input units |
Delta epsilon Mean residue ellipticity mdeg/theta (machine units) DRS (yy units) | |
| Initial wavelength in the data file |
First wavelength listed in data file (regardless of whether ordered from high to low, or low to high wavelengths) | |
| Final wavelength in the data file |
Last wavelength listed in data file | |
| Wavelength step (interval) in the data file |
1, 0.5, 0.2, 0.1 (all in nm) | |
| Lowest wavelength to use in analysis |
(in nm) (subject to constraints of the method and database used, and quality of the data) | |
| Choice of analysis methods | Analysis programs |
SELCON3 CONTINLL VARSLC CDSSTR K2D |
|
Reference datasets (minimum wavelength range required) |
Set1 (178–260 nm) Set2 (178–260 nm) Set3 (185–240 nm) Set4 (190–240 nm) Set5 (178–260 nm) Set6 (185–240 nm) Set7 (190–240 nm) SP175 (175–240 nm) SP175t (190–240 nm) SMP180 (180–240 nm) SMP180t (190–240 nm) CRYST175 (175–240 nm) | |
| Advanced option | Optional scaling factor (use with caution!) | 0.5–1.5× |
| Output options | Output units |
Delta epsilon Mean Residue Ellipticity (MRE) mdeg (theta, machine units) DRS (yy units) |
FIGURE 2Example of DichroWeb Results Pages for a “Good” Analysis. Top) Example of a results page obtained using the DichroWeb server for a “good” quality analysis. The protein name [Navpore, (PCDDBid CD0006226000)] is displayed along with the analysis method used [Contin‐LL] and the reference set used [SMP180]. Below these is the NRMSD “goodness‐of‐fit parameter”; the low value in this example (0.034) indicates this is a good analysis, based on the close correspondence between the back‐calculated and measured spectra. Below that is other potentially indicative information calculated for the protein (to be used [not recommended], and then only with caution). Below these are tables (in yellow overlay) which display the calculated secondary structure results obtained for (1) the closest matching solution with all proteins, and (2) the average values of all matching solutions, followed by details of all matching solutions. It is recommended that solution 1 in the top table be used, as it represents the best fit to the data. Bottom) Example of the graphical output of the DichroWeb server for a “good fit,” showing the experimental spectrum (green line with crosses), the back‐calculated closest match spectrum (blue line with stars), and the difference spectrum (red vertical bars) between the experimental and back‐calculated spectra. Text files for these plots can be obtained by clicking the icons at the top of the plot section
FIGURE 3Example of DichroWeb Results Pages for a “Poor” Analysis. Top ) Example of a results page obtained using the DichroWeb server for a “poor” quality analysis. This was obtained for an intrinsically disordered protein ([MEG14, (PDCDDBid CD0004055000)]). Features are as described in the legend to Figure 2, except in this case the high (see Table 1) NRMSD value (0.219) and the poor correspondence between the calculated and experimental spectra, suggest that the best solution is not an accurate reflection of the protein secondary structure. Bottom) Example of the graphical output of the DichroWeb server for a “poor fit” showing the experimental spectrum (green line with crosses), the back‐calculated closest match spectrum (blue line with stars), and the difference spectrum (red vertical bars) between the experimental and back‐calculated spectra