Literature DB >> 30263915

Distribution of evaluation scores for the models submitted to the second cryo-EM model challenge.

Andriy Kryshtafovych¹, Bohdan Monastyrskyy¹, Paul D Adams^2,3, Catherine L Lawson⁴, Wah Chiu⁵.

Abstract

142 protein structure models were submitted to second Cryo-EM model challenge (2015-2016). Accuracy of the models was evaluated with 54 evaluation scores. Results of the descriptive statistical analysis of the scores are provided in this article.

Entities: Chemical Disease Gene

Year: 2018 PMID： 30263915 PMCID： PMC6157618 DOI： 10.1016/j.dib.2018.08.214

Source DB: PubMed Journal: Data Brief ISSN： 2352-3409

Specifications table

Value of the data

The data reveal ranges of evaluation scores for models submitted to cryo-EM model challenge and provide descriptive statistics. The data show differences in accuracy of cryo-EM models generated with different modeling techniques. The data can serve as a benchmark for future cryo-EM modeling challenges. The data can be compared with the data in other structure modeling experiments (e.g., CASP [1]).

Data

142 protein structure models were submitted on eight modeling targets of the second Cryo-EM model challenge (2015–2016) [2]. A computational system was developed to estimate the accuracy of the models [3]. Each model was evaluated using a suite of 15 software packages and, as a result, 54 accuracy scores were generated per model. All scores are presented in tables and graphs of the dedicated web infrastructure (http://model-compare.emdatabank.org). Some scores were analyzed in the accompanying paper [3]. This article provides results of the descriptive statistical analysis of the complete set of evaluation scores.

Materials and methods

Model types

Each model submitted to the second cryo-EM model challenge was accompanied by basic information about the modeling technique used. This information indicated whether model was built ab initio by fitting coordinates to density maps or by optimizing already available related structures. Based on this information, models were binned into two categories: ab initio or optimized. Fig. 1, Fig. 2, Fig. 3 in this paper show distributions of the evaluation scores separately for ab initio and optimized models, while Figs. 4 and 5 show the distributions for all models in one dataset (see description of the data presented in the figures in Section 4.4 below).

Fig. 1

Fig. 2

Distribution of the evaluation scores calculated by comparing models with reference structures. For each measure (specified in the x-axis title), a blue boxplot shows the score distribution for models built starting from reference structure, while a red boxplot – for models built ab initio. Panel (A) shows evaluation scores for models’ subunits (monomeric evaluation mode), while panel (B) for whole multimeric models.

Fig. 3

Distribution of the evaluation scores estimating fit of models to density maps (panel A) and similarity of models to other submitted models (panel B). For each measure (specified in the x-axis title), a blue boxplot shows the score distribution for models built starting from reference structure, while a red boxplot – for models built ab initio.

Fig. 4

Distribution of the evaluation scores shown in Fig. 8 of Ref. [3] for all submitted models (optimization and ab initio models pulled together). Left set of boxplots shows scores from multimeric evaluations, right set – from monomeric ones.

Fig. 5

Distribution of the evaluation scores not included in Fig. 4 for all submitted models.

Distribution of the evaluation scores calculated exclusively from the model coordinates for different types of models. For each measure (specified in the x-axis title), a blue boxplot shows the score distribution for models built starting from reference structure, while a red boxplot – for models built ab initio. Panel (A) shows evaluation scores for models’ subunits (monomeric evaluation mode), while panel (B) for whole multimeric models. Distribution of the evaluation scores calculated by comparing models with reference structures. For each measure (specified in the x-axis title), a blue boxplot shows the score distribution for models built starting from reference structure, while a red boxplot – for models built ab initio. Panel (A) shows evaluation scores for models’ subunits (monomeric evaluation mode), while panel (B) for whole multimeric models. Distribution of the evaluation scores estimating fit of models to density maps (panel A) and similarity of models to other submitted models (panel B). For each measure (specified in the x-axis title), a blue boxplot shows the score distribution for models built starting from reference structure, while a red boxplot – for models built ab initio. Distribution of the evaluation scores shown in Fig. 8 of Ref. [3] for all submitted models (optimization and ab initio models pulled together). Left set of boxplots shows scores from multimeric evaluations, right set – from monomeric ones. Distribution of the evaluation scores not included in Fig. 4 for all submitted models.

Box plots

Data in the Fig. 1, Fig. 2, Fig. 3, Fig. 4, Fig. 5 are presented as box plots. Box boundaries correspond to the Q1 = 25th (bottom) and Q3 = 75th (top) percentiles in the data; the horizontal line inside the box corresponds to the median (Q2). The height of the box defines the interquartile range (IQR = Q3-Q1). The height of the whiskers shows the range of the values outside the interquartile range, but within 1.5 IQR. The dots correspond to outliers, i.e. values outside the 1.5 IQR range.

Evaluation tracks and packages

Submitted models were evaluated in four evaluation tracks: directly from the model coordinates, i.e. without referring to density maps or other available structures (software packages used: MolProbity 4.4 [4], phenix.model_vs_map module (from PHENIX 1.11.1-2575 distribution [5]), DFIRE [6], ProQ3 [7], QMEAN [8]); comparing the model coordinates to those of reference structures (software packages used: LGA [9], TM [10], [11], LDDT [12], CAD [13], QS-score [14], IFaceCheck [15], phenix.chain_comparison module [5]); checking fit of the model coordinates to the experimental 3DEM density maps (software packages used: TEMPy 1.0 [16], [17], EMRinger [18], phenix.model_vs_map module [5]); comparing coordinates of each model to those of other submitted models (software packages used: Davis-QAconsensus [19]).

Score distributions

Detailed explanation of the evaluation scores is provided in the accompanying paper [3]. Fig. 1 illustrates distributions of evaluation scores calculated directly from model coordinates (evaluation track 2.3.1). Panel (A) shows scores calculated on representative model subunits, while panel (B) – on whole multimeric structures. The figure includes box plots for the following scores: Panel (A): Molprb(mon) – MolProbity׳s combined MPscore; Molprb(mon):rot_out – MolProbity׳s rotamer outlier score; Molprb(mon):clash – MolProbity׳s clash score; Molprb(mon):ram_fv – MolProbity׳s Ramachandran favored score; Molprb(mon):ram_out – MolProbity׳s Ramachandran outlier score; Log(-DFIRE) – logarithm of negative DFIRE energy score; ProQ3 – score from the ProQ3 program ran with default parameters; QMEAN – score from the QMEAN program ran with default parameters. Panel (B): [Molprb(mult):ram_fv /Molprb(mult):clash /Molprb(mult):ram_out /Molprb(mult):rot_out] – MolProbity׳s [Ramachandran favored score /clash score /Ramachandran outlier score /rotamer outlier] score; [PHENIX:bond_rmsd /PHENIX:angle_rmsd /PHENIX:planar_rmsd /PHENIX:chiral_rmsd /PHENIX:dihedr_rmsd] – the RMSD on [bond /angle /planarity /chirality /dihedral angle] deviations calculated with the phenix.model_vs_map module; PHENIX:bond_max – the maximum deviation of bond distances from ideal values (in Å); PHENIX:angle_max – the maximum deviation of angles from ideal values (in deg.); PHENIX:planar_max – the maximum deviation of peptide bond planarity from ideal values (in deg.); PHENIX:chiral_max – the maximum deviation of chirality score from ideal values; PHENIX:dihedr_max – the maximum deviation of dihedral angles from ideal values (in deg.). Fig. 2 illustrates distributions of evaluation scores calculated by comparing models with reference structures (evaluation track 2.3.2). Panel (A) shows scores calculated on representative model subunits, while panel (B) – on whole multimeric structures. The figure includes box plots for the following scores: Panel (A): GDT_TS, GDT_HA – scores from the LGA package ran with parameters: ‘-3 -sda -d:4’); LGA_S – score from the LGA package ran with parameters: ‘-4 -sia -d:4’); RMSD – root mean square deviation on Cα atoms of the representative chain (as reported by the LGA package); TMscore, TMalign – scores from the TM package ran with default parameters; CAD – CAD_aa variant of the CAD score calculated on all atoms; LDDT – a score from the LDDT package run with 15 Å inclusion radius. Panel (B): [QS-global /QS-best] – Quaternary Structure score calculated on [all interfaces /best interface] with the QS-score package; [QS:LDDT /QS:RMSD] – [LDDT and /Cα RMSD] scores calculated on all chains by the QS-score package; IFaceCheck:F1_max – maximum F1 statistics from among those calculated on all interfaces calculated with the IFaceCheck package; IFaceCheck:Jd_min – the minimum Jaccard distance from among those calculated on all interfaces; IFaceCheck:prec_max – maximum precision from among those calculated on all interfaces; IFaceCheck:recall_max – maximum recall from among those calculated on all interfaces; IFaceCheck:RMSD_min – minimum RMSD on target interface atoms from among those calculated on all interfaces; [IFaceCheck:F1_avg /IFaceCheck:Jd_avg /IFaceCheck:prec_avg /IFaceCheck:recall_avg /IFaceCheck:RMSD_avg] – the [F1 /Jaccard distance /precision /recall /interface RMSD] scores averaged over all interfaces; [PHENIX:CA-score /PHENIX:seq_match] –scores generated with the phenix.chain_comparison module. Fig. 3 illustrates distributions of evaluation scores estimating fit of models to density maps (panel A, evaluation track 2.3.3) and similarity of models to other submitted models (panel B, evaluation track 2.3.4). The figure includes box plots for the following scores: Panel (A): [PHENIX:overall_FSC /PHENIX:boxCC] – the [overall Fourier Shell Correlation in reciprocal Fourier space /per-chain box cross-correlation] calculated with the phenix.model_vs_map module; [TEMPy:CCC /TEMPy:LAP /TEMPy:ENV /TEMPy:MI] – TEMPY׳s [cross-correlation coefficient /Laplacian-filtered cross-correlation /envelope /mutual information] scores; EMRinger – EMRinger score calculated using the phenix:emringer module. Panel (B): Davis-QA – a model consensus score calculated by averaging the GDT_TS scores from pairwise comparisons of the model to all others. Figs. 4 and 5 illustrate distributions of evaluation scores presented in Fig. 1, Fig. 2, Fig. 3 when all models (optimization and ab initio) are grouped in one dataset. Score names are as described above for Fig. 1, Fig. 2, Fig. 3.

Subject area	Structural Biology
More specific subject area	Cryo-EM Models
Type of data	Figures
How data was acquired	Computational analysis
Data format	Analyzed
Experimental factors	None
Experimental features	None
Data source location	Rutgers University
Data accessibility	http://model-compare.emdatabank.org/data/scores
Data accessibility	http://model-compare.emdatabank.org/em_score_boxplots.cgi

18 in total

Review 4. Evolution of Standardization and Dissemination of Cryo-EM Structures and Data Jointly by the Community, PDB and EMDB.

Authors: Wah Chiu; Michael F Schmid; Grigore Pintilie; Catherine L Lawson
Journal: J Biol Chem Date: 2021-03-17 Impact factor: 5.157

4 in total

Distribution of evaluation scores for the models submitted to the second cryo-EM model challenge.

Specifications table

Value of the data

Data

Materials and methods

Model types

Box plots

Evaluation tracks and packages

Score distributions

1. Scoring function for automated assessment of protein structure template quality.

2. Critical assessment of methods of protein structure prediction (CASP)-Round XII.

3. CAD-score: a new contact area difference-based function for evaluation of protein structural models.

4. CASP prediction center infrastructure and evaluation measures in CASP10 and CASP ROLL.

5. Evaluation system and web infrastructure for the second cryo-EM model challenge.

6. TM-align: a protein structure alignment algorithm based on the TM-score.

7. MolProbity: all-atom structure validation for macromolecular crystallography.

8. TEMPy: a Python library for assessment of three-dimensional electron microscopy density fits.

9. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests.

10. ProQ3: Improved model quality assessments using Rosetta energy terms.

1. Cryo-EM targets in CASP14.

2. Evaluation system and web infrastructure for the second cryo-EM model challenge.

3. Cryo-electron microscopy targets in CASP13: Overview and evaluation of results.

Review 4. Evolution of Standardization and Dissemination of Cryo-EM Structures and Data Jointly by the Community, PDB and EMDB.