| Literature DB >> 30867930 |
Yingchen Shi1,2, Ke Yin3, Xuecheng Tai4, Hasan DeMirci5,6, Ahmad Hosseinizadeh7, Brenda G Hogue8, Haoyuan Li9,10, Abbas Ourmazd7, Peter Schwander7, Ivan A Vartanyants11,12, Chun Hong Yoon9, Andrew Aquila9, Haiguang Liu2.
Abstract
Using X-ray free-electron lasers (XFELs), it is possible to determine three-dimensional structures of nanoscale particles using single-particle imaging methods. Classification algorithms are needed to sort out the single-particle diffraction patterns from the large amount of XFEL experimental data. However, different methods often yield inconsistent results. This study compared the performance of three classification algorithms: convolutional neural network, graph cut and diffusion map manifold embedding methods. The identified single-particle diffraction data of the PR772 virus particles were assembled in the three-dimensional Fourier space for real-space model reconstruction. The comparison showed that these three classification methods lead to different datasets and subsequently result in different electron density maps of the reconstructed models. Interestingly, the common dataset selected by these three methods improved the quality of the merged diffraction volume, as well as the resolutions of the reconstructed maps.Entities:
Keywords: X-ray free-electron lasers (XFELs); classification algorithms; electron-density map reconstruction; single-particle imaging
Year: 2019 PMID: 30867930 PMCID: PMC6400180 DOI: 10.1107/S2052252519001854
Source DB: PubMed Journal: IUCrJ ISSN: 2052-2525 Impact factor: 4.769
Figure 1Pre-treatment of scattering patterns. (a) The original pattern, (b) after ‘bad’-pixel fixation using Friedel symmetry and (c) after photon-count conversion. The intensities are shown in logarithm scale to display the details. The apparent contrast difference between (b) and (c) is caused by the removal of weak signals (negative or analogue signals smaller than one photon were set to zero).
Figure 2Venn diagram for the three sets of single-particle scattering patterns. There are 10 016 commonly selected patterns between the CNN and DM, 11 124 patterns between the GC and CNN, and 11 389 patterns between the GC and DM. A total of 9 404 patterns were tagged as single-particle scattering patterns by all three methods.
Computing-speed comparison
| Algorithm | Hardware | Time |
|---|---|---|
| CNN | K80 GPU | ∼5 min |
| GC | Xeon CPU (ten cores) | ∼15 min |
| DM | Xeon CPU (ten cores) | ∼20 min |
Figure 3Averaged intensity radial profiles. Here q is calculated by , where is the scattering angle and is the wavelength of the X-rays. The profiles are overlaid by the matching intensities at the low-q region.
Figure 4Self-consistency and cross-comparison of the merged results from the three datasets. Here q is calculated in the same way as in Fig. 3 ▸. The two graphs are (a) the R factors between the independent merged results from the same dataset and (b) the R factors for the merged results from the different datasets.
Figure 5The distributions of the largest probability for each pattern. For randomly sampled orientations, the expected probability should be approximately 2 × 10−5. For the CNN (a) , the GC (b) and the DM (d) methods, the percentage of patterns whose largest probabilities are smaller than 10−4 are 7.19, 2.62 and 6.76%, respectively. In comparison, the common dataset (d) has 9404 patterns, 28 of which have the largest probability smaller than 10−4 (∼0.3%).
Figure 6Contour display of the retrieved electron density maps. The maps for the datasets selected using the (a) CNN, (b) GC, (c) DM and (d) common dataset.
The consistency levels between the reconstructed maps from four datasets
| Model A | Model B |
| Real-space resolution (nm) |
|---|---|---|---|
| CNN | DM | 0.087 | 11.5 |
| CNN | GC | 0.087 | 11.5 |
| DM | GC | 0.102 | 9.8 |
| CNN | Common | 0.097 | 10.3 |
| DM | Common | 0.097 | 10.3 |
| GC | Common | 0.100 | 10.0 |
Figure 7The shape analysis of the reconstructed maps. (a) An illustration of the cross-section slicing procedure: a Fibonacci sampling algorithm was used to select the direction of the planes that pass through the model centre. (b) Ellipses with three eccentricity values to guide the understanding of the deviation from a perfect circle. (c) The distribution of eccentricity values of map cross-sections for four reconstructed maps.