| Literature DB >> 31610239 |
Alexander D Cook1, Szymon W Manka1, Su Wang1, Carolyn A Moores1, Joseph Atherton2.
Abstract
Microtubules are polar filaments built from αβ-tubulin heterodimers that exhibit a range of architectures in vitro and in vivo. Tubulin heterodimers are arranged helically in the microtubule wall but many physiologically relevant architectures exhibit a break in helical symmetry known as the seam. Noisy 2D cryo-electron microscopy projection images of pseudo-helical microtubules therefore depict distinct but highly similar views owing to the high structural similarity of α- and β-tubulin. The determination of the αβ-tubulin register and seam location during image processing is essential for alignment accuracy that enables determination of biologically relevant structures. Here we present a pipeline designed for image processing and high-resolution reconstruction of cryo-electron microscopy microtubule datasets, based in the popular and user-friendly RELION image-processing package, Microtubule RELION-based Pipeline (MiRP). The pipeline uses a combination of supervised classification and prior knowledge about geometric lattice constraints in microtubules to accurately determine microtubule architecture and seam location. The presented method is fast and semi-automated, producing near-atomic resolution reconstructions with test datasets that contain a range of microtubule architectures and binding proteins.Entities:
Keywords: 3D reconstruction; Cryo-EM; Microtubule; Pseudo-helical symmetry; RELION
Mesh:
Substances:
Year: 2019 PMID: 31610239 PMCID: PMC6961209 DOI: 10.1016/j.jsb.2019.10.004
Source DB: PubMed Journal: J Struct Biol ISSN: 1047-8477 Impact factor: 2.867
Dataset and data collection details. *K2 summit direct electron detector from Gatan Inc. CA, USA. †DE20 direct electron detector from Direct Electron, San Diego, CA. ‡With quantum post-column energy-filter (Gatan Inc. CA, USA), operated in zero-loss imaging mode with a 20-eV energy-selecting slit. §C-Flat 2/2-4C from Protochips Inc. ‖Lacey carbon grids from Agar Scientific.
| Decorator | Tubulin Ligands | Grid type | EM and energy filtration | Detector and mode | Pixel size (Å/pixel) | Defocus range | Dose (e-/Å2, weighted) | Total exposure time and frame number | EMPIAR and EMDB accession codes |
|---|---|---|---|---|---|---|---|---|---|
| CKK domain of CAMSAP1 | α-tubulin: GTP | 2 μm Holey Carbon§ | Polara (300 kV‡) | K2 summit* Counting at 5e-/pixel/sec | 1.39 | 0.5–3.5 μm | 42 dose weighted | 16 secs 64 frames | EMPIAR-465 |
| Motor domain of MKLP2 | α-tubulin: GTP | 2 μm Holey Carbon§ | Polara (300 kV) | DE20† (Direct Electron) | 1.53 | 0.5–3.0 μm | 50 dose weighted | 1.5 secs | EMPIAR-467 |
| N-DC domain of DCX | α-tubulin: GTP | Lacey Carbon‖ | Polara (300 kV‡) | K2 summit* | 1.39 | 0.5–2.5 μm | 24 unweighted | 9 secs | EMPIAR-10300 |
Details of the MiRP procedure. *If performing refinement with helical symmetry these options are set to ‘yes’. The appropriate helical parameters need to be optimised for each dataset. Example starting parameters that have worked well in our hands are;
| Pipeline stage | Relion operation | Custom MT operations | Rough computing times | Details/parameters |
|---|---|---|---|---|
| 1. Pre-processing | a) Manual picking | ~1 day/500 micrographs (CPU) | Input micrographs: Selected motion corrected and dose-weighted micrographs after CTF determination | |
| b) Particle extraction | ~10mins for 100,000 particles (CPU, 40 MPI processors) | Input coordinates: Manual-picked coordinates | ||
| c) Segment average generation | ~ 2 h for 100,000 particles (CPU) | Script in C shell | ||
| 2. Pf number sorting | a) 3D classification | ~ 20 h for 100,000 particles (either CPU, 60 MPI processors or GPU, 5 MPI processors with 3 threads) | Input images STAR file: 4 × binned segment averages star file | |
| b) Class Unification | <1 min (CPU) | Script in Perl and commands in R | ||
| c) Class extraction, zero shifts and ROT and reset PSI and TILT angles to priors | <1 min (CPU) | Script in C shell | ||
| 3. Global search | a) 3D auto-refine (1st) | ~20 mins for 30,000 particles (GPU, 5 MPI processors, 3 threads) | Input images STAR file: A final .star file generated above of a single PF number class | |
| b) Zero shifts and ROT, | <1 min (CPU) | Script in C shell | ||
| 4. Unique Phi angle Assignment | a) 3D auto-refine (2nd) | ~10 mins for 30,000 particles (GPU, 5 MPI processors, 3 threads) | Input images STAR file: The .star file generated in the previous step. | |
| b) Phi Unification | <1 min (CPU) | Script in Python | ||
| c) Zero shifts and reset PSI and TILT angles to priors | <1 min (CPU) | Script in C shell | ||
| d) 3D auto-refine (3rd) | ~10 mins for 30,000 particles (GPU, 5 MPI processors, 3 threads) | Input images STAR file: The .star file generated in the previous step, but with links to ‘raw’ particles (not segment averages). | ||
| 5. X/Y Shift Smoothing | a) X/Y Shift Smoothing | <1 min (CPU) | Script in Python | |
| b) 3D auto-refine (4th) | ~5 mins for 30,000 particles (GPU, 5 MPI processors, 3 threads) | Input images STAR file: The .star file generated in the previous step (‘raw’ particles). | ||
| 6. Refined segment averages | a) Particle extraction | ~4mins for 30,000 particles (CPU, 40 MPI processors) | Input Coordinates: | |
| b) Refined segment average generation | <1 min (CPU) | Script in C shell | ||
| 7. Seam Finding | a) 3D classification | ~10mins for 30,000 particles (CPU, 1 MPI processor, 9 threads) | Input images STAR file: 4 × binned refined segment averages star file | |
| b) Class Unification | <1 min (CPU) | Script in Perl and commands in R | ||
| c) Class extraction | <1 min (CPU) | Script in C shell | ||
| d) Phi/XY Correction | <1 min (CPU) | Script in C shell | ||
| 8. High-resolution reconstruction | a) Particle extraction | ~10mins for 30,000 particles (CPU, 40 MPI processors) | Input Coordinates: | |
| b) 3D auto-refine (5th) | ~ 6 h for 30,000 particles (either CPU, 60 MPI processors or GPU, 5 MPI processors with 3 threads) | Input images STAR file: The .star file generated in the previous step. |
13 PF:Number of asymmetrical units: 13 or 12 (if 13 or 12 binding proteins in a helical turn respectively). Initial twist (deg). rise (A): −27.67 9.46 Central Z length (%): 30 Twist search – Min,Max,Step (deg): −27–28 0.1 Rise search – Min,Max,Step (A): 9.4 9.7 0.1.
14 PF:Number of asymmetrical units: 14 or 13 (if 14 or 13 binding proteins in a helical turn respectively). Initial twist (deg). rise (A): −25.71 8.81 Central Z length (%): 30 Twist search – Min,Max,Step (deg): −25.2–26.2 0.1 Rise search – Min,Max,Step (A): 8.6 9 0.1.
If this option does not stop the refinement at iteration 1, manually terminate the refinement after the first iteration and use the iteration 1 output star files.
Fig. 2Pre-processing – manual picking and particle extraction strategy. a) “Helical” manual picking strategy is shown in an example micrograph from the CKK-MT dataset displayed using the RELION manual-picking GUI window. Start-end coordinates are selected (green circles) delineating desired MT lengths (connecting green lines) for extraction. Distorted or curved MT lengths (shown with blue and red arrows respectively) as well as contaminated areas (yellow arrows) are excluded. b) Particle extraction strategy illustrated on an MT diagram. Box size is set at roughly 600 Å with a box separation of 82 Å representing the dimer repeat distance along the helical axis. 7 example boxes are shown, where the central box (bold green) serves as the central particle for segment average generation. c) Segment average generation strategy. 3 adjacent particles either side of a central particle along the helical axis are averaged with a central particle to create a new central particle with a higher signal-to-noise ratio.
Helical parameters used for protofilament number references.
| Protofilament Number | Rise (Å) | Twist (°) |
|---|---|---|
| 11-3 | 11.1 | −32.5 |
| 12-3 | 10.2 | −29.9 |
| 13-3 | 9.4 | −27.7 |
| 14-3 | 8.7 | −25.8 |
| 15-4 | 10.8 | −23.8 |
| 16-4 | 10.2 | −22.4 |
Fig. 3Sorting MTs in 3D by PF architecture. a) Examples of four MTs (MT1-4) from the CKK dataset, showing the PF number class assignment as a function of the particle number within each MT. * indicates the modal class for the microtubule. b) Histogram showing the overall confidence of the MT architecture assignment step (performed with segment averages, or standard particles), plotting the % of MTs within certain confidence values (for the CKK dataset). c) The distributions for 11-3, 12-3, 13-3, 14-3, 15-4, 16-4 PF MTs (% of particles with a certain PF architecture) calculated after MT PF number assignment. d) Central z-axis slices of the different PF number references used for PF number classification, and of the resulting reconstructions for different datasets.
Fig. 4Initial Seam Assignment and X/Y Shift Smoothing. a) MT-Rot angle assignment for a single MT from the CKK dataset, after the ‘Global Search’ step, with the MT Rot angle plotted as a function of the particle number within that MT. The rainbow coloured lines show the clusters calculated during the MT-Rot angle assignment step, with each cluster representing a different PF register being aligned between individual MT particles and the reference – as can be seen by the regular spacing between the clusters. A MT top view representation is annotated by the percentage of particles for this MT aligned with different PFs in the 3D reference. b) Rot angles from the MT example in a), after MT Rot angle assignment. c) The X/Y shifts from the example MT in a-b plotted as a function of particle number. The micrograph of the MT is shown, with the X/Y alignment for each particle in the MT represented by green circles. Particles which have shifted in register are shown by black arrows. d) The X and Y shifts for the example MT in a-c, after X/Y shift smoothing.
Fig. 5Seam check via supervised 3D classification. a) Supervised 3D classification strategy. Example simulated 13 PF CKK density references are shown with rotations around and translations along the helical axis of −2, −1, 0, +1 and +2 times the helical twist and rise. The resulting class reconstructions from a single classification iteration to these references are shown. b) Class occupancy for all 26 supervised classes used for the 13 PF CKK-MT dataset. i) Shows classes representing rotations around and translations along the helical axis (angle) of −6 to +6, whereas ii) shows classes with the same rotations and translations plus an additional translation of the monomer repeat distance (41 Å). For clarity, class occupancies as a % are indicated above bars representing classes. c) 4 × binned reconstructions of CKK-MT data after application of MiRP and its seam finding procedure or instead using standard helical processing in RELION without intervention. When using MiRP, CKK density is clearly absent from the seam and from 41 Å translated locations along the helical axis, while aberrant density is found at the seam and 41 Å translated locations when using standard helical processing, indicating poor MT Rot angle and αβ-tubulin register determination.
Fig. 6Data optimisation and high-resolution reconstruction. a) Plot showing particle angles for 4 13 PF MTs from the MKLP2-MT dataset as a function of particle number (neighbouring particles are separated by ~82 Å along the helical axis). Three MTs have aligned as expected (‘good’) while one shows jumps in angle along the helical axis and can be excluded (‘bad’). b) Result of the 3D classification without alignment to a preliminary unbinned reconstruction reference of 13 PF MKLP2-MTs, displayed as 2D slices in the RELION display GUI. 85% of particles went into a ‘good’ class with expected structural features and a defined seam (green box) and were taken for further processing. c) Gold-standard corrected FSC curves for the 13 PF CKK-MT dataset from RELION post-processing, using the central masked 15% of the reconstructions along the helical axis (~90 Å, a little over the dimer repeat distance). Final symmetrised reconstructions are compared after standard helical processing (without Bayesian polishing or CTF refinement) or after use of MiRP without or with Bayesian polishing or Bayesian polishing and CTF refinement. d) Unique features of α- and β-tubulin are poorly resolved after standard helical processing in RELION. i) The lumenal face of the tubulin dimer of the asymmetric unit opposite the seam for symmetrised 13 PF CKK-MT reconstructions showing poorly defined density for the H1-S2 and S9-S10 loops, which are distinct in α and β-tubulin. ii) Density for non-conserved α and β-tubulin sidechains such as β-tubulin’s R158 (S149 in α-tubulin) and R48 (S39 in α-tubulin) are poorly defined. e) Unique features to α and β-tubulin are well resolved after application of MiRP. i) The lumenal face of the tubulin dimer of the asymmetric unit opposite the seam for symmetrised 13 PF CKK-MT reconstructions exhibits well defined density for the H1-S2 and S9-S10 loops, which are distinct in α- and β-tubulin. ii) Density for non-conserved α- and β-tubulin sidechains such as β-tubulin’s R158 (S149 in α-tubulin) and R48 (S39 in α-tubulin) are well defined.
Fig. 7Final reconstruction results for test datasets. a) C1 reconstruction of the 13 PF CKK-MT dataset (unfiltered), showing a well-defined seam indicative of accurate MT Rot angle and αβ-tubulin register assignment. b) C1 reconstruction of the 13 PF MKLP2-MT dataset (unfiltered), showing a well-defined seam indicative of accurate MT Rot angle and αβ-tubulin register assignment. c) C1 reconstruction of the 13 PF NDC-MT dataset (unfiltered), showing a well-defined seam indicative of accurate MT Rot angle and αβ-tubulin register assignment. d) Density and fitted model for the ‘good’ asymmetric unit opposite the seam in the symmetrised reconstruction of the 13 PF CKK-MT dataset, showing density quality consistent with the reported resolution. e) Density and fitted model for the ‘good’ asymmetric unit opposite the seam in the symmetrised reconstruction of the 13 PF MKLP2-MT dataset, showing density quality consistent with the reported resolution. f) Density and fitted model for the ‘good’ asymmetric unit opposite the seam in the symmetrised reconstruction of the 13 PF NDC-MT dataset, showing density quality consistent with the reported resolution. g) As in panel e but from a different viewpoint, showing local resolution determined by RELION’s local resolution software. The CKK decorating protein is within the dashed black line. h) As in panel e, but from a different viewpoint showing local resolution determined by RELION’s local resolution software. The MKLP2 decorating protein is within the dashed black line. i) As in panel f, but from a different viewpoint showing local resolution determined by RELION’s local resolution software. The NDC decorating protein is within the dashed black line.
Fig. 1The MT Image processing RELION-based Pipeline (MiRP). Each step is marked in blue, in the same box as short summaries of the RELION operations (yellow), and custom MT operations (orange) involved in that step, described in more detail in the text.
Dataset size and resolutions. Gold-standard corrected FSC resolution at the 0.143 threshold for symmetrised reconstructions calculated with RELION post-processing, using the central masked 15% of the reconstructions along the helical axis (~90 Å, a little over the dimer repeat distance).
| Decorator | Number of selected micrographs | Starting particle number | Final 13pf particle number | Final 13pf resolution (Gold standard FSC 0.143) |
|---|---|---|---|---|
| CKK domain of CAMSAP1 | 1075 | 82,666 | 26,854 | 3.68 Å |
| Motor domain of MKLP2 | 293 | 25,568 | 14,411 | 4.14 Å |
| N-DC domain of DCX | 847 | 50,255 | 28,347 | 3.63 Å |
| EB3 | 383 | 34,754 | 34,047 | 3.3 Å |