Literature DB >> 35072037

Multiscale modelling of the extracellular matrix.

Hua Wong1, Jean-Marc Crowet1, Manuel Dauchez1, Sylvie Ricard-Blum2, Stéphanie Baud1,3, Nicolas Belloy1,3.   

Abstract

The extracellular matrix is a complex three-dimensional network of molecules that provides cells with a complex microenvironment. The major constituents of the extracellular matrix such as collagen, elastin and associated proteins form supramolecular assemblies contributing to its physicochemical properties and organization. The structure of proteins and their supramolecular assemblies such as fibrils have been studied at the atomic level (e.g., by X-ray crystallography, Nuclear Magnetic Resonance and cryo-Electron Microscopy) or at the microscopic scale. However, many protein complexes are too large to be studied at the atomic level and too small to be studied by microscopy. Most extracellular matrix components fall into this intermediate scale, so-called the mesoscopic scale, preventing their detailed characterization. Simulation and modelling are some of the few powerful and promising approaches that can deepen our understanding of mesoscale systems. We have developed a set of modelling tools to study the self-organization of the extracellular matrix and large motion of macromolecules at the mesoscale level by taking advantage of the dynamics of articulated rigid bodies as a mean to study a larger range of motions at the cost of atomic resolution.
© 2021 The Author(s).

Entities:  

Keywords:  Basement membrane; CG, coarse-grained; Cryo-EM, cryogenic electron microscopy; DOF, degrees of freedom; ECM, extracellular matrix; EGF, epidermal growth factor; Extracellular matrix; FEM, finite element method; MD, molecular dynamics; Mesoscopic scale; Modelling; NC, non-collagenous; NMR, nuclear magnetic resonance; Rigid bodies; SAXS, small-angle X-ray scattering; Simulation

Year:  2021        PMID: 35072037      PMCID: PMC8763633          DOI: 10.1016/j.mbplus.2021.100096

Source DB:  PubMed          Journal:  Matrix Biol Plus        ISSN: 2590-0285


Introduction

The extracellular matrix (ECM) is a three-dimensional network of proteins, proteoglycans and glycosaminoglycans, which are found in different isoforms in tissues. Collagens, laminins, fibronectin, thrombospondins, and proteoglycans belong to the core matrisome, whereas ECM regulators, ECM-affiliated proteins, and secreted factors are referred to as matrisome-associated proteins [1]. ECM components form insoluble supramolecular assemblies, which provide tissues with mechanical properties, namely tensile strength for collagen fibrils, and resistance to compression for proteoglycans. The organization of the ECM is tissue-specific, and varies depending on the developmental stage, and the physio-pathological context. Basement membranes are thin sheets of ECM, which underlie epithelial and endothelial cells, compartmentalize tissues, and surrounds several cell types [2]. ECM molecular entities are of various sizes, ranging from soluble, globular proteins of 10 nm in diameter or less to macromolecules such as collagens which are several hundred nanometers in length and to fibrils and fibers up to several µm-long [3], [4]. Collagen IV molecules form tetramers and dimers via their N- and C-termini respectively, which then self-assemble into a network interacting with the other major components of basement membranes, namely perlecan, laminins and nidogens. It remains difficult to analyze full-length, multi-domain, ECM macromolecules by X-ray crystallography or NMR and most structural data available at the atomic resolution are those of individual domains which do not reflect the size and shape of the full-length proteins deposited in the ECM. On the other hand, supramolecular assemblies such as fibrils can be studied at the microscopic scale. However, several ECM large proteins (e.g., collagens, laminins, and fibronectin) and protein complexes fall into an intermediate scale, the so-called biological mesoscopic scale as defined in [5] that makes them difficult to observe directly at atomic resolution. Indeed, experimental direct observations of ECM components in the range of 10−7 m to 10−8 m are often at the lower limit of microscopy and upper limit of X-Ray crystallography. While cryo-EM continues to increase in accuracy and resolution [6], proteins with molecular weight lower than 50 kDa and biomolecules or complexes longer than 500 nm (i.e., the electron beam penetration limit) are still difficult to image [7]. All-atom molecular dynamics (MD) is an established method for studying the dynamics and modifications of ECM-derived peptides. It does so by modelling each atom of a molecule as a discrete point in space called particle, with a given set of parameters that are used to compute the next state of the system at each timestep. It is a proven method to study the ectodomains of membrane proteins [8] and/or protein domains [9], [10], [11], [12], [13]. It performs calculations on all the atoms of the model and is especially useful for molecules up to 10-nm in length. In classical MD, as the size of the system increases, so does the number of particles whose features (coordinates, velocity, forces, energies) must be updated at each step of computation, typically every 1–2 fs [14]. For example, the C-terminal non-collagenous (NC) domain 1 of native trimeric collagen IV, a major component of basement membranes, is made up of ∼5000 atoms. An all-atom model of collagen IV includes roughly 28,000 atoms without considering any solvent or ions molecules. In addition to collagen IV, a model of a basement membrane should comprise several collagen types (e.g., collagens IV and XVIII) as well as laminins and the proteoglycan perlecan [15]. It should also include links to ECM cell surface receptors such as integrins. The simulation of one atomistic model of a large biomolecule such as collagen IV and a basement membrane made of a combination of the above macromolecules would necessitate huge computational resources to perform calculations and store simulation data even in the exascale computing (1018 calculations per second) and big data era. The current record for an atomistic model (i.e., the model of HIV-1 capsid) is the simulation of over 64 million atoms for over 1 µs, which required some of the largest existing supercomputers [16]. For these systems and large ECM supramolecular assemblies (e.g., in the µm-range and beyond such as collagen fibers), coarse-grained (CG) models, requiring fewer computational resources, and discrete or finite elements methods (FEM) are used [17], [18]. In contrast to fine-grained models providing atomistic resolution), coarse-grained MD (CGMD) aims at representing complex systems by a reduced number of subcomponents. CGMD clusters atoms in larger beads which simulate the properties of the individual atoms they are made of. Using the CGMD approach, the 5,000 atoms of the NC1 domain of collagen IV could be represented by ten times fewer atoms, thus reducing the degrees of freedom (DOF) of the system and allowing longer integration steps around 10–20 fs. This approach has been used successfully to study collagen molecules [19], [20] and tropoelastin self-assembly into nascent fibrils [21]. At the nanoscale (≤10−9m), water can be discretized as individual molecules. As we get closer to the microscale, there are so many water molecules that they can be approximated into a continuous medium (or continuum) without discrete separation. This concept of continuum is referred to as implicit solvation and is also called continuum solvation in MD. While in molecular dynamics, atoms position and velocity are updated at each time point according to Newton’s equation of motion, in rigid body dynamics, protein domains are considered as rigid domains whose volumes never change or deform during the simulation [22]. In contrast with all-atom MD or CG simulations, FEM tools, such as OpenFOAM [23], [24] or ANSYS require the computing power of a desktop workstation but will only work at a scale where matter can be considered a continuum and cannot predict the behavior of discrete atoms [25] or sub-domains. Furthermore, FEM is well suited for large systems ranging from the macroscale to the microscale from 1 m to 10−6 m. We have developed a set of tools within the DURABIN project (Developing Utilities for Nanometric Interactions in Biochemistry with Augmented Reality) [26] to help users without expertise in modelling to build models of large macromolecules or multimolecular complexes at the mesoscopic scale in order to bridge the gap between the microscopic and the macroscopic scales, and to study their “collective” behavior in different biological conditions. Here we report the use of these recently developed tools with the ability to export atomistic models from rigid body ones, hinting at true multiscale capability, to build models of individual full-length ECM proteins (i.e., collagen IV, laminin-111, and nidogen-1) and proteoglycan (perlecan) found in basement membranes, and to generate a molecular network to build a three-dimensional model of a basement membrane-like ECM. Basement membranes play a crucial role in delimiting tissue boundaries [27], as a filter in kidneys and in tumor metastasis [28]. Our model will thus be of major interest to investigate the effects on the organization of basement membrane of various biological contexts and diseases, which could be mimicked by adding constraints to the model (e.g., changes in the number of a particular molecule in a model to reflect up- or down-regulation of this molecule in diseases).

Material and methods

Unity 3D, a multi-platform game engine

Game engines are suites of software and tools that let users import assets (3D objects, 2D textures, game controller hardware interfaces…) in a unified environment to build simulations for entertainment purposes (video games) or more serious applications (industrial training, scientific visualization…) [29]. Unity3D is a multi-platform game engine, which could be used by operating system agnostics and has been already used for scientific visualization [30]. It provides scripting tools that will be used both for simulation and programming interactions between biomolecules in the same way video game characters interact with their environment or with other characters. These scripts are embedded as components of each object in Unity and program how these objects move or react in specific situations.

Rigid body dynamics

While in molecular dynamics, atoms position and velocity are updated at each time point according to Newton’s equation of motion, in rigid body dynamics, protein domains are considered as rigid domains whose volumes never change or deform [22]. While particles have three translational degrees of freedom, meaning that their motion can be described in terms of x, y, z translation, rigid bodies add three rotational degrees of freedom around the x, y and z-axis. To transform our all-atom structure gathered from the Protein Databank [31], we simply apply the x,y,z, translation and rotation of the rigid body to the whole PDB file, paving the way for getting from a rigid body representation back to an atomistic model as described in the “Results” section. Simulations are run using physics engines which, like other simulation tools (GROMACS, NAMD), are software packages that provide means to simulate the behavior of physical systems within a given set of parameters/rules (force field). In most physics engines rigid body dynamics is implemented using two components. One holds the physics information used by the physics engine to update the simulation (mass, position, orientation, velocities, forces), whereas the second, called a collider, holds the actual spatial volume. Colliders are used to define the space occupied by a biomolecule as well as to operate collisions between objects or the environment. Colliders are simple geometric shapes (spheres, cylinders, cubes, planes) also called primitives, as this speeds up collision physics. Complex mesh colliders exist but have significant computing costs. While rod-like helical or fibrous molecules can be modelled as capsules or cylinders, more complex forms will require a compound collider made of a mix of primitive colliders. It should be noted that several figures of this article show detailed surface representations of the biomolecules, but this is, for all intent and purpose, only as a visualization proxy for the physics entities (primitive colliders) used in the simulation. Chains of rigid bodies are created by articulating each rigid body with positional constraints, preventing them to drift from one another. At each step of the simulation, rigid body positions in the simulation are taken into account and constraints are applied to keep objects linked together [32]. Positional constraints are also used to simulate interactions between molecules. When two rigid body collide, the whole process is managed by a transient positional constraint, which can be modified to be permanent or last a given amount of time that can be modulated to mimic e.g., transient interactions or domain affinity. Random forces are applied to the rigid bodies during the simulation. For this, a Langevin equation with a viscosity term [33] is used. This mimics a fluid at thermal equilibrium and the viscosity term allows for the solvent to be simulated without explicitly modelling the solvent molecules individually (implicit solvation or continuum solvation), thus reducing the number of objects/particles for which interactions must be calculated. Other programs such as CellPACK [5] have made use of simplified representation like rigid bodies to create macromolecular ensembles with a high packing ratio but the current version of CellPACK does not seem to produce large-scale motion simulations. The data presented in this study aims to demonstrate the potential of the DURABIN toolkit to help decipher ECM assembly and dynamics.

Membrane simulation

A restraint field based on IMPALA (Integral Membrane Proteins and Lipids Association) [34] was implemented in DURABIN to simulate how transmembrane helices anchor in the cell membrane. This potential was used to simulate more realistic interactions between integrins and the cell membrane in our model. This approach can be used for any type of transmembrane component interacting with the ECM. The cell lipid membrane was described as a continuum in the following function:where z is the depth at which the helix collider is relative to the center of the membrane, z the depth at which C = 0. With set to 1.99 and z set to 15.75 Å, C = 0.49 at ±18 Å and C = −0.49 at ±13.5 Å. These values were chosen to represent the thickness of a 1,2-dipalmitoylphosphatidylcholine (DPPC) bilayer in the fluid phase with a smooth transition to the hydrophobic core of the membrane [34], [35]. This thickness can be adapted to other lipids or membrane compositions. A collider of 36 Å in height, spanning the whole “membrane”, detects if a helix collider enters the “membrane” collider which applies a force proportional to C pulling the transmembrane domain in the membrane. The result is a vector pointing toward the inside of the membrane when the helix is in contact with the membrane but still in the water phase and approaches 0 as the helix gets closer to the hydrophobic region of the bilayer. Thus, any rigid body set in the simulation as being transmembrane, regardless of shape, will be attracted inside the membrane continuum field, but will still be free to move around once inside the membrane.

Collecting or modelling 3D structures of interest

Building an accurate model of multi-domains protein complexes requires the collection of the highest number of structural data available. As mentioned above, many components of the extracellular matrix are too large to have their 3D structure determined by X-ray or NMR as full-length proteins. High-molecular weight proteins are indeed difficult to crystallize [36] or to analyze by NMR [37]. Most 3D atomic structures available for ECM proteins or proteoglycans are those of individual domains or pairs of domains and not those of full-length molecules. Low-resolution techniques such as AFM, SAXS and rotary-shadowing electron microscopy give insights into the domain organization of ECM proteins and their global size, which are useful to build models. Although recent developments in the field of cryo-EM increased the resolution, it is still costly [38] and subject to limitations (size of the complex, among other things) [39]. Also of note, while recent, AI based techniques to solve the 3D structure of proteins have made large contributions toward providing plausible structures of previously unresolved or unresolveable structures [40], [41], although one has to be aware of the limitations of such technique especially regarding disordered regions [42]. Ideally, data collected by both high- and low-resolution techniques should be combined to build our models. The main source of experimentally solved protein structures is the Protein Data Bank [43] but data collected by cryo-electron microscopy are also available via the public repository Electron Microscopy Data Bank [44], [45]. When no 3D structure experimentally determined is available, modelling approaches (e.g., threading, ab initio or homology modelling [46], [47], [48]) are used to predict the 3D structure of a protein or a domain from its primary sequence.

3D structure importation and model building in Unity 3D

The coordinates from the PDB being not centered, it is necessary to treat the PDB data so that the center of mass corresponds to the origin of the coordinates using any tool allowing the manipulation of 3D coordinates. Our approach uses Python scripts running in Blender (https://blender.org/), a free and open-source 3D software that supports the modelling and rendering of 3D objects, but any tool allowing the transformation of the coordinates of a PDB file could be used. When an axis is obvious in the molecule (e.g., a rod-like triple-helical domain), the main axis is aligned along the Y-axis which is by default the up direction in Unity3D. The modified PDB file is then opened in VMD, where the molecular surface representation is exported in Wavefront .obj file format. This file format is human readable and can be used both by many applications including both Blender and Unity3D. The molecular surface is used as a mesh representation for the underlying rigid bodies. VMD produces very dense mesh, which can affect interactive rendering speed. Blender is used to simplify the object, lowering the polygon count, while keeping the overall shape intact. Once imported into Unity3D, the required components for the physics engine such as the rigid body components themselves and the associated constraints are added.

Simulating in Unity 3D

We define a simulation box with a base area of 400 nm × 400 nm to fit the longest model of the selected proteins. The thickness of the box varies with the nature of the studied system. For example, the basement membrane thickness varies with age, disease and/or the methods used to measure it. The thickness of basement membrane varies depending on tissues and on the physiopathological context, but most basement membranes are 50–100 nm thick [49]. The thickness thus was fixed at 100 nm, which in addition accommodates the length of laminin molecules. The total size of the simulation box was 0.016 µm3 including less than 600 rigid bodies, all of them representing domains of larger macromolecules. In contrast, an atomistic model of basement membrane made solely of integrin (one isoform), collagen IV (one isoform), laminin and nidogen, would account for approximately 865,000 atoms. While CGMD reduces the DOF of residues, the rigid body approach further reduces the DOF of protein domains and represents large macromolecular complexes as articulated chains of protein domains where individual domains are the beads of a very coarse model. This approach has been used to study larger mesoscopic systems such as the yeast interphase nucleus [33]. Using this approach, collagen IV was modelled as a chain of 44 rigid bodies, one NC1 C-terminal domain and 43 triple-helical domains of 9-nm length and a total length of ∼390 nm [50]. Details on the constraints used to articulate the chain are given further in the paper. The crystal structure of the collagen triple helix model [(Pro-Pro-Gly)10]3 (PDB 1K6F, Table 1) was used as an individual triple-helical domain. This all-atom representation would correspond to ∼28,000 atoms (43*1K6F (528 atoms) + 1LI1(5253 atoms)). The number of domains being far lower than the number of residues or atoms, the amount of computing resources required is that of a desktop PC. It can even be run interactively, which allows the addition of functionalities such as user feedback and interactions in real-time as the simulation runs. By importing the model in the Unity3D game engine, it is possible to make a representation of the collagen network running in real-time with few computing resources [26].
Table 1

List of known PDB entries of proteins, glycosaminoglycans and protein complexes of the basement membrane and integrins.

ProteinsDomainsPDB entries
Collagen triple helixCrystal Structure of the Collagen Triple Helix Model [(Pro-Pro-Gly)10]31K6F
Collagen IVThe hexameric non-collagenous domain 1 of human placenta collagen IV (α1α1α2lσoϕoρλ)1LI1
Laminin EGF-like modules3 consecutive laminin-type EGF-like (LE) modules of laminin gamma1 chain harboring the nidogen binding site1KLO
Laminin coiled-coil domainA coiled-coil motif (Salmonella enterica SadA 479–519 fused to GCN4 adaptors)2WPQ
Laminin (α5chain)Laminin α 5 chain N-terminal fragment2Y38
Laminin β1 (short arm)Laminin β1 LN-LE1-4 structure4AQS
Laminin γ1 (short arm)Laminin γ1 LN-LE1-2 structure4AQT
Laminin α2laminin α2 subunit L4b Domain4YEQ
Laminin α1 chainMouse laminin alpha1 chain, domains LG4-52JD4
Nidogen-1 (EGF domain)EGF1JL9
NidogenG1 threading model [54] (no available crystal structure)
Nidogen-1 domain G2Domain G2 of mouse nidogen-11H4U
Perlecan (LDL receptor domain)2nd repeat of the LDL receptor ligand-binding domain (domain mediates interactions of the receptor with two lipoprotein apoproteins, apo E and apo B-100)1LDR
Perlecan (sea-urchin sperm protein, enterokinase and agrin domain)SEA domain of human mucin 12ACM
Perlecan (LG like domain 3)Laminin G like domain 3 from human perlecan3SH4
Integrin ectodomainStructure of complete ectodomain of integrin αIIBβ33FCS
Integrin transmembrane domainIntegrin αIIBβ3 transmembrane complex2K9J



ComplexesPDB entries

Laminin-111 (integrin-binding domain)The heterotrimeric integrin-binding region of laminin-111(50 residues of α1β1γ1 coiled coil and the first 3 laminin G-like (LG) domains of the α1 chain)5MC9
Laminin-Nidogen complexNidogen/Laminin Complex (a 6-bladed β-propeller domain in nidogen laminin epidermal-growth-factor-like (LE) modules III3-5)1NPE
Perlecan/nidogen complexNidogen-1 G2/Perlecan IG3 Complex1GL4
List of known PDB entries of proteins, glycosaminoglycans and protein complexes of the basement membrane and integrins.

All-atom reconstruction: back-mapping

In Unity3D, a C# script in the project’s assets (SaveSnapshot.cs) exports the transformation matrix of the rigid bodies from the molecule models in the ongoing simulation. The original atomic coordinates are transformed using the matrix and the coordinates are saved as separate files formatted as GROMACS .gro files. Because these are separate files, further work is necessary to create one single molecule that can be used with a MD package like GROMACS. But by loading all the separate .gro files in the visualization software VMD it is possible to visualize the all-atom model, nonetheless.

Results

Building of rigid body models of individual proteins

The 3D structures used to build the proteins of basement membrane models are listed in Table 1. There are for examples several structures of the NC1 domain of collagen IV issued from different collagen IV species and isoforms (e.g., homo-oligomers or hetero-oligomers), cross-linked or not, from different tissues, at various resolution. However, from a “rigid body” point of view and at the mesoscopic scale, the NC1 domains are similar in term of general shape (half-oval shaped geometry, same diameter, height). The 1LI1 structure has been selected because it comprises two alpha1 chains and one alpha2 chains of collagen IV, which corresponds to the major isoform of collagen IV in most tissues, and is cross-linked by the Met-Lys cross-links. The crystal structure of the collagen triple helix model 1K6F, although not specific to collagen IV, was used because the triple helix is a common feature of all collagens [51]. A bacterial coiled-coil structure (2WPQ) was used to represent the coiled-coil region of laminin-111 [52], [53]. No 3D structure of the G1 domain of nidogen-1 was available, and a threading approach was used to build a model of this domain [54]. Basement membranes surround several cell types and several basement membrane constituents (e.g., laminin-111 and perlecan) interact with integrins. Integrins were thus included in the simulation. 3FCS was first chosen because of its ability to interact with molecules such as thrombospondins [55] in globular basement membrane, but as the focus of the project shifted toward more general purpose simulation of the basement membrane, this integrin variant was kept as the default integrin model because of the mechanical similarities of the ectodomain between integrins. The extracellular part of the integrin, compared to other structures in Table 1, has all the domains solved in the same PDB file instead of being spread throughout the PDB. To be exploitable in our simulation it was necessary to break down the extracellular part of the integrin into domains to generate separate PDB files. The integrin α subunit for example, is made of residues 1 to 452, the thigh (α subunit) goes from residue 453 to residue 604. The integrin β subunit was similarly sliced into domains [56]. Known biomolecular interactions were integrated and parameterized for nidogen-1, where two out of the three main domains, G2 and G3, interact with collagen IV and laminin-111 respectively [57]. Integrins interact with laminin-111 globular domain [58]. Collagen IV dimerizes through its non-collagenous (NC1) C-terminal domain and the 7S N-terminal domain to form a network [59]. Laminin-111 interacts with itself through the N-terminus of α, β and γ chains [60]. Perlecan interacts with a wide range of ECM molecules and growth factors. In basement membranes, perlecan connects laminin and collagen IV, while interacting with integrin α2β1 [61]. The process to go from existing data reported in the literature to a fully rigged rigid body model that can be used in a rigid body simulation is summarized in Fig. 1 using nidogen as an example. The process is mainly manual, with helper scripts in Python for Blender and the Editor functions for Unity3D. Although we reckon the task could be automated, this was not the focus of this work. Nidogen-1 is a linear molecule made of three domains, G1, G2, G3 connected by EGF-like domains as represented in Fig. 1A, based on biochemical [62] or biophysical [63] data. If we assemble the structural bits according to the experimentally determined domain organization, the resulting maximum size of nidogen is ∼25 nm, roughly the maximum particle size determined by SAXS experiments [63]. The curation of existing data allowed us to determine the available structures and the part of the protein to be modelled (Table 1). It should be noted that low-resolution data such as cryo-EM and SAXS envelopes can be used for simulations, although in fine a true atomistic structure will be needed. The domains were then converted to rigid body models with matching colliders as shown for G1 and G3 domains (Fig. 1D). The slightly oblong EGF and G2 domains were matched with capsule colliders (cylinder capped with spheres). The rigid bodies were assembled according to the organization of domains by joining the C-terminus of a domain with the N-terminus of the next domain. As explained in the passage on rigid body, joints are positional constraints updated at each timesteps with customizable rotational degrees of freedom. Unity3D allows for different presets of these constraints [64]. We used the configurable joints as it was the most polyvalent and is highly configurable. The rotational and positional constraints between rigid bodies were defined at this step to fine-tune flexibility by modulating rotational constraints. In most cases, in absence of information on the rotational freedom between domains, we use the simplest joint, positional constraint without rotational constraints. For laminin and collagen domains, we add twist constraints, reflecting how super-coiled structures can be more constrained in their helical axis, the reality being more complex [65].
Fig. 1

A rigid body model based on nidogen-1 for rigid body simulation. A Domain structure of nidogen-1 including the G1, G2, G3 and EGF-like domains. B: Solvent Accessible Surface representation of the individual domains. C: Construction of the rigid body model based on literature data. D: final model used in the simulation, (right) schema of the rigid body colliders used in the simulation.

A rigid body model based on nidogen-1 for rigid body simulation. A Domain structure of nidogen-1 including the G1, G2, G3 and EGF-like domains. B: Solvent Accessible Surface representation of the individual domains. C: Construction of the rigid body model based on literature data. D: final model used in the simulation, (right) schema of the rigid body colliders used in the simulation.

All-atom reconstruction

One of the goals of the DURABIN as a project is to develop convenient tools to do simulation from all atoms to rigid body and back to all atoms (called back-mapping in CGMD) in a true multiscale way. While rigid body dynamics can provide a convenient way to sample and study large-scale motions of molecular complexes, the details of biomolecular interactions lie in the all-atoms realms of small range motions and residue reorientation. In Fig. 2, we use custom scripts and user interaction to generate a couple of collagen IV molecules at 1/10th their final size and we then scale them back to their final real size. This allows us to fit long polymers in a very constrained spaces, like sphere, which does not have biological relevance for collagen IV but shows the ease of building all-atom models of a large protein in a very tightly confined space, which could be of practical use when modelling a crowded ECM or integrin trafficking [66]. In CGMD, the first step of back-mapping is to get back the coordinates of the original atoms used to build the CG model, followed by an optimization step where the model is relaxed. In the case of rigid-body models, atomic coordinates are back-mapped to fit the newly oriented and positioned rigid body (Fig. 2B). The minimization/relaxation step is far less trivial to achieve. It is a crucial phase for using the model in MD simulations, especially to correct problems such as atom superimposition and unwanted kinks.
Fig. 2

Example of atomistic model construction based on rigid body simulation. Example of a complex model that can be made with DURABIN. A. Real-time rigid body models of the collagen-IV heterotrimer α1α1α2 constrained to a sphere. B. Atomistic model built using PDB data (Table 1, 1K6F, 1LI1) and rigid bodies spatial orientation data extracted from A. C. Top: 3D mesh representation of the molecular surface (blue) lacking atomic details, overlaid or superimposed on the rigid bodies (green,). Bottom: rigid bodies alone showing how a slight overlap leads a gapless chain. D. Close up view of the all-atoms globular NC1 domain of collagen IV. Carbon: gray, oxygen: red, nitrogen: blue, sulphur: yellow. Hydrogen atoms are not represented for the sake of clarity. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Example of atomistic model construction based on rigid body simulation. Example of a complex model that can be made with DURABIN. A. Real-time rigid body models of the collagen-IV heterotrimer α1α1α2 constrained to a sphere. B. Atomistic model built using PDB data (Table 1, 1K6F, 1LI1) and rigid bodies spatial orientation data extracted from A. C. Top: 3D mesh representation of the molecular surface (blue) lacking atomic details, overlaid or superimposed on the rigid bodies (green,). Bottom: rigid bodies alone showing how a slight overlap leads a gapless chain. D. Close up view of the all-atoms globular NC1 domain of collagen IV. Carbon: gray, oxygen: red, nitrogen: blue, sulphur: yellow. Hydrogen atoms are not represented for the sake of clarity. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Modelling a basement membrane

Setting up the simulation

Models of basement membrane components, namely collagen IV, laminin-111, nidogen-1 and perlecan, were generated in a simulation box described in the Material and Methods section (400 nm × 400 nm × 100 nm). While the components of the basement membrane are known, the proportion of each component is still subject to debate [67]. In our model, we have chosen a stoichiometry of 1:1 so that in theory, no molecule in our simulation is left orphaned during self-assembly (one collagen interacting with one laminin and so on). The simulation was then run, and the rigid body model allowed to diffuse in an implicit medium which viscosity is modulated by the use of Langevin dynamics, under the influence of random thermal forces linked to a thermostat (higher temperatures mean larger forces) in order to self-assemble into a basement membrane-like ECM (Video Supplementary material). Any interactions at this stage resulted from random/chance encounter during the simulation (Fig. 4).
Fig. 4

Top, the main components of the basement membrane and integrins. Bottom A to C: Effect of increasing flexibility of the collagen IV model from the stiffest to most flexible molecule on the organization of the basement membrane during simulation.

Influence of collagen IV flexibility

In our previous study [26], we investigated the behaviour of laminins and nidogen. The collagen IV chains contain 21–26 interruptions of various lengths in their triple helix [68], which provide collagen IV molecules with flexibility. By tweaking the joint rotational constraints (Fig. 3A) between the 43 individual triple helical domains used to build the model of collagen IV, it was possible to stiffen or loosen the collagen molecule (see Fig. 3B and C).
Fig. 3

Tweaking joint rotational constraint changes the flexibility of a polymer model. A. How the rotation constraint affects the joint. Rotation around the X and Y axis is more limited in Top A than in Bottom A, the grey cap helps visualize the resulting Z axis exploration limits. B. Close up view isolating the rotation constraint. C. Difference in turn radius between a stiff (5 degrees rotation constraint) and a bendy (20 degrees rotation constraint) polymer.

Tweaking joint rotational constraint changes the flexibility of a polymer model. A. How the rotation constraint affects the joint. Rotation around the X and Y axis is more limited in Top A than in Bottom A, the grey cap helps visualize the resulting Z axis exploration limits. B. Close up view isolating the rotation constraint. C. Difference in turn radius between a stiff (5 degrees rotation constraint) and a bendy (20 degrees rotation constraint) polymer. A simple simulation of the basement membrane showed the influence of collagen IV flexibility on the final structures formed (Fig. 4). When collagen IV was rigid or moderately flexible, laminin and nidogen molecules tended to spread because the collagen rigidity helped to maintain a minimal distance (Fig. 4A and B) and to form an irregular polygonal network during the simulation as previously observed experimentally [69]. Top, the main components of the basement membrane and integrins. Bottom A to C: Effect of increasing flexibility of the collagen IV model from the stiffest to most flexible molecule on the organization of the basement membrane during simulation. In contrast, when collagen IV was flexible, distances decreased, and the overall structure looked like an aggregate. If the flexibility was set to the maximum value, allowing the formation of kinks and U-turns, clumps formed, and the self-organization appeared to be lost. Increasing the flexibility of the collagen IV model leads to the formation of a loose network [69]. The ability to modulate the self-assembly of basement membrane from organized to disorganized networks could be useful to mimic what happens in some diseases and to investigate the underlying molecular events associated with and/or triggering the disease [70]. The introduction of cross-links mimicking physiological cross-linked ECM would be useful to build an ECM model exhibiting various mechanical properties modulated by the extent of crosslinking [71]. The proportions of each molecule in the basement membrane model are listed in Table 2 and compared to the Matrigel basement membrane composition [72].
Table 2

Relative proportions calculated according to the molecular weight of the components of the Matrigel and the basement membrane model.

Matrigel [72]Basement membrane modelMolecular weight of the basement membrane components
Laminin-11160%62.33%900 kDa [73]
Collagen IV30%28.04%405 kDa [74]
Nidogen8%9.63%139 kDa [63]
1444 kDa
Relative proportions calculated according to the molecular weight of the components of the Matrigel and the basement membrane model. The numbers seem to support equal proportions between laminins, collagen and nidogen. The model is based on a 1:1 ratio stoichiometry. This ratio was chosen in our model because it maximizes networking and interactions while minimizing the number of molecules who cannot find a partner to interact with. These proportions were similar to those of the basement membrane extract Matrigel.

Membrane proteins: integrins

Lipid bilayers and integrins are in close contact with basement membranes and define the boundaries of our modelled and simulated system. Until now integrins were locked in a separate 2D space with a rigid plane mimicking a membrane [26]. The constraints used did not have a physical justification. We used here a restraint field, as described in the Material and Methods section (membrane simulation), which can be used to simulate any transmembrane molecules. Forces calculated from Equation (1), for instance, maintain the transmembrane helix of the integrin models inside the simulated lipid bilayer (Fig. 5). By using this newly implemented restraint field, the integrins behave as we anticipated. The restraint field is two sided, and while we only model the extracellular part, it will allow for intracellular content to be simulated as well in the future.
Fig. 5

Integrin models embedded in a model of lipid bilayer. The black dotted lines represent the boundaries of the lipid bilayer (-z, +z), and the cyan line corresponds to the center of the membrane bilayer. The rigid bodies representing the hydrophobic transmembrane helices are represented as red rectangles. The arrows represent the direction of C as a function of the position of the center of mass of the rigid bodies in the IMPALA field (a force-field specific to lipid membranes simplified as a continuum). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Integrin models embedded in a model of lipid bilayer. The black dotted lines represent the boundaries of the lipid bilayer (-z, +z), and the cyan line corresponds to the center of the membrane bilayer. The rigid bodies representing the hydrophobic transmembrane helices are represented as red rectangles. The arrows represent the direction of C as a function of the position of the center of mass of the rigid bodies in the IMPALA field (a force-field specific to lipid membranes simplified as a continuum). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Discussion

DURABIN was developed as a first contribution to fill the gap between microscopic and macroscopic observations. Experimental methods allowing the direct observation of biomolecular objects at the mesoscale are still lacking and simulation offers a tentative glimpse at systems that would otherwise elude us. We demonstrated that the combination of two incomplete datasets, namely the atomic details of microscopic observations of protein domains, with the global observation at the macroscopic scale is perfectly feasible for large molecular-weight protein. We have demonstrated that the tools we developed for DURABIN can simulate large motions in mesoscale systems like the basement membrane and hope it will help matrix biology researchers gain insight into the molecular structure of the basement membrane components at the atom level in a true multiscale fashion, all tools being simple to use and interactive thanks to virtual/augmented reality technologies. Our rigid body to all-atom approach is promising but could certainly be improved. While transforming atomic coordinates using rigid body positions and rotations works, joining the heterogeneous bits before doing some minimization/optimisation on the resulting atomic model remains challenging. This two-stage approach is already commonly used in CGMD and the “Backward” back-mapping tool [75]. It uses a geometrical reconstruction approach starting from the backbone and library of geometrical rules. Our rigid body approach allows us to populate simulation with multiple copies of the rigid body model of the components of the basement membrane, but we plan to improve this approach. One of these improvements could come from position-based physics or particle-based rigid bodies. It is a recent development where rigid bodies are defined not by collider primitives but by a set of strongly constrained particles allowing for arbitrary shapes and unlocking computing parallelism [76]. This opens the possibility to perform simulations with a higher population of models in the simulation box. The N-terminal arms of the α, β and γ chains of different laminin molecules interact to form a laminin network [49], [77]. This was indeed the case during the simulation of our model but the short arms of the β and γ chains belonging to the same laminin molecule were also able to interact. This could be prevented in the future by making the short arms of the β and γ chains rigid enough so that the β and γ arms of the same laminin molecule never come in contact. This is another strength of the DURABIN approach. The rigid body models are easy to modify and adapt to the ever-changing large amount of research information available. The more information we get, the more accurate the simulation will be. It would be interesting to use DURABIN to test if and how the modifications of the basement membrane composition affect molecule diffusion, and basement membrane stability and physical properties, mimicking physio-pathological changes occurring in vivo. The model reported here is the first step towards the building of an extracellular matrix mesoscope, which will provide the opportunity to investigate molecular functions of the ECM and the biological processes it is involved in. Basement membranes are usually associated with cells through interactions with adhesion receptors, sulfated glycolipids and others. We have not yet included in our model the interactions between basement membrane components and glycolipids, or dystroglycan which should provide new insights on the role of cell surface receptors in basement membrane self-assembly. Furthermore, the polarization of proteins in the basement membrane should also be considered. For example, the C-terminal globular domains of laminins interact with the cell surface, whereas the N-terminus of their three chains form the laminin network in the basement membrane [78]. Collagen XVIII also has a polarized orientation in basement membranes with the N-terminus facing the fibrillar matrix and the C-terminus orientated towards the plasma membrane [79]. Last, collagen IV C-terminus dimerization was implemented, but we still have to fully model the tetramerization of the 7S domain, which is much more complex since it involves 12 chains, is cross-linked by lysyl oxidase like-2 but is currently modelled as a simple constraint. Biomolecular non-covalent intermolecular bonds are not programmed to dissociate during the simulation, meaning that they are stable once created. The challenge is to find a way to translate values of the equilibrium dissociation constants into a programmable parameter in the simulation. This could be expressed as a contact frequency defining the probability a non-covalent bond/interaction could form or dissociate at each timestep. Our approach developed with the Unity platform benefits from the native support of haptic and augmented/virtual reality (AR/VR) devices. The advantage of VR is that it makes easier for non-expert users to navigate the computer environment. From simply walking around the visualization to quite literally grabbing the model and turning it around, it offers very intuitive and new means of looking at molecular models [80], [81]. Unity also provides real-time visual feedback to the user. DURABIN provides an easy way to apprehend visual medium for the users to observe and interact with complex mesoscale systems, but DURABIN lacks advanced molecular representation features like ribbons or cartoon representation, unlike Unitymol [30] or other visualization software such as VMD or PyMOL. DURABIN provides tools to import, simulate, manipulate, and export molecular models, but it does not aim at performing all-atom MD and at replacing well-known MD packages such as Gromacs, NAMD, Amber or LAMMPS. DURABIN speeds up the study of large molecular systems at the mesoscale level by facilitating the building and sampling of complex molecular systems such as ECM large multi-domains molecules.

CRediT authorship contribution statement

Hua Wong: Methodology, Software, Formal analysis, Investigation, Data curation, Visualization, Writing – original draft. Jean-Marc Crowet: Writing – review & editing. Manuel Dauchez: Conceptualization, Writing – review & editing, Supervision, Funding acquisition. Sylvie Ricard-Blum: Writing – review & editing. Stéphanie Baud: Writing – review & editing, Supervision. Nicolas Belloy: Conceptualization, Writing – review & editing, Resources, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  61 in total

1.  Finite element analysis of microelectrotension of cell membranes.

Authors:  Chilman Bae; Peter J Butler
Journal:  Biomech Model Mechanobiol       Date:  2007-07-27

Review 2.  Cryo-electron microscopy for structural analysis of dynamic biological macromolecules.

Authors:  Kazuyoshi Murata; Matthias Wolf
Journal:  Biochim Biophys Acta Gen Subj       Date:  2017-07-27       Impact factor: 3.770

3.  Basement membranes.

Authors:  Ranjay Jayadev; David R Sherwood
Journal:  Curr Biol       Date:  2017-03-20       Impact factor: 10.834

4.  Cryo-electron microscopy reaches atomic resolution.

Authors:  Mark A Herzik
Journal:  Nature       Date:  2020-11       Impact factor: 49.962

5.  Molecular simulations predict novel collagen conformations during cross-link loading.

Authors:  Jonathan W Bourne; Peter A Torzilli
Journal:  Matrix Biol       Date:  2011-05-26       Impact factor: 11.583

6.  Lysyl Oxidase-like-2 Cross-links Collagen IV of Glomerular Basement Membrane.

Authors:  Carolina Añazco; Alberto J López-Jiménez; Mohamed Rafi; Lorenzo Vega-Montoto; Ming-Zhi Zhang; Billy G Hudson; Roberto M Vanacore
Journal:  J Biol Chem       Date:  2016-10-21       Impact factor: 5.157

7.  Structure of a complete integrin ectodomain in a physiologic resting state and activation and deactivation by applied forces.

Authors:  Jianghai Zhu; Bing-Hao Luo; Tsan Xiao; Chengzhong Zhang; Noritaka Nishida; Timothy A Springer
Journal:  Mol Cell       Date:  2008-12-26       Impact factor: 17.970

8.  Collagen IV in Normal Skin and in Pathological Processes.

Authors:  Ana Maria Abreu-Velez; Michael S Howard
Journal:  N Am J Med Sci       Date:  2012-01

Review 9.  Protein composition and biomechanical properties of in vivo-derived basement membranes.

Authors:  Willi Halfter; Joseph Candiello; Haiyu Hu; Peng Zhang; Emanuel Schreiber; Manimalha Balasubramani
Journal:  Cell Adh Migr       Date:  2012-11-15       Impact factor: 3.405

10.  Single-particle cryo-EM at atomic resolution.

Authors:  Takanori Nakane; Abhay Kotecha; Andrija Sente; Greg McMullan; Simonas Masiulis; Patricia M G E Brown; Ioana T Grigoras; Lina Malinauskaite; Tomas Malinauskas; Jonas Miehling; Tomasz Uchański; Lingbo Yu; Dimple Karia; Evgeniya V Pechnikova; Erwin de Jong; Jeroen Keizer; Maarten Bischoff; Jamie McCormack; Peter Tiemeijer; Steven W Hardwick; Dimitri Y Chirgadze; Garib Murshudov; A Radu Aricescu; Sjors H W Scheres
Journal:  Nature       Date:  2020-10-21       Impact factor: 69.504

View more
  1 in total

Review 1.  Dynamic movement and turnover of extracellular matrices during tissue development and maintenance.

Authors:  Yutaka Matsubayashi
Journal:  Fly (Austin)       Date:  2022-12       Impact factor: 1.143

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.