| Literature DB >> 30670792 |
Tu C Le1, Matthew Penna2,3, David A Winkler4,5,6,7, Irene Yarovsky8,9.
Abstract
Preventing biological contamination (biofouling) is key to successful development of novel surface and nanoparticle-based technologies in the manufacturing industry and biomedicine. Protein adsorption is a crucial mediator of the interactions at the bio - nano -materials interface but is not well understood. Although general, empirical rules have been developed to guide the design of protein-resistant surface coatings, they are still largely qualitative. Herein we demonstrate that this knowledge gap can be addressed by using machine learning approaches to extract quantitative relationships between the material surface chemistry and the protein adsorption characteristics. We illustrate how robust linear and non-linear models can be constructed to accurately predict the percentage of protein adsorbed onto these surfaces usingEntities:
Mesh:
Substances:
Year: 2019 PMID: 30670792 PMCID: PMC6342937 DOI: 10.1038/s41598-018-36597-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Chemical structure of the self-assembled monolayers (SAMs).
The chemical structure of –R of the self-assembled monolayers (SAMs)[15].
| Entry | R | Entry | R | Entry | R | Entry | R |
|---|---|---|---|---|---|---|---|
| 1 | H2N(CH2)10CH3 | 13 |
| 25 | H2N(Gly)3N(CH3)2 | 37 |
|
| 2 | H2NCH2(CF2)6CF3 | 14 |
| 26 |
| 38 |
|
| 3 |
| 15 | H2N(CH2CH2O)2CH2CH2NH2 | 27 | H(CH3)N(Sar)3N(CH3)2 | 39 | HN(CH2CH2CN)2 |
| 4 |
| 16 |
| 28 | H(CH3)N(Sar)4N(CH3)2 | 40 | HN(CH2CN)2 |
| 5 | H2NCH2CH2OCH3 | 17 |
| 29 | H(CH3)N(Sar)5N(CH3)2 | 41 | H2NCH2CH2CN |
| 6 | H2NCH2CH2OH | 18 | HN(CH3)2 | 30 |
| 42 |
|
| 7 | HN(CH2CH2OCH3)2 | 19 |
| 31 |
| 43 |
|
| 8 | H2N(CH2CH2O)3CH3 | 20 |
| 32 |
| 44 | H2NC(CH2CH2CH2OH)3 |
| 9 | H2N(CH2CH2O)3H | 21 |
| 33 |
| 45 |
|
| 10 | H2N(CH2CH2O)6CH3 | 22 |
| 34 |
| 46 | H(CH3)NCH2CH(OCH3)2 |
| 11 | H2N(CH2CH2O)6H | 23 |
| 35 |
| 47 |
|
| 12 |
| 24 |
| 36 |
| 48 |
|
Figure 2Scaled MLR coefficients for Whitesides rule descriptors to prevent protein adsorption. (A) Model using Hy parameter for hydrophilicity. (B) Model using AlogP for hydrophobicity.
Figure 3The dependence of the standard error of prediction (SEP) on the number of descriptors for models constructed using the MLREM approach to prune out irrelevant descriptors. The red data point indicates the best models with optimal sparsity.
Statistics of the optimal linear and non-linear models of protein adsorption (fibrinogen and lysozyme) on different surfaces at 3 and 30 minutes. (N is the number of effective weights (adjustable parameters) in the model).
| Modelling technique |
| Training set | Test set | ||
|---|---|---|---|---|---|
|
| SEE [%] |
| SEP [%] | ||
| MLREM | 27 | 0.81 | 13 | 0.78 | 14 |
| BRANNGP* | 28 | 0.82 | 12 | 0.76 | 14 |
| BRANNGP# | 35 | 0.84 | 10 | 0.79 | 14 |
*BRANNGP model built using the entire pool of 67 descriptors.
#BRANNGP model built using 25 descriptors selected by MLREM.
Figure 4Prediction of the best MLREM model of percentage protein monolayer coverage on SAMs (%ML). Training set (grey circles) and test set (black triangles).
Figure 5Scaled MLR coefficients of the most relevant descriptors selected from the pool 67 descriptors.
The most relevant descriptors selected by MLREM and their contributions to the model predicting %ML. The descriptors are listed in the order of least negative to most positive.
| Descriptor | Definition | Type | Contribution |
|
| |||
| AMR | Ghose-Crippen molar refractivity | Continuous | −60 |
| RGyr | radius of gyration (mass weighted) | Continuous | −52 |
| Ui | unsaturation index | Continuous | −19 |
| nR06 | number of 6-membered rings | Integer | −19 |
| RBF | rotatable bond fraction | Integer | −17 |
| Hy | hydrophilic factor | Continuous | −9 |
|
| |||
| C-026 | number of R–CX–R | Integer | 5 |
| ProteinType | protein type indicator | Integer | 5 |
| nCp | number of terminal primary C(sp3) | Integer | 6 |
| N-068 | number of Al3-N fragments | Integer | 7 |
| C-041 | number of X-C(=X)-X fragments | Integer | 8 |
| Time | time scale indicator | Integer | 8 |
| N-066 | number of Al-NH2 fragments | Integer | 11 |
| N-067 | number of Al2-NH fragments | Integer | 12 |
| ALOGP | Ghose-Crippen octanol-water partition coeff. (logP) | Continuous | 15 |
| nCrs | number of ring secondary C(sp3) | Integer | 25 |
| N-074 | number of R≡N / R=N- fragments | Integer | 27 |
| C-002 | number of CH2R2 fragments | Integer | 28 |
| nROR | number of ethers (aliphatic) | Integer | 32 |
| C-006 | number of CH2RX fragments | Integer | 34 |
| nOHs | number of secondary alcohols | Integer | 36 |
| RBN | number of rotatable bonds | Integer | 51 |
| O-058 | number of O= | Integer | 67 |
| ARR | aromatic ratio | Integer | 71 |
Figure 6Scaled MLR coefficients of descriptors in the updated model.
Figure 7The workflow and the corresponding evolvement of the model performance.