Literature DB >> 34793633

Physicochemical properties determining drug detection in skin.

Wout Bittremieux1,2,3, Rohit S Advani1,2,4, Alan K Jarmusch1,2,5, Shaden Aguirre1,2, Aileen Lu1,2, Pieter C Dorrestein1,2, Shirley M Tsunoda1.   

Abstract

Chemicals, including some systemically administered xenobiotics and their biotransformations, can be detected noninvasively using skin swabs and untargeted metabolomics analysis. We sought to understand the principal drivers that determine whether a drug taken orally or systemically is likely to be observed on the epidermis by using a random forest classifier to predict which drugs would be detected on the skin. A variety of molecular descriptors describing calculated properties of drugs, such as measures of volume, electronegativity, bond energy, and electrotopology, were used to train the classifier. The mean area under the receiver operating characteristic curve was 0.71 for predicting drug detection on the epidermis, and the SHapley Additive exPlanations (SHAP) model interpretation technique was used to determine the most relevant molecular descriptors. Based on the analysis of 2561 US Food and Drug Administration (FDA)-approved drugs, we predict that therapeutic drug classes, such as nervous system drugs, are more likely to be detected on the skin. Detecting drugs and other chemicals noninvasively on the skin using untargeted metabolomics could be a useful clinical advancement in therapeutic drug monitoring, adherence, and health status.
© 2021 The Authors. Clinical and Translational Science published by Wiley Periodicals LLC on behalf of the American Society for Clinical Pharmacology and Therapeutics.

Entities:  

Mesh:

Year:  2021        PMID: 34793633      PMCID: PMC8932847          DOI: 10.1111/cts.13198

Source DB:  PubMed          Journal:  Clin Transl Sci        ISSN: 1752-8054            Impact factor:   4.689


WHAT IS THE CURRENT KNOWLEDGE ON THE TOPIC? A multitude of chemicals that an individual encounters in daily life can be detected on the skin surface using untargeted metabolomic analysis, including topical and systemically administered xenobiotics. WHAT QUESTION DID THIS STUDY ADDRESS? Can machine learning be used to predict whether systemically administered drugs are observed on the epidermis and provide insights into the complex underlying biochemical processes? WHAT DOES THIS STUDY ADD TO OUR KNOWLEDGE? Our machine‐learning model found relevant molecular descriptors related to volume, electronegativity, bond energy, and electrotopology to be strong predictors of drug observance on the epidermis. Our model predicted that certain categories of drugs, such as those affecting the nervous system, are more likely to be observed. HOW MIGHT THIS CHANGE CLINICAL PHARMACOLOGY OR TRANSLATIONAL SCIENCE? Our machine‐learning model demonstrated several physicochemical properties of drugs that predict detection on noninvasively obtained skin swabs. Using skin swabs may be a paradigm shift in how we monitor drugs, measure drug adherence, and monitor health and disease noninvasively.

INTRODUCTION

The skin provides a physical and chemical barrier to environmental insults and supports immunological function and thermoregulation. Additionally, the bacteria, viruses, and fungi that comprise the skin microbiome provide an essential function in protection against microbial pathogens, educating the immune system, and breaking down products. Traditionally, topical formulations of drugs are desired in certain medical conditions to either deliver drugs from the skin to the systemic circulation (e.g., transdermal scopolamine) or to deliver drugs locally to the skin and minimize the systemic toxicity of these drugs (e.g., topical corticosteroids). Interestingly, a recent study demonstrated systemic concentrations above the US Food and Drug Administration (FDA) safety threshold of the sunscreen compounds avobenzone, oxybenzone, and octocrylene up to 21 days postadministration, despite the widespread assumption that these commonly used topical products are considered “safe.” Skin permeation by xenobiotics has been investigated for many years. The majority of drugs permeate across the bulk of the epidermis, along a concentration gradient, in three pathways—intracellular, intercellular, and follicular—with the intercellular pathway believed to provide the principal route for drug permeation. Several mechanistic studies have shown that extracellular lacunar domains comprise a pore pathway for penetration of polar and nonpolar molecules across the stratum corneum. Following penetration across the stratum corneum, drugs diffuse across the viable epidermis and dermis and are carried away into the bloodstream by the capillaries of the dermis. Ideal drug candidates for permeation through the skin include those with low molecular weight, solubility in water and oils to achieve an appropriate concentration gradient, an elevated but balanced partition coefficient, and a low melting point for solubility purposes. Currently, there exist several models to predict skin permeability and data resources of experimental skin permeation values and their corresponding protocols. In contrast, less is known about “inverse penetration”: drugs moving from the systemic circulation to the epidermis. Patzelt et al. suggested five possible inverse penetration pathways, consisting of: (i) the intracellular; (ii) intercellular; and (iii) follicular pathways, similar to topically administered substances; and (iv) inverse penetration via sweat, or (v) via the desquamation process. Based on a literature study of 11 systemically administered substances and their recovery in the skin, they concluded that lipophilic substances predominantly reach the skin surface via the sebum, whereas hydrophilic substances utilize the sweat for delivery to the skin surface. Inverse cellular penetration and desquamation—which occurs on longer time scales—are less relevant, although no indication for inverse intracellular penetration was found in the literature. Concentrations on the epidermis of a few systemically administered compounds have been reported in the literature. , For example, the antifungal agent fluconazole was detected in significantly higher concentrations in the stratum corneum than in plasma and for a longer duration after cessation of therapy. Similarly, other systemically administered antifungal agents were detected in high concentrations on the skin and exhibited slow clearance from both skin and nails. Comprehensive knowledge of the underlying mechanisms is relevant in the dermatological field, as a multiplicity of pharmaceutics are administered systemically to address skin disorders, and would enable important applications of xenobiotic skin detection using noninvasive methods to determine adherence of drugs, for therapeutic drug monitoring, the extent of metabolism, and to assess organ and health status. Skin swab samples have previously been used to determine individual skin chemistry profiles, including topically applied chemicals, such as avobenzone, octocrylene, as well as others found in soap, lotions, cosmetics, and anti‐mosquito sprays and lotions. Furthermore, we have recently demonstrated that systemically administered drugs, such as citalopram, diphenhydramine, and the N‐acetyl metabolite of sulfamethoxazole, can be detected in skin swab samples of the hands, forearm, forehead, and axilla. Utilizing untargeted metabolomics and analysis of these data using the Global Natural Products Social Molecular Networking (GNPS) infrastructure, we achieved the detection of these compounds on the epidermis of patients that were prescribed these drugs, thus concluding that systemically administered drugs can be detected on the skin surface. Additionally, our recent study in healthy humans demonstrated a delayed time course between plasma and skin concentrations of diphenhydramine and its metabolites ranging from 1.5 to 10 h (https://biorxiv.org/cgi/content/short/2021.11.22.469638v1). The full mechanism and pathways of chemicals and drugs moving from the systemic circulation to the epidermis are unknown. Additionally, not all xenobiotics can be detected on the skin. A notable example is the immunosuppressive drug tacrolimus, which was not detected in our skin swab samples. We sought to understand the physicochemical and pharmacokinetic properties that allow some systemically administered drugs to be detected on the epidermis and not others. Using existing skin swab data, we trained a random forest classifier that is able to accurately predict whether a compound will be observed on the epidermis.

METHODS

Data origin

No human subjects were recruited for this study and all data were assessed retrospectively. All data were anonymous and obtained from open mass spectrometry data contained in GNPS and ReDU. GNPS is a public data repository and analysis infrastructure for untargeted metabolomics data. Analysis tools available on GNPS include spectral library searching, molecular networking to compute spectral similarities between tandem mass spectrometry (MS/MS) spectra and detect related compounds, and MASST to query MS/MS spectra against all public metabolomics data and associated sample information to investigate their context. ReDU is an associated system to capture metadata of public data in GNPS using validated controlled vocabularies (redu.ucsd.edu). To collect drugs that are observable on the epidermis, MS data and associated metadata were selected using ReDU (March 24, 2019) by filtering for files that were annotated as pertaining to human skin samples using the Uberon ontology of anatomy terms (Table S1). This resulted in a list of 5629 files from a heterogeneous set of 20 previously performed studies with data deposited to GNPS (Table S2). Additionally, prescription records available in conjunction with data from a previous kidney transplant study were used to define drugs which were prescribed to individuals but were not observed in skin samples in that study (GNPS/MassIVE dataset identifier MSV000081548). Skin samples were obtained from 15 individuals at two different clinic visits—without regard to timing with their medications—on 10 locations on the body (bilateral collection of the forehead, nasolabial area, axillary, backhand, and palm). The subjects of that study were prescribed many (>5) medications simultaneously. Of the 58 different medications in that study, 50 drugs were previously not detected in skin samples and offer “negative” examples for which we have experimental data. Negative examples will include both the lack of transport to the epidermis, but also the lack of detection due to sample preparation (e.g., some drugs might not be detected due to the chosen extraction conditions). The eight drugs or drug metabolites that were detected in skin swabs in that study are part of presumed “positive” compounds that are observed on the epidermis. Further, these particular examples are supported with experimental data and matching prescription records (i.e., the drugs were detected in the subjects to whom they were prescribed).

Data processing

All peak files filtered through ReDU were analyzed using MS/MS library searching on GNPS against all available public spectral libraries (version 2.0; GNPS task ID: https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=53e265f8f6994f0196bf9bccd8d1b513). MS/MS library searching resulted in 175 drugs that were identified in the human skin files (level 2 annotation according to the Metabolomics Standards Initiative ), filtered using a list of curated drugs and drug metabolites as they are recorded in the GNPS MS/MS reference libraries. Duplicate annotations were removed and drugs available in topical formulations were excluded, resulting in a final list of 95 compounds. Based on the empirical measurement of these drugs or drug metabolites in publicly available MS data we presume that these 95 compounds are “positive” examples of drugs that appear on the epidermis. Combined, 145 unique compounds were retained for the machine learning, 95 positive examples and 50 negative examples. The full list of compounds and information on whether they were observed on the epidermis or not is available in Table S3.

Machine learning epidermis prediction

Feature generation and preprocessing

A random forest classifier was used to predict whether drugs are expected to be observed on the epidermis. First, Mordred, a cheminformatics software tool to efficiently compute a large variety of molecular descriptors, was used to generate molecular descriptors for all 145 compounds, such as calculated measures of volume, electronegativity, bond energy, and electrotopology (see Table S4 for relevant descriptor examples). Molecular descriptors that were missing for one or more drugs were omitted, resulting in a feature table consisting of 929 unique descriptors per compound. Next, a classification pipeline was built to predict the probability of observing a drug on the epidermis. The classification pipeline consisted of preprocessing steps to remove irrelevant features and a random forest classifier. Preprocessing steps included removing all features whose variance was below 0.05 and removing one of the features for which their pairwise Pearson correlation exceeded 0.95.

Random forest classifier training and evaluation

A random forest classifier using 1000 trees was trained to predict the epidermis probability. Evaluation and hyperparameter tuning of the classification pipeline were done using nested cross‐validation. Two levels of stratified shuffle splitting consisting of 100 iterations of random splitting in 80% training data and 20% test data were performed. In the inner cross‐validation loop, 10 iterations of randomized searching were used for hyperparameter optimization of the random forest. The following different hyperparameters were evaluated: tree depth (i.e., the longest path from the root node to the leaf nodes) between 5 and 9 (inclusive), minimum number of samples per leaf node between 1 and 9 (inclusive), and minimum number of samples to split an internal node between 2 and 9 (inclusive). The random forest classifier with optimal hyperparameters was subsequently evaluated in the outer cross‐validation loop. The number of features retained in the final classification pipeline after removing uninformative features was 287, and trees with depth eight, minimum two samples per leaf node, and minimum two samples to split a node were most frequently found to be optimal. For each split, the balanced accuracy, true positive rate, false positive rate, and precision were computed for both the training data and test data. Model performance was assessed based on the receiver operating characteristic (ROC) curve and precision–recall curve.

SHapley Additive exPlanations model interpretability

Important features for epidermis prediction were determined using SHapley Additive exPlanations (SHAP), a model interpretability method founded in game theory. Briefly, SHAP explains machine‐learning predictions by using interpretable local models to approximate a complex black box model. Kernel SHAP was used to explore the trained classification pipeline. To determine the important features, 50 training samples determined by K‐means clustering, with the cluster centroids weighted by the number of samples assigned to them, were used as the background dataset. To investigate the features of importance of individual compounds, if they were part of the training dataset, the random forest classifier with optimal hyperparameters was retrained using a leave‐one‐out strategy prior to SHAP analysis.

FDA‐approved drugs and biotransformations

Drug names, SMILES representations, and Anatomic Therapeutic Chemical (ATC) codes for 2561 FDA‐approved drugs were retrieved from DrugBank (version 5.1.7) on December 23, 2020. Mordred was used to generate the same features for these drugs as used during model training, and the probability of observing these drugs on the epidermis was determined using the trained classification pipeline. Additionally, potential biotransformation products of the drugs were generated using the BioTransformer tool. The human super transformer mode, which combines an Enzyme Commission‐based transformer, a CYP450 (phase I) transformer, a phase II transformer, and a human gut microbial transformer were used to predict potential biotransformation products after a single transformation step. This resulted in 23,693 putative biotransformation metabolites derived from the FDA‐approved drugs, for which similarly the probability of observing them on the epidermis was predicted using the trained classification pipeline.

Code availability

All analyses were performed in Python 3.8. RDKit (version 2020.09.3) and Mordred (version 1.2.0) were used to generate molecular descriptors. A GPU‐accelerated version of the random forest algorithm, available as part of the cuML library (version 0.18.0) was used in combination with Scikit‐Learn (version 0.24.1) for data preprocessing and model evaluation. SHAP (version 0.39.0) was used to compute the features of importance. BioTransformer (version 2.0.1) was used to generate biotransformation products. Additionally, NumPy (version 1.20.1), SciPy (version 1.6.0), and Pandas (version 1.1.5) were used for scientific computing, and matplotlib (version 3.3.4) and Seaborn (version 0.11.1) were used for visualization purposes. All code is available at https://github.com/bittremieux/drugs_epidermis as open source under the permissive BSD license.

RESULTS

Occurrence of drugs on the epidermis

Based on the rich metadata associated with the MS/MS data, we were able to select 5629 MS/MS peak files that contain samples collected from human body sites from 20 publicly available datasets (Table S2). These data originate from a variety of studies, including, for example, a 3D molecular cartography of the human skin study in which paired MS and sequencing data was collected to investigate the spatial relationships of human skin with hygiene, the microbiota, and the environment (MSV000078556) ; a study in which skin samples were obtained from healthy human volunteers that were given single doses of caffeine, omeprazole, midazolam, and dextromethorphan on 2 separate days (8 days apart) and a 7‐day course of cefprozil (MSV000082493) ; and the kidney transplant study described in more detail above (MSV000081548). For our secondary analysis, we extracted 145 curated drugs from these data to build a machine‐learning model to predict whether drugs occur on the epidermis. Additionally, based on the Uberon anatomy ontology, these drugs were mapped to the body site on which they were detected (Figure 1). The different rates of drug occurrence throughout the body suggest that there will be distinct detection of chemicals and xenobiotics in skin. As an example, our previous study showed that the N‐acetyl metabolite of sulfamethoxazole was detected in armpit skin samples but not in other skin sites sampled, such as forehead, palms, and forearm. More polar compounds may be more likely detected in more aqueous areas of the skin where sweat is more concentrated, such as the armpit.
FIGURE 1

Body sites of the drugs found through spectral library searching. Body sites for the identified drugs were retrieved from the Uberon annotations specified in ReDU, and drug counts per body site were normalized by the total number of ReDU entries for each body site

Body sites of the drugs found through spectral library searching. Body sites for the identified drugs were retrieved from the Uberon annotations specified in ReDU, and drug counts per body site were normalized by the total number of ReDU entries for each body site

Machine learning to predict whether drugs occur on the epidermis

Using a random forest classifier, we were able to predict whether drugs will be observed on the epidermis with an area under the ROC curve (AUC) obtained during cross‐validation of 0.71 ± 0.10 (Figure 2) and an area under the precision–recall curve of 0.82 ± 0.07 (Figure S1). As an example, two drugs that were present in our data and which were previously reported to be present on the skin, the antimycotics fluconazole, and ketoconazole, are strongly predicted to be observed on the skin. This performance indicates that machine learning can be used to successfully approximate the complex underlying biochemical processes leading to drugs being observed on the epidermis. As we were constrained by the limited availability of ground truth data in this study, we hypothesize that as more training data becomes available it will be possible to produce even more accurate machine learning models (Figure S2).
FIGURE 2

ROC curve indicating the performance of the random forest classifier to predict whether drugs can be observed on the epidermis. The curve is the mean ROC curve over 100 random stratified training (80% of the data) and test (20% of the data) splits. The standard deviation over the splits is indicated by the shaded area. The mean AUC is 0.707, with a standard deviation of 0.095. AUC, area under the curve; ROC, receiver operating characteristic

ROC curve indicating the performance of the random forest classifier to predict whether drugs can be observed on the epidermis. The curve is the mean ROC curve over 100 random stratified training (80% of the data) and test (20% of the data) splits. The standard deviation over the splits is indicated by the shaded area. The mean AUC is 0.707, with a standard deviation of 0.095. AUC, area under the curve; ROC, receiver operating characteristic We tried to gain insight into the molecules’ physical properties that result in drugs being present on the epidermis. The SHAP model interpretation technique was used to determine the most relevant features, consisting of molecular descriptors generated by Mordred, for the classifier performance (Figure 3, Table S4). The top‐ranked features are computed measures of volume (ATSC7v), electronegativity (PEOE_VSA1, PEOE_VSA9), bond energy (ATSC6d), and electrotopology (EState_VSA1). By investigating the SHAP values for individual features, we can derive that in general smaller compounds (Van der Waals volume) with a smaller bonding potential (electronegativity) are more likely to be observed on the epidermis. We can hypothesize that through heterogeneous biochemical processes such molecules diffuse faster and thus will be secreted to the epidermis.
FIGURE 3

SHAP features of importance for the top 20 most important Mordred features from the random forest classifier for the 145 training compounds. A positive SHAP feature importance contributes to drugs predicted to appear on the epidermis, whereas a negative SHAP feature importance contributes to drugs predicted to not appear on the epidermis. The top‐ranked features capture information about the volume, electronegativity, bond energy, and electrotopology of the molecules. See Table S4 for a full description of the Mordred features. SHAP, SHapley Additive exPlanations

SHAP features of importance for the top 20 most important Mordred features from the random forest classifier for the 145 training compounds. A positive SHAP feature importance contributes to drugs predicted to appear on the epidermis, whereas a negative SHAP feature importance contributes to drugs predicted to not appear on the epidermis. The top‐ranked features capture information about the volume, electronegativity, bond energy, and electrotopology of the molecules. See Table S4 for a full description of the Mordred features. SHAP, SHapley Additive exPlanations Additionally, SHAP can be used to interpret predictions for individual drugs. The antihistamine drug diphenhydramine was experimentally observed on the epidermis in a previous healthy human clinical study (https://biorxiv.org/cgi/content/short/2021.11.22.469638v1). Using a leave‐one‐out training strategy to not bias the classifier, it was also strongly predicted to be present on the epidermis (Figure 4a). The most relevant features contributing to this prediction are its lack of accessible Van der Waals surface area with a low electrotopological state (EState_VSA1), a high atomic mass autocorrelation (ATSC4m), a small Van der Waals surface area with low partial charge (PEOE_VSA1), and a low autocorrelation weighted by sigma electrons (ATSC6d, ATSC7d). In contrast, although the related compound diphenhydramine N‐hexose is structurally similar, it is predicted to not appear on the epidermis (Figure 4b), in part because of its increased accessible Van der Waals surface area with a low electrotopological state (EState_VSA1), as well as its higher autocorrelation weighted by Van der Waals volume (ATSC7v), its high autocorrelation weighted by ionization potential (ATSC7i), and its high topological radius (Radius). This is consistent with our experimental results (https://biorxiv.org/cgi/content/short/2021.11.22.469638v1). In a previous study, citalopram was detected in the skin samples of the only subject to which it was prescribed. This empirical observation is confirmed by the machine‐learning model (Figure 4c), as citalopram is very strongly predicted to be observed on the epidermis due to its lack of accessible Van der Waals surface area with a low electrotopological state (EState_VSA1), its low topological charge (GGI10), and its low autocorrelation weighted by valence electrons (ATSC7dv). Conversely, tacrolimus is very strongly predicted to not appear on the epidermis (Figure 4d), primarily due to its high number of double bonds (nBondsD), its high number of Kier–Hall dssC atom types (motif “C(=[*])([*])[*]”), and its high autocorrelation weighted by ionization potential (ATSC5i, ATSC7i, and ATSC8i). This prediction matches its absence in the skin samples of 14 subjects who were prescribed tacrolimus. This analysis demonstrates how machine‐learning techniques can be used to obtain insights into the complex internal biochemical mechanisms that lead systemically administered drugs to be observed on the epidermis.
FIGURE 4

Force plots of the SHAP values to interpret predictions of individual drugs. The most important features, their values, and the direction in which they contribute to the predictions (higher/red: observed, lower/blue: not observed) are displayed. The horizontal axis shows the model probability, with the prediction score indicated by “f(x).” Scores above the expected value based on the training data (“base value”) constitute positive predictions, and scores below the expected value constitute negative predictions. The size of the bars for individual features indicates their magnitude contributing to a positive or negative prediction. (a) Diphenhydramine is predicted to be observed on the epidermis. (b) Diphenhydramine N‐hexose is predicted to not be observed on the epidermis. (c) Citalopram is predicted to be observed on the epidermis. (d) Tacrolimus is predicted to not be observed on the epidermis. SHAP, SHapley Additive exPlanations

Force plots of the SHAP values to interpret predictions of individual drugs. The most important features, their values, and the direction in which they contribute to the predictions (higher/red: observed, lower/blue: not observed) are displayed. The horizontal axis shows the model probability, with the prediction score indicated by “f(x).” Scores above the expected value based on the training data (“base value”) constitute positive predictions, and scores below the expected value constitute negative predictions. The size of the bars for individual features indicates their magnitude contributing to a positive or negative prediction. (a) Diphenhydramine is predicted to be observed on the epidermis. (b) Diphenhydramine N‐hexose is predicted to not be observed on the epidermis. (c) Citalopram is predicted to be observed on the epidermis. (d) Tacrolimus is predicted to not be observed on the epidermis. SHAP, SHapley Additive exPlanations

Exploring presence of drugs and related biotransformations on the epidermis

To expand our knowledge of the variety of drugs that are likely to be observed on the epidermis beyond the training data consisting of 145 drugs, we retrieved 2561 FDA‐approved drugs from DrugBank. Furthermore, we utilized BioTransformer to predict potential biotransformation products of the FDA‐approved drugs, resulting in 23,693 putative biotransformation metabolites. These biotransformations include phase I metabolism products (e.g., Cytochrome P450), enzyme commission‐based metabolism products, phase II metabolism products (e.g., Uridine 5′‐diphospho‐glucuronosyltransferase), and gut microbial transformation products, and they cover a number of different reaction types, including hydrolysis, oxidation and reduction, and conjugation. The probability of observing the FDA‐approved drugs and their potential biotransformation products was predicted using the trained random forest model. To investigate whether specific types of drugs were more likely to occur on the epidermis, we grouped the drugs and the corresponding biotransformation products using the ATC Classification System (Figure 5). This indicates, for example, that hormonal preparations, such as corticosteroids, are least likely to be observed on the epidermis, whereas nervous system drugs, such as analgesics, antiepileptics, antidepressants, and antipsychotics are more likely to be detected on skin.
FIGURE 5

Prediction scores for 2561 FDA approved drugs and their 23,693 biotransformations, subdivided by their drug class in the ATC classification system. ATC, Anatomic Therapeutic Chemical; FDA, US Food and Drug Administration

Prediction scores for 2561 FDA approved drugs and their 23,693 biotransformations, subdivided by their drug class in the ATC classification system. ATC, Anatomic Therapeutic Chemical; FDA, US Food and Drug Administration

DISCUSSION

So far, little is known about which chemicals and drugs move from the systemic circulation to the epidermis. Here, we have demonstrated that machine learning can be used to gain insights into these complex processes for the first time. Using publicly available MS data, we have trained a random forest model to predict whether drugs will occur on the epidermis. Notably, the classifier correctly predicted the presence on the skin of the antimycotics fluconazole and ketoconazole, matching previous experimental results. To obtain insights into the complex processes that underlie reverse penetration of drugs to the epidermis, the SHAP model interpretability method was used to investigate which molecular descriptors are most relevant for prediction using the random forest. In general, we observe that smaller compounds with a smaller bonding potential are more likely to be observed on the epidermis. Although further studies are needed to fully understand the underlying biochemical processes, we hypothesize that through heterogeneous mechanisms such molecules diffuse faster and thus will be secreted to the epidermis. Additionally, we used SHAP to investigate predictions for drugs with a known experimentally derived ground truth. This demonstrates how detailed and individualized insights for specific drugs can be obtained to explore whether they will appear on the epidermis or not. Applying our random forest model to over 2500 FDA‐approved drugs and their biotransformations gives insight into additional drugs and their metabolites that may be detected on the skin surface. For those drugs with a low probability of skin detection, we hypothesize that either these drugs are fully processed within the body, rather than secreted to the epidermis, or their physicochemical properties prevent access to the skin surface. For example, a high degree of lipophilicity might prevent access to the skin surface, as the hydrophilic viable epidermis is a barrier to lipophilic substances. Alternatively, highly hydrophilic compounds are unlikely to be detected on the skin either, as the hydrophobic stratum corneum is a barrier for hydrophilic substances. The variety of important physicochemical properties underlying the prediction of drug detection on the skin indicates that there is no single process for moving compounds from the systemic circulation to the epidermis, but rather that there are unique interplays between the specific drugs and their relevant transport pathways. Notably, median epidermis prediction values for the FDA approved drugs and their biotransformations in different ATC drug classes range from ~35% to ~60% of drugs in each category. There is substantial variation in predicted probability; we speculate that this observation reflects that specific physicochemical properties of the drugs are the driver of this phenomenon rather than the ATC class. Nevertheless, broad generalizations can be made; for example, steroid hormones were predicted to not be detected on the epidermis, which is consistent with our experimental data for budesonide, fludrocortisone, prednisone, and prednisolone; whereas amitriptyline, citalopram, cyclobenzaprine, escitalopram, gabapentin, ketamine, nortriptyline, and venlafaxine were detected in our data, consistent with our model prediction for nervous system drugs (Table S4). An important caveat of the current work is that there was only a limited amount of heterogeneous data available for secondary analysis. Because the data were derived from various publicly available untargeted metabolomics studies for secondary analysis, no consistent experimental protocol was used. Additionally, prescribed drugs might not be detected on the skin due to lack of transport from the systemic circulation to the epidermis, but also due to incompatibilities with sample preparation. Although we achieved encouraging predictive performance, the machine‐learning performance can be further improved upon by obtaining and incorporating more relevant experimental data, as indicated by the learning curve. Ideally, this should include a controlled measurement of both positive and negative examples of drugs and other xenobiotics that are commonly consumed and their status of being observed on the epidermis. Our machine‐learning model is the first attempt to predict xenobiotic skin detection using physicochemical properties. There will likely be future iterations of this model as we advance our understanding of the complex processes governing molecular transport from the systemic circulation to the surface of the skin. The use of noninvasive skin swabs in clinical medicine could be a paradigm shift in how health and disease are monitored. Contemporary methods of blood draws and tissue biopsies are invasive and inconvenient for patients. In the future, we envision the use of noninvasive skin sampling to determine adherence of drugs, for therapeutic drug monitoring, the extent of metabolism, and to assess organ and health status.

CONFLICT OF INTEREST

P.C.D. is on the scientific advisory board of Sirenas, Cybele Microbiome, Galileo, and founder and scientific advisor of Ometa Labs LLC and Enveda. All other authors declare no competing interests for this work.

AUTHOR CONTRIBUTIONS

W.B., R.S.A., A.K.J., P.C.D., and S.M.T. wrote the manuscript. R.S.A., A.K.J., and S.M.T. designed the research. W.B., R.S.A., S.A., A.L., A.K.J., and S.M.T. performed the research. W.B., R.S.A., A.K.J., and S.M.T. analyzed the data. W.B. and R.S.A. contributed new reagents/analytical tools. Figure S1 Click here for additional data file. Figure S2 Click here for additional data file. Table S1 Click here for additional data file. Table S2 Click here for additional data file. Table S3 Click here for additional data file. Table S4 Click here for additional data file.
  26 in total

Review 1.  Mechanistic and empirical modeling of skin permeation of drugs.

Authors:  Fumiyoshi Yamashita; Mitsuru Hashida
Journal:  Adv Drug Deliv Rev       Date:  2003-09-12       Impact factor: 15.470

2.  ReDU: a framework to find and reanalyze public mass spectrometry data.

Authors:  Alan K Jarmusch; Mingxun Wang; Christine M Aceves; Rohit S Advani; Shaden Aguirre; Alexander A Aksenov; Gajender Aleti; Allegra T Aron; Anelize Bauermeister; Sanjana Bolleddu; Amina Bouslimani; Andres Mauricio Caraballo Rodriguez; Rama Chaar; Roxana Coras; Emmanuel O Elijah; Madeleine Ernst; Julia M Gauglitz; Emily C Gentry; Makhai Husband; Scott A Jarmusch; Kenneth L Jones; Zdenek Kamenik; Audrey Le Gouellec; Aileen Lu; Laura-Isobel McCall; Kerry L McPhail; Michael J Meehan; Alexey V Melnik; Riya C Menezes; Yessica Alejandra Montoya Giraldo; Ngoc Hung Nguyen; Louis Felix Nothias; Mélissa Nothias-Esposito; Morgan Panitchpakdi; Daniel Petras; Robert A Quinn; Nicole Sikora; Justin J J van der Hooft; Fernando Vargas; Alison Vrbanac; Kelly C Weldon; Rob Knight; Nuno Bandeira; Pieter C Dorrestein
Journal:  Nat Methods       Date:  2020-08-17       Impact factor: 28.547

3.  Mass spectrometry searches using MASST.

Authors:  Mingxun Wang; Alan K Jarmusch; Fernando Vargas; Alexander A Aksenov; Julia M Gauglitz; Kelly Weldon; Daniel Petras; Ricardo da Silva; Robert Quinn; Alexey V Melnik; Justin J J van der Hooft; Andrés Mauricio Caraballo-Rodríguez; Louis Felix Nothias; Christine M Aceves; Morgan Panitchpakdi; Elizabeth Brown; Francesca Di Ottavio; Nicole Sikora; Emmanuel O Elijah; Lara Labarta-Bajo; Emily C Gentry; Shabnam Shalapour; Kathleen E Kyle; Sara P Puckett; Jeramie D Watrous; Carolina S Carpenter; Amina Bouslimani; Madeleine Ernst; Austin D Swafford; Elina I Zúñiga; Marcy J Balunas; Jonathan L Klassen; Rohit Loomba; Rob Knight; Nuno Bandeira; Pieter C Dorrestein
Journal:  Nat Biotechnol       Date:  2020-01       Impact factor: 54.908

4.  Proposed minimum reporting standards for chemical analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI).

Authors:  Lloyd W Sumner; Alexander Amberg; Dave Barrett; Michael H Beale; Richard Beger; Clare A Daykin; Teresa W-M Fan; Oliver Fiehn; Royston Goodacre; Julian L Griffin; Thomas Hankemeier; Nigel Hardy; James Harnly; Richard Higashi; Joachim Kopka; Andrew N Lane; John C Lindon; Philip Marriott; Andrew W Nicholls; Michael D Reily; John J Thaden; Mark R Viant
Journal:  Metabolomics       Date:  2007-09       Impact factor: 4.290

Review 5.  Tissue penetration of antifungal agents.

Authors:  Timothy Felton; Peter F Troke; William W Hope
Journal:  Clin Microbiol Rev       Date:  2014-01       Impact factor: 26.132

6.  Uberon, an integrative multi-species anatomy ontology.

Authors:  Christopher J Mungall; Carlo Torniai; Georgios V Gkoutos; Suzanna E Lewis; Melissa A Haendel
Journal:  Genome Biol       Date:  2012-01-31       Impact factor: 13.583

7.  DrugBank: a comprehensive resource for in silico drug discovery and exploration.

Authors:  David S Wishart; Craig Knox; An Chi Guo; Savita Shrivastava; Murtaza Hassanali; Paul Stothard; Zhan Chang; Jennifer Woolsey
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

8.  Reproducible molecular networking of untargeted mass spectrometry data using GNPS.

Authors:  Allegra T Aron; Emily C Gentry; Kerry L McPhail; Louis-Félix Nothias; Mélissa Nothias-Esposito; Amina Bouslimani; Daniel Petras; Julia M Gauglitz; Nicole Sikora; Fernando Vargas; Justin J J van der Hooft; Madeleine Ernst; Kyo Bin Kang; Christine M Aceves; Andrés Mauricio Caraballo-Rodríguez; Irina Koester; Kelly C Weldon; Samuel Bertrand; Catherine Roullier; Kunyang Sun; Richard M Tehan; Cristopher A Boya P; Martin H Christian; Marcelino Gutiérrez; Aldo Moreno Ulloa; Javier Andres Tejeda Mora; Randy Mojica-Flores; Johant Lakey-Beitia; Victor Vásquez-Chaves; Yilue Zhang; Angela I Calderón; Nicole Tayler; Robert A Keyzers; Fidele Tugizimana; Nombuso Ndlovu; Alexander A Aksenov; Alan K Jarmusch; Robin Schmid; Andrew W Truman; Nuno Bandeira; Mingxun Wang; Pieter C Dorrestein
Journal:  Nat Protoc       Date:  2020-05-13       Impact factor: 17.021

Review 9.  Array programming with NumPy.

Authors:  Charles R Harris; K Jarrod Millman; Stéfan J van der Walt; Ralf Gommers; Pauli Virtanen; David Cournapeau; Eric Wieser; Julian Taylor; Sebastian Berg; Nathaniel J Smith; Robert Kern; Matti Picus; Stephan Hoyer; Marten H van Kerkwijk; Matthew Brett; Allan Haldane; Jaime Fernández Del Río; Mark Wiebe; Pearu Peterson; Pierre Gérard-Marchant; Kevin Sheppard; Tyler Reddy; Warren Weckesser; Hameer Abbasi; Christoph Gohlke; Travis E Oliphant
Journal:  Nature       Date:  2020-09-16       Impact factor: 49.962

10.  Enhanced Characterization of Drug Metabolism and the Influence of the Intestinal Microbiome: a Pharmacokinetic, Microbiome and Untargeted Metabolomics Study.

Authors:  Alan K Jarmusch; Alison Vrbanac; Jeremiah D Momper; Joseph D Ma; Maher Alhaja; Marlon Liyanage; Rob Knight; Pieter C Dorrestein; Shirley M Tsunoda
Journal:  Clin Transl Sci       Date:  2020-03-26       Impact factor: 4.689

View more
  4 in total

1.  Physicochemical properties determining drug detection in skin.

Authors:  Wout Bittremieux; Rohit S Advani; Alan K Jarmusch; Shaden Aguirre; Aileen Lu; Pieter C Dorrestein; Shirley M Tsunoda
Journal:  Clin Transl Sci       Date:  2021-11-28       Impact factor: 4.689

Review 2.  An Overview of the Latest Metabolomics Studies on Atopic Eczema with New Directions for Study.

Authors:  Jamie Afghani; Claudia Traidl-Hoffmann; Philippe Schmitt-Kopplin; Matthias Reiger; Constanze Mueller
Journal:  Int J Mol Sci       Date:  2022-08-08       Impact factor: 6.208

3.  Non-invasive skin sampling detects systemically administered drugs in humans.

Authors:  Morgan Panitchpakdi; Kelly C Weldon; Alan K Jarmusch; Emily C Gentry; Arianna Choi; Yadira Sepulveda; Shaden Aguirre; Kunyang Sun; Jeremiah D Momper; Pieter C Dorrestein; Shirley M Tsunoda
Journal:  PLoS One       Date:  2022-07-26       Impact factor: 3.752

4.  Predicting skin permeability using HuskinDB.

Authors:  Laura J Waters; Xin Ling Quah
Journal:  Sci Data       Date:  2022-09-23       Impact factor: 8.501

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.