| Literature DB >> 35507099 |
Sara Khabazbashi1, Josefin Engelhardt2, Claudia Möckel1, Jana Weiss2, Anneli Kruve3,4.
Abstract
Hydroxylated PCBs are an important class of metabolites of the widely distributed environmental contaminants polychlorinated biphenyls (PCBs). However, the absence of authentic standards is often a limitation when subject to detection, identification, and quantification. Recently, new strategies to quantify compounds detected with non-targeted LC/ESI/HRMS based on predicted ionization efficiency values have emerged. Here, we evaluate the impact of chemical space coverage and sample matrix on the accuracy of ionization efficiency-based quantification. We show that extending the chemical space of interest is crucial in improving the performance of quantification. Therefore, we extend the ionization efficiency-based quantification approach to hydroxylated PCBs in serum samples with a retraining approach that involves 14 OH-PCBs and validate it with an additional four OH-PCBs. The predicted and measured ionization efficiency values of the OH-PCBs agreed within the mean error of 2.1 × and enabled quantification with the mean error of 4.4 × or better. We observed that the error mostly arose from the ionization efficiency predictions and the impact of matrix effects was of less importance, varying from 37 to 165%. The results show that there is potential for predictive machine learning models for quantification even in very complex matrices such as serum. Further, retraining the already developed models provides a timely and cost-effective solution for extending the chemical space of the application area.Entities:
Keywords: Ionization efficiency; LC/HRMS; Machine learning; Matrix effect; Non-targeted screening; Quantification
Mesh:
Substances:
Year: 2022 PMID: 35507099 PMCID: PMC9482908 DOI: 10.1007/s00216-022-04096-2
Source DB: PubMed Journal: Anal Bioanal Chem ISSN: 1618-2642 Impact factor: 4.478
Fig. 1a The logIE prediction accuracy for OH-PCBs with previously trained model by Liigand et al.[20] and with the updated retrained model. For the updated model, the data added to the model are shown in blue and the OH-PCBs in the test set are shown in violet. b A bee swarm plot of the cosine similarity of the OH-PCBs to the dataset used for original model training by Liigand et al. [20]
Fig. 2The root mean square prediction error for OH-PCBs (purple dots) depending on the number of OH-PCBs added to the model training set. The grey ribbon shows the 95 percentile cosine similarity between the OH-PCBs in the test set and all compounds in the training set. The upper and lower bounds of the ribbon show the highest and lowest similarities depending on the sampling of the OH-PCBs to the training set. The upper panel and lower panel differ in the number, N = 40 and N = 80, of the starting training set taken as the bases before adding any OH-PCBs
Fig. 3The matrix effect in LC/ESI/HRMS for studied OH-PCBs depending a significantly on the serum volume taken for extraction (panels of 1.4 mL, 2 mL, and 4 mL) and b significantly on injection volume (5 µL, 10 µL). 2-OH-CB77, 4′-OH-CB159, and 4,4′-diOH-CB80, shown in panel a, showed statistically significant variability with both serum volume and injection volume