dc.description.abstract | AI Assistance for Chemical Analysis of PPCPs in Water and Wastewater: Highlights and Potentials Babak Kavianpour1, Farzad Piadeh1, Mohammad Gheibi2, Atiyeh Ardakanian1, Kourosh Behzadian1 1 School of Computing and Engineering, University of West London, St Mary’s Rd, London W5 5RF, UK 2 Association of Talent Under Liberty in Technology (TULTECH), Tallinn, Estonia E-mail contact: b.kavianpour@gmail.com 21570983@student.uwl.ac.uk 1. Introduction According to the benchmark for the detection confidence using high-resolution mass spectrometry (HRMS) [1], a critical step in suspect or non-targeted analysis is to tentatively ascertain the existence of compounds using HRMS data compared with spectral libraries and then confirm them by reference standard substances. The challenges arise when many compounds are deemed to be detected or when the reference materials are not readily available, especially in the case of transformation or degradation products in complex environmental or wastewater matrices. To address this and to enhance confidence, artificial intelligence (AI) methods have facilitated the prediction of the retention time (RT) and collision cross section (CCS). However, the models validated versus real water and wastewater samples have not been surveyed sufficiently. This presentation focuses on selected features of AI-assisted chemical analysis of pharmaceuticals and personal care products (PPCPs) applicable in water and wastewater real matrices and highlights the potential for further research into this domain. 2. Materials and Methods This ongoing review spotlights the endeavours integrating AI for the prediction of RT (QSRR) and, more recently, CCS, which are preferably externally tested by environmental water and wastewater samples following the typical development through satisfactory training and testing procedures over the dataset of compounds basically selected and customised among contaminants of emerging concern (CECs) and PPCPs. The priority of the selection through title, abstracts, and keywords were Pharmaceuticals, PPCP, artificial intelligence, machine learning, water and wastewater, and emerging contaminants, to which HRMS, tandem mass spectrometry, RT, and CCS contributed. Fig. 1 illustrates the screening workflow of the references. Figure 1: Reference screening 3. Results and Discussion Among 16 references, the datasets or the number of compounds for the models developed over 5 [2] to 24 descriptors as their input variables and RT or CSS as responses were within an extensive range from below 100 [3] to a whole METLIN databank with over 80000 compounds [4]; however, the number of the PPCPs in real samples externally tested and tentatively identified by the contribution of the predictive models were upto around 60 compounds [3]. Fig. 2 shows the applied AI methods for predicting RT and CCS of PPCPs. Figure 2: Applied AI methods The comparative performance of the models during training and testing shows that GA-optimised models, ensembled ones, SVM, and ANFIS had better accuracy than typical ANNs, MLRs and MLPs. Nonetheless, for external testing, in some cases, the initially less accurate models outperformed. It should be noted that three- and four-layered MLP-ANN models have proven robustness over several studies. Since the descriptors and the datasets vary among the studies, the comparison cannot be generalised for all cases. Different methods were applied to select descriptors, including PCA (statistical), KNN, and GA. Lipophilicity in terms of logD and logP, number of oxygens, molecular weight (for CCS), fractional polar and negative Van der Waals surface areas were reported as the most decisive factors. When considering external validation for suspect screening or for facilitating unknown detection, while the models were successful in practice, very few studies clearly explained the matrix effect, how the models performed in terms of false positive eliminations, as the main pragmatic goal and shortlisting the suspect list of compounds and how the models independently or in conjunction with mass to charge (m/z) and fragmentation spectra enhanced the confidence of the detection. Also, the reviewed studies did not report independent false negative detection of the models. 4. Conclusions Artificial intelligence and machine learning can significantly facilitate confidence in detecting compounds, especially PPCPs, in complex environmental water and wastewater samples, along with HRMS analysers by predicting the RT and CCS of compounds. The studies show the importance of external validation, preferably by environmental samples, since the best-selected AI method during training and internal blind testing over the dataset for model development can differ from the most suitable one for new compounds. We envisage that AI-assisted models for predicting CCS considered as a matrix- or case-independent factor, either as single output or combined with RT, will be applied for future studies more frequently, especially for unknown detection of PPCPs and their metabolites, as well as their transformation and biodegradation products in the effluent of wastewater treatment plants and surface waters. 5. References [1] E. L. Schymanski et al., “Identifying small molecules via high resolution mass spectrometry: Communicating confidence,” Environmental Science and Technology, vol. 48, no. 4. pp. 2097–2098, Feb. 18, 2014. doi: 10.1021/es5002105. [2] R. Aalizadeh, M. C. Nika, and N. S. Thomaidis, “Development and application of retention time prediction models in the suspect and non-target screening of emerging contaminants,” J Hazard Mater, vol. 363, pp. 277–285, Feb. 2019, doi: 10.1016/j.jhazmat.2018.09.047. [3] A. K. Richardson et al., “Rapid direct analysis of river water and machine learning assisted suspect screening of emerging contaminants in passive sampler extracts,” Analytical Methods, vol. 13, no. 5, pp. 595–606, Feb. 2021, doi: 10.1039/d0ay02013c. [4] Q. Yang, H. Ji, H. Lu, and Z. Zhang, “Prediction of Liquid Chromatographic Retention Time with Graph Neural Networks to Assist in Small Molecule Identification,” Anal Chem, vol. 93, no. 4, pp. 2200–2206, Feb. 2021, doi: 10.1021/acs.analchem.0c04071. | en |