Predicting the Absorption Rate of Chemicals Through Mammalian Skin Using Machine Learning Algorithms
Machine learning (ML) methods have been applied to the analysis of a range of biological systems. This thesis evaluates the application of these methods to the problem domain of skin permeability. ML methods offer great potential in both predictive ability and their ability to provide mechanistic insight to, in this case, the phenomena of skin permeation. Historically, refining mathematical models used to predict percutaneous drug absorption has been thought of as a key factor in this field. Quantitative Structure-Activity Relationships (QSARs) models are used extensively for this purpose. However, advanced ML methods successfully outperform the traditional linear QSAR models. In this thesis, the application of ML methods to percutaneous absorption are investigated and evaluated. The major approach used in this thesis is Gaussian process (GP) regression method. This research seeks to enhance the prediction performance by using local non-linear models obtained from applying clustering algorithms. In addition, to increase the model’s quality, a kernel is generated based on both numerical chemical variables and categorical experimental descriptors. Monte Carlo algorithm is also employed to generate reliable models from variable data which is inevitable in biological experiments. The datasets used for this study are small and it may raise the over-fitting/under-fitting problem. In this research I attempt to find optimal values of skin permeability using GP optimisation algorithms within small datasets. Although these methods are applied here to the field of percutaneous absorption, it may be applied more broadly to any biological system.