Revisiting the General Solubility Equation : In Silico Prediction of Aqueous Solubility Incorporating the Effect of Topographical Polar Surface Area

Ali, Jogoth, Camilleri, Patrick, Brown, Marc, Hutt, Andrew J. and Kirton, Stewart B. (2012) Revisiting the General Solubility Equation : In Silico Prediction of Aqueous Solubility Incorporating the Effect of Topographical Polar Surface Area. Journal of Chemical Information and Modeling, 52 (2). pp. 420-428. ISSN 1549-9596

Copy

The General Solubility Equation (GSE) is a QSPR model based on the melting point and log P of a chemical substance. It is used to predict the aqueous solubility of nonionizable chemical compounds. However, its reliance on experimentally derived descriptors, particularly melting point, limits its applicability to virtual compounds. The studies presented show that the GSE is able to predict, to within 1 log unit, the experimental aqueous solubility (logs) for 81% of the compounds in a data set of 1265 diverse chemical structures (-8.48 < log S < 1.58). However, the predictive ability of the GSE is reduced to 75% when applied to a subset of the data (1160 compounds -6.00 < log S < 0.00), which discounts those compounds occupying the sparsely populated regions of data space. This highlights how sparsely populated extremities of data sets can significantly skew results for linear regression-based models. Replacing the melting point descriptor of the GSE with a descriptor which accounts for topographical polar surface area (TPSA) produces a model of comparable quality to the GSE (the solubility of 81% of compounds in the full data set predicted accurately). As such, we propose an alternative simple model for predicting aqueous solubility which replaces the melting point descriptor of the GSE with TPSA and hence can be applied to virtual compounds. In addition, incorporating TPSA into the GSE in addition to log P and, melting point gives a three descriptor model that improves accurate prediction of aqueous solubility over the GSE by 5.1% for the full and 6.6% for the reduced data set, respectively.

Item Type	Article
Identification Number	10.1021/ci200387c
Date Deposited	15 May 2025 12:28
Last Modified	29 Dec 2025 06:48

Explore Further

Journal of Chemical Information and Modeling

Full text not available from this repository.

Atom

BibTeX

OpenURL ContextObject in Span

OpenURL ContextObject

Dublin Core

MPEG-21 DIDL

Data Cite XML

EndNote

HTML Citation

METS

MODS

RIOXX2 XML

Reference Manager

Refer

ASCII Citation

Export

Downloads