REPOSITORY ABOUT GUIDELINES CITING BLOG

Oja, M.; Sild, S.; Piir, G.; Maran, U. Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. Pharmaceutics 2022, 14, 2248.

QsarDB Repository

Oja, M.; Sild, S.; Piir, G.; Maran, U. Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. Pharmaceutics 2022, 14, 2248.

QDB archive DOI: 10.15152/QDB.257   DOWNLOAD

QsarDB content

Property logS0: Intrinsic aqueous solubility from single source [log(mol/L)]

M1: Model with Dragon descriptors from training set 1

Regression model (regression)

Open in:QDB Explorer QDB Predictor

Name Type n

R2

σ

Training set training 81 0.670 0.823
Validation set external validation 42 0.785 0.842
Tight test set external validation 100 0.508 0.936
Loose test set external validation 32 0.749 1.115
Test sets together external validation 132 0.659 0.980

Property logS0a: Intrinsic aqueous solubility from multiple sources [log(mol/L)]

M2: Model with RDKit descriptors from training set 2

Regression model (regression)

Open in:QDB Explorer QDB Predictor

Name Type n

R2

σ

Training set training 346 0.624 1.004
Validation set external validation 90 0.697 0.964
Tight test set external validation 100 0.521 0.933
Loose test set external validation 32 0.648 1.300
Test sets together external validation 132 0.602 1.031
M3: Model with PaDEL and XLOGS descriptors from training set 2

Regression model (regression)

Open in:QDB Explorer QDB Predictor

Name Type n

R2

σ

Training set training 346 0.671 0.939
Validation set external validation 90 0.785 0.813
Tight test set external validation 100 0.520 0.962
Loose test set external validation 32 0.794 1.010
Test sets together external validation 132 0.652 0.971
M_cons: Consensus model (average of predictions from M1, M2 and M3)

Regression model ensemble (regression)

Open in:QDB Explorer QDB Predictor

Name Type n

R2

σ

Training set i training 345 0.693 0.914
Tight test set external validation 100 0.571 0.861
Loose test set external validation 32 0.787 1.021
Test sets together external validation 132 0.694 0.898

Citing

When using this QDB archive, please cite (see details) it together with the original article:

  • Oja, M.; Sild, S.; Piir, G.; Maran, U. Data for: Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. QsarDB repository, QDB.257. 2022. http://dx.doi.org/10.15152/QDB.257

  • Oja, M.; Sild, S.; Piir, G.; Maran, U. Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. Pharmaceutics 2022, 14, 2248. http://dx.doi.org/10.3390/pharmaceutics14102248

Metadata

Show full item record

Title: Oja, M.; Sild, S.; Piir, G.; Maran, U. Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. Pharmaceutics 2022, 14, 2248.
Abstract: Intrinsic aqueous solubility is a foundation property for understanding chemical, technological, pharmaceutical, and environmental behavior of drug substances. Despite years of solubility research, molecular structure-based prediction of the intrinsic aqueous solubility of drug substances is still under active investigation. This paper describes the authors’ systematic data-driven modelling in which two fit-for-purpose training data sets for intrinsic aqueous solubility were collected and curated, and three quantitative structure-property relationships were derived to make predictions for the most recent solubility challenge. All three models are performing well individually, while being mechanistically transparent and easy to understand. Molecular descriptors involved in the models are related to the following key steps in the solubility process: dissociation of the molecule from the crystal, formation of a cavity in the solvent, and insertion of the molecule into the solvent. A consensus modeling approach with these models remarkably improved prediction capability and reduced the number of strong outliers by more than two times. The performance and outliers of the second solubility challenge predictions were analyzed retrospectively. All developed models have been published in the QsarDB repository according to FAIR principles and can be used without restrictions for exploring, downloading, and predictions.
URI: http://hdl.handle.net/10967/257
http://dx.doi.org/10.15152/QDB.257
Date: 2022-10-12


Files in this item

Name Description Format Size View
2022P2248.qdb.zip Models for intrinsic water solubility application/zip 194.3Kb View/Open
Files associated with this item are distributed
under Creative Commons license.

This item appears in the following Collection(s)

Show full item record