Oja, M.; Sild, S.; Piir, G.; Maran, U. Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. Pharmaceutics 2022, 14, 2248.

QsarDB Repository

Oja, M.; Sild, S.; Piir, G.; Maran, U. Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. Pharmaceutics 2022, 14, 2248.

QDB archive DOI: 10.15152/QDB.257   DOWNLOAD

QsarDB content

Property logS0: Intrinsic aqueous solubility from single source [log(mol/L)]

M1: Model with Dragon descriptors from training set 1

Regression model (regression)

Open in:QDB ExplorerQDB Predictor

NameTypen

R2

σ

Training settraining810.6700.823
Validation setexternal validation420.7850.842
Tight test setexternal validation1000.5080.936
Loose test setexternal validation320.7491.115
Test sets togetherexternal validation1320.6590.980

Property logS0a: Intrinsic aqueous solubility from multiple sources [log(mol/L)]

M2: Model with RDKit descriptors from training set 2

Regression model (regression)

Open in:QDB ExplorerQDB Predictor

NameTypen

R2

σ

Training settraining3460.6241.004
Validation setexternal validation900.6970.964
Tight test setexternal validation1000.5210.933
Loose test setexternal validation320.6481.300
Test sets togetherexternal validation1320.6021.031
M3: Model with PaDEL and XLOGS descriptors from training set 2

Regression model (regression)

Open in:QDB ExplorerQDB Predictor

NameTypen

R2

σ

Training settraining3460.6710.939
Validation setexternal validation900.7850.813
Tight test setexternal validation1000.5200.962
Loose test setexternal validation320.7941.010
Test sets togetherexternal validation1320.6520.971
M_cons: Consensus model (average of predictions from M1, M2 and M3)

Regression model ensemble (regression)

Open in:QDB ExplorerQDB Predictor

NameTypen

R2

σ

Training set itraining3450.6930.914
Tight test setexternal validation1000.5710.861
Loose test setexternal validation320.7871.021
Test sets togetherexternal validation1320.6940.898

Citing

When using this QDB archive, please cite (see details) it together with the original article:

  • Oja, M.; Sild, S.; Piir, G.; Maran, U. Data for: Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. QsarDB repository, QDB.257. 2022. http://dx.doi.org/10.15152/QDB.257

  • Oja, M.; Sild, S.; Piir, G.; Maran, U. Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. Pharmaceutics 2022, 14, 2248. http://dx.doi.org/10.3390/pharmaceutics14102248

Metadata

Show simple item record

dc.date.accessioned2022-10-12T13:54:19Z
dc.date.available2022-10-12T13:54:19Z
dc.date.issued2022-10-12
dc.identifier.urihttp://hdl.handle.net/10967/257
dc.identifier.urihttp://dx.doi.org/10.15152/QDB.257
dc.description.abstractIntrinsic aqueous solubility is a foundation property for understanding chemical, technological, pharmaceutical, and environmental behavior of drug substances. Despite years of solubility research, molecular structure-based prediction of the intrinsic aqueous solubility of drug substances is still under active investigation. This paper describes the authors’ systematic data-driven modelling in which two fit-for-purpose training data sets for intrinsic aqueous solubility were collected and curated, and three quantitative structure-property relationships were derived to make predictions for the most recent solubility challenge. All three models are performing well individually, while being mechanistically transparent and easy to understand. Molecular descriptors involved in the models are related to the following key steps in the solubility process: dissociation of the molecule from the crystal, formation of a cavity in the solvent, and insertion of the molecule into the solvent. A consensus modeling approach with these models remarkably improved prediction capability and reduced the number of strong outliers by more than two times. The performance and outliers of the second solubility challenge predictions were analyzed retrospectively. All developed models have been published in the QsarDB repository according to FAIR principles and can be used without restrictions for exploring, downloading, and predictions.en_US
dc.publisherMare Oja
dc.publisherSulev Sild
dc.publisherGeven Piir
dc.publisherUko Maran
dc.rightsAttribution 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.titleOja, M.; Sild, S.; Piir, G.; Maran, U. Intrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substances. Pharmaceutics 2022, 14, 2248.
qdb.property.endpoint1. Physical Chemical Properties 1.3. Water solubilityen_US
qdb.descriptor.applicationDRAGON 6.0.40en_US
qdb.descriptor.applicationRDKit 2016.03.05en_US
qdb.descriptor.applicationXLOGS 1.0en_US
qdb.descriptor.applicationPaDEL-Descriptor 2.21en_US
qdb.prediction.applicationCODESSA Pro 1.0en_US
qdb.prediction.applicationscikit-learn 0.18en_US
qdb.prediction.applicationR 3.5.3en_US
bibtex.entryarticleen_US
bibtex.entry.authorOja, Mare
bibtex.entry.authorSild, Sulev
bibtex.entry.authorPiir, Geven
bibtex.entry.authorMaran, Uko
bibtex.entry.doi10.3390/pharmaceutics14102248
bibtex.entry.journalPharmaceuticsen_US
bibtex.entry.number10
bibtex.entry.pages2248
bibtex.entry.titleIntrinsic aqueous solubility: mechanistically transparent data-driven modeling of drug substancesen_US
bibtex.entry.volume14
bibtex.entry.year2022
qdb.model.typeRegression model (regression)en_US
qdb.model.typeRegression model ensemble (regression)en_US
qdb.descriptor.calculationM1
qdb.descriptor.calculationM2
qdb.descriptor.calculationM3
qdb.descriptor.calculationM_cons


Files in this item

NameDescriptionFormatSizeView
2022P2248.qdb.zipModels for intrinsic water solubilityapplication/zip194.3KbView/Open
Files associated with this item are distributed
under Creative Commons license.

This item appears in the following Collection(s)

Show simple item record