REPOSITORY ABOUT GUIDELINES CITING BLOG

Aptula, A.O.; Jeliazkova, N.G.; Schultz, T.W.; Cronin, M.T.D. The Better Predictive Model: High q2 for the Training Set or Low Root Mean Square Error of Prediction for the Test Set?. QSAR Comb. Sci. 2005, 24, 3, 385–396.

QsarDB Repository

Aptula, A.O.; Jeliazkova, N.G.; Schultz, T.W.; Cronin, M.T.D. The Better Predictive Model: High q2 for the Training Set or Low Root Mean Square Error of Prediction for the Test Set?. QSAR Comb. Sci. 2005, 24, 3, 385–396.

QDB archive DOI: 10.15152/QDB.83   DOWNLOAD

QsarDB content

Property MOA: Mode of action

Property pIGC50: 40-h Tetrahymena toxicity as log(1/IGC50) [log(L/mmol)]

Citing

When using this QDB archive, please cite (see details) it together with the original article:

  • Ruusmann, V. Data for: The Better Predictive Model: High q2 for the Training Set or Low Root Mean Square Error of Prediction for the Test Set?. QsarDB repository, QDB.83. 2012. http://dx.doi.org/10.15152/QDB.83

  • Aptula, A. O.; Jeliazkova, N. G.; Schultz, T. W.; Cronin, M. T. D. The Better Predictive Model: High q2 for the Training Set or Low Root Mean Square Error of Prediction for the Test Set?. QSAR Comb. Sci. 2005, 24, 385–396. http://dx.doi.org/10.1002/qsar.200430909

Metadata

Show simple item record

dc.date.accessioned 2012-05-23T16:12:38Z
dc.date.available 2012-05-23T16:12:38Z
dc.date.issued 2012-05-23
dc.identifier.uri http://hdl.handle.net/10967/83
dc.identifier.uri http://dx.doi.org/10.15152/QDB.83
dc.description.abstract The process of validation of computational models (e.g., QSARs) may become the most important step in their development. Different requirements for the reliability and predictability of QSAR models have been described in the literature. Despite these formal recommendations there are few practical rules as to when to cease adding variables to a QSAR (i.e., what is an appropriate level of complexity of the model). In this work the influence of model complexity to statistical fit and error have been investigated using toxicity data for 200 phenols to the ciliated protozoan Tetrahymena pyriformis when applying a test set of a further 50 compounds. The results from this investigation showed that some important factors play a role in the definition of a good and reliable QSAR. These include the fact that q2 is not a good criterion for a model predictivity; that outliers should not necessarily be deleted as this may reduce the chemical space of the model; the number of descriptors in a multivariate model should be chosen carefully to avoid model under- and over-estimation; and that an appropriate number of dimensions is required for PLS modelling.
dc.publisher Villu Ruusmann
dc.rights Attribution 4.0 International
dc.rights.uri http://creativecommons.org/licenses/by/4.0/
dc.title Aptula, A.O.; Jeliazkova, N.G.; Schultz, T.W.; Cronin, M.T.D. The Better Predictive Model: High q2 for the Training Set or Low Root Mean Square Error of Prediction for the Test Set?. QSAR Comb. Sci. 2005, 24, 3, 385–396.
qdb.property.endpoint 6. Other (Acute toxicity to ciliate protozoa)
qdb.property.species Tetrahymena pyriformis
qdb.descriptor.application ACD/Labs
qdb.descriptor.application MOPAC
bibtex.entry article
bibtex.entry.author Aptula, A. O.
bibtex.entry.author Jeliazkova, N. G.
bibtex.entry.author Schultz, T. W.
bibtex.entry.author Cronin, M. T. D.
bibtex.entry.doi 10.1002/qsar.200430909
bibtex.entry.journal QSAR Comb. Sci.
bibtex.entry.number 3
bibtex.entry.pages 385–396
bibtex.entry.title The Better Predictive Model: High q2 for the Training Set or Low Root Mean Square Error of Prediction for the Test Set?
bibtex.entry.volume 24
bibtex.entry.year 2005


Files in this item

Name Description Format Size View
109925323.qdb.zip n/a application/zip 11.22Kb View/Open
Files associated with this item are distributed
under Creative Commons license.

This item appears in the following Collection(s)

Show simple item record