How the energy evaluation method used in the geometry optimization step affect the quality of the subsequent QSAR/QSPR models

Åsmund Rinnan; Niels Johan Christensen; Søren Balling Engelsen

doi:10.1007/s10822-009-9308-x

How the energy evaluation method used in the geometry optimization step affect the quality of the subsequent QSAR/QSPR models

Åsmund Rinnan, Niels Johan Christensen, Søren Balling Engelsen

8 Citationer (Scopus)

1317 Downloads (Pure)

Abstract

The quantitative influence of the choice of energy evaluation method used in the geometry optimization step prior to the calculation of molecular descriptors in QSAR and QSPR models was investigated. A total of 11 energy evaluation methods on three molecular datasets (toxicological compounds, aromatic compounds and PPARγ agonists) were studied. The methods employed were: MMFF94 s, MM3*with ε _r (relative dielectric constant) = 1, MM3*with ε _r = 80, AM1, PM3, HF/STO-3G, HF/6-31G, HF/6-31G(d,p), B3LYP/STO-3G, B3LYP/6-31G, and B3LYP/6-31G(d,p). The 3D-descriptors used in the QSAR/QSPR models were calculated with commercially available molecular descriptor programs primarily directed toward pharmaceutical research. In order to evaluate the uncertainties involved in the QSAR/QSPR predictions bootstrapping was used to validate all models using 1,000 drawings for each data set. The scale free error-term, q ², was used to compare the relative quality of the models resulting from different optimization methods on the same set of molecules. Depending on the dataset, the average 0.632 bootstrap estimated q ² varies from 0.55 to 0.57 for the toxicological compounds, from 0.58 to 0.62 for the aromatic compounds, and from 0.69 to 0.75 for the PPARγ agonists. The B3LYP/6-31G(d,p) provided the best overall results, albeit the increase in q ² was small in all cases. The results clearly indicate that the choice of the energy evaluation method has very limited impact. This study suggests that QSAR or QSPR studies might benefit from the choice of a rapid optimization method with little or no loss in model accuracy.

Originalsprog	Engelsk
Tidsskrift	Journal of Computer - Aided Molecular Design
Vol/bind	24
Udgave nummer	1
Sider (fra-til)	17-22
Antal sider	6
ISSN	0920-654X
DOI	https://doi.org/10.1007/s10822-009-9308-x
Status	Udgivet - jan. 2010

Adgang til dokumentet

10.1007/s10822-009-9308-x

How the energy evaluation method used in the geometry optimization step affect the quality of the subsequent QSAR/QSPR modelsIndsendt manuskript, 314 KB

Citationsformater

@article{ffb35d8c26ed45c3830a82e04cae22b8,

title = "How the energy evaluation method used in the geometry optimization step affect the quality of the subsequent QSAR/QSPR models",

abstract = "The quantitative influence of the choice of energy evaluation method used in the geometry optimization step prior to the calculation of molecular descriptors in QSAR and QSPR models was investigated. A total of 11 energy evaluation methods on three molecular datasets (toxicological compounds, aromatic compounds and PPARγ agonists) were studied. The methods employed were: MMFF94 s, MM3*with ε r (relative dielectric constant) = 1, MM3*with ε r = 80, AM1, PM3, HF/STO-3G, HF/6-31G, HF/6-31G(d,p), B3LYP/STO-3G, B3LYP/6-31G, and B3LYP/6-31G(d,p). The 3D-descriptors used in the QSAR/QSPR models were calculated with commercially available molecular descriptor programs primarily directed toward pharmaceutical research. In order to evaluate the uncertainties involved in the QSAR/QSPR predictions bootstrapping was used to validate all models using 1,000 drawings for each data set. The scale free error-term, q 2, was used to compare the relative quality of the models resulting from different optimization methods on the same set of molecules. Depending on the dataset, the average 0.632 bootstrap estimated q 2 varies from 0.55 to 0.57 for the toxicological compounds, from 0.58 to 0.62 for the aromatic compounds, and from 0.69 to 0.75 for the PPARγ agonists. The B3LYP/6-31G(d,p) provided the best overall results, albeit the increase in q 2 was small in all cases. The results clearly indicate that the choice of the energy evaluation method has very limited impact. This study suggests that QSAR or QSPR studies might benefit from the choice of a rapid optimization method with little or no loss in model accuracy.",

author = "{\AA}smund Rinnan and Christensen, {Niels Johan} and Engelsen, {S{\o}ren Balling}",

year = "2010",

month = jan,

doi = "10.1007/s10822-009-9308-x",

language = "English",

volume = "24",

pages = "17--22",

journal = "Journal of Computer - Aided Molecular Design",

issn = "0920-654X",

publisher = "Springer",

number = "1",

}

TY - JOUR

T1 - How the energy evaluation method used in the geometry optimization step affect the quality of the subsequent QSAR/QSPR models

AU - Rinnan, Åsmund

AU - Christensen, Niels Johan

AU - Engelsen, Søren Balling

PY - 2010/1

Y1 - 2010/1

N2 - The quantitative influence of the choice of energy evaluation method used in the geometry optimization step prior to the calculation of molecular descriptors in QSAR and QSPR models was investigated. A total of 11 energy evaluation methods on three molecular datasets (toxicological compounds, aromatic compounds and PPARγ agonists) were studied. The methods employed were: MMFF94 s, MM3*with ε r (relative dielectric constant) = 1, MM3*with ε r = 80, AM1, PM3, HF/STO-3G, HF/6-31G, HF/6-31G(d,p), B3LYP/STO-3G, B3LYP/6-31G, and B3LYP/6-31G(d,p). The 3D-descriptors used in the QSAR/QSPR models were calculated with commercially available molecular descriptor programs primarily directed toward pharmaceutical research. In order to evaluate the uncertainties involved in the QSAR/QSPR predictions bootstrapping was used to validate all models using 1,000 drawings for each data set. The scale free error-term, q 2, was used to compare the relative quality of the models resulting from different optimization methods on the same set of molecules. Depending on the dataset, the average 0.632 bootstrap estimated q 2 varies from 0.55 to 0.57 for the toxicological compounds, from 0.58 to 0.62 for the aromatic compounds, and from 0.69 to 0.75 for the PPARγ agonists. The B3LYP/6-31G(d,p) provided the best overall results, albeit the increase in q 2 was small in all cases. The results clearly indicate that the choice of the energy evaluation method has very limited impact. This study suggests that QSAR or QSPR studies might benefit from the choice of a rapid optimization method with little or no loss in model accuracy.

AB - The quantitative influence of the choice of energy evaluation method used in the geometry optimization step prior to the calculation of molecular descriptors in QSAR and QSPR models was investigated. A total of 11 energy evaluation methods on three molecular datasets (toxicological compounds, aromatic compounds and PPARγ agonists) were studied. The methods employed were: MMFF94 s, MM3*with ε r (relative dielectric constant) = 1, MM3*with ε r = 80, AM1, PM3, HF/STO-3G, HF/6-31G, HF/6-31G(d,p), B3LYP/STO-3G, B3LYP/6-31G, and B3LYP/6-31G(d,p). The 3D-descriptors used in the QSAR/QSPR models were calculated with commercially available molecular descriptor programs primarily directed toward pharmaceutical research. In order to evaluate the uncertainties involved in the QSAR/QSPR predictions bootstrapping was used to validate all models using 1,000 drawings for each data set. The scale free error-term, q 2, was used to compare the relative quality of the models resulting from different optimization methods on the same set of molecules. Depending on the dataset, the average 0.632 bootstrap estimated q 2 varies from 0.55 to 0.57 for the toxicological compounds, from 0.58 to 0.62 for the aromatic compounds, and from 0.69 to 0.75 for the PPARγ agonists. The B3LYP/6-31G(d,p) provided the best overall results, albeit the increase in q 2 was small in all cases. The results clearly indicate that the choice of the energy evaluation method has very limited impact. This study suggests that QSAR or QSPR studies might benefit from the choice of a rapid optimization method with little or no loss in model accuracy.

U2 - 10.1007/s10822-009-9308-x

DO - 10.1007/s10822-009-9308-x

M3 - Journal article

C2 - 19943083

SN - 0920-654X

VL - 24

SP - 17

EP - 22

JO - Journal of Computer - Aided Molecular Design

JF - Journal of Computer - Aided Molecular Design

IS - 1

ER -

How the energy evaluation method used in the geometry optimization step affect the quality of the subsequent QSAR/QSPR models

Abstract

Adgang til dokumentet

Fingeraftryk

Citationsformater