TY - JOUR
T1 - How the energy evaluation method used in the geometry optimization step affect the quality of the subsequent QSAR/QSPR models
AU - Rinnan, Åsmund
AU - Christensen, Niels Johan
AU - Engelsen, Søren Balling
PY - 2010/1
Y1 - 2010/1
N2 - The quantitative influence of the choice of energy evaluation method used in the geometry optimization step prior to the calculation of molecular descriptors in QSAR and QSPR models was investigated. A total of 11 energy evaluation methods on three molecular datasets (toxicological compounds, aromatic compounds and PPARγ agonists) were studied. The methods employed were: MMFF94 s, MM3*with ε r (relative dielectric constant) = 1, MM3*with ε r = 80, AM1, PM3, HF/STO-3G, HF/6-31G, HF/6-31G(d,p), B3LYP/STO-3G, B3LYP/6-31G, and B3LYP/6-31G(d,p). The 3D-descriptors used in the QSAR/QSPR models were calculated with commercially available molecular descriptor programs primarily directed toward pharmaceutical research. In order to evaluate the uncertainties involved in the QSAR/QSPR predictions bootstrapping was used to validate all models using 1,000 drawings for each data set. The scale free error-term, q 2, was used to compare the relative quality of the models resulting from different optimization methods on the same set of molecules. Depending on the dataset, the average 0.632 bootstrap estimated q 2 varies from 0.55 to 0.57 for the toxicological compounds, from 0.58 to 0.62 for the aromatic compounds, and from 0.69 to 0.75 for the PPARγ agonists. The B3LYP/6-31G(d,p) provided the best overall results, albeit the increase in q 2 was small in all cases. The results clearly indicate that the choice of the energy evaluation method has very limited impact. This study suggests that QSAR or QSPR studies might benefit from the choice of a rapid optimization method with little or no loss in model accuracy.
AB - The quantitative influence of the choice of energy evaluation method used in the geometry optimization step prior to the calculation of molecular descriptors in QSAR and QSPR models was investigated. A total of 11 energy evaluation methods on three molecular datasets (toxicological compounds, aromatic compounds and PPARγ agonists) were studied. The methods employed were: MMFF94 s, MM3*with ε r (relative dielectric constant) = 1, MM3*with ε r = 80, AM1, PM3, HF/STO-3G, HF/6-31G, HF/6-31G(d,p), B3LYP/STO-3G, B3LYP/6-31G, and B3LYP/6-31G(d,p). The 3D-descriptors used in the QSAR/QSPR models were calculated with commercially available molecular descriptor programs primarily directed toward pharmaceutical research. In order to evaluate the uncertainties involved in the QSAR/QSPR predictions bootstrapping was used to validate all models using 1,000 drawings for each data set. The scale free error-term, q 2, was used to compare the relative quality of the models resulting from different optimization methods on the same set of molecules. Depending on the dataset, the average 0.632 bootstrap estimated q 2 varies from 0.55 to 0.57 for the toxicological compounds, from 0.58 to 0.62 for the aromatic compounds, and from 0.69 to 0.75 for the PPARγ agonists. The B3LYP/6-31G(d,p) provided the best overall results, albeit the increase in q 2 was small in all cases. The results clearly indicate that the choice of the energy evaluation method has very limited impact. This study suggests that QSAR or QSPR studies might benefit from the choice of a rapid optimization method with little or no loss in model accuracy.
U2 - 10.1007/s10822-009-9308-x
DO - 10.1007/s10822-009-9308-x
M3 - Journal article
C2 - 19943083
SN - 0920-654X
VL - 24
SP - 17
EP - 22
JO - Journal of Computer-Aided Molecular Design
JF - Journal of Computer-Aided Molecular Design
IS - 1
ER -