Abstract (EN):
Three-dimensional models, or pharmacophores, describing Euclidean constraints on the location on small molecules of functional groups (like hydrophobic groups, hydrogen acceptors and donors, etc.), are often used in drug design to describe the medicinal activity of potential drugs (or `ligands'). This medicinal activity is produced by interaction of the functional groups on the ligand with a binding site on a target protein. In identifying structure-activity relations of this kind there are three principal issues: (1) It is often dicult to \align" the ligands in order to identify common structural properties that may be responsible for activity; (2) Ligands in solution can adopt dierent shapes (or
`conformations') arising from torsional rotations about bonds. The 3-D
molecular substructure is typically sought on one or more low-energy conformers; and (3) Pharmacophore models must, ideally, predict medicinal
activity on some quantitative scale. It has been shown that the logical
representation adopted by Inductive Logic Programming (ILP) naturally
resolves many of the diculties associated with the alignment and multiconformation issues. However, the predictions of models constructed by
ILP have hitherto only been nominal, predicting medicinal activity to
be present or absent. In this paper, we investigate the construction of
two kinds of quantitative pharmacophoric models with ILP: (a) Models
that predict the probability that a ligand is \active"; and (b) Models
that predict the actual medicinal activity of a ligand. Quantitative predictions
are obtained by the utilising the following statistical procedures as background knowledge: logistic regression and naive Bayes, for probability prediction; linear and kernel regression, for activity prediction. The multi-conformation issue and, more generally, the relational representation used by ILP results in some special diculties in the use of any statistical procedure. We present the principal issues and some solutions. Specically, using data on the inhibition of the protease Thermolysin, we demonstrate that it is possible for an ILP program to construct good quantitative structure-activity models. We also comment on the relationship of this work to other recent developments in statistical relational learning.
Idioma:
Inglês
Tipo (Avaliação Docente):
Científica
Contacto:
Rui Camacho
Tipo de Licença: