Abstract
Structure-based drug design is made possible by our understanding of molecular recognition.
The utility of this approach was apparent in the development of the clinically e V ective HIV-1
PR inhibitors, where crystal structures of complexes of HIV-1 protease and inhibitors gave
pivotal information. Computational methods drawing upon structural data are of increasing
relevance
... read more
to the drug design process. Nonetheless, these methods are quite rudimentary and
signicant improvements are needed. The aim of this thesis was to investigate techniques
which may lead to improved modelling of molecular recognition and a better ability to make
predictions about the binding a Y nity of ligands. The two main themes were the modelling
of acid–base titration behaviour of ligand and receptor, and the application of the simulation
technique of congurational bias Monte Carlo (CBMC). The studies were performed with
HIV-1 PR and its inhibitors as a model system.
Biological processes are inuenced by the pH of the medium in which they take place.
Ligand–receptor binding equilibria are often thermodynamically linked to protonation changes
in ligand and/or receptor, as seen in the the binding of a number of HIV-1 PR inhibitors.
In Chapter 2, a series of sixteen continuum electrostatics pKa calculations of HIV-1 PR–
inhibitor complexes was done, in order to characterize the nature and size of these linkages.
The most important e V ects concern changes in the pKa of the enzyme active site aspartate
dyad. Large pKa shifts were predicted in all cases, and at least one of the two dyad pKas
became more basic on binding. At physiologically relevant pH, di V erent ligands induced
di V erent protonation states, with di V erent tautomeric forms favoured. The fully deprotonated
form of the dyad was not signicantly populated for any of the complexes. For about a third
of the complexes, both singly and doubly protonated forms were predicted to be populated.
The predicted predominant protonation states of MVT-101 and VX-478 were consistent with
previous theoretical studies. The size of the predicted pKa shifts for MVT-101 and XK263
di V ered from a previous study using similar methods. The paucity and ambiguity of available
experimental data makes it di Y cult to evaluate the results fully; however the tendency to
exaggerate shifts, as observed in other studies, appears to be present.
“Scoring” is the prediction of binding a Y nity from the structure of the ligand–receptor
complex, according to an empirical scheme. Scoring studies usually neglect or grossly simplify
the contribution of protonation equilibria to a Y nity, so in Chapter 3 proton linkage data was
included in a regression analysis of the HIV-1 PR complexes from Chapter 2. Parameters
previously shown to correlate with binding, namely electrostatic free energy changes and
buried surface areas, were the basis for the analysis, and terms describing proton linkage,
in the form of a correction for assay pH and an indicator variable for predicted dyad pKa
shift on binding, were also considered. The complex with MVT-101 was an outlier in the
analysis and was excluded. Further analysis demonstrated that the correction for assay pH made a signicant contribution to the regression equation. Amendment of the parameters
for XK263 according to the available experimental data led to an improved regression in
which the term for calculated pKa shifts also made a signicant contribution. The regression
equations obtained had the same form and similar coe Y cients to scoring functions of the
“master equation” type, and t the experimental data with comparable accuracy.
More physically realistic simulations of ligand–receptor binding using the techniques of
molecular dynamics (MD) or Monte Carlo (MC) are potentially more accurate than scoring
function approaches. These methods are slow, so the alternative of CBMC, which has been
shown to give faster convergence for polymer simulations, was implemented for C harmm 22,
an all-atom protein force eld (Chapter 4). The correctness of the implementation was
demonstrated by comparison with exact and stochastic dynamics (SD) results for individual
terms in the force eld. The algorithm is more complex than those typically used with alkane
force elds, and this has possible consequences for the e Y ciency. CBMC was used to generate
a Ramachandran plot for the alanine dipeptide, and the results were found to be in agreement
with those generated by a SD simulation. Analysis of statistical errors suggests that CBMC
should be competitive with umbrella sampling for simulating conformational equilibria, par-
ticularly when the cost of non-bonded energy evaluations dominates the simulation.
CBMC can be applied to ligand–receptor binding, as demonstrated in grand canonical
simulations of alkane adsorption in zeolites. The more limited problem of nding the pre-
dominant bound conformation of a exible ligand given a rigid protein receptor (i.e. “dock-
ing”) was treated in Chapter 5, using the example of a tripeptide inhibitor which binds to
HIV-1 PR. Attempts to perform the docking using the Metropolis MC/simulated annealing
and Lamarckian genetic algorithm methods implemented in the program AutoDock failed
to reproduce the native conguration (with runs on the order of two days execution time).
Docking using CBMC, combined with parallel tempering to further improve sampling, was
successful in nding the native binding mode, although this success was dependent on ad hoc
adjustments to the force eld, and a priori knowledge of the ligand protonation state and bind-
ing site. The e Y ciency of the method was considerably lower than hoped, with problems
due to the force eld- and model-dependent coupling between terms in the potential energy
function, and the “greedy” nature of the CBMC algorithm.
Various conclusions can be drawn from these studies. Chapters 2 and 3 provide evidence
of the importance of protonation equilibria in ligand–protein molecular recognition, and un-
derline the sizable contribution of electrostatic interactions to binding energies. In the face of
this nding, neglect of electrostatic terms, as often seen past studies, appears to be counterpro-
ductive. The scoring study also shows how experimental data can be used more e V ectively if
factors such as assay conditions are carefully taken into account. Implementation of CBMC for
a widely-used protein force eld and application of the algorithm to docking (Chapters 4 and
5) represents a proof of concept for a broadly useful simulation technique. Further work will
be required to nd the right niche for CBMC and fully explore the potential of this and re-
lated techniques. A nal point is the demonstrated utility of the HIV-1 PR test system which
formed the focus of the studies. Abundant structural data has enabled many new approaches
to be tested, and further insights are expected from the analysis of unusual cases, such as the
anomalous results for MVT-101. As well as the question of scoring, studies of mutation and
resistance are likely to attract considerable interest in the future.
show less