Abstract
Proteins are the wheels and mill stones of the complex machinery that underlies human life. In carrying out their functions, proteins work in close association with other proteins, forming protein complexes. A huge network of protein-protein interactions enables the cell to respond quickly to changes in the environment and to
... read more
communicate with other cells. When the balance of this network is disrupted, diseases such as cancer may result, and for this reason, pharmaceutical drugs are very often targeted at proteins. To fully understand how proteins work together, knowledge at the atomic level is required. X-ray crystallography and NMR are the classical methods for this, solving the three-dimensional structure of many individual proteins as well as protein complexes. However, the number of complexes in the cell is at least one order of magnitude larger than the number of proteins. Moreover, associations between proteins and other macromolecules are often transient and reversible, especially the biologically interesting signal transduction complexes. For many complexes, the 3D structures of the individual proteins in their free, unbound form are known, but the structure of the protein complex itself remains elusive. This thesis deals with two fields of study that aim to shed light on protein complexes by computational means: data-driven docking and interface prediction. Docking, in general, is predicting the structure of a complex starting from the free, unbound structures. Data-driven docking, in particular, uses experimental information during the docking process. Chemical shift perturbation (CSP), hydrogen-deuterium (H/D) exchange and site-directed mutagenesis can identify the interface region; residual dipolar couplings and relaxation anisotropy can provide information about the relative orientation of the proteins. All of this information can be used in HADDOCK, the data-driven docking method developed in our group, and an improved version of HADDOCK is presented in chapter 6 of this thesis. HADDOCK is not limited to complexes between two proteins, but can deal with up to six molecules of proteins, nucleic acids, sugars or small ligands. HADDOCK finished second in the recent cycle of the CAPRI international docking competition. Finally, a web server interface for HADDOCK is presented in chapter 7, facilitating data-driven docking for a larger community. It is shown that even in the absence of experimental data, data-driven docking can be successful when the interface region between the proteins is predicted by computational means. Chapter 4 describes WHISCY, a general-purpose interface prediction program and web server. In chapter 5, pairwise propensities and their use in interface prediction are evaluated. Finally, in chapter 8, CPORT is introduced: a consensus method that combines six interface predictors and that is specifically designed for data-driven docking. In this chapter, prediction-driven docking is successfully applied to a large and diverse benchmark of protein complexes, including signal transduction complexes. While correct solutions could not be obtained for all complexes, the success rate is comparable with state-of-the-art ab initio docking methods, and it is argued that further improvement is still possible.
show less