Density Functional Theory Applied to Copper Proteins

Chemical research is often associated with laboratories with all kinds of boiling pots and test-tubes, toxic waste by chemical compounds and more of these stereotypes. However, it is not a true image of reality. Even though a large portion of current chemical research is still carried out in the aforementioned experimental manner, a "new" trend within chemical research is the application of clean chemistry, by performing theoretical calculations.

This way of doing research is not exactly new, as the basic principles (of the so-called quantum chemistry) had already been formulated in the beginning of the 20th century. However, for a long time the application of quantum chemistry was limited to relatively small molecules. With the computers getting faster and more powerful, it has been possible in the last couple of years to also take a look at large (bio-)systems. Especially the investigation of metalloproteins by quantum chemical calculations has increased considerably.

The study of metalloproteins is of major importance as they regulate several elementary processes within the human body, as well as in plants and animal life. The application of metal atoms in proteins enable that these processes be carried out under normal "human" circumstances. Copper proteins often function as electron transfer proteins within the complex chemistry in amongst others the human body; that is, at one place an electron is taken up, after which the protein moves to another place and delivers the electron. Using copper for this is "natural" as it is both in the mono-valent (CuI) as the divalent (CuII) form strongly bound to the protein, which is accompanied by a large stability of the protein. Furthermore, often the protein forces the copper atom in a certain conformation, which is midway between the mono- and divalent preferred conformations. This enhances electron transfer and enables the protein to perform its function even more efficiently.

The application of quantum chemistry to metalloproteins shows the same increase as the usage of DFT (Density Functional Theory) within chemical research. Until a few years ago examples of both were relatively scarce, but developments within the field of DFT (as the construction of improved [GGA] potentials, more efficient computer programs with the help of for instance parallel computing techniques) have enabled a situation where it is nowadays almost standard to incorporate (DFT) calculations in (bio)chemical studies.

Still, biosystems like (metallo)proteins remain to be too large to be described completely by quantum chemistry. Moreover, a detailed method would be used for large parts of the system where a less accurate description (by using specially designed force fields) would suffice to describe the relevant processes well. These standard force fields are known to give a good description for capturing dynamical properties, unless there is a metal atom present in the protein, as the interactions of the metal atom with the protein are difficult to generalize in a generally applicable force field. The development of a force field specific for a certain (type of) metalloprotein was one of the aims of the research described in this thesis.

Copper proteins are classified into several types, depending on their structural and spectroscopic properties. Type 2 copper proteins show electron paramagnetic spectra similar to "normal" copper complexes, type 3 proteins show no paramagnetic activity at all, while type 1 copper proteins are characterized by a bright blue color. Therefore, the latter are usually referred to as blue copper proteins. The protein studied in this thesis is azurin, a type 1 copper protein that probably serves as electron transfer protein and that consists of 128 (Pseudomonas aeruginosa) or 129 (Alcaligenes denitrificans) amino acid residues. The copper atom is, just like in all other type 1 copper proteins, in the active site bound to three strong ligands (two histidine and one cysteine amino acid residues) that lie approximately in one plane with the copper atom. There are also two axial amino acid residues (glycine and methionine) that are coordinated towards the copper. In this respect, the glycine residue is special as azurin is the only type 1 copper protein where this residue is coordinated towards copper.

Several methods were developed to obtain the necessary force field parameters for the copper proteins from DFT data. The first method (MDC charge analysis) gives atomic charges that are representative for the studied system; the most relevant molecular and atomic properties (multipoles) for the evaluation of the Coulomb potential are represented exactly. The second method (IntraFF) extracts from the computed Hessian matrix, a matrix with the second derivatives of the energy with respect to the atomic coordinates, values for force constants that can be used in force field calculations or simulations for interactions for which no standard force constant is available.

At first however, a few properties of the usage of DFT were determined, in particular the accuracy of predicted values of molecular polarizabilities, geometries of molecules, and a comparison of the newly developed charge analysis with other methods. The results show that DFT can predict the geometry or polarizability of a molecule or the charge distribution within a molecule very accurately, with an accuracy and efficiency that is difficult to attain with other quantum chemical methods.

DFT was also used to check the proposed mechanism of two reactions. The first reaction that was studied is coming from organic chemistry and involves a zinc atom. The different effects of an aminoalcohol or an aminothiol on the asymmetric addition of dialkylzinc molecules to an aldehyde were investigated; the enantiomeric selectivity as well as the effect on the reaction rate of the catalyst can be explained by the computed results. The second reaction is more in the direction of the aim of the PhD research, as an enzyme was investigated with a copper atom present in the active site. The different stages that lead from the initial complexation of the substrate to the formation of the product were studied, where a few important intermediates were localized.

We then arrive at the studies on copper proteins, where the azurin protein was investigated in its natural (wildtype) form as well as in several other forms where different amino acid residues have been replaced by others. The influence of these replacements (mutations) on relevant properties of the protein has been extensively tested experimentally, which gives a solid framework against which the calculations can be checked. The charge distribution within the active site of azurin (both wildtype as mutated) is shown to have a coherent structure: the way in which the electrons are spread over the site varies little when looking at the different azurin molecules; in all cases the total charge on copper is considerably lower than its formal value, with a large amount of delocalization of charge over the three in-plane ligands, as well as the axial methionine group. This suggests a role as ligand also for the methionine, while the other axial group, glycine, can be better described as coordinating group.

Apart from studying the charge distribution in the active site also force constant values were determined for the bonds between copper and the five amino acid residues in the active site. These values were found to be strongly dependent on the geometry of the site that was used, which is a direct consequence of the anharmonicity of the bond. As the geometry used was taken directly from the experimentally determined crystal structure with a corresponding uncertainty of 0.1-0.2 Angstrom, the harmonic force field parameters should be judged cautiously. It might be better to use the anharmonic parameters, which are also given by the IntraFF method. On the other hand, a molecular dynamics simulation of wildtype azurin using the harmonic IntraFF parameters and the MDC charges results in a stable active site and protein in the simulation, with copper-ligand distances that compare well with experimental values and vibrational frequencies that are in the right range of 200-500 cm-1.

Metalloproteins are often characterized by UV/VIS and EPR/ESR spectroscopy. The spectra of the proteins can therefore be used as reference for comparison with the computed characteristics. In the case of EPR spectroscopy, the so-called g-tensor is dealt with, which is determined by an unpaired electron. Using DFT to compute this tensor seems to be sometimes problematic for systems containing metal atoms, with an underestimation of the z-value of the tensor by up to 50 %. Also the prediction of the hyperfine coupling tensor of copper is sometimes mediocre, with z-values up to three times too large. On the contrary, the hyperfine couplings of the other atoms are reasonably well predicted, especially in the case of copper proteins where the differences between the two histidine residues in the active site are well predicted.

The UV/VIS spectra (or excitation energies) of copper proteins can be computed with DFT only for the reduced state (with copper formally in its +1 redox state), not for the oxidized state (formal redox state +2). Therefore, the UV/VIS spectra were predicted by semi-empirical CNDO/INDO calculations, both for the active site alone as well as the site surrounded by the protein and solvent. The presence of the surroundings was shown to play a decisive role for the computed excitation energies. In this study, the parameters for copper in the reduced state were obtained from fitting the computed semi-empirical excitation energies to the computed DFT energies.

In order to avoid the dependency on the crystal structure with the corresponding uncertainty of 0.1-0.2 Angstrom as well as to investigate the influence of the protein environment on the geometry of the active site, QM/MM calculations were performed. In these calculations the complete protein (including a shell of water molecules around it) was taken into account in the calculation, where the active site was treated with DFT (QM system) and the rest of the protein (and solvent molecules) with a classical force field (MM system), which is designed for the description of biosystems like proteins. A new model (AddRemove) for the direct coupling of the QM and MM systems was developed, in which hydrogen atoms are added in the DFT calculation to satisfy the valences of the QM system. Afterwards the interactions of the added hydrogens with the real QM atoms are corrected for; therefore the artificial introduction of the hydrogens has in principle no effect on the geometry and/or energy.

The geometry of a number of azurin molecules (wildtype, mutated and metal substituted) was optimized by the QM/MM method, where the total number of atoms varied from ca. 2200 to 14.000, depending on the number of water molecules that were taken into consideration in the calculation. The optimized geometry of the active site is in general in good agreement with experimentally observed structures (either crystal or EXAFS). The only real discrepancy is found for the reduced state of the Met121Gln mutant; the computed structure is similar to the reduced wildtype azurin structure, while the crystal structure shows a large deformation. The difference between the computed and experimental structure is that large, and contradictory to the general agreement for the other molecules, that it is not expected that there is something wrong with the calculations. It rather seems that some (unexpected) other effect plays a decisive role here.

PhD thesis Marcel Swart