Recollect this earlier post on the topic of the Baeyer-Villiger reaction. In 1999 natural abundance kinetic isotope effects were reported[1] and I set out to calculate the values predicted for a particular model constructed using Quantum mechanics. This comparison of measurement and calculation is nowadays a standard verification of both experiment and theory. When the two disagree either the computational model is wrong or incomplete, or the remoter possibility that there is something not understood about the experiment.
In this case, as you can see above, the measured 13C KIE at the carbonyl carbon was in the range 1.045-1.051, whereas the theory (ωB97XD/Def2-TZVP/SCRF=DCM) predicted a significantly smaller value of 1.023 for the first kinetic step, the formation of the tetrahedral intermediate. This was the step suggested by Singleton as rate limiting in the forward direction (there was also a larger disagreement on the 2H KIE at C3/C5, measured as 0.97 and calculated as 0.997 but this might simply be a typographical error).
Now we have to find an explanation, and it was contingent upon myself to show that the theory was properly executed. At this stage, Dan Singleton offered some suggestions on this blog. This one related to how I had calculated the KIE, with Dan suggesting that the method I had used might be too inaccurate to draw any conclusions from. It was up to me to reproduce my results using the method suggested by Dan as the more accurate. And this is where this story really starts.
I had used the Free energy method, which involves calculating the free energies of reactant and product using the built-in thermochemical component of the Gaussian code. Dan’s concern was that such free energies are only reported to an accuracy of 0.000001 Hartree. Since the isotope effect is calculated from differences in these free energies, was this degree of precision high enough to ensure reliable calculated KIE? Better to solve the equations directly and these involve the so-called Bigeleisen-Mayer terms derived from the classical partition functions. I recollected we had used this method back in 1975[2], but I assumed the computer code used was long since lost by myself. Instead Quiver was suggested (first written around 1988 by Keith Laidig at Yale) and I added the comment that the modern version of Quiver is called THERMISTP[3]. All that needed to be done was to acquire a copy of an appropriate code and re-run the KIE calculation using it.
And now comes the true purpose of this post, which is reproducing a calculation that might have been done ~20 years ago using computer code. Issues such as:
- Does the code still exist in either Source form or Executable form?
- If it does, will it still run correctly on a modern computer?
- Is it documented sufficiently to allow someone not immersed in the code to run it. After ~20 years, it might not be possible to talk to the original coders for explanations.
So this is what happened when I investigated.
- Requests for the THERMISTP code of its authors, by email, have not yet brought a response. They may still, but one has to recognise that they may not. For example (unless I missed it), the article in which THERMISTP is cited[3] does not document how the code described in that article might be obtained (but the authors’ emails are provided).
- I then followed up Dan’s suggestion to use Quiver. A copy was tracked down at http://www.chem.ucla.edu/~mccarren/houkGroup.html and a tar archive was available. The program comes as Quiver itself and a pre-program called qcrunch is used to prepare the data for Quiver.
- It comes with pretty good (and brief!) documentation.
- Fortran source code for Quiver (quiver.f) is available which is good, since one can compile it for one’s favourite machine (in my case a Mac running OS X). One can also read the comments to discover information beyond that provided by the documentation.
- qcrunch however is only provided as an executable, and that means running it on Linux. One cannot read the comments!
- When I ran qcrunch, it complained that an auxiliary program was not available to it. It wanted to run something associated with Gaussian 94, sadly long since gone from our systems. With no source code available I would have to fool qcrunch into thinking that Gaussian09 was really Gaussian94. It involves symbolic links and such and needs a little Unix expertise.
- That done, I gave it its first input data, which is the normal mode analysis of the reactant involved in the reaction above, in the form of a formatted checkpoint file (a .fchk file). It seemed happy, and so I proceeded to the next step
- which was to run quiver. This needed compilation. Here we encounter interesting issues. Most Fortran code can have “dependencies” which relate to the compiler used by its developer. The compiler however is rarely stated in the documentation, and I decided to use GFortran, which I obtained from http://hpc.sourceforge.net and installed on my Mac. And using this compiler, the compilation failed with about 3 syntax errors. You have to know enough Fortran to correct them all and obtain your executable.
- Now I could run quiver to get the Bigeleisen-Mayer functions. It appeared to happily run, but produced rubbish (the anticipated frequencies had silly values of -10,000 cm-1 etc).
- Now you go back to the code, and discover that the dimension statement (yes, Fortran required you to specify the size of your arrays in the code) is limited to 40 atoms! The Baeyer-Villiger system has 48. Easily fixed. Still rubbish out!
- So now you go back to the start and create data for a tiny molecule, water in this case, and again qcrunch and quiver are run. Now sensible results, in the shape of normal mode frequencies of the anticipated values.
- Conclusion: That qcrunch is probably also dimensioned to 40 atoms, but I now have no way of correcting this because I do not (yet?) have the source code for it. It might indeed be lost. If it is, it will have to be re-written from scratch.
- Time elapsed thus far: ~2 weeks of intermittent work on the problem.
- It was at this point I had one of those happy moments of accidental discovery. I updated the OS on my Mac to 10.10.4 (this was released yesterday). One of the new features was TRIM support for solid state third party disks (SSDs). I installed it and decided to test an external SSD connected to my machine. It happened to have lots of old stuff on it. In an idle moment, I decided to search its file base for the program we had written back in 1975 and which I had convinced myself was long-lost. Hey presto, there it was (KINISOT.f). I even found example input and output files. These take the form of frequencies for reactant (normal isotopes), reactant (new isotope), transition state (normal isotope) and transition state (new isotope). Not as elegant as quiver, but entirely fit for my purpose here.
- The results:
- 13C KIE for the carbonyl carbon using the free energy method: 1.023 (see diagram above)
- 13C KIE using the Bigeleisen-Mayer partition function ratios: 1.0226
- 2H KIE for the axial α protons using the free energy method: 0.928
- 2H KIE for the axial α protons using the Bigeleisen-Mayer partition function ratios: 0.92831
- The agreement enables me to conclude that the free energy method does not in fact suffer from significant round-off errors and other inaccuracies induced by the subtraction of two very similar energies.
What I have I learnt from this experience?
- That reproducing a calculation using old computer codes (in this case ~18 years) can be a difficult and complex procedure, relying on being able to contact people and ask for copies of the code, coping with possibly non-existent documentation, and with no access to the original coder who may be the only person able to explain it.
- That 20 years of continuous improvement in the computer industry means that a problem that would require impossibly enormous amounts of computer time then might be easily feasible now. The limit of 40 atoms in 1997 for quiver and qcrunch must have seemed future proof then, but alas it did not prove so.
- That if the source code for a program dimensioned too small is lost, one may have no option but to re-write it from scratch. It would indeed be a good test of reproducibility if the answers are unchanged!
- Keep good archives of your old work. You never know when they might come in useful!
Lessons for the future?
- Document your code! Put yourself in the position of someone in 20 years time trying to make sense of it! And record the compiler used (along with flags), and also the OS.
- Archive it in Zenodo or Github, and include that documentation.[4] You might not be contactable by email in 20 years time!
- include test inputs and outputs (OK, that is mostly done nowadays).
- I intend to do the above for my program. But I will use the immediate excuse often used by others for not archiving their codes: it is not yet documented sufficiently! But surely this would only take an hour or so? Watch this space. And please forgive the coding, it was done in 1975, when I was very inexperienced (another oft-used excuse!).
What about the Baeyer-Villiger isotope effects? Well, the above merely establishes that the free energy method (which requires no extra codes) has sufficient accuracy for computing KIEs. An explanation for the difference between the reported 13C KIE at the carbonyl carbon and its calculation still needs identifying. I state at the outset that (heavy atom) tunnelling corrections are NOT yet applied, and I might try the small-curvature tunneling model suggested by Dan, since he and Borden[5] have shown it can indeed be important. It would be exciting if the hydrogen isotope effects can be reproduced without tunnelling but that the carbon KIEs do require it.
I have received much help in the above saga from Jordi Bures Amat here, Erik Plata in Donna Blackmond’s group and information about another useful system for calculating KIE, the ISOEFF program.[6]
References
- D.A. Singleton, and M.J. Szymanski, "Simultaneous Determination of Intermolecular and Intramolecular <sup>13</sup>C and <sup>2</sup>H Kinetic Isotope Effects at Natural Abundance", Journal of the American Chemical Society, vol. 121, pp. 9455-9456, 1999. https://doi.org/10.1021/ja992016z
- M.J.S. Dewar, S. Olivella, and H.S. Rzepa, "Ground states of molecules. 49. MINDO/3 study of the retro-Diels-Alder reaction of cyclohexene", Journal of the American Chemical Society, vol. 100, pp. 5650-5659, 1978. https://doi.org/10.1021/ja00486a013
- M. Saunders, M. Wolfsberg, F.A.L. Anet, and O. Kronja, "A Steric Deuterium Isotope Effect in 1,1,3,3-Tetramethylcyclohexane", Journal of the American Chemical Society, vol. 129, pp. 10276-10281, 2007. https://doi.org/10.1021/ja072375r
- H.S. Rzepa, "KINISOT. A basic program to calculate kinetic isotope effects using normal coordinate analysis of transition state and reactants.", 2015. https://doi.org/10.5281/zenodo.19272
- O.M. Gonzalez-James, X. Zhang, A. Datta, D.A. Hrovat, W.T. Borden, and D.A. Singleton, "Experimental Evidence for Heavy-Atom Tunneling in the Ring-Opening of Cyclopropylcarbinyl Radical from Intramolecular <sup>12</sup>C/<sup>13</sup>C Kinetic Isotope Effects", Journal of the American Chemical Society, vol. 132, pp. 12548-12549, 2010. https://doi.org/10.1021/ja1055593
- V. Anisimov, and P. Paneth, "ISOEFF98. A program for studies of isotope effects using Hessian modifications", Journal of Mathematical Chemistry, vol. 26, pp. 75-86, 1999. https://doi.org/10.1023/a:1019173509273