Posts Tagged ‘computational chemistry’

An Ambimodal Trispericyclic Transition State: the effect of solvation?

Thursday, May 2nd, 2019

Ken Houk’s group has recently published this study of cycloaddition reactions, using a combination of classical transition state location followed by molecular dynamics trajectory calculations,[1] and to which Steve Bachrach’s blog alerted me. The reaction struck me as being quite polar (with cyano groups) and so I took a look at the article to see what both the original[2] experimental conditions were and how the new simulations compared. The reaction itself is shown below.


Turns out that chloroform was used as solvent (also benzene), whilst the transition state calculations and the subsequent molecular dynamics trajectories were modelled for the gas phase. The key observation is that if TS1 is used as the starting point for trajectory calculations, only 87% lead to the product predicted by classical transition state theory (3), as revealed by a classical intrinsic reaction coordinate (IRC) calculation. The remaining 13% lead to 4 and 5, the ambimodal effect. So here, I want to explore what effect including a continuum solvent on the computation of TS1 and its IRC might have on the classical (non-dynamic) model.

Firstly, the model for TS1 as reported[3], ωB97XD/6-31G(d) (FAIR data at DOI: 10.14469/hpc/5590). The basis set is modest by today’s standards, but is largely imposed by the need to use it for very large numbers of trajectory calculations. I was able to copy/paste the coordinates from the reported supporting information and then to replicate the IRC at this level (gas phase), also shown in the SI.

There is a feature in this IRC I want to expand upon (red arrow above). It occurs at an IRC value of ~-0.9 on the axis below.

It can be seen more clearly if the RMS gradient norm is plotted, the value of which drops to almost zero at IRC -0.9. Had it reached exactly 0.0, we would have had an intermediate formed. As it is we have what is called a hidden intermediate. The origins of this intermediate can be more readily inferred from this dipole moment plot along the IRC. At TS1, the dipole moment is > 10D. A rule of thumb I have often used is that if a TS has a DM > 10, then one cannot ignore solvation any more and a gas phase model must be augmented.

Here are the same plots, but now with an added solvent field for water, an extreme polarity.

The IRC now stops at -2, being a high energy true (ionic) intermediate (rather than a hidden one). The IRC stops because gradient norm has now reached 0.0 at IRC -2, again an indication of a true intermediate.

The dipole moment has been increased from 10 in the gas phase to around 15 at TS1 and it continues to increase until the ionic intermediate is reached. These differences from the gas phase plots are induced entirely by applying a continuum solvent model.

Next an intermediate solvent, chloroform, being one of the solvents used for the actual reaction. This time the gradient norm almost reaches a value of 0.0, avoiding it only by a whisker! A barely hidden intermediate.

The dipole moment totters around 13.5D, before finally collapsing as the ionic intermediate itself collapses to a neutral molecule again. Benzene as solvent (not shown here) reaches an intermediate dipole moment of about 12D. It too can stabilize an ionic intermediate noticeably even though it is not ionic itself.

I want to also briefly explore what effect if any the use of a relatively small basis set (6-31G(d)) has on the shape of the IRC. Below is a repeat of the gas phase IRC using the Def2-TZVPP basis, which is about twice the size of the smaller one (and hence is around 16 times slower to compute). The gradient norm shows that the “hidden intermediate” region around IRC -1 is a little more prominent (flatter).

So we see that both a solvent model (as a continuum field) and a larger basis set can increase the degree of “hidden intermediate” character in the classical reaction coordinate for this cycloaddition reaction, to the extent that if water is used as model solvent an actual discrete albeit shallow ionic intermediate forms. As Houk puts it, solvation induces a conversion from an entropic intermediate to an enthalpic one.[4] Molecular dynamics trajectories however have a propensity for not settling into quite shallow intermediates (those with escape barriers of < 3 kcal/mol, as would be the case here).

It will indeed be interesting to see the extent, if any, that either of the augmented models shown above affect the calculated distribution of molecular dynamics trajectories compared to those obtained using a gas phase model.

References

  1. X. Xue, C.S. Jamieson, M. Garcia-Borràs, X. Dong, Z. Yang, and K.N. Houk, "Ambimodal Trispericyclic Transition State and Dynamic Control of Periselectivity", Journal of the American Chemical Society, vol. 141, pp. 1217-1221, 2019. https://doi.org/10.1021/jacs.8b12674
  2. C.Y. Liu, and S.T. Ding, "Cycloadditions of electron-deficient 8,8-disubstituted heptafulvenes to electron-rich 6,6-disubstituted fulvenes", The Journal of Organic Chemistry, vol. 57, pp. 4539-4544, 1992. https://doi.org/10.1021/jo00042a039
  3. https://doi.org/
  4. O.M. Gonzalez-James, E.E. Kwan, and D.A. Singleton, "Entropic Intermediates and Hidden Rate-Limiting Steps in Seemingly Concerted Cycloadditions. Observation, Prediction, and Origin of an Isotope Effect on Recrossing", Journal of the American Chemical Society, vol. 134, pp. 1914-1917, 2012. https://doi.org/10.1021/ja208779k

Smoke and mirrors. All is not what it seems with this Sn2 reaction!

Thursday, April 4th, 2019

Previously, I explored the Graham reaction to form a diazirine. The second phase of the reaction involved an Sn2′ displacement of N-Cl forming C-Cl. Here I ask how facile the simpler displacement of C-Cl by another chlorine might be and whether the mechanism is Sn2 or the alternative Sn1. The reason for posing this question is that as an Sn1 reaction, simply ionizing off the chlorine to form a diazacyclopropenium cation might be a very easy process. Why? Because the resulting cation is analogous to the cyclopropenium cation, famously proposed by Breslow as the first example of a 4n+2 aromatic ring for which the value of n is zero and not 1 as for benzene.[1] Another example of a famous “Sn1” reaction is the solvolysis of t-butyl chloride to form the very stable tertiary carbocation and chloride anion (except in fact that it is not an Sn1 reaction but an Sn2 one!)

Here is the located transition state for the above, using Na+.6H2O as the counter-ion to the chloride. The calculated free energy of this transition state is 3.2 kcal/mol lower than the previous Sn2′ version (FAIR data collection, 10.14469/hpc/5045), with an overall barrier to reaction of 26.5 kcal/mol. This compares to ~24.5 kcal/mol obtained by Breslow for solvolysis of the cyclopropenyl tosylate. Given the relatively simple solvation model I used in the calculation (only six waters to solvate all the ions, and a continuum solvent field for water), the agreement is not too bad.

The animation above is of a normal vibrational mode known as the transition mode (click on the image above to get a 3D rotatable animated model). The calculated vectors for this mode (its energy being an eigenvalue of the force constant matrix) are regularly used to “characterise” a transition state. I will digress with a quick bit of history here, starting in 1972 when another famous article appeared.[2] The key aspect of this study was the derivation of the first derivatives of the energy of a molecule with respect to the (3N) geometrical coordinates of the atoms, using a relatively simply quantum mechanical method (MINDO/2) to obtain that energy. Analytical first derivatives of the MINDO/2 Hamiltonian were then used to both locate the transition state for a simple reaction and then to evaluate the second derivatives (the force constant matrix) using a finite difference method. That force constant matrix, when diagonalized, reveals one negative root (eigenvalue) which is characteristic of a transition state. The vectors reveal how the atoms displace along the vibration, and should of course approximate to the path to either reactant or product.

Since that time, it has been a more or less mandatory requirement for any study reporting transition state models to characterise them using the vectors of the negative eigenvalue. The eigenvalue invariably expressed as a wavenumber. Because this comes from the square root of the mass-weighted negative force constant, it is often called the imaginary mode. Thus in this example, 115i cm-1, the i indicating it is an imaginary number. The vectors are derived from quadratic force constants, which is a parabolic potential surface for the molecule. Since most potential surfaces are not quadratic, it is recognized as an approximation, but nonetheless good enough to serve to characterise the transition state as the one connecting the assumed reactant and product. Thousands of published studies in the literature have used this approach.

So now to the animation above. If you look closely you will see that it is a nitrogen and not a carbon that is oscillating between two chlorines (here it is the lighter atoms that move most). The vectors confirm that, with a large one at N and only a small one at C. So it is Sn2 displacement at nitrogen that we have located? 

Not so fast. This is a reminder that we have to explore a larger region of the potential energy surface, beyond the quadratic region of the transition state from which the vectors above are derived. This is done using an IRC (intrinsic reaction coordinate). Here it is, and you see something remarkable.

The Cl…N…Cl motions seen above in the transition state mode change very strongly in regions away from the transition state. On one side of the transition state, it forms a Cl…C bond, on the other side a Cl…N.

It is also reasonable to ask why the paths either side of the transition state are not the same? That may be because with only six explicit water molecules, three of which solvate the sodium ion, there are not enough to solvate equally the chloride anions either side of the transition state. As a result one chlorine does not behave in quite the same way as the other. The addition of an extra water molecule or two may well change the resulting reaction coordinate significantly.

The overall message is that there are two ways to characterise a computed reaction path. One involves looking at the motions of all the atoms just in the narrow region of the transition state. Most reported literature studies do only this. When the full path is explored with an IRC, a different picture can emerge, as here. The Cl…N…Cl Sn2 mode is replaced by a Cl…C/N…Cl mode. This example however is probably rare, with most reactions the transition state vibration and the IRC do actually agree!

References

  1. R. Breslow, "SYNTHESIS OF THE s-TRIPHENYLCYCLOPROPENYL CATION", Journal of the American Chemical Society, vol. 79, pp. 5318-5318, 1957. https://doi.org/10.1021/ja01576a067
  2. J.W. McIver, and A. Komornicki, "Structure of transition states in organic reactions. General theory and an application to the cyclobutene-butadiene isomerization using a semiempirical molecular orbital method", Journal of the American Chemical Society, vol. 94, pp. 2625-2633, 1972. https://doi.org/10.1021/ja00763a011

Reaction coordinates vs Dynamic trajectories as illustrated by an example reaction mechanism.

Monday, March 20th, 2017

The example a few posts back of how methane might invert its configuration by transposing two hydrogen atoms illustrated the reaction mechanism by locating a transition state and following it down in energy using an intrinsic reaction coordinate (IRC). Here I explore an alternative method based instead on computing a molecular dynamics trajectory (MD).

I have used ethane instead of methane, since it is now possible to envisage two outcomes:

An animation of the IRC starting from the located transition state is shown below (DOI: 10.14469/hpc/2331). This is based purely on the computed potential energy surface of the molecule. The IRC is computed from the forces experienced on the atoms as they are displaced from an initial set of coordinates corresponding to the located transition state and then following the direction indicated by the eigenvectors of the negative force constant required of a transition state. Importantly, there is no time component; the path is based entirely on energies and forces.

Next, a molecular dynamics simulation (ωB97XD/6-31G(d,p), DOI: 10.14469/hpc/2330).  This uses the ADMP method, which requests a classical trajectory calculation using the “atom-centered density matrix propagation molecular dynamics model”. This integrates kinetic energy contributions from the molecular vibrations and so explicitly now includes a time component. In this example the evolution of the system from the transition state is charted over a period of 100 femtoseconds (1000 integrated steps). As it happens this is a relatively short period of evolution; sometimes periods of picoseconds may be required to get a realistic model, especially if one is also dealing with explicit solvent molecules (of which perhaps 500 might be required).

Such explicit inclusion of the kinetic energy from molecular vibrations in effect allows the molecule to “jump” over shallow barriers. In this case, the barrier for a [1,2] hydrogen shift from the methyl group to the adjacent carbene (watch atom 8). Simultaneously, the path taken by two hydrogens no longer corresponds to their transposition but to their elimination as a hydrogen molecule. So this quite different outcome from the IRC is very probably also a much more realistic one.

If the MD method is so much more realistic than the IRC, then why is it not always used? The simple answer is computational time! For this very small molecule and using quite a modest basis set (6-31G(d,p)), the relatively short 1000 time steps took about three times as long to compute as the IRC. The factor gets worse as the size of the molecule increases and the number of time steps for a realistic result increases. Perhaps, as technology gets better and new computer architectures emerge, MD simulations of ever increasingly complex reactions will become far more common. In ten years time, I expect most of the examples on this blog will use this method!

The 2016 Bradley-Mason prize for open chemistry.

Tuesday, October 4th, 2016

Peter Murray-Rust and I are delighted to announce that the 2016 award of the Bradley-Mason prize for open chemistry goes to Jan Szopinski (UG) and Clyde Fare (PG).

Jan’s open chemistry derives from a final year project looking at why atom charges derived from quantum chemical calculation of the electronic density represent chemical information well, but the electrostatic potential (ESP) generated from these charges is very poor and conversely charges derived from the computed electrostatic potential are incommensurate with chemical information (such as the electronegativity of atoms). He has developed a Python program called ‘repESP’ in which ‘compromise’ charges are generated which attempt to reconcile the physical world-view (fitting the ESP) with chemical insight provided by NPA (Natural Population Analysis). Jan was the main driver to making his code open source, “opening his supervisor’s eyes” to the various flavours of open source licences. To ensure that all subsequent improvements to the program remain available to anyone, the source code has been released under a ‘copyleft’ licence (GPL v3) and is maintained by Jan on GitHub, where Jan looks forward to helping new users and collaborating with contributors.

Clyde has made various contributions to opensource chemistry over the period of his PhD, with the focus mainly on utilities to improve quantum chemical research and the enhancement of a popular machine learning library with a method that has been successful in chemometrics, creation of an opensource channel for teaching chemists programming and data analysis and creation of a tool to help encourage open sourcing software development. Cclib is the most popular library for parsing quantum chemical data from output files and Clyde has contributed patches for the Atomic simulation environment which enables control of quantum chemical codes from a unified python interface. He was responsible for the construction of a computational chemistry electronic notebook published to github and which is now under active development by others as well. This aims to encapsulate computation chemical research projects, both for the sake of reproducibility and for the sake of organising and keeping track of quantum chemical research. Alongside this platform he created an enhanced Gaussian calculator for the Atomic Simulation Environment that enables automatic construction of ONIOM input files, also now under active development. He also made contributions to scikit learn, the most popular python machine learning framework, implementing a kernel for Kernel Ridge Regression that has become the most successful kernel for regression over molecular properties. He was part of the team that won the 2014 sustainable software conference prize for creation of the opensource healthchecker software as part of Sustain. He has argued for opensource as a platform for teaching resources and created the Imperial Chemistry github user account, which is now run by the department. Materials for the Imperial Chemistry Data Analysis and Programming workshops implemented as Python Notebooks are now available through this account and continue under active development.

Criteria for the award will include judging the submission on its immediate accessibility via public web sites, what is visible and re-usable in this way and of evidence of either community formation/engagement or re-use of materials by people other than the proposer.

The chemical Web at 22 and where it might go.

Wednesday, August 19th, 2015

This post is prompted by the appearance of a retrospective special issue of C&E news, with what appears to be its very own Website: internet.cenmag.org. It contains articles and interviews with many interesting people, along with several variations on the historical (albeit rather USA-centric) perspectives and a time-line covers many of the key innovations (again, from a USA-perspective). Some subjects are covered in greater depth, including computational chemistry. The periodic table too gets coverage, but surprisingly that is not of Mark Winter’s WebElements, which carries the impressive 1993-2015 continuous timeline (hence 22 in the title!).  

You can find mention of Tim Berners-Lee at the CEN site (but no interview with Sir Tim), so here I contribute to the historical record by showing the plaque placed in the corridor of offices at CERN where TiMBL and Robert Cailliau worked to set the Web up in 1991.

Click to expand

Click to expand

What is not really given much prominence in the C&E news article is DATA. Which arguably is the reason why TimBL set things going back in 1989! Zenodo is (just like the Web before it) a spin-off from the activities at CERN, but now handling just data. Or Code. Or indeed almost any useful outcome of the research process that might be useful to someone else or to posterity. And to put it into context, it comes in two parts:

  1. The data store itself (which CERN are especially good at, since the Large Hadron Collider generates a great deal of data). They add the A to FAIR (Findable, Accessible, interoperable and Reusable). And also a certain confidence that this store will be enduring, not here today and gone tomorrow.
  2. The metadata describing the data, which in fact turns out is stored somewhere else, at DataCite. This organisation serves to add the F to FAIR.

And to show how this works in practice I can do no better than give this link: search.labs.datacite.org/help/examples which shows how you can benefit from the metadata.

I have already demonstrated the use of Zenodo for archiving some old computer code of mine for calculating kinetic isotope effects, but of course it is so much more than that. If the first CERN spin-out, the Web, is already 22 years old (for chemists), then I am confident in asserting that facilities such as Zenodo will play an increasingly important role over the next 22 years (indeed a much shorter timescale than that). 


A personal souvenir can be seen here.

Digital repositories. An update.

Saturday, July 21st, 2012

I blogged about this two years ago and thought a brief update might be in order now. To support the discussions here, I often perform calculations, and most of these are then deposited into a DSpace digital repository, along with metadata. Anyone wishing to have the full details of any calculation can retrieve these from the repository. Now in 2012, such repositories are more important than ever. 

In the UK, the main funding organisations are increasingly requiring researchers to deposit their primary data in such open archives, and some disciplines are better than others at this (chemistry does not rank very highly in general however in terms of deposition of data). Our DSpace server is a local one running at Imperial College, but a few months back I became aware of Figshare, which aspires to operate on a much wider and more general scale.  So I have injected one of the calculations reported in another post (the IRC for the sodium tolyl thiolate reaction with dichlorobutenone) into Figshare, making use of the API which has recently been developed for this purpose and implemented by  Matt Harvey. As with DSspace, it issues a DOI, which can be then quoted wherever appropriate (and particularly in scientific articles). This particular deposition is 10.6084/m9.figshare.93096

This repository is still undergoing a lot of development, but already one can see many interesting features, such as export to Endnote or Mendeley, and a QR barcode for devices with cameras. I would encourage anyone who regularly generates e.g. computational chemistry data, or knows a group that does, to encourage them to make use of such facilities.

Postscript: If you have a look at this deposition in Figshare you may already notice some of the developments I note above.  Matt Harvey (who, with Mark Hahnel of Figshare, developed our publish script) has added to the entry:

* A data descriptor document URL

* Wikipedia and pubchem links (automatically resolved from Inchi/Key searches)

* Links to chemspider searches

* Links to all other objects in the  Spectra DSpace repository with a common Inchi/Key

Origins of the Regioselectivity of Cyclopropylcarbinyl Ring Opening Reactions.

Friday, July 20th, 2012

Twenty years are acknowledged to be a long time in Internet/Web terms. In the early days (in 1994), it was a taken that the passage of 1 Web day in the Internet time-warp was ~≡ 7 for the rest of the world (the same factor as applied to the lives of canines). This temporal warping can also be said to apply to computational chemistry. I previously revisited some computational work done in 1992, and here I rediscover another investigation from that year[1] and that era. The aim in this post is to compare not only how the presentation of the results has changed, but how the computational models have as well.

Experiment had shown that Sabinene undergoes a radical ring-opening of the cyclopropane when treated with CCl3 radicals. If BrCCl3 is used as solvent, the kinetic 5-exo product is immediately trapped. If instead the reaction is conducted in the less reactive trap CCl4, the thermodynamic 6-endo product is isolated. The objective was to investigate the origins of these effects. In 1992, computational modelling was limited by the speed and memory of the computers to the following:

  1. Semi-empirical methods such as PM3 (ab initio methods were used only sparingly).
  2. Larger groups (in this case, the CCl3 and isopropyl groups) were trimmed off
  3. Simulations were often for the gas phase only (although the self-consistent-reaction-field was starting to be used to simulate solution).
  4. The properties of transition states were analysed via their molecular orbitals alone, and these were often disconcertingly complex:

The conclusion in 1992 using these techniques found that the transition state for 5-exo ring formation was 3.1 kcal/mol higher than for 6-endo, contrary to the experimental result. With no support from mere activation energies, perhaps slightly desperate recourse was made to an orbital correlation diagram, and the discussion included, inter alia, an arcane feature involving an avoided orbital crossing unique to the 5-exo transition state. Perhaps, in retrospect, rather too arcane for the intended audience, since this is unfortunately not a well cited article.

How might one do things differently (better?) twenty years on?

  1. Here, I start with presenting a (3D) model of each transition state. This was not done in 1992 for reasons of space (the journal format limited the page length very strictly) and of course the journal was only available in printed form (no e-journals then!).
  2. The model itself can be greatly improved (ωB97XD/6-311G(d,p)/CPCM=CCl4).  We now have a DFT calculation, including proper dispersion terms (which PM3 lacks by the way) and good triple-ζ basis set (PM3 is single-ζ), with inclusion of solvation (even though this is a radical, the dipole moments are nevertheless in the range 3-4D, and hence a gas phase model may not be entirely appropriate) and with no trimming off of groups. Crucially (in retrospect), my decision to delete the CCl3 group in 1992 was not a sound one! We have no archive from those days however, so cannot double-check this point. The modern calculation is indeed archived here (although of course whether this archive will itself still be available in 20 years time remains to be established!). 
  3. With modern computers, these new models took about 2.25 hours each to compute (the entire project was done in one working day). The 5-exo transition state is shown below:
  4. The presentation of this model can also be improved from that available 20 years ago. As usual, just click on the image above to see it.
  5. The free energy of activation, ΔG298 = 10.6 kcal/mol, is an entirely reasonable value for radical ring opening of a cyclopropyl (this value is mysteriously not reported in the 1992 version, for which I also take complete blame).
  6. The isomeric 6-endo transition state (which is observed to be kinetically slower) indeed now has the higher calculated barrier (ΔG298 = 11.5 kcal/mol) and this value corresponds to a process about 5 times slower than 5-exo. Recollect, PM3 obtained the opposite result, but possibly that was because the CCl3 group was not present in the model.
  7. We can learn a little about the dynamics of the reaction path; note how the isopropyl group rotates near the end of the ring-opening, due to some form of σ-conjugation no doubt.
  8. Instead of delocalised molecular orbitals, we are going to present localized NBOs, and in particular look at the localised effect to the C-CCl3 bond. The orbitals for the 5-exo transition state are shown first. The red-blue is the C-CCl3 σ* NBO orbital, the purple-orange is the highest energy doubly occupied NBO orbital (these two are selected because they represent a pair with a small energy gap, which means a larger interaction energy). Where blue and purple, or orange and red overlap, we have a stabilizing influence.
  9. The equivalent pair of NBOs for the 6-endo transition state overlaps much less well (click on image to get a rotatable 3D model to see for yourself). 
  10. Nevertheless, the 6-endo transition state manages an overlap between the highest singly occupied NBO and the C-Cl σ*, but because it involves only one, and not a pair of electrons, the stabilizing effect is smaller.
  11. What we conclude is that at the transition state, the 5-exo isomer has the more stabilizing orbital overlaps, but that beyond the transition state, the greater thermodynamic stability of the 6-endo isomer takes over.

Well, here we have a refresh of some chemistry analysed 20 years ago, and done with the help of software and hardware tools that have advanced considerably during this period. One may only speculate what another refresh in 20 years time might bring! 

References

  1. R.A. Batey, P. Grice, J.D. Harling, W.B. Motherwell, and H.S. Rzepa, "Origins of the regioselectivity of cyclopropylcarbinyl ring opening reactions in bicyclo [n.1.0] systems", Journal of the Chemical Society, Chemical Communications, pp. 942, 1992. https://doi.org/10.1039/c39920000942

The Dieneone-phenol controversies.

Monday, April 30th, 2012

During the 1960s, a holy grail of synthetic chemists was to devise an efficient route to steroids. R. B. Woodward was one the chemists who undertook this challenge, starting from compounds known as dienones (e.g. 1) and their mysterious conversion to phenols (e.g. 2 or 3) under acidic conditions. This was also the golden era of mechanistic exploration, which coupled with an abundance of radioactive isotopes from the war effort had ignited the great dienone-phenol debates of that time (now largely forgotten). In a classic recording from the late 1970s, Woodward muses how chemistry had changed since he started in the early 1940s. In particular he notes how crystallography had revolutionised the reliability and speed of molecular structure determination. Here I speculate what he might have made of modern computational chemistry, and in particular whether it might cast new light on those mechanistic controversies of the past.

Charting the mechanistic pathway connecting 1 and 2 was first done by Capsi using 14C labels (* in the diagram above, on a steroid derivative),  when after claimed selective Birch reduction of the blue double bond in 2, alkene ozonolysis and decarboxylative loss of *, all the radioactivity ended up in the CO2. This showed that the mechanism involved path (a). For paths (b) or (c), the label would have ended up at * and hence not oxidatively lost as CO2. Futaki did the experiment in a different way, putting his 14C label in the position * where he found that only about half of the label was retained in this position (and then lost when he specifically degraded 2 by oxidatively removing that carbon). This now strongly implicated path (b), and also seemed to disprove not only path (a) but also mechanism (c), where a [1,5] shift should have retained the label at the original position (and caused all of it to be lost upon decarboxylation). It was these two apparently contradictory results that helped ignite the controversies.

All the routes (a)-(e) above involve pericyclic sigmatropic reactions, the understanding of which was about to be revolutionised by Woodward (with Hoffmann) in the mid 1960s. In fact, the mechanism here comprises a mixture of [1,2] cationic sigmatropic migrations and [1,5] neutral sigmatropic migrations. To balance one against the other, can computational chemistry come to the rescue? I first note that the mechanisms above are all shown as cations. Until recently, a computational chemist would simply set the charge on their model to +1 and proceed onwards and upwards. But now we can do a bit better. We can (arguably we always should) include the counterion, and so in my own exploration, I have included a perchlorate anion, and the whole study then becomes one of a neutral system (charge =0), a zwitterion. A B3LYP/6-311G(d,p) model with SCRF=water continuum solvent was employed. Let us see what emerges:

  1. Path (a) involves a [1,2] angular methyl (R=Me) migration, which turns out to have ΔG28.5 kcal/mol. The IRC for this migration is shown below. 

    The (Wheland) intermediate then loses a proton to give 2.

  2. Path (b) involves an alternative rate-limiting migration of the angular methyl, ΔG30.2 kcal/mol, followed by two lower energy [1,2] migrations of the ring ΔG27.8 and 25.3via a spiro-ring Wheland intermediate (relative energy +3.8 kcal/mol), and deprotonation to again give 2.

    Path b-1. Click for 3D

    Path b-2. Click for 3D

    Path b-3. Click for 3D

    Notice how the perchlorate counterion is relatively free to change its position relative to the substituents, and not all these positions have been explored here. This stochastic problem is an issue with counter ions (more accurately, this problem is almost always massaged away by simply ignoring this counterion. But if its ultimate positioning does matter, then one must argue that its inclusion is essential in order to build a good model). 

  3. The energy of path (a) is thus seen to be 1.7 kcal/mol lower than (b/c), which is sufficient to favour positioning of most of the 14C tracer on * rather than and which seems to favour the Capsi mechanism over the Futaki one, although clearly the balance between the two is a fine one. The  Capsi mechanism does seem to hinge on the observation that  Birch reduction of  1 reduces the blue bond entirely specifically, and the evidence for this does need to be reviewed (in an informatics sense, this evidence is buried in a string of logically connected semantic inferences, each of which may well be contained as a passing comment in a different article).
  4. Regarding the matter of whether path (b) or  path (c) is the better representation, this goes to the heart of whether the path is respectively stepwise or concerted. The barriers for escape out of the spiro-ring intermediate defining the steps in path  (b) are key. The IRC for a reaction path with a shallow intermediate  is shown below. If the depth of the well it finds itself in imparts sufficient lifetime for it to lose all  (dynamic) memory of where it came from, then the probability of the  * label remaining in its original position is only 50%, since the other (symmetrically equivalent but unlabeled) position may also migrate in the next step. This seems to be the case for path (b), where the intermediate is in quite a deep well (21.5 kcal/mol for escape), and this is consistent with Futaki’s experiment. If the intermediate however were to be in only in a shallow minimum (2-4 kcal/mol), the momentum it inherits from the previous transition state may carry it over to the second stage without scrambling the isotope. For systems such as these, we do encounter a serious limitation of simple transition state theory, and must start to adopt a molecular dynamics approach. This might also apply to the positioning of the counterion, although perhaps less so for the relatively heavy perchlorate. It may also be an interesting issue of electron dynamics. Path (c) formally involves six electrons, path (b) only two. In a previous post, I speculated whether the electronic pack size for proton transfer was 4,6 or 8 electrons. Perhaps one day it will be possible to either measure (attosecond spectroscopy) or compute the preferred dynamics.
The points made in the last section come to the fore in a result obtained by Hopff and Drieding (he of the models). They confirmed the formation of 2 from 1, and also reported that at 80°C in 70% perchloric acid, 2 was itself then converted in two hours to 3. The debate again turns to whether this is accomplished via path (d) involving 2-electron shifts or path (e) involving a 6-electron shift. No radio-labelling experiments have been reported on this system. 
 
Well, as suspected perhaps, the computational analysis of the dienone-phenol rearrangements has shown the system to be poised on a knife-edge (of chaos). Tiny changes might swing things one way or the other. Adding two further (steroid rings) to  1 might of itself change the balance between e.g. path (a) and  path (b). So too might a change of counterion, or indeed solvent. One needs to identify the evidence that selective reduction of 2 reduces just the blue bond. If computational chemistry has not (yet) provided a clear-cut resolution to the chemistry of this system, at least it can identify new experiments that might.

Postscript:  I posed the question above about  Capsi’s identification of the reduction product of 2. The two possible products would give different outcomes for whether the * label would be lost upon subsequent oxidation or not.

If the reaction is thermodynamically controlled, then the relative free energies of 3 and 4 would determine the outcome. A B3LYP/6-311G(d,p) calculation (in ethanol as solvent, which has a very similar dielectric to liquid ammonia) predicts 4 is about 0.3 kcal/mol lower than 3. This does not suggest that the reaction is going to be particularly regioselective, and of course Capsi’s interpretation depends on the product being entirely 4, with no 3 formed.

Confirming the Fischer convention as a structurally correct representation of absolute configuration.

Tuesday, March 13th, 2012

I wrote in an earlier post how Pauling’s  Nobel prize-winning suggestion in February 1951 of an (left-handed) α-helical structure for proteins was based on the wrong absolute configuration of the amino acids (hence his helix should really have been the right-handed enantiomer). This was most famously established a few months later by Bijvoet’s definitive crystallographic determination of the absolute configuration of rubidium tartrate, published on August 18th, 1951 (there is no received date, but a preliminary communication of this result was made in April 1950). Well, a colleague (thanks Chris!) just wandered into my office and he drew my attention to an article by John Kirkwood (DOI: 10.1063/1.1700491) published in April 1952, but received July 20, 1951, carrying the assertion “The Fischer convention is confirmed as a structurally correct representation of absolute configuration“, and based on the two compounds 2,3-epoxybutane and 1,2-dichloropropane. Neither Bijvoet nor Kirkwood seem aware of the other’s work, which was based on crystallography for the first, and quantum computation for the second. Over the years, the first result has become the more famous, perhaps because Bijvoet’s result was mentioned early on by Watson and Crick in their own very famous 1953 publication of the helical structure of DNA. They do not mention Kirkwood’s result. Had they not been familiar with Bijvoet’s result, their helix too might have turned out a left-handed one!

I record all this because I was today asked to provide an abstract for an NSCCS Themed Workshop shortly to be held at Imperial College on the uses of the Gaussian computational chemistry program in synthetic chemistry. One of the themes will be chiroptical spectroscopy. Gaussian of course deploys much of the theory developed by Kirkwood in the 1950s to make exactly the same sort of predictions that Kirkwood himself used to verify the Fischer convention in 1951. Whilst the majority of modern determinations of absolute configuration are still based on Bijvoet’smethod, catching rapidly up are those based on chiroptical calculations. Perhaps in 2012 they are trusted more than they were in the 1950s? At any rate, such calculations are nowadays very much part of a modern undergraduate laboratory experience (slightly less so still in research laboratories I fear).

Here is another coincidence. Both Pauling and Kirkwood worked in the same department (Institute of Technology, Pasadena, California). One can only speculate on whether Kirkwood might have wandered into Pauling’s office in late 1951 to alert him that the protein helix should be right rather than left-handed (oh to have been a fly on Pauling’s blackboard). So alerted, would Pauling have foreseen that eventually such determinations would be routinely made using the very quantum mechanics that he had popularised?

Computers 1967-2011: a personal perspective. Part 4. Moore’s Law and Molecules.

Friday, October 28th, 2011

Moore’s law describes a long-term trend in the evolution of computing hardware, and it is often interpreted in terms of processing speed. Here I chart this rise in terms of the size of computable molecules. By computable I mean specifically how long it takes to predict the geometry of a given molecule using a quantum mechanical procedure.

LSD, the 1975 benchmark for computable molecules.

The geometry (shape) of a molecule is defined by 3N-6 variables, where N is the number of atoms it contains. Optimising the value of variables in order to obtain the minimum value of a function was first conducted by chemical engineers, who needed to improve the function of chemical reactor plants. The mathematical techniques they developed were adopted to molecules in the 1970s, and in 1975 a milestone was reached with the molecule above. Here, N=49, and 3N-6=141. The function used was one describing its computed enthalpy of formation, using a quantum mechanical procedure known as MINDO/3. The computer used was what passed then for a supercomputer, a CDC 6600 (of which a large well endowed university could probably afford one of). It was almost impossible to get exclusive access to such a beast (its computing power was shared amongst the entire university, in this case of about 50,000 people), but during a slack period over a long weekend, the optimised geometry of LSD was obtained (it’s difficult to know how many hours the CDC 6600 took to perform this feat, but I suspect it might have been around 72). The result was announced by Paul Weiner to the group I was then part of (the Dewar research group), and Michael immediately announced that this deserved an unusual Monday night sojourn to the Texas Tavern, where double pitchers of beer would be available. You might be tempted to ask what the reason for the celebration was. Well, LSD was a “real molecule” (and not a hallucination). It meant one could predict for the first time the geometry of realistic molecules such as drugs and hence be taken seriously by people who dealt with molecules of this size for a living. And if you could predict the energy of its equilibrium geometry, you could then quickly move on to predicting the barriers to its reaction. A clear tipping point had been reached in computational simulation.

In 1975, MINDO/3 was thought to compute an energy function around 1000 to 10,000 faster than the supposedly more accurate ab initio codes then available (in fact you could not then routinely optimise geometries with the common codes of this type). With this in mind, one can subject the same molecule to a modern ωB97XD/6-311G(d,p) optimisation. This level of theory is probably closer to 104 to 105 times slower to compute than MINDO/3. On a modest “high performance” resource (which nowadays runs in parallel, in fact on 32 cores in this case), the calculation takes about an hour (starting from a 1973 X-ray structure, which turns out to be quite a poor place to start from, and almost certainly poorer than the 1975 point). In (very) round numbers, the modern calculation is about a million times faster. Which (coincidentally) is approximately the factor predicted by Moore’s law.

I will give one more example, this time for an example dating from around 2003, 28 years on from the original benchmark.

Transition state for lactide polymerisation.

This example has 114 atoms, and hence 3N-6 =336, or 2.42 times the 1975 size. It is a transition state, which is a far slower calculation then an equilibrium geometry. It is also typical of the polymerisation chemistry of the naughties. Each run on the computer (B3LYP/6-31G(d), with the alkyl groups treated at STO-3G) now took about 8-10 days (on a machine with 4 cores), and probably 2-4 runs in total would have been required per system (of which four needed to be studied to derive meaningful conclusions). Let us say 1000 hours per transition state. Together with false starts etc, the project took about 18 months to complete. Move on to 2010; added to the model was a significantly better (= slower) basis set and a solvation correction, and a single calculation now took 67 hours. In 2011, it would be reduced to ~10 hours (by now we are up to 64-core computers).

In 2011, calculations involving ~250 atoms are now regarded as almost routine, and molecules with up to this number of atoms cover most of the discrete (i.e. non repeating) molecular systems of interest nowadays. But the 1975 LSD calculation still stands as the day that realistic computational chemistry came of age.