Posts Tagged ‘molecular systems’

WATOC2014 Conference report. Emergent themes.

Thursday, October 9th, 2014

This second report highlights two “themes”, or common ideas that seem to emerge spontaneously from diversely different talks. Most conferences do have them.

The first is “embedding“, which in this context means treating different parts of a probably complex molecular system at different levels of theory. Thus Emily Carter in her plenary described how a periodic crystal treated by density functional theory, or DFT could have an embedded component in which the electronic structures are described instead by multi-reference correlated wave functions (CAS-PT2). She illustrated this by discussing what happens when a triplet state oxygen molecule approaches the surface of an aluminium crystal, and (mostly) dissociates into surface bound oxygen atoms with Al-O bonds. The spin state of the oxygen changes smoothly to an overall singlet, with a rapid transfer of charge at the saddle point in the potential energy surface. The numbered of embedded Al atoms had to be at least a cluster of 14 to reproduce the observed reaction barriers (DFT on its own gets a zero barrier!). This sort of study is important in understanding the details of what is happening in metal surface catalysis.

Arieh Warshel then addressed the same theme with his own talk entitled Multiscale Modeling of Complex Biological Systems and Processes. Here you got quantum embedding in a mechanical force field description of some very large molecules. This was a broad brush talk, but what I did get out of it was the concept of asymmetry in molecular systems. Whereas an organic chemist thinks of asymmetry as often relating to just a single chiral carbon centre in a molecule, nature operates on vaster scales. Thus the enzyme ATPase has a molecular axle or spindle, which rotates to assemble the phosphate groups one at a time. This spindle rotates asymmetrically, i.e. always in a specific direction, and Warshel attempts to describe the origins of this rotational asymmetry at a molecular level. Well, this is Nobel prize winning stuff! He followed this up with filaments that “walk” along surfaces in one (asymmetric) direction, first lifting up one point of attachment, and then re-attaching at a different point such that the filament develops a clear sense of direction in its walk. This of course is all done with molecular dynamics, and (I think) has its origins in subtle electrostatics.

Stefan Grimme in his plenary also described dynamic processes, this time those that happen in a mass spectrometer when a molecule is ionised by electron impact. Removal of an electron produces a complex set of ionised states, in which many different single bonds may be weakened due to this ionisation. He developed simplified  DFT (sDFT) methods that can be applied to molecular dynamics, and assembled a “black box” which predicts the expected fragmentations over a time scale of a ps or so. By sampling the trajectories, he estimated the intensities of the various positively charged species and overlaid this on the observed EI-MS. The agreement was often spectacular. A particularly interesting example was the fragmentation of taxol. Here, no molecular ion is found, only much lighter ions. The molecular dynamics shows that rather than consecutive single-bond fragmentations, you instead get multiple bonds more or less all fragmenting at the same time. Tougher was to reproduce rearrangements, such as the McLafferty. Here, the semi-empirical method OM2 was more successful. His work means you can just “dial-a-mass-spectrum” and he speculates whether getting a good fit with the observed spectrum could tell you subtle aspects of the gas-phase molecular species, what its tautomeric state might be or perhaps even its conformation. He also described large-scale (800+) atom simulations of electronic circular dichroism (ECD) spectra of organometallic systems. Octahedral complexes can be prepared in chiral form, and this theoretical ECD treatment allows determination of absolute configuration of these often non-crystalline systems. Here you often need to compute 1000 or more electronic states, and if you have ever tried such ECD simulations, you will know that this is a lot of states!

We had been expecting Stefan to talk about dispersion effects in molecules, another emerging theme. Instead lots of other people mentioned them. In my talk I showed how including a D3-dispersion correction could dramatically change the predicted enantioselectivity of a chiral aldol condensation.[1]

The above observations of course cannot be in the least representative; typical of a modern conference there are five parallel sessions and 400+ posters, and so it represents a highly personal and selective snapshot.

References

    Computers 1967-2011: a personal perspective. Part 4. Moore’s Law and Molecules.

    Friday, October 28th, 2011

    Moore’s law describes a long-term trend in the evolution of computing hardware, and it is often interpreted in terms of processing speed. Here I chart this rise in terms of the size of computable molecules. By computable I mean specifically how long it takes to predict the geometry of a given molecule using a quantum mechanical procedure.

    LSD, the 1975 benchmark for computable molecules.

    The geometry (shape) of a molecule is defined by 3N-6 variables, where N is the number of atoms it contains. Optimising the value of variables in order to obtain the minimum value of a function was first conducted by chemical engineers, who needed to improve the function of chemical reactor plants. The mathematical techniques they developed were adopted to molecules in the 1970s, and in 1975 a milestone was reached with the molecule above. Here, N=49, and 3N-6=141. The function used was one describing its computed enthalpy of formation, using a quantum mechanical procedure known as MINDO/3. The computer used was what passed then for a supercomputer, a CDC 6600 (of which a large well endowed university could probably afford one of). It was almost impossible to get exclusive access to such a beast (its computing power was shared amongst the entire university, in this case of about 50,000 people), but during a slack period over a long weekend, the optimised geometry of LSD was obtained (it’s difficult to know how many hours the CDC 6600 took to perform this feat, but I suspect it might have been around 72). The result was announced by Paul Weiner to the group I was then part of (the Dewar research group), and Michael immediately announced that this deserved an unusual Monday night sojourn to the Texas Tavern, where double pitchers of beer would be available. You might be tempted to ask what the reason for the celebration was. Well, LSD was a “real molecule” (and not a hallucination). It meant one could predict for the first time the geometry of realistic molecules such as drugs and hence be taken seriously by people who dealt with molecules of this size for a living. And if you could predict the energy of its equilibrium geometry, you could then quickly move on to predicting the barriers to its reaction. A clear tipping point had been reached in computational simulation.

    In 1975, MINDO/3 was thought to compute an energy function around 1000 to 10,000 faster than the supposedly more accurate ab initio codes then available (in fact you could not then routinely optimise geometries with the common codes of this type). With this in mind, one can subject the same molecule to a modern ωB97XD/6-311G(d,p) optimisation. This level of theory is probably closer to 104 to 105 times slower to compute than MINDO/3. On a modest “high performance” resource (which nowadays runs in parallel, in fact on 32 cores in this case), the calculation takes about an hour (starting from a 1973 X-ray structure, which turns out to be quite a poor place to start from, and almost certainly poorer than the 1975 point). In (very) round numbers, the modern calculation is about a million times faster. Which (coincidentally) is approximately the factor predicted by Moore’s law.

    I will give one more example, this time for an example dating from around 2003, 28 years on from the original benchmark.

    Transition state for lactide polymerisation.

    This example has 114 atoms, and hence 3N-6 =336, or 2.42 times the 1975 size. It is a transition state, which is a far slower calculation then an equilibrium geometry. It is also typical of the polymerisation chemistry of the naughties. Each run on the computer (B3LYP/6-31G(d), with the alkyl groups treated at STO-3G) now took about 8-10 days (on a machine with 4 cores), and probably 2-4 runs in total would have been required per system (of which four needed to be studied to derive meaningful conclusions). Let us say 1000 hours per transition state. Together with false starts etc, the project took about 18 months to complete. Move on to 2010; added to the model was a significantly better (= slower) basis set and a solvation correction, and a single calculation now took 67 hours. In 2011, it would be reduced to ~10 hours (by now we are up to 64-core computers).

    In 2011, calculations involving ~250 atoms are now regarded as almost routine, and molecules with up to this number of atoms cover most of the discrete (i.e. non repeating) molecular systems of interest nowadays. But the 1975 LSD calculation still stands as the day that realistic computational chemistry came of age.

    Computers 1967-2011: a personal perspective. Part 4. Moore's Law and Molecules.

    Friday, October 28th, 2011

    Moore’s law describes a long-term trend in the evolution of computing hardware, and it is often interpreted in terms of processing speed. Here I chart this rise in terms of the size of computable molecules. By computable I mean specifically how long it takes to predict the geometry of a given molecule using a quantum mechanical procedure.

    LSD, the 1975 benchmark for computable molecules.

    The geometry (shape) of a molecule is defined by 3N-6 variables, where N is the number of atoms it contains. Optimising the value of variables in order to obtain the minimum value of a function was first conducted by chemical engineers, who needed to improve the function of chemical reactor plants. The mathematical techniques they developed were adopted to molecules in the 1970s, and in 1975 a milestone was reached with the molecule above. Here, N=49, and 3N-6=141. The function used was one describing its computed enthalpy of formation, using a quantum mechanical procedure known as MINDO/3. The computer used was what passed then for a supercomputer, a CDC 6600 (of which a large well endowed university could probably afford one of). It was almost impossible to get exclusive access to such a beast (its computing power was shared amongst the entire university, in this case of about 50,000 people), but during a slack period over a long weekend, the optimised geometry of LSD was obtained (it’s difficult to know how many hours the CDC 6600 took to perform this feat, but I suspect it might have been around 72). The result was announced by Paul Weiner to the group I was then part of (the Dewar research group), and Michael immediately announced that this deserved an unusual Monday night sojourn to the Texas Tavern, where double pitchers of beer would be available. You might be tempted to ask what the reason for the celebration was. Well, LSD was a “real molecule” (and not a hallucination). It meant one could predict for the first time the geometry of realistic molecules such as drugs and hence be taken seriously by people who dealt with molecules of this size for a living. And if you could predict the energy of its equilibrium geometry, you could then quickly move on to predicting the barriers to its reaction. A clear tipping point had been reached in computational simulation.

    In 1975, MINDO/3 was thought to compute an energy function around 1000 to 10,000 faster than the supposedly more accurate ab initio codes then available (in fact you could not then routinely optimise geometries with the common codes of this type). With this in mind, one can subject the same molecule to a modern ωB97XD/6-311G(d,p) optimisation. This level of theory is probably closer to 104 to 105 times slower to compute than MINDO/3. On a modest “high performance” resource (which nowadays runs in parallel, in fact on 32 cores in this case), the calculation takes about an hour (starting from a 1973 X-ray structure, which turns out to be quite a poor place to start from, and almost certainly poorer than the 1975 point). In (very) round numbers, the modern calculation is about a million times faster. Which (coincidentally) is approximately the factor predicted by Moore’s law.

    I will give one more example, this time for an example dating from around 2003, 28 years on from the original benchmark.

    Transition state for lactide polymerisation.

    This example has 114 atoms, and hence 3N-6 =336, or 2.42 times the 1975 size. It is a transition state, which is a far slower calculation then an equilibrium geometry. It is also typical of the polymerisation chemistry of the naughties. Each run on the computer (B3LYP/6-31G(d), with the alkyl groups treated at STO-3G) now took about 8-10 days (on a machine with 4 cores), and probably 2-4 runs in total would have been required per system (of which four needed to be studied to derive meaningful conclusions). Let us say 1000 hours per transition state. Together with false starts etc, the project took about 18 months to complete. Move on to 2010; added to the model was a significantly better (= slower) basis set and a solvation correction, and a single calculation now took 67 hours. In 2011, it would be reduced to ~10 hours (by now we are up to 64-core computers).

    In 2011, calculations involving ~250 atoms are now regarded as almost routine, and molecules with up to this number of atoms cover most of the discrete (i.e. non repeating) molecular systems of interest nowadays. But the 1975 LSD calculation still stands as the day that realistic computational chemistry came of age.