The chemical Web at 22 and where it might go.

August 19th, 2015

This post is prompted by the appearance of a retrospective special issue of C&E news, with what appears to be its very own Website: internet.cenmag.org. It contains articles and interviews with many interesting people, along with several variations on the historical (albeit rather USA-centric) perspectives and a time-line covers many of the key innovations (again, from a USA-perspective). Some subjects are covered in greater depth, including computational chemistry. The periodic table too gets coverage, but surprisingly that is not of Mark Winter’s WebElements, which carries the impressive 1993-2015 continuous timeline (hence 22 in the title!).  

You can find mention of Tim Berners-Lee at the CEN site (but no interview with Sir Tim), so here I contribute to the historical record by showing the plaque placed in the corridor of offices at CERN where TiMBL and Robert Cailliau worked to set the Web up in 1991.

Click to expand

Click to expand

What is not really given much prominence in the C&E news article is DATA. Which arguably is the reason why TimBL set things going back in 1989! Zenodo is (just like the Web before it) a spin-off from the activities at CERN, but now handling just data. Or Code. Or indeed almost any useful outcome of the research process that might be useful to someone else or to posterity. And to put it into context, it comes in two parts:

  1. The data store itself (which CERN are especially good at, since the Large Hadron Collider generates a great deal of data). They add the A to FAIR (Findable, Accessible, interoperable and Reusable). And also a certain confidence that this store will be enduring, not here today and gone tomorrow.
  2. The metadata describing the data, which in fact turns out is stored somewhere else, at DataCite. This organisation serves to add the F to FAIR.

And to show how this works in practice I can do no better than give this link: search.labs.datacite.org/help/examples which shows how you can benefit from the metadata.

I have already demonstrated the use of Zenodo for archiving some old computer code of mine for calculating kinetic isotope effects, but of course it is so much more than that. If the first CERN spin-out, the Web, is already 22 years old (for chemists), then I am confident in asserting that facilities such as Zenodo will play an increasingly important role over the next 22 years (indeed a much shorter timescale than that). 


A personal souvenir can be seen here.

Single Figure (nano)publications, reddit AMAs and other new approaches to research reporting

August 5th, 2015

I recently received two emails each with a subject line new approaches to research reporting. The traditional 350 year-old model of the (scientific) journal is undergoing upheavals at the moment with the introduction of APCs (article processing charges), a refereeing crisis and much more. Some argue that brand new thinking is now required. Here are two such innovations (and I leave you to judge whether that last word should have an appended ?).

To set the scene for the first, I will quote the abstract: “The single figure publication is a novel, efficient format by which to communicate scholarly advances. It will serve as a forerunner of the nano-publication, a modular unit of information critical for machine-driven data aggregation and knowledge integration[1] The kernel of this suggestion is (again I quote) “We offer the idea of the micro-publication unit, the single figure publication (SFP), to provide scholars with a real-world, manageable method to inform research.” I was struck by the overlap between this suggestion and the one you may find on many of the posts on this blog, where what I refer to as FAIR Data is assigned a digital object identifier (DOI) and included in the citation lists at the end of the post. The key phrase in the above abstract is machine-driven data aggregation and knowledge, although the article does not really go into any mechanisms for easily achieving this. It is my argument that the act of assigning a DOI carries with it the association that there is machine searchable metadata which can be retrieved and used for the aggregation and knowledge mining. The authors of this article, Do and Mobley, advocate adoption of nanopublications defined by inclusion of just a single figure (notably, not a table of results!) and some accompanying context which they claim would reduce the unit of publication to a more tractable size. This does raise the question of whether science needs more publications (in chemistry alone there are said to be more than a million published each year) or whether we should instead be concentrating our efforts on improving the data side of things by increasing its semantic content and formalising its structures, its preservation and curation. I certainly argue that far too little effort has been poured into these latter activities. You only have to look at the typical SI (supporting information) associated with many chemistry articles to realise that in many cases they are still hardly fit for purpose. There is one concept introduced by Do and Mobley that also deserves mention. Their nanopublications are structured to be read by machines, not people. They will therefore not be refereed by people (my inference). They do not really discuss how else the quality will be assessed, but of course if you treat their nanopublication as essentially FAIR data, then it does become possible to develop methods of machine refereeing.

The second email alerted me to an article[2] in the Winnower, a forum that offers a bridge between “traditional scholarly publishing tools to traditional and non-traditional scholarly outputs—because scholarly communication doesn’t just happen in scholarly journals“. Here, the concept of scholarly communication is extended to the New Reddit Journal of Science and introduces the concept pioneered by reddit of the AMA, or “ask me anything” environment. I occasionally publish some of the posts on this blog to the Winnower, receiving in return the increasingly ubiquitous DOI. I have also occasionally quoted these DOIs in articles submitted to conventional chemistry journals. What we see now is the propagation of a Winnower DOI on to e.g. https://www.reddit.com/r/science/ where anyone can post a question related to the original research reporting. I must state that I do have some reservations about this. Whilst it is likely that the majority of traditional scholarly reporting is likely to receive no AMAs (just as a very high proportion of research articles attract few if any citations in other articles over a period of decades), it is also likely that the quality of posted AMAs may turn out to be very low. At which point the original researcher has to make a judgement as to whether to devote any of their increasingly precious and fragmented time to answering them. And if few if any answers are posted in response to an AMA, the system seems unlikely to flourish.

But what we see here are two serious attempts to develop new approaches to research reporting, and not doubt others will emerge. To quote Yogi Berra, the future is not what it used to be.


Anyone can also post to this blog to ask similar questions. But note that associating an ORCID with such comments is highly recommended. I do not think that reddit currently supports ORCID, but  I would argue if the intent is serious, it certainly should.

References

  1. L. Do, and W. Mobley, "Single Figure Publications: Towards a novel alternative format for scholarly communication", F1000Research, vol. 4, pp. 268, 2015. https://doi.org/10.12688/f1000research.6742.1
  2. . RobustTempComparison, and . r/Science, "Science AMA Series: Climate models are more accurate than previous evaluations suggest. We are a bunch of scientists and graduate students who recently published a paper demonstrating this, Ask Us Anything!", The Winnower, . https://doi.org/10.15200/winn.143871.12809

Intermolecular atom-atom bonds in crystals? The O…O case.

July 25th, 2015

I recently followed this bloggers trail; link1link2 to arrive at this delightful short commentary on atom-atom bonds in crystals[1] by Jack Dunitz. Here he discusses that age-old question (to chemists), what is a bond? Even almost 100 years after Gilbert Lewis’ famous analysis,[2] we continue to ponder this question. Indeed, quite a debate on this topic broke out in a recent post here. My eye was caught by one example in Jack’s article: “The close stacking of planar anions, as occurs in salts of croconic acid …far from producing a lowering of the crystal energy, this stacking interaction in itself leads to an increase by several thousand kJ mol−1 arising from Coulombic repulsion between the doubly negatively charged anions” I thought I might explore this point a bit further in this post.

A search query of the Cambridge structure database was defined as below. Two non-bonded oxygen atoms are each attached to one carbon, each oxygen was defined as having one bonded atom (to carbon) and each assigned one negative charge. Addition of the usual constraints of R < 0.05, no errors, no disorder and specifying an intermolecular search produced 103 hits with the distance distribution shown below.

OO-query


O-O

Firstly, you should be aware that the van der Waals radius for oxygen is ~1.5Å, and so any contacts less than 3.0Å become interesting. What becomes particularly exciting is the distinct cluster at ~2.5Å. Could these be ~30 examples of close encounters of the type noted by Dunitz? Well, a control search has to be done, this time for O-H-O motifs, with each OH distance plotted as below:

OHO

The hot-spot occurs when both OH distances are equal at ~1.22Å, or an O…O separation close to 2.45Å. Time to quote Dunitz again “This large destabilization is, of course, more than compensated in the overall energy balance by the large stabilization arising from Coulombic interactions of the croconate anions with the surrounding cations.” In this case of course, the cation is a proton, residing at the half way point between the two oxygens. So two oxygens can indeed approach ~0.5Å closer than the sum of the vdw radii if a proton sits in-between them.

What do we learn? Well, firstly that one should always have a reality check of the results of any crystal structure search. The search did specify that the oxygens be non-bonded but also that they should both carry a negative charge and that both should only have one bonded atom. That should in theory at least have excluded any C-O-H-O-C structures, so why were about 30 such examples found? I can only speculate here, but recollect that 50 years ago when the CSD was founded, hydrogen atoms were rarely identified from the electron density. They were instead placed or “idealised” to where they might be expected. Nowadays any contentious hydrogens are almost always located rather than idealised, but clearly their status as bona-fide atoms is not quite so strong as the rest of the periodic table. So in at least some of these 30 examples with short O…O contacts, we might expect there to lurk a (possibly unrecognised) proton. But one never knows, there may be some real examples of O…O contacts with no such proton intervening. Now these really would be interesting.


Postscript. F is isoelectronic with O(-); below is the same search as defined above, but for non-bonded CF…FC approaches. F---F

The vdw radius of F is 1.45Å hence any non-bonded contact <2.9Å is worth taking a look at. But notice the small cluster of about 10 compounds for which the value is ~2.15Å. The F-H-F plot shows a hot spot at ~2.3 for the F…F separation, but there are zero hits for CF-H-FC. So these ten hits are indeed tantalising.

References

  1. J.D. Dunitz, "Intermolecular atom–atom bonds in crystals?", IUCrJ, vol. 2, pp. 157-158, 2015. https://doi.org/10.1107/s2052252515002006
  2. G.N. Lewis, "THE ATOM AND THE MOLECULE.", Journal of the American Chemical Society, vol. 38, pp. 762-785, 1916. https://doi.org/10.1021/ja02261a002

The structure of naphthalene: 1890-1925, and a modern twist.

July 18th, 2015

This is a little historical essay into the electronic structure of naphthalene, presented as key dates (and also collects comments made which were appended to other posts).

  1. 1890[1]: Henry Armstrong presents the following structure of naphthalene. Three words need translation into modern usage. Where he uses the word nuclei the closest translation now might be rings. Secondly, the term affinity is nowadays replaced by electron. This latter term was first coined by Stoney[2] one year after Armstrong wrote this article to mean a then hypothetical “atom of electricity”. Oddly Armstrong never updated his own usage even after Thomson actually discovered the electron in 1897. Radicle is a substituent on the ring, and the origin perhaps of the generic R used nowadays.

    arm1


    Notice that Armstrong talks about a cycle of ten carbons in which ten affinities/electrons act (he had previously accounted for the 22 affinities associated with what we would now call the 11 C-C σ-bonds) and is adamant that no separation of the central carbon atoms takes place as Bamburger had suggested. In modern parlance the central C-C bond has a σ-bond and he is describing a [10]annulene. The last sentence above presages the modern term delocalisation


    arm2


    Armstrong next considers anthracene (above) and replaces the line representation of the affinities by a circle, abbreviated C (which represents six cyclic affinities, or electrons) and by four conventional double bonds, recovering 14 of what nowadays designate as π-electrons. What he does NOT do is consider the equally valid structure where his C is shown in the right hand ring, and then apply Kekule’s hypothesis to in effect average them on a chemical time scale. It is noteworthy that overall,  Armstrong has discussed 6, 10 and 14 electrons, just a hint of the  4n+2 rule yet to come.

  2. 1922[3]:The next example comes from Robert Robinson, future Nobel prize winner, who collected all 32 electrons in naphthalene (excluding CH) into the representation show as XVL. This is an averaging (mean) that Armstrong did not do, of what we would nowadays call two resonance forms. Whereas Armstrong had clearly recognised two sets of electrons (22 and 10), this distinction is lost in this 1922 representation of the 32 ring electrons in naphthalene.

    rob1


  3. 1925[4]Just three years later Robinson (re?)discovers the magic of six π-electrons (the term π was not yet coined) and decides to reapply it to naphthalene. Rather than average two equivalent structures, each with just six cyclic π-electrons (Armstrong’s C) he uses two such rings with twelve π-electrons. This means that he implies only 20 σ electrons (32-12=20), because to balance his count he has to remove two from the central C-C bond. When he writes that the deletion of the central connecting bond(s) is more apparent than real, he is really describing for the first time what we nowadays call a homo-π-bond, one with no underlying σ-bond (also called a suspended π-bond). On the premise one can never have too much of a good thing, he also applies this to anthracene.

    rob2


  4. 2015: Posterity has now decided that Robinson’s 1922 effort has more or less survived and his 1925 effort has not. But one might ask whether this ill-fated suggestion could in fact inspire modern chemistry? Well, crystalline examples of such suspended π-bonds are now indeed known[5] and there are probably many more out there. I too have been inspired by the fun and games Robinson had with those two electrons;

    BB


    I have forcibly removed two electrons from the system by replacing the two central carbon atoms with boron. And now playing Armstrong and Robinson’s games leaves either only an 8π periphery with a central B-B σ-bond[6] or one can raid the two B-B electrons to top the π-periphery up to 10 electrons.[7]

    • The first isomer, as a 8π-electron system is according to modern knowledge antiaromatic. A ωB97XD/6-311G(d) calculation shows this is not a stable minimum, with negative energy force constants showing a twisting motion trending to a Möbius ring? It never reaches this, since further C-B bonds are ultimately formed to create an unrelated structure[8].
    • a 10π form is 35.7 kcal/mol lower than the first and reveals five π MOs, the highest energy of which is shown below with a suspended π-bond between the two central boron atoms and a LUMO corresponding to an empty B-B σ-bond.

      BB-open-LUMOBB-open-HOMO


I hope this illustrates how science often iterates to final solutions, but that even the incorrect oscillations can still teach us chemistry.

References

  1. "Proceedings of the Chemical Society, Vol. 6, No. 85", Proceedings of the Chemical Society (London), vol. 6, pp. 95, 1890. https://doi.org/10.1039/pl8900600095
  2. G.J. Stoney, "XLIX. <i>Of the “electron,” or atom of electricity</i>", The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 38, pp. 418-420, 1894. https://doi.org/10.1080/14786449408620653
  3. W.O. Kermack, and R. Robinson, "LI.—An explanation of the property of induced polarity of atoms and an interpretation of the theory of partial valencies on an electronic basis", J. Chem. Soc., Trans., vol. 121, pp. 427-440, 1922. https://doi.org/10.1039/ct9222100427
  4. J.W. Armit, and R. Robinson, "CCXI.—Polynuclear heterocyclic aromatic types. Part II. Some anhydronium bases", J. Chem. Soc., Trans., vol. 127, pp. 1604-1618, 1925. https://doi.org/10.1039/ct9252701604
  5. A. Doddi, C. Gemel, M. Winter, R.A. Fischer, C. Goedecke, H.S. Rzepa, and G. Frenking, "Low‐Valent Ge<sub>2</sub> and Ge<sub>4</sub> Species Trapped by N‐Heterocyclic Gallylene", Angewandte Chemie International Edition, vol. 52, pp. 450-454, 2012. https://doi.org/10.1002/anie.201204440
  6. H.S. Rzepa, "C 8 H 8 B 2", 2015. https://doi.org/10.14469/ch/191378
  7. H.S. Rzepa, "C 8 H 8 B 2", 2015. https://doi.org/10.14469/ch/191380
  8. H.S. Rzepa, "C 8 H 8 B 2", 2015. https://doi.org/10.14469/ch/191379

Reproducibility in science: calculated kinetic isotope effects for cyclopropyl carbinyl radical.

July 11th, 2015

Previously on the kinetic isotope effects for the Baeyer-Villiger reaction, I was discussing whether a realistic computed model could be constructed for the mechanism. The measured KIE or kinetic isotope effects (along with the approximate rate of the reaction) were to be our reality check. I had used ΔΔG energy differences and then HRR (harmonic rate ratios) to compute[1] the KIE, and Dan Singleton asked if I had included heavy atom tunnelling corrections in the calculation, which I had not. His group has shown these are not negligible for low-barrier reactions such as ring opening of cyclopropyl carbinyl radical.[2] As a prelude to configuring his suggested programs for computing tunnelling (GAUSSRATE and POLYRATE), it was important I learnt how to reproduce his KIE values.[2] Hence the title of this post. Now, read on.

cp

I felt I could contribute to the cause by extending the published results in two respects:

  1. The reported[2] calculations are for the model B3LYP/6-31G(d) but the article does not report the tolerance to e.g. basis set variation (6-31G(d), a modest basis set by 2015 standards),
  2. or to the quantum model used (B3LYP, a veritable DFT method).

These two model chemistries can both be tested by “increasing” their accuracy. The Def2-QZVPP basis set is nearing the CBS, or complete basis set limit. The coupled-cluster CCSD(T) method is regarded as the gold standard for single reference calculations. The CASSCF method tests the response to a multi-reference wave function. Each is applied separately to ensure only one variable is being changed at a time.

Method Expt. KIE[2] Pred. KIE (my result) Pred. ΔG298 Pred. KIE[2] KIE + Tunnelling correction[2]
B3LYP/6-31G(d)[3],[4] 1.079295 1.0582 8.0 1.058 1.073
1.163173 1.1067 1.106 1.169
B3LYP/Def2-QZVPP[5],[6] 1.079295 1.0563 6.6 1.058 1.073
1.163173 1.1031 1.106 1.169
CASSCF(5,5)/6-31G(d)[7],[8] 1.079295 1.0572 8.2 1.058 1.073
1.163173 1.1050 1.106 1.169
CASSCF(5,5)/Def2-TZVPP[9],[10] 1.079295 1.0561 7.9 1.058 1.073
1.163173 1.1028 1.106 1.169
CCSD(T)/6-31G(d)[11],[12] 1.079295 1.0597 9.7 1.058 1.073
1.163173 1.1099 1.106 1.169

Actually separate ratios of 13C/12C(C-4)/13C/12C(C-3) since C-3 and C-4 are not equivalent in the reactant species because of the methylene group pyramidalisation. The KIE calculation input and outputs are archived.[13]

The first two rows of table are my attempt at an exact replication of the literature. The start point of such a project would be the supporting information or SI[2] which contains coordinates for the program GAUSSRATE and defines key structures in the form of a double-column, page thrown (broken might be a better word) PDF file. It was going to be a bit of a struggle to reconstitute this format into the structure required for a Gaussian calculation, so I simply constructed the models from scratch and optimised to the ring-opening transition state[4] and reactant.[3] I used a more recent version of the Gaussian program (G09/D.01 rather than G03/D.02) to do this, and tightened up some of the criteria to modern cutoff standards. A continuum solvent model could have been specified  (the solvent used in the experiments was 1,2-dichlorobenzene) but since no mention was made of solvent, I assumed a gas phase calculation had originally been done.  The starting geometry of the reactant deliberately had no symmetry, but during optimisation it converged to having a plane of symmetry using the B3LYP/6-31G(d) level of theory (the SI does not note this symmetry, it is implicit). I then used my code[1] to compute the isotope effects. The KIE program used in the original literature calculation was not directly mentioned in the supporting information but is presumed to be Quiver. Dan Singleton has recently sent me these codes, but they still need to be compiled and tested at my end. I ended up with splendid agreement for the KIE as you can see above (top two lines). Its reproducible! Hence the various assumptions I made in achieving this appear justified.

Returning to the geometry of the cyclopropyl carbinyl radical as having a plane of symmetry, two of the other methods, CCSD(T)/6-31G(d) and CASSCF(5,5)/6-31G(d), as well as CASSCF(5,5) at the better Def2-TZVPP basis all predicted that the methylene radical is twisted by about 20° with respect to the Cs plane of the ring.

cp-asymm

It is useful to check whether this twisting has any impact on the predicted KIE. The answer is clear (Table). ALL the methods predict similar KIE to ± 0.003, which is as about as accurate as can be measured experimentally at the 1σ level of confidence. This is a remarkable result; few other computed molecular properties turn out to be so insensitive to the quantum procedure used. The next stage will be to check if the tunnelling corrections required to bring the calculation into congruence with the measured values are similarly insensitive.


The “barrier height” is quoted as 7 kcal/mol[2]. This is probably NOT the activation free energy.

References

  1. H.S. Rzepa, "KINISOT. A basic program to calculate kinetic isotope effects using normal coordinate analysis of transition state and reactants.", 2015. https://doi.org/10.5281/zenodo.19272
  2. O.M. Gonzalez-James, X. Zhang, A. Datta, D.A. Hrovat, W.T. Borden, and D.A. Singleton, "Experimental Evidence for Heavy-Atom Tunneling in the Ring-Opening of Cyclopropylcarbinyl Radical from Intramolecular <sup>12</sup>C/<sup>13</sup>C Kinetic Isotope Effects", Journal of the American Chemical Society, vol. 132, pp. 12548-12549, 2010. https://doi.org/10.1021/ja1055593
  3. H.S. Rzepa, "C4H7(2)", 2015. https://doi.org/10.14469/ch/191357
  4. H.S. Rzepa, "C4H7(2)", 2015. https://doi.org/10.14469/ch/191358
  5. H.S. Rzepa, "C 4 H 7", 2015. https://doi.org/10.14469/ch/191353
  6. H.S. Rzepa, "C 4 H 7", 2015. https://doi.org/10.14469/ch/191352
  7. H.S. Rzepa, "C 4 H 7", 2015. https://doi.org/10.14469/ch/191361
  8. H.S. Rzepa, "C 4 H 7", 2015. https://doi.org/10.14469/ch/191364
  9. H.S. Rzepa, "C 4 H 7", 2015. https://doi.org/10.14469/ch/191363
  10. H.S. Rzepa, "C 4 H 7", 2015. https://doi.org/10.14469/ch/191362
  11. H.S. Rzepa, "C 4 H 7", 2015. https://doi.org/10.14469/ch/191367
  12. H.S. Rzepa, "C 4 H 7", 2015. https://doi.org/10.14469/ch/191356
  13. H.S. Rzepa, "Reproducibility In Science: Calculated Kinetic Isotope Effects For Cyclopropyl Carbonyl Radical.", 2015. https://doi.org/10.5281/zenodo.19949

Electrides (aka solvated electrons).

July 8th, 2015

Peter Edwards has just given the 2015 Hofmann lecture here at Imperial on the topic of solvated electrons. An organic chemist knows this species as “e” and it occurs in ionic compounds known as electrides; chloride = the negative anion of a chlorine atom, hence electride = the negative anion of an electron. It struck me how very odd these molecules are and so I thought I might share here some properties I computed after the lecture for a specific electride known as GAVKIS.[1] If you really want to learn (almost) everything about these strange species, go read the wonderful review by Zurek, Edwards and Hoffmann,[2] including a lesson in the history of chemistry stretching back almost 200 years.

GAVKIS consists of a tricyclic aza-ether ligand or cryptand wrapping a potassium atom in the centre, the overall unit having no charge. The oxygen and nitrogen heteroatoms coordinate to the metal, in the process evicting its single electron. The question that struck me is “where does that electron go?”. You see in all normal molecules that electrons are associated with either one, two (or rarely) three nuclei, to form one-centred monosynaptic basins (lone pairs), two-centre or disynaptic basins (i.e. bonds) and more rarely three-centre bonds. The shared-electron two-centre manifestation was of course famously introduced by Gilbert N. Lewis in 1916 (note the centenary coming up!). Knowing where the electron (pairs) are has enabled the technique popular with organic chemists known as arrow pushing, or the VSEPR analysis of inorganic compounds. But an electride has no nucleus associated with it! So how can one describe its location?

[jsmol caption=’The crystal structure of GAVFIS’ fileurl=’https://www.rzepa.net/blog/wp-content/uploads/2015/07/GAVFIS-lattice.mol2′ id=’a2′ commands=’=spacefill 23%;wireframe 0.15;color cpk;’ debug=’false’]

The crystal structure of GAVFIS shows the potassium to be 8-coordinate. Remember, x-rays are diffracted not by a nucleus but by electrons in the molecule. The highest densities are of course associated with electrons in inner shells centered on nuclei and the much lower densities found in conventional bonds are not normally located by this technique (but see here). So it is no surprise to find that this x-ray analysis[1] did not succeed in answering the question posed above; where is the single electron liberated from the potassium atom? They did look for it, but surmised only that would be found in the “noise level electron density in the spaces between them (molecules)“. For GAVFIS, that empty space is actually dumb-bell shaped, and so perhaps an answer is that the electron occupies the dumb-bell shaped spaces between the ligand-potassium complex.

X-ray analysis was defeated by noise; it is an experimental technique after all. But the noise in a quantum mechanical calculation is much smaller; can this reveal where the evicted electron is? Here is the spin density (unpaired electron) distribution for one molecule of GAVFIS computed using the UωB97XD/6-31++(G) DFT method.[3] It is a stratocumulus-like cloud that enshrouds the molecule (click on the diagram below and you can rotate the function to view it from your own point of interest) but interestingly avoiding the regions along the N….N axis. There are also tiny amounts of (negative) spin density on the ligand atoms. So even when the “empty space” is infinitely large, the shape of the electride anion is nevertheless quite specific, but a holistic function of the shape of the entire molecule rather than its component atoms.

[jsmol caption=’The spin density in GAVFIS’ fileurl=’https://www.rzepa.net/blog/wp-content/uploads/2015/07/GAVFIS_sden.cub.xyz’ id=’a3′ commands=’=isosurface wp-content/uploads/2015/07/GAVFIS_sden.cub.jvxl translucent;zoom 70;’ debug=’false’]

Another way of describing where electrons are is using functions known as molecular orbitals. Below is the SOMO (singly occupied MO) and its shape in this case coincides with that of the spin density.

[jsmol caption=’The SOMO for GAVFIS’ fileurl=’https://www.rzepa.net/blog/wp-content/uploads/2015/07/GAVFIS_gp_mo113.cub.xyz’ id=’a3′ commands=’=isosurface wp-content/uploads/2015/07/GAVFIS_gp_mo113.cub.jvxl translucent;zoom 70;’ debug=’false’]

The molecular electrostatic potential is rather wackier (red = attractive to protons).

[jsmol caption=’The molecular electrostatic potential in GAVFIS’ fileurl=’https://www.rzepa.net/blog/wp-content/uploads/2015/07/GAVFIS_gp_esp.cub.xyz’ id=’a4′ commands=’=isosurface wp-content/uploads/2015/07/GAVFIS_gp_esp.cub.jvxl translucent;zoom 40;’ debug=’false’]

Odder still is the ELF (electron localisation function) and the identification of the centroids of its basins. These centroids normally coincide with the two-centre basins (bonds) and one-centre basins (lone pairs, inner shell electrons) in normal molecules, both being close to nuclear centres (atoms). For GAVFIS, two unexpected one-centre basins are found close to the two nitrogen atoms in the molecule, each with a population of 0.48 electrons, along with regular one-centre “lone pair” basins pointing inwards to the potassium (2.38 electrons each). The odd-looking pair of locations identified for the electride anion may have little physical reality, except for reminding us that the electride can indeed be in more than one location simultaneously!

[jsmol caption=’The ELF centroids in GAVFIS’ fileurl=’https://www.rzepa.net/blog/wp-content/uploads/2015/07/GAVFIS_bas.mol’ id=’a5′ debug=’false’]

I often also use the NCI (non-covalent-interaction) property of the electron density in these blogs. It tells us about regions of non-covalent electron density which represent attractive weak interactions between or within molecules. Here, it again shows us the weak non-covalent density (as the reduced density gradients) wrapping the molecule (green=weakly stabilizing).

[jsmol caption=’The NCI function in GAVFIS’ fileurl=’https://www.rzepa.net/blog/wp-content/uploads/2015/07/GAVFIS-gp-lr.xyz’ id=’a6′ commands=’=isosurface wp-content/uploads/2015/07/GAVFIS-gp-lr.jvxl translucent;zoom 70;’ debug=’false’]

The obvious next question is that if each molecule is surrounded by weak spin density arising from an unpaired electron, would two such species form a dimer in which the spins are paired in an manner analogous to the conventional single bond? The overlap is not going to be fantastic if the spin distribution has the shape shown above, but what the hell. Here is the HOMO of such a species.[4] It appears the shape of the electride is very pliable indeed; they have been squeezed out of the contact region between the two molecules (which form a close contact pair) into wrapping the dimer rather than the monomer! The spin-coupled singlet by the way is about 4.6 kcal/mol more stable in free energy ΔG298 than two isolated monomer doublets, and 5.5 kcal/mol lower than the triplet species[5] which retains two unpaired electrons. A sort of weak molecule-pair bond rather than an atom-pair bond.
[jsmol caption=’The MO in GAVFIS dimer’ fileurl=’https://www.rzepa.net/blog/wp-content/uploads/2015/07/GAVFIS_mo225.cub.xyz’ id=’a8′ commands=’=isosurface wp-content/uploads/2015/07/GAVFIS_mo225.cub.jvxl translucent;zoom 70;’ debug=’false’]
This has hardly started to scratch the surface of the strange properties of electrides. If your appetite has been whetted, go read the article I noted at the beginning.[2]


For normal molecules, a Mulliken or other population analysis reduces the charge and spin density down to an atom-centered distribution. If this is done for GAVFIS, the spin density collapses down to the molecular centroid, in this case the potassium (spin density 1.15). This of course is horribly misleading, and serves to remind us that such atom-centered distributions can sometimes be far from realistic.


References

  1. D.L. Ward, R.H. Huang, and J.L. Dye, "Structures of alkalides and electrides. I. Structure of potassium cryptand[2.2.2] electride", Acta Crystallographica Section C Crystal Structure Communications, vol. 44, pp. 1374-1376, 1988. https://doi.org/10.1107/s0108270188002847
  2. E. Zurek, P. Edwards, and R. Hoffmann, "A Molecular Perspective on Lithium–Ammonia Solutions", Angewandte Chemie International Edition, vol. 48, pp. 8198-8232, 2009. https://doi.org/10.1002/anie.200900373
  3. H.S. Rzepa, "C 18 H 36 K 1 N 2 O 6", 2015. https://doi.org/10.14469/ch/191347
  4. H.S. Rzepa, "C 36 H 72 K 2 N 4 O 12", 2015. https://doi.org/10.14469/ch/191348
  5. H.S. Rzepa, "C 36 H 72 K 2 N 4 O 12", 2015. https://doi.org/10.14469/ch/191350

Reproducibility in science: calculated kinetic isotope effects for the Baeyer-Villiger reaction.

July 1st, 2015

Recollect this earlier post on the topic of the Baeyer-Villiger reaction. In 1999 natural abundance kinetic isotope effects were reported[1] and I set out to calculate the values predicted for a particular model constructed using Quantum mechanics. This comparison of measurement and calculation is nowadays a standard verification of both experiment and theory. When the two disagree either the computational model is wrong or incomplete, or the remoter possibility that there is something not understood about the experiment.

bv4

In this case, as you can see above, the measured 13C KIE at the carbonyl carbon was in the range 1.045-1.051, whereas the theory (ωB97XD/Def2-TZVP/SCRF=DCM) predicted a significantly smaller value of 1.023 for the first kinetic step, the formation of the tetrahedral intermediate. This was the step suggested by Singleton as rate limiting in the forward direction (there was also a larger disagreement on the 2H KIE at C3/C5, measured as 0.97 and calculated as 0.997 but this might simply be a typographical error).

Now we have to find an explanation, and it was contingent upon myself to show that the theory was properly executed. At this stage, Dan Singleton offered some suggestions on this blog. This one related to how I had calculated the KIE, with Dan suggesting that the method I had used might be too inaccurate to draw any conclusions from. It was up to me to reproduce my results using the method suggested by Dan as the more accurate. And this is where this story really starts.

I had used the Free energy method, which involves calculating the free energies of reactant and product using the built-in thermochemical component of the Gaussian code. Dan’s concern was that such free energies are only reported to an accuracy of 0.000001 Hartree. Since the isotope effect is calculated from differences in these free energies, was this degree of precision high enough to ensure reliable calculated KIE? Better to solve the equations directly and these involve the so-called Bigeleisen-Mayer terms derived from the classical partition functions. I recollected we had used this method back in 1975[2], but I assumed the computer code used was long since lost by myself. Instead Quiver was suggested (first written around 1988 by Keith Laidig at Yale) and I added the comment that the modern version of Quiver is called THERMISTP[3]. All that needed to be done was to acquire a copy of an appropriate code and re-run the KIE calculation using it.

And now comes the true purpose of this post, which is reproducing a calculation that might have been done ~20 years ago using computer code. Issues such as:

  1. Does the code still exist in either Source form or Executable form?
  2. If it does, will it still run correctly on a modern computer?
  3. Is it documented sufficiently to allow someone not immersed in the code to run it. After ~20 years, it might not be possible to talk to the original coders for explanations.

So this is what happened when I investigated.

  • Requests for the THERMISTP code of its authors, by email, have not yet brought a response. They may still, but one has to recognise that they may not. For example (unless I missed it), the article in which THERMISTP is cited[3] does not document how the code described in that article might be obtained (but the authors’ emails are provided).
  • I then followed up Dan’s suggestion to use Quiver. A copy was tracked down at http://www.chem.ucla.edu/~mccarren/houkGroup.html and a tar archive was available. The program comes as Quiver itself and a pre-program called qcrunch is used to prepare the data for Quiver.
  • It comes with pretty good (and brief!) documentation.
  • Fortran source code for Quiver (quiver.f) is available which is good, since one can compile it for one’s favourite machine (in my case a Mac running OS X). One can also read the comments to discover information beyond that provided by the documentation.
  • qcrunch however is only provided as an executable, and that means running it on Linux. One cannot read the comments!
  • When I ran qcrunch, it complained that an auxiliary program was not available to it. It wanted to run something associated with Gaussian 94, sadly long since gone from our systems. With no source code available I would have to fool qcrunch into thinking that Gaussian09 was really Gaussian94. It involves symbolic links and such and needs a little Unix expertise.
  • That done, I gave it its first input data, which is the normal mode analysis of the reactant involved in the reaction above, in the form of a formatted checkpoint file (a .fchk file). It seemed happy, and so I proceeded to the next step
  • which was to run quiver. This needed compilation. Here we encounter interesting issues. Most Fortran code can have “dependencies” which relate to the compiler used by its developer. The compiler however is rarely stated in the documentation, and I decided to use GFortran, which I obtained from http://hpc.sourceforge.net and installed on my Mac. And using this compiler, the compilation failed with about 3 syntax errors. You have to know enough Fortran to correct them all and obtain your executable.
  • Now I could run quiver to get the Bigeleisen-Mayer functions. It appeared to happily run, but produced rubbish (the anticipated frequencies had silly values of -10,000 cm-1 etc).
  • Now you go back to the code, and discover that the dimension statement (yes, Fortran required you to specify the size of your arrays in the code) is limited to 40 atoms! The Baeyer-Villiger system has 48. Easily fixed. Still rubbish out!
  • So now you go back to the start and create data for a tiny molecule, water in this case, and again qcrunch and quiver are run. Now sensible results, in the shape of normal mode frequencies of the anticipated values.
  • Conclusion: That qcrunch is probably also dimensioned to 40 atoms, but I now have no way of correcting this because I do not (yet?) have the source code for it. It might indeed be lost. If it is, it will have to be re-written from scratch.
  • Time elapsed thus far: ~2 weeks of intermittent work on the problem.
  • It was at this point I had one of those happy moments of accidental discovery. I updated the OS on my Mac to 10.10.4 (this was released yesterday). One of the new features was TRIM support for solid state third party disks (SSDs). I installed it and decided to test an external SSD connected to my machine. It happened to have lots of old stuff on it. In an idle moment, I decided to search its file base for the program we had written back in 1975 and which I had convinced myself was long-lost. Hey presto, there it was (KINISOT.f). I even found example input and output files. These take the form of frequencies for reactant (normal isotopes), reactant (new isotope), transition state (normal isotope) and transition state (new isotope). Not as elegant as quiver, but entirely fit for my purpose here.
  • The results:
    1. 13C KIE for the carbonyl carbon using the free energy method: 1.023 (see diagram above)
    2. 13C KIE using the Bigeleisen-Mayer partition function ratios: 1.0226
    3. 2H KIE for the axial α protons using the free energy method: 0.928
    4. 2H KIE for the axial α protons using the Bigeleisen-Mayer partition function ratios: 0.92831
  • The agreement enables me to conclude that the free energy method does not in fact suffer from significant round-off errors and other inaccuracies induced by the subtraction of two very similar energies.

What I have I learnt from this experience?

  1. That reproducing a calculation using old computer codes (in this case ~18 years) can be a difficult and complex procedure, relying on being able to contact people and ask for copies of the code, coping with possibly non-existent documentation, and with no access to the original coder who may be the only person able to explain it.
  2. That 20 years of continuous improvement in the computer industry means that a problem that would require impossibly enormous amounts of computer time then might be easily feasible now. The limit of 40 atoms in 1997 for quiver and qcrunch must have seemed future proof then, but alas it did not prove so.
  3. That if the source code for a program dimensioned too small is lost, one may have no option but to re-write it from scratch. It would indeed be a good test of reproducibility if the answers are unchanged!
  4. Keep good archives of your old work. You never know when they might come in useful!

Lessons for the future?

  1. Document your code! Put yourself in the position of someone in 20 years time trying to make sense of it! And record the compiler used (along with flags), and also the OS.
  2. Archive it in Zenodo or Github, and include that documentation.[4] You might not be contactable by email in 20 years time!
  3. include test inputs and outputs (OK, that is mostly done nowadays).
  4. I intend to do the above for my program. But I will use the immediate excuse often used by others for not archiving their codes: it is not yet documented sufficiently! But surely this would only take an hour or so? Watch this space. And please forgive the coding, it was done in 1975, when I was very inexperienced (another oft-used excuse!).

What about the Baeyer-Villiger isotope effects? Well, the above merely establishes that the free energy method (which requires no extra codes) has sufficient accuracy for computing KIEs. An explanation for the difference between the reported 13C KIE at the carbonyl carbon and its calculation still needs identifying. I state at the outset that (heavy atom) tunnelling corrections are NOT yet applied, and I might try the small-curvature tunneling model suggested by Dan, since he and Borden[5] have shown it can indeed be important. It would be exciting if the hydrogen isotope effects can be reproduced without tunnelling but that the carbon KIEs do require it.


I have received much help in the above saga from Jordi Bures Amat here, Erik Plata in Donna Blackmond’s group and information about another useful system for calculating KIE, the ISOEFF program.[6]


References

  1. D.A. Singleton, and M.J. Szymanski, "Simultaneous Determination of Intermolecular and Intramolecular <sup>13</sup>C and <sup>2</sup>H Kinetic Isotope Effects at Natural Abundance", Journal of the American Chemical Society, vol. 121, pp. 9455-9456, 1999. https://doi.org/10.1021/ja992016z
  2. M.J.S. Dewar, S. Olivella, and H.S. Rzepa, "Ground states of molecules. 49. MINDO/3 study of the retro-Diels-Alder reaction of cyclohexene", Journal of the American Chemical Society, vol. 100, pp. 5650-5659, 1978. https://doi.org/10.1021/ja00486a013
  3. M. Saunders, M. Wolfsberg, F.A.L. Anet, and O. Kronja, "A Steric Deuterium Isotope Effect in 1,1,3,3-Tetramethylcyclohexane", Journal of the American Chemical Society, vol. 129, pp. 10276-10281, 2007. https://doi.org/10.1021/ja072375r
  4. H.S. Rzepa, "KINISOT. A basic program to calculate kinetic isotope effects using normal coordinate analysis of transition state and reactants.", 2015. https://doi.org/10.5281/zenodo.19272
  5. O.M. Gonzalez-James, X. Zhang, A. Datta, D.A. Hrovat, W.T. Borden, and D.A. Singleton, "Experimental Evidence for Heavy-Atom Tunneling in the Ring-Opening of Cyclopropylcarbinyl Radical from Intramolecular <sup>12</sup>C/<sup>13</sup>C Kinetic Isotope Effects", Journal of the American Chemical Society, vol. 132, pp. 12548-12549, 2010. https://doi.org/10.1021/ja1055593
  6. V. Anisimov, and P. Paneth, "ISOEFF98. A program for studies of isotope effects using Hessian modifications", Journal of Mathematical Chemistry, vol. 26, pp. 75-86, 1999. https://doi.org/10.1023/a:1019173509273

The 2015 Bradley-Mason prize for open chemistry.

June 26th, 2015

Open principles in the sciences in general and chemistry in particular are increasingly nowadays preached from funding councils down, but it can be more of a challenge to find innovative practitioners. Part of the problem perhaps is that many of the current reward systems for scientists do not always help promote openness. Jean-Claude Bradley was a young scientist who was passionately committed to practising open chemistry, even though when he started he could not have anticipated any honours for doing so. A year ago a one day meeting at Cambridge was held to celebrate his achievements, followed up with a special issue of the Journal of Cheminformatics. Peter Murray-Rust and I both contributed and following the meeting we decided to help promote Open Chemistry via an annual award to be called the Bradley-Mason prize. This would celebrate both “JC” himself and Nick Mason, who also made outstanding contributions to the cause whilst studying at Imperial College. The prize was initially to be given to an undergraduate student at Imperial, but was also extended to postgraduate students who have promoted and showcased open chemistry in their PhD researches.

Peter and I are delighted to announce the inaugural winners of this prize.

The postgraduate winner is Tom Phillips for his open blog describing his experiences as a PhD student and for leading by example. He has published his instrumental codes on Github (and now Zenodo[1]) and data and codes for reproducing the graphs in his work on the “lab on a chip” in Figshare[2] and through his blog has encouraged other research students to do the same. Tom has worked assiduously to ensure that all the articles describing his PhD work are or will be open access.[3]

The undergraduate winner is Tom Arrow for his “spare time” involvement with WikiMedia (the foundation that underpins the open Wikipedia), including participating in a Wikimedia EU hackathon in Lyon France, and feeding his experiences and skills back into his undergraduate environment as well as enhancing the teaching Wiki used by his fellow students. Tom took the lead in introducing us to Wikidata[4] for storing chemical data in an open Wikibase data repository and in promoting its use for enriching Wikipedia chemistry pages and showcasing open data in undergraduate teaching environments.

References

  1. T. Phillips, and S. Macbeth, "pumpy: Zenodo release", 2015. https://doi.org/10.5281/zenodo.19033
  2. T. Phillips, J.H. Bannock, and J.D. Mello, "Data for microscale extraction and phase separation using a porous capillary", 2015. https://doi.org/10.6084/m9.figshare.1447208
  3. T.W. Phillips, J.H. Bannock, and J.C. deMello, "Microscale extraction and phase separation using a porous capillary", Lab on a Chip, vol. 15, pp. 2960-2967, 2015. https://doi.org/10.1039/c5lc00430f
  4. D. Vrandečić, and M. Krötzsch, "Wikidata", Communications of the ACM, vol. 57, pp. 78-85, 2014. https://doi.org/10.1145/2629489

The formation of tetrahedral intermediates.

June 12th, 2015

In the preceding post, I discussed the reaction between mCPBA (meta-chloroperbenzoic acid) and cyclohexanone, resulting in Baeyer-Villiger oxidation via a tetrahedral intermediate (TI). Dan Singleton, in whose group the original KIE (kinetic isotope measurements) were made, has kindly pointed out on this blog that his was a mixed-phase reaction, and that mechanistic comparison with homogenous solutions may not be justified. An intriguing aspect of the (solution) mechanism would be whether the TI forms quickly and/or reversibly and what the position of any equilibrium between it and the starting ketone is. This reminded me of work we did some years ago,[1] and here I discuss that.

It involved the addition of phenyl hydroxylamine, PhNHOH to acetyl cyanide at 215K. Because the CN group is poor at leaving, the tetrahedral intermediates do not collapse and instead accumulate in seconds to the point of becoming detectable by NMR (both N-C and O-C isomers). The position of the equilibrium clearly favours the TI rather than the starting materials. In another context, both the rate of reaction and the equilibrium can be driven towards the TI by the application of pressure.[2] Hydroxylamines are known to be super nucleophiles, enhanced by the so-called α-effect from buttressing of adjacent lone pairs on the N and O. This reminds that a peracid also should exhibit a related α-effect; it should be a better nucleophile than a normal carboxylic acid. So I decided to take the TI formed from cyclohexanone and mcPBA and look at the NBO orbitals, which should tell us about the anomeric effects present in this TI, and in particular if they might be larger than normal (which could be equated with greater stability for the TI). Here are the relevant NBO energies.[3]

TI-NBO

  1. The conventional anomeric effect in O-C-O manifests as a E(2) perturbation energy of ~16-18 kcal/mol between one oxygen lone pair and the antibonding C-O orbital. There are two combinations, and these are normally similar in energy.
  2. For the system above, the O1-C2-O6 interaction is 25.6 kcal/mol, much larger than normal, but partially counterbalanced by:
  3. O6-C1-O2 =13.0 kcal/mol which is a little lower than normal. This is overall an unusually strong anomeric effect for the O-C-O motif!
  4. The energetic asymmetry is matched by the two computed bond lengths, 1.381Å for the larger interaction and 1.455Å for the smaller. The pseudo-α-effect has desymmetrized the anomeric effect, but nevertheless strengthened it overall.
NBO 103

NBO 103 for O1(Lp)

NBO 97

NBO 97 for O6(Lp)

NBO 123

NBO 123 for C2-O6 antibonding σ*orbital

One concludes that the asymmetric anomeric effect makes the TI resemble the reactants. The transition state leading to the TI must be even earlier. In this context, I note that the (mixed phase) 13C effect reported for the carbonyl by Singleton and Szymanski[4] was quite a large one for carbon (1.045-1.051), a magnitude which argues against a very early transition state under these conditions. But the calculated value for a homogenous solution state model of ~1.023 is certainly more in accord with an early transition state.

Finally, a search of the CSD reveals 12 molecules containing either a O-O-C-O-O or a O-C-O-O sub unit This one[5] shows a bis HO-C-O-O-C-OH structure at room temperature; these species need not be unstable! There are none however with Ac-O-O-C-O. And of course the potent antimalarial artemisinin contains a O-O-C-O-C-O-Ac unit, for which stereoelectronic effects may also be important.

References

  1. A.M. Lobo, M.M. Marques, S. Prabhakar, and H.S. Rzepa, "Tetrahedral intermediates formed by nitrogen and oxygen attack of aromatic hydroxylamines on acetyl cyanide", The Journal of Organic Chemistry, vol. 52, pp. 2925-2927, 1987. https://doi.org/10.1021/jo00389a050
  2. N.S. Isaacs, H.S. Rzepa, R.N. Sheppard, A.M. Lobo, S. Prabhakar, and A.E. Merbach, "Volumes of reaction for the formation of some analogues of tetrahedral intermediates", Journal of the Chemical Society, Perkin Transactions 2, pp. 1477, 1987. https://doi.org/10.1039/p29870001477
  3. H.S. Rzepa, "C 20 H 20 Cl 2 O 6", 2015. https://doi.org/10.14469/ch/191327
  4. D.A. Singleton, and M.J. Szymanski, "Simultaneous Determination of Intermolecular and Intramolecular <sup>13</sup>C and <sup>2</sup>H Kinetic Isotope Effects at Natural Abundance", Journal of the American Chemical Society, vol. 121, pp. 9455-9456, 1999. https://doi.org/10.1021/ja992016z
  5. A. Kobayashi, Y. Ikeda, K. Kubota, and Y. Ohashi, "Syntheses and crystalline structures of several aldehyde peroxides as new flavor compounds", Journal of Agricultural and Food Chemistry, vol. 41, pp. 1297-1299, 1993. https://doi.org/10.1021/jf00032a025