Posts Tagged ‘Cambridge’

Imaging vibrational normal modes of a single molecule.

Thursday, April 18th, 2019

The topic of this post originates from a recent article which is attracting much attention.[1] The technique uses confined light to both increase the spatial resolution by around three orders of magnitude and also to amplify the signal from individual molecules to the point it can be recorded. To me, Figure 3 in this article summarises it nicely (caption: visualization of vibrational normal modes). Here I intend to show selected modes as animated and rotatable 3D models with the help of their calculation using density functional theory (a mode of presentation that the confinement of Figure 3 to the pages of a conventional journal article does not enable).

I should start by quoting some pertinent aspects obtained from the article itself. The caption to Figure 3 includes assignments, which I presume were done with the help of Gaussian calculations. Thus in the Methods section, we find … The geometry of a free CoTPP molecule is optimized under tight convergence criteria using Gaussian 09 (ref. 33). The orientationally averaged Raman spectrum and vibrational normal modes are calculated with the geometry of a free molecule … All the calculations mentioned above are performed at the B3LYP/6-31G* level with the effective core potential at the cobalt centre. Armed with this information, I looked at the data included with the article (the data supporting the findings of this study are available within the paper. Experimental source data for Figs. 1–4 are provided with the paper) but did not spot any data specifically relating to those Gaussian 09 calculations; in particular any data that would allow me to animate some vibrational normal modes for display here. No matter, it is easy to re-calculate, although I had to obtain the basic 3D coordinates from the Cambridge crystal data base (e.g. entry IKUDOH, DOI: 10.5517/cc6hj4b) since they were unavailable from the article itself. At this point some decisions about molecular symmetry needed to be made (the symmetry is not mentioned in the article), since it is useful to attach the irreducible representations (IR) of each mode as a label (lacking in Figure 3). The crystal structure I picked has idealised S4 symmetry, but it could be higher at D2d or lower at C2.

The next issue to be solved is how many electrons to associate with the molecule. Tetraphenylporphyrin has 347 electrons and the free molecule would be expected to be a doublet spin state (with the quartet as an excited state). Were the vibrational modes calculated for this state? Perhaps not since I then found this statement: The physisorbed CoTPP is positively charged on gold, as demonstrated through TERS measurements using CO-terminated tips24 and through the Smoluchowski effect29…. In contrast to gold, the Kondo resonance of cobalt disappears on Cu(100), suggesting that it acquires nearly a full electron from the metal (see Extended Data Fig. 2). So it seems worth calculating both the cation and the anion singlets as well as the neutral doublet. But at this stage we do not know for certain what spin state the Gaussian 09 assignments in Figure 3 were done for, since there is no data associated with the article to tell us, only that they were done for the free molecule (nominally a doublet).

There is one more remark made in the article we need to take into account: After lowering the sample bias to approach the molecule and scanning at close range, the molecule flattens. Its phenyl rings, which in the free molecule assume a dihedral angle of 72°, rotate to become coplanar (see Extended Data Fig. 1b). Evidently, the binding energy of the phenyl groups to copper overcomes the steric hindrance in the planar geometry. So it might be useful to calculate this “flattened” form to see how much steric repulsion energy needs to be overcome by that binding of the phenyl groups to the surface of the metal. 

Finally, I decided to not try to replicate exactly the reported calculations (B3LYP/6-31G(d)) since this type of DFT mode does not include any dispersion attraction terms; moreover by today’s standards the basis set is also rather small. So here you have an ωB97Xd/6-311G(d,p) calculation, with tight convergence criteria (integral accuracy 10-14 and SCF 10-9; again we do not know what values were used for the article). To ensure that my data is as FAIR as possible, here is its DOI: 10.14469/hpc/5461

charge Multiplicity ΔG, Twisted Ph
Hartree
ΔG, Co-planar Ph
Hartree
ΔΔG, kcal/mol
0 Doublet -3294.58693 -3294.48867 61.7
0 Quartet -3294.58777 -3294.51985 42.6
+1 Singlet -3294.35473 -3294.24973 65.9
+1 Triplet -3294.40821 -3294.33092 48.5
-1 Singlet -3294.67713 -3294.56652 69.4

Starting with a singlet cation as a model, the intent is to compare the “free molecule” energy with that of a flattened version where the dihedral angles of the phenyl rings relative to the porphyrin ring are constrained to ~0° rather than ~72°. This emerges as a 4th order saddle point (a stationary point with four negative roots for the force constant matrix). Such a property means that each co-planar phenyl group is independently a transition state for rotation. The calculated geometry overall is far from planar, having S4 symmetry. The image below in (a) shows how non-planar the molecule still is; (b) an attempt to orient it into the same position as is displayed in Figure 3 of the article.[1]

Singlet cation. Click on the image to get a rotatable model.

The free energy ΔG is 65.9 kcal/mol higher than the twisted form, which means that according to the model proposed, the binding energy of the phenyl groups to copper must recover at least this much energy. If we consider a cationic porphyrin interacting with an anionic metal surface as an ion-pair, then this is perhaps feasible. It is difficult however to see how more than two of the phenyl rings can simultaneously interact with a flat metal surface.

Next, the triplet state of the cation, again a 4th-order saddle point with a rotational barrier of ΔG48.5 kcal/mol; the triplet being 33.6 kcal/mol lower than the singlet using this functional (singlet-triplet separations can be quite sensitive to the DFT functional used).

Triplet cation. Click on the image to get a rotatable model.

Next, the neutral doublet, another 4th-order saddle point and below it the quartet state, which this time is just a 2nd-order saddle point (an interesting observation in itself).

Neutral Doublet

Neutral Quartet

Finally, the “flattened” singlet anion, which also emerges as a 4th-order saddle point (the triplet state has SCF convergence issues which I am still grappling with).

Singlet anion

To inspect the vibrational modes of any of these species, click on the appropriate image to open a JSmol display. Then right-click in the molecule window, navigate to the 3rd menu down from the top (Model – 48/226), where the frames/vibrations are ordered in sets of 25. Open the appropriate set and select the vibration you want from the list of wavenumbers shown. The preselected normal mode is the one identified in Figure 3 as 388 cm-1, the symmetric N-Co stretch (I note the figure 3 caption refers to them as vibrational frequencies; they are of course vibrational wavenumbers!). You can also inspect the four modes shown as negative numbers (correctly as imaginary numbers) to see how the phenyl groups rotate. If you want to analyze the vibrational modes using other tools (the free Avogadro program is a good one), then download the appropriate log or checkpoint file from the FAIR data archives at 10.14469/hpc/5461.

I conclude by noting that the aspect of this article which I presume reports the Gaussian normal vibrational mode calculations (Figure 3, caption Bottom, assigned vibrational normal modes), has been a challenging one to analyse. Neither the charge state nor the spin state of these calculations is clearly indicated in the article (unless I missed it somewhere). The barriers to flattening out the molecule by twisting all four phenyl groups are unreported in the article, but emerge as substantial from the calculations here. The various species I calculated (summarised in the table and figures above) are all predicted to be non-planar. In the absence of provided coordinates with the article, the visual appearances (bottom row, Figure 3) are the only information available. These certainly appear flat and rather different from my projections shown above or below.

All of which amounts to a plea for more data and especially FAIR data to be submitted, providing information such as the charge and spin states used for the calculations, along with a full listing of all the normal mode vectors and wavenumbers. The article is only a letter at this stage; perhaps this information will appear in due course!


As noted above I have not attempted a direct replication, not least because there is no reported data to which any replication could be compared. The IRs of each vibrational mode are displayed along with the wavenumber when the 3D JSmol display is shown with a right-mouse-click.

References

  1. J. Lee, K.T. Crampton, N. Tallarida, and V.A. Apkarian, "Visualizing vibrational normal modes of a single molecule with atomically confined light", Nature, vol. 568, pp. 78-82, 2019. https://doi.org/10.1038/s41586-019-1059-9

How to stop (some) acetals hydrolysing.

Thursday, November 12th, 2015

Derek Lowe has a recent post entitled “Another Funny-Looking Structure Comes Through“. He cites a recent medchem article[1] in which the following acetal sub-structure appears in a promising drug candidate (blue component below). His point is that orally taken drugs have to survive acid (green below) encountered in the stomach, and acetals are famously sensitive to hydrolysis (red below). But if X=NH2, compound “G-5555” is apparently stable to acids.[1] So I pose the question here; why?

acetal

This reminded me of some work we did a few years ago on herbicides containing such an acetal substructure, where one diastereoisomer was very unstable to hydrolysis (and hence did not have the lifetime required of a herbicide) whereas the other diastereomer was far less labile and hence more suitable.[2],[3] Crystal structures (below) revealed that the two C-O bond lengths of the labile form were very unequal in length (Δ0.043Å), whereas the stable form had two equal C-O lengths (1.408Å, Δ=0.0Å).

Click for 3D

KAWYOW, Click for 3D

Click for 3D

KAWYEM, Click for 3D

A search of the Cambridge structure database (CSD) surprisingly reveals no hits for molecules containing the (blue) substructure in which X=NH2, but there is one example[4],[5] of an orthoformate in which the group equivalent to X is protonated as Me2NH+. For this example, all three C-O lengths are shorter than even the hydrolytically stable herbicide above (1.405, 1.402, 1.396Å). The distribution for 6-ring acetals in general shows hot-spots at ~1.415Å and 1.43Å (but sadly it is not possible to e.g. use this database to correlate these lengths with the aqueous stability of the entries).

OCO

Is this tentative further evidence that a group X = NH2 positioned as above in an acetal can inhibit its hydrolysis?

HUZKEZ, click for 3D

HUZKEZ, click for 3D

Time for calculations. A model (X=R=H) for the hydrolysis was constructed as above in which proton transfer from an acid (ethanoic) is achieved via a cyclic 8-ring transition state and which includes a continuum solvent field as ωB97XD/6-311G(d,p)/SCRF=water and one explicit water in the proton relay. The IRC looks thus:

acetalH

This shows that the first event is protonation of an oxygen, closely followed by cleavage of the associated C-O bond, and ending with deprotonation of the erstwhile water molecule.

acetalha

The value of ΔG298 is 38.2 kcal/mol (38.4 in relative total energy). Although rather high for a facile thermal reaction (perhaps the 8-ring TS is a bit too strained; possibly adding a second active water molecule to form a 10-ring might lead to a lower barrier?), we are more interested in the effect upon this barrier of group X (Table below).

X ΔE ΔG DataDOI,TS DataDOI,IRC
H 38.4 38.2 [6] [7]
NH2,eq 39.8 38.8 [8] [9]
NH3.Cl,eq 45.1 43.1 [10] [11]
NH3.Cl,ax 42.6 41.5 [12] [13]
CF3,eq 41.9 40.1 [14] [15]
SF5,eq 43.6 42.4 [16] [17]

Introduction of X=NH3+.Cl into an (equatorial) position which is antiperiplanar to the C-O bonds of the acetal produces a modified IRC profile. The barrier measured at a point IRC = -10 is ~41 kcal/mol, which is noticeably higher than for X=H. In fact the final barrier is even higher, since the reactant goes on to form a hydrogen bond between the water molecule and the Cl, an extra stabilisation not present with X=H (and so not really appropriate to include in the comparison).

acetal-NH3Cl

acetalnh3cl-eqa

Placing the X=NH3+.Cl into an (axial) position which is not antiperiplanar to the C-O bonds shows a lower barrier compared to the equatorial isomer. This difference can also be illustrated by the NBO localised orbital energies of the two reactants. With X=NH3+.Cl axial, the lone pair on the oxygen being protonated by the acid has an energy of -0.464 au, whereas the equatorial equivalent is a “less reactive” -0.471 au (a difference in energy of 4.4 kcal/mol, which is VERY approximately related to the effects being discussed).

I conclude that the inhibition of acetal solvolysis is induced by the presence of an electron withdrawing group X, via antiperiplanar effects on the basicity of the acetal oxygen. In moderately low pH, X=NH2 is likely to be fully protonated; in this state, X=NH3+.Cl is an even better electron withdrawing group. The effect is also much stronger if X = equatorial. So one can predict here that if the alternate stereoisomer with X = axial were to be synthesised, it would hydrolyse more quickly. Other groups (X=F, CN etc) would probably show similar behaviour.


I have added two further entries, X=CF3 and X=SF5 in the table above, showing the latter to be more effective at inhibiting hydrolysis.

References

  1. C.O. Ndubaku, J.J. Crawford, J. Drobnick, I. Aliagas, D. Campbell, P. Dong, L.M. Dornan, S. Duron, J. Epler, L. Gazzard, C.E. Heise, K.P. Hoeflich, D. Jakubiak, H. La, W. Lee, B. Lin, J.P. Lyssikatos, J. Maksimoska, R. Marmorstein, L.J. Murray, T. O’Brien, A. Oh, S. Ramaswamy, W. Wang, X. Zhao, Y. Zhong, E. Blackwood, and J. Rudolph, "Design of Selective PAK1 Inhibitor G-5555: Improving Properties by Employing an Unorthodox Low-p <i>K</i> <sub>a</sub> Polar Moiety", ACS Medicinal Chemistry Letters, vol. 6, pp. 1241-1246, 2015. https://doi.org/10.1021/acsmedchemlett.5b00398
  2. P. Camilleri, D. Munro, K. Weaver, D.J. Williams, H.S. Rzepa, and A.M.Z. Slawin, "Isoxazolinyldioxepins. Part 1. Structure–reactivity studies of the hydrolysis of oxazolinyldioxepin derivatives", J. Chem. Soc., Perkin Trans. 2, pp. 1265-1269, 1989. https://doi.org/10.1039/p29890001265
  3. P. Camilleri, D. Munro, K. Weaver, D.J. Williams, H.S. Rzepa, and A.M.Z. Slawin, "Isoxazolinyldioxepins. Part 1. Structure–reactivity studies of the hydrolysis of oxazolinyldioxepin derivatives", J. Chem. Soc., Perkin Trans. 2, pp. 1929-1933, 1989. https://doi.org/10.1039/p29890001929
  4. Beckmann, C.., Jones, P.G.., and Kirby, A.J.., "CCDC 209989: Experimental Crystal Structure Determination", 2003. https://doi.org/10.5517/cc71hvl
  5. C. Beckmann, P.G. Jones, and A.J. Kirby, "<i>N,N,N</i>′,<i>N</i>′-Tetramethylstreptamine 2,4,6-orthoformate hydrochloride", Acta Crystallographica Section E Structure Reports Online, vol. 59, pp. o566-o568, 2003. https://doi.org/10.1107/s1600536803006287
  6. H.S. Rzepa, "C 6 H 14 O 5", 2015. https://doi.org/10.14469/ch/191581
  7. H.S. Rzepa, "Gaussian Job Archive for C6H14O5", 2015. https://doi.org/10.6084/m9.figshare.1599751
  8. H.S. Rzepa, "C 6 H 15 N 1 O 5", 2015. https://doi.org/10.14469/ch/191582
  9. H.S. Rzepa, "C6H15NO5", 2015. https://doi.org/10.14469/ch/191586
  10. H.S. Rzepa, "C 6 H 16 Cl 1 N 1 O 5", 2015. https://doi.org/10.14469/ch/191584
  11. H.S. Rzepa, "C6H16ClNO5", 2015. https://doi.org/10.14469/ch/191588
  12. H.S. Rzepa, "C 6 H 16 Cl 1 N 1 O 5", 2015. https://doi.org/10.14469/ch/191590
  13. H.S. Rzepa, "Gaussian Job Archive for C6H16ClNO5", 2015. https://doi.org/10.6084/m9.figshare.1601891
  14. H.S. Rzepa, "C 7 H 13 F 3 O 5", 2015. https://doi.org/10.14469/ch/191592
  15. H.S. Rzepa, "Gaussian Job Archive for C7H13F3O5", 2015. https://doi.org/10.6084/m9.figshare.1603088
  16. H.S. Rzepa, "C 6 H 13 F 5 O 5 S 1", 2015. https://doi.org/10.14469/ch/191595
  17. H.S. Rzepa, "Gaussian Job Archive for C6H13F5O5S", 2015. https://doi.org/10.6084/m9.figshare.1603420

Yes, no, yes. Computational mechanistic exploration of (nickel-catalysed) cyclopropanation using tetramethylammonium triflate.

Thursday, October 1st, 2015

A fascinating re-examination has appeared[1] of a reaction first published[2] in 1960 by Wittig and then[3] repudiated by him in 1964 since it could not be replicated by a later student. According to the new work, the secret to a successful replication seems to be the presence of traces of a nickel catalyst (originally coming from e.g. a nickel spatula?). In this recent article[1] a mechanism for the catalytic cycle is proposed. Here I thought I might explore this mechanism using calculations to see if any further insights might emerge.

cyclopropanation

In the mechanism above (I have retained the original numbering shown in the article itself), Ln is set to 2PH3 as an initial approximation and the solvent thf is approximated only by a continuum solvation field, with no explicit thf molecules involved at this stage. At this level and using ωB97XD/Def2-SVPD/SCRF=thf free energies, one can explore the cycle quite quickly (~2-3 days). It is also interesting that this reaction unusually involved nine different elements (I wonder what the record is? Not much greater I suspect).

Species ΔΔG298 DataDOI
4+CH2NMe3+LiOTf + ethene +23.9 [4],[5]
5 0.0 [6]
TS (5→ 9) 12.7 [7],[8]
9 + LiOTf + NMe3 0.2 [9]
TS (9 + ethene → 6) 7.2 [10],[11]
6 4.8 [12]
TS (6 → 7) 11.2 [13],[14]
7 -36.3 [15]
TS (7 → 4+8) -18.8 [16],[17]
4+8 + LiOTf + NMe3 -29.7 [4]

The structure of the complex 5 is more or less as shown in the article. The mean single bonded Ni-C length in the Cambridge structure database (CSD) is ~1.9Å, and (formally at least) Ni=C lengths are shorter at ~1.80-1.85. There is one reasonable analogy to the sub-structure shown below[18],[19] with a C-Ni length of 1.90, Ni-Li = 2.51 and Li-C = 2.40 which is reasonably similar to what is shown below. 

T

Click for  3D

Click for 3D

The elimination of NMe3 reveals a reasonable thermal barrier, resulting in the formation of the nickel-carbene product and the complex between NMe3 and LiOTf. 

5a5-9

The Ni-carbene then reacts with alkene (modelled here by ethene) to form a Ni-alkene π-complex, with a very low barrier to the exo-energic reaction.

9-6a9-6

This complex then rearranges, again with a small barrier, to the metallocyclobutane, with considerable release of energy.

6-7a6-7

Finally, the metallocyclobutane extrudes the nickel to form cyclopropane bound to the Ni(PH3)2 as a pseudo-π/agostic complex, with this step of the reaction being somewhat endo-energic (+6.6 kcal/mol). As modelled, it produces a low-coordination Ni product 4, which also causes the initial reactants to be relatively high in energy (+23.9 relative to 5). This suggests that the entire cycle should optimally be repeated by including say two explicit thf solvent molecules, which could coordinate to 4, thus lowering its energy relative to the rest of the cycle. 

7-4a7-4

Below is shown the NCI (non-covalent-interactions) surface for the Ni-cyclopropane complex, revealing the relatively high density between the Ni and the edge of the cyclopropane (high enough indeed to be considered on the verge of being covalent density). No examples of this motif are found in the CSD.

Click for  3D

Click for 3D


Overall, the reaction as shown shows entirely reasonable energetics and activation free energy barriers (with the caveat that inclusion of explicit solvent molecules might improve things, see above). We might conclude from this that the catalytic cycle as proposed is entirely reasonable. What we cannot comment on of course is the relative energetics of any of the competing side reaction shown in the original scheme,[1] but it would be really easy to include them in a more complete analysis if needed. I wanted to show here that a simple reality check on a proposed reaction mechanism can be quick to perform, and perhaps nowadays should be regarded as a sine qua non of mechanistic speculation.

References

  1. S.A. Künzi, J.M. Sarria Toro, T. den Hartog, and P. Chen, "Nickel‐Catalyzed Cyclopropanation with NMe<sub>4</sub>OTf and <i>n</i>BuLi", Angewandte Chemie International Edition, vol. 54, pp. 10670-10674, 2015. https://doi.org/10.1002/anie.201505482
  2. V. Franzen, and G. Wittig, "Trimethylammonium‐methylid als Methylen‐Donator", Angewandte Chemie, vol. 72, pp. 417-417, 1960. https://doi.org/10.1002/ange.19600721210
  3. G. Wittig, and D. Krauss, "Cyclopropanierungen bei Einwirkung von <i>N</i>‐Yliden auf Olefine", Justus Liebigs Annalen der Chemie, vol. 679, pp. 34-41, 1964. https://doi.org/10.1002/jlac.19646790106
  4. H.S. Rzepa, "C 4 H 9 F 3 Li 1 N 1 O 3 S 1", 2015. https://doi.org/10.14469/ch/191545
  5. H.S. Rzepa, "C 5 H 11 F 3 Li 1 N 1 O 3 S 1", 2015. https://doi.org/10.14469/ch/191553
  6. H.S. Rzepa, and H.S. Rzepa, "C 5 H 17 F 3 Li 1 N 1 Ni 1 O 3 P 2 S 1", 2015. https://doi.org/10.14469/ch/191554
  7. H.S. Rzepa, "C 5 H 17 F 3 Li 1 N 1 Ni 1 O 3 P 2 S 1", 2015. https://doi.org/10.14469/ch/191536
  8. H.S. Rzepa, "C5H17F3LiNNiO3P2S", 2015. https://doi.org/10.14469/ch/191550
  9. H.S. Rzepa, and H.S. Rzepa, "C 5 H 17 F 3 Li 1 N 1 Ni 1 O 3 P 2 S 1", 2015. https://doi.org/10.14469/ch/191555
  10. H.S. Rzepa, "C 3 H 12 Ni 1 P 2", 2015. https://doi.org/10.14469/ch/191547
  11. H.S. Rzepa, "C3H12NiP2", 2015. https://doi.org/10.14469/ch/191546
  12. H.S. Rzepa, "C 3 H 12 Ni 1 P 2", 2015. https://doi.org/10.14469/ch/191541
  13. H.S. Rzepa, "C 3 H 12 Ni 1 P 2", 2015. https://doi.org/10.14469/ch/191540
  14. H.S. Rzepa, "C3H12NiP2", 2015. https://doi.org/10.14469/ch/191548
  15. H.S. Rzepa, "C 3 H 12 Ni 1 P 2", 2015. https://doi.org/10.14469/ch/191542
  16. H.S. Rzepa, "C 3 H 12 Ni 1 P 2", 2015. https://doi.org/10.14469/ch/191537
  17. H.S. Rzepa, "C3H12NiP2", 2015. https://doi.org/10.14469/ch/191538
  18. Buchalski, P.., Grabowska, I.., Kaminska, E.., and Suwinska, K.., "CCDC 650794: Experimental Crystal Structure Determination", 2008. https://doi.org/10.5517/ccpv6c2
  19. P. Buchalski, I. Grabowska, E. Kamińska, and K. Suwińska, "Synthesis and Structures of 9-Nickelafluorenyllithium Complexes", Organometallics, vol. 27, pp. 2346-2349, 2008. https://doi.org/10.1021/om701275u

Deviations from tetrahedral four-coordinate carbon: a statistical exploration.

Sunday, September 6th, 2015

An article entitled “Four Decades of the Chemistry of Planar Hypercoordinate Compounds[1] was recently reviewed by Steve Bacharach on his blog, where you can also see comments. Given the recent crystallographic themes here, I thought I might try a search of the CSD (Cambridge structure database) to see whether anything interesting might emerge for tetracoordinate carbon.

The search definition is shown below using a  simple carbon with four ligands, the ligands themselves also being tetracoordinate carbon. The search is restricted to data collected below temperatures of 140K, as well as R-factor <5%, no errors and no disorder. Cyclic species are allowed and a statistically reasonable 2773 hits emerged from the search.

Scheme

Recollect that the idealised angle subtended at the centre is 109.47°. I show below three separate heat plots of the search results. Why three? The way the search software (Conquest) works is that one could define four C-C distances and six angles, and then plot any combination of one distance and one angle. I show just three combinations here, but could have included many more.

There appear to be four distinct clusters of values for this angle that emerge from the three plots shown below (the “bin size” is 100, and the frequency colour code indicates how many hits there are in each bin).

  1. The hotspot is unsurprisingly ~109° with a corresponding C-C distance of ~1.54Å.
  2. There may be two clusters at angles of ~60° (cyclopropane), with C-C values ranging from ~1.47 to ~1.55Å.
  3. A collection at ~90° (mostly cyclobutane?), with C-C values up to 1.6Å.
  4. A collection at ~140° (again small rings), now with much shorter C-C values of ~1.46Å. This reminds of the approximation that the hybridisation in e.g. cyclopropane is a combination of sp5 and sp3.

Scheme

Scheme

Scheme

Ideally, what one might want to plot would be sums of four angles; for a pure tetrahedral carbon the sum would always be 438° (4*109.47°) but for a pure planar carbon it could be as low as 360° (4*90°). One could then see how closely the distribution approaches to the latter and hence reveal whether there are any true planar tetracoordinate carbon species known. Although the Conquest software cannot analyse in such terms, a Python-based API has recently been released that should allow this to be done, although I should state that this requires a commercial license and it is not open access code. If we manage to get it working, I will report!


As a teaser I also include a plot of six-coordinate carbon, in which the ligands can be any non-metal. Note the clusters at angles of 60, ~112 and ~120-130°. It is worth pointing out that the definition of the connection between a carbon and a ligand as a “bond” becomes increasingly arbitrary as the coordination becomes “hyper”. Because crystallography does not measure electron densities in “bonds”, we know nothing of its topology in this region. It is therefore quite possible that the appearance of the heat plot below might be related just as much to whatever convention is being used in creating the entry in the CSD as it would be to a quantum analysis of the bonding.

Scheme

References

  1. L. Yang, E. Ganz, Z. Chen, Z. Wang, and P.V.R. Schleyer, "Four Decades of the Chemistry of Planar Hypercoordinate Compounds", Angewandte Chemie International Edition, vol. 54, pp. 9468-9501, 2015. https://doi.org/10.1002/anie.201410407

A visualisation of the effects of conjugation; dienes and biaryls.

Tuesday, August 25th, 2015

Here is another exploration of simple chemical concepts using crystal structures. Consider a simple diene: how does the central C-C bond length respond to the torsion angle between the two C=C bonds?

arm1

The search of the CSD (Cambridge structure database) is constrained to R < 5%, no errors and no disorder and the central  C-C bond is specific to be acyclic.

arm1

  1. Note first that the hotspot occurs for a torsion angle of 180°, a trans diene.
  2. There is just a hint that the C-C distance for a cis-diene might be a little shorter than the trans diene, but this might not be significant.
  3. There is a gentle curve illustrating that the C-C distance is indeed a maximum at 90°
  4. The C-C bond extends from ~1.445Å when the two double bonds are coplanar (fully conjugated) to ~1.48Å when orthogonal. Not much of a change, but statistically highly significant.

Here is another search, this time of the C=C-C=C motif embedded into a biaryl, of which there are far more examples. This time, the (red) hotspot is actually at 90°, with local (green) hotspots at 0 and 180° but also at 45 and 135°. Again, you can easily spot the maximum in C-C bond length at 90° but notice how much smaller the bond lengthening is (~ 0.01Å). This lengthening is inhibited by retention of the aromaticity of the two aryl rings; again the statistical effect is highly significant. Perhaps also significant is that the  C-C bond at torsions of 0 or 180° appear to be no shorter than the values at 45 and 135°.

arm1

arm1

Both these searches took about  5 minutes each, and serve to illustrate just how many basic chemical concepts can be teased out of a statistical analysis of crystal structures.


The analogous diagram for O=C-C=C is shown below;

arm1

That for  O=C-C=O is different however;

arm1

The 2015 Bradley-Mason prize for open chemistry.

Friday, June 26th, 2015

Open principles in the sciences in general and chemistry in particular are increasingly nowadays preached from funding councils down, but it can be more of a challenge to find innovative practitioners. Part of the problem perhaps is that many of the current reward systems for scientists do not always help promote openness. Jean-Claude Bradley was a young scientist who was passionately committed to practising open chemistry, even though when he started he could not have anticipated any honours for doing so. A year ago a one day meeting at Cambridge was held to celebrate his achievements, followed up with a special issue of the Journal of Cheminformatics. Peter Murray-Rust and I both contributed and following the meeting we decided to help promote Open Chemistry via an annual award to be called the Bradley-Mason prize. This would celebrate both “JC” himself and Nick Mason, who also made outstanding contributions to the cause whilst studying at Imperial College. The prize was initially to be given to an undergraduate student at Imperial, but was also extended to postgraduate students who have promoted and showcased open chemistry in their PhD researches.

Peter and I are delighted to announce the inaugural winners of this prize.

The postgraduate winner is Tom Phillips for his open blog describing his experiences as a PhD student and for leading by example. He has published his instrumental codes on Github (and now Zenodo[1]) and data and codes for reproducing the graphs in his work on the “lab on a chip” in Figshare[2] and through his blog has encouraged other research students to do the same. Tom has worked assiduously to ensure that all the articles describing his PhD work are or will be open access.[3]

The undergraduate winner is Tom Arrow for his “spare time” involvement with WikiMedia (the foundation that underpins the open Wikipedia), including participating in a Wikimedia EU hackathon in Lyon France, and feeding his experiences and skills back into his undergraduate environment as well as enhancing the teaching Wiki used by his fellow students. Tom took the lead in introducing us to Wikidata[4] for storing chemical data in an open Wikibase data repository and in promoting its use for enriching Wikipedia chemistry pages and showcasing open data in undergraduate teaching environments.

References

  1. T. Phillips, and S. Macbeth, "pumpy: Zenodo release", 2015. https://doi.org/10.5281/zenodo.19033
  2. T. Phillips, J.H. Bannock, and J.D. Mello, "Data for microscale extraction and phase separation using a porous capillary", 2015. https://doi.org/10.6084/m9.figshare.1447208
  3. T.W. Phillips, J.H. Bannock, and J.C. deMello, "Microscale extraction and phase separation using a porous capillary", Lab on a Chip, vol. 15, pp. 2960-2967, 2015. https://doi.org/10.1039/c5lc00430f
  4. D. Vrandečić, and M. Krötzsch, "Wikidata", Communications of the ACM, vol. 57, pp. 78-85, 2014. https://doi.org/10.1145/2629489

Artemisinin: are stereo-electronics at the core of its (re)activity?

Sunday, April 13th, 2014

Around 100 tons of the potent antimalarial artemisinin is produced annually; a remarkable quantity given its very unusual and fragile looking molecular structure (below). When I looked at this, I was immediately struck by a thought: surely this is a classic molecule for analyzing stereoelectronic effects (anomeric and gauche). Here this aspect is explored.

artemisinin

I start by listing the bonds around which interesting things might happen:

  1. C3-C4 has the gauche motif of a 1,2-diol
  2. Carbons 7 and 4 are anomeric centres, with the focus on bonds 1-7/7-6 and 6-4/4-5
  3. Bond 1-2 has the potential for a so-called α-effect, where the lone pairs on adjacent hetero-atoms are buttressed.

The crystal structure is shown below, annotated with pertinent bond lengths (trivial atom numbering). The dihedral 2-3-4-6 and 2-3-4-6 are respectively -51 and 72° (hence a double gauche at the 3-4 bond).

Click for 3D

Click for 3D

First, an exploration of what might be happening around C4. The following is a search of the Cambridge crystal structure database, plotted for the two C-O bond lengths common to C4.

artemisinin1 artemisinin Here, DIST1 is C4-O6 and DIST2 is C4-O5. Notice the very pronounced asymmetry; at the red hotspot above, the most frequent occurrence is ~1.39 and 1.46Å respectively; artemisinin is more or less at that hotspot. This can be quantified by the NBO E(2) energies for the interaction of an oxygen lone pair antiperiplanar to the C-O σ* bond;

  1. Lp(O6)-σ*(C4-O5) = 21.2 kcal/mol which helps to account for the short C4-O6 and the long C4-O5 bonds.
  2. whereas the reverse donation of Lp(O5)-σ*(C4-O6) is merely 4.8 kcal/mol (normally the two donations are more or less equal, and hence so at the two C-O bond lengths).
  3. At the second anomeric centre of C7, Lp(O1)-σ*(C7-O6) = 19.9 kcal/mol
  4. whereas the reverse donation of Lp(O6)-σ*(C7-O1) is 5.7 kcal/mol, again highly asymmetric, as are the C-O bond lengths (1.413/1.441Å).
  5. Next, the gauche effect at C3-C4. The C4-H to C3-O2 donation is 6.4 kcal/mol, again contributing to the longer C-O length of 1.447Å.

Where such stereoelectronic interactions are asymmetric, one might expect enhanced reactivity. A good example of this are two stereoisomeric of a 7-ring herbicide[1] where one anomer with equal anomeric C-O lengths is a stable soil-persistent species, whereas the other with asymmetric lengths has a very short soil residency due to rapid hydrolysis. It might be tempting to speculate that some aspect of the activity of artemisinin may be due to such stereoelectronic asymmetries.

Finally, because it is virtually free to do so in a computational sense, I show the computed VCD spectrum[2] (covering the possibility that it is measured at some point). The calculated[3] optical rotation ([α]589 is +93° (obs ~+76°). Whilst the absolute configuration is not in any doubt, it is always nice to have further confirmations.

artemisinin

References

  1. P. Camilleri, D. Munro, K. Weaver, D.J. Williams, H.S. Rzepa, and A.M.Z. Slawin, "Isoxazolinyldioxepins. Part 1. Structure–reactivity studies of the hydrolysis of oxazolinyldioxepin derivatives", J. Chem. Soc., Perkin Trans. 2, pp. 1929-1933, 1989. https://doi.org/10.1039/p29890001929
  2. H.S. Rzepa, "Gaussian Job Archive for C15H22O5", 2014. https://doi.org/10.6084/m9.figshare.997360
  3. H.S. Rzepa, "Gaussian Job Archive for C15H22O5", 2014. https://doi.org/10.6084/m9.figshare.997463

Artemisinin: are stereo-electronics at the core of its (re)activity?

Sunday, April 13th, 2014

Around 100 tons of the potent antimalarial artemisinin is produced annually; a remarkable quantity given its very unusual and fragile looking molecular structure (below). When I looked at this, I was immediately struck by a thought: surely this is a classic molecule for analyzing stereoelectronic effects (anomeric and gauche). Here this aspect is explored.

artemisinin

I start by listing the bonds around which interesting things might happen:

  1. C3-C4 has the gauche motif of a 1,2-diol
  2. Carbons 7 and 4 are anomeric centres, with the focus on bonds 1-7/7-6 and 6-4/4-5
  3. Bond 1-2 has the potential for a so-called α-effect, where the lone pairs on adjacent hetero-atoms are buttressed.

The crystal structure is shown below, annotated with pertinent bond lengths (trivial atom numbering). The dihedral 2-3-4-6 and 2-3-4-6 are respectively -51 and 72° (hence a double gauche at the 3-4 bond).

Click for 3D

Click for 3D

First, an exploration of what might be happening around C4. The following is a search of the Cambridge crystal structure database, plotted for the two C-O bond lengths common to C4.

artemisinin1 artemisinin Here, DIST1 is C4-O6 and DIST2 is C4-O5. Notice the very pronounced asymmetry; at the red hotspot above, the most frequent occurrence is ~1.39 and 1.46Å respectively; artemisinin is more or less at that hotspot. This can be quantified by the NBO E(2) energies for the interaction of an oxygen lone pair antiperiplanar to the C-O σ* bond;

  1. Lp(O6)-σ*(C4-O5) = 21.2 kcal/mol which helps to account for the short C4-O6 and the long C4-O5 bonds.
  2. whereas the reverse donation of Lp(O5)-σ*(C4-O6) is merely 4.8 kcal/mol (normally the two donations are more or less equal, and hence so at the two C-O bond lengths).
  3. At the second anomeric centre of C7, Lp(O1)-σ*(C7-O6) = 19.9 kcal/mol
  4. whereas the reverse donation of Lp(O6)-σ*(C7-O1) is 5.7 kcal/mol, again highly asymmetric, as are the C-O bond lengths (1.413/1.441Å).
  5. Next, the gauche effect at C3-C4. The C4-H to C3-O2 donation is 6.4 kcal/mol, again contributing to the longer C-O length of 1.447Å.

Where such stereoelectronic interactions are asymmetric, one might expect enhanced reactivity. A good example of this are two stereoisomeric of a 7-ring herbicide[1] where one anomer with equal anomeric C-O lengths is a stable soil-persistent species, whereas the other with asymmetric lengths has a very short soil residency due to rapid hydrolysis. It might be tempting to speculate that some aspect of the activity of artemisinin may be due to such stereoelectronic asymmetries.

Finally, because it is virtually free to do so in a computational sense, I show the computed VCD spectrum[2] (covering the possibility that it is measured at some point). The calculated[3] optical rotation ([α]589 is +93° (obs ~+76°). Whilst the absolute configuration is not in any doubt, it is always nice to have further confirmations.

artemisinin

References

  1. P. Camilleri, D. Munro, K. Weaver, D.J. Williams, H.S. Rzepa, and A.M.Z. Slawin, "Isoxazolinyldioxepins. Part 1. Structure–reactivity studies of the hydrolysis of oxazolinyldioxepin derivatives", J. Chem. Soc., Perkin Trans. 2, pp. 1929-1933, 1989. https://doi.org/10.1039/p29890001929
  2. H.S. Rzepa, "Gaussian Job Archive for C15H22O5", 2014. https://doi.org/10.6084/m9.figshare.997360
  3. H.S. Rzepa, "Gaussian Job Archive for C15H22O5", 2014. https://doi.org/10.6084/m9.figshare.997463

The butterfly effect in chemistry: Bimodal M~S bonds?

Sunday, July 14th, 2013

I noted previously that some 8-ring cyclic compounds could exist in either a planar-aromatic or a non-planar-non-aromatic mode, the mode being determined by apparently quite small changes in a ring substituent. Hunting for other examples of such chemistry on the edge, I did a search of the Cambridge crystal database for metal sulfides. 

The search was specified as following:

  1. Any element from one of the three transition metal series, TS1, TS2 or TS3
  2. to contain a M-S bond (any type of bond)
  3. with the restriction that the sulfur has only one atom attached (the metal)
  4. R < 0.05
  5. No errors and no disorder

The results for the three transition series were quite different. The first row indicated a distribution with a single maximum (~2.3Å), albeit with a very long tail reaching out to 3.2Å. The second row had a very clear bimodel distribution, with two peaks, at ~2.2Å and ~2.55Å, the latter having the greater number of examples. The third row showed an inverse distribution, with the prominent peak at ~2.2Å and the smaller peak at 2.55Å. 

TS1TS2 TS3

A search for the possible reasons for the bimodal distributions is now needed. This could be done in two ways; (a) to refine the search, or (b) to individually scrutinize each example to identify if any chemical reasons can be found. Since there are 100s of examples, I will concentrate on the first tactic. The most obvious is that the compounds cluster into neutral and ionic, as below, which allows the search to be constrained to only entries where a negative charge has been assigned to the sulfur.

MS

TS2m TS3m

This reveals such entries are a fraction of the total, and for these the C-S bond length clusters around 2.2Å. Which leaves us with a mystery. This would mean that the neutral systems, presumably with a formal M=S bond, would have to be responsible for the entries at ~2.55Å. Could it really be that a higher bond order results in a longer bond? 

What the search does not allow us to do (or I am unaware of how to do it) is to identify whether there are any pair of examples where the ligands surrounding M are identical but where one of the pair has a short and the other a long M-S bond. Or indeed such a pair where the differences between the ligands are small (however that is defined). Similarly, one cannot constrain the searches by spin state to see if that might be responsible. It would also be nice to have the system automatically evaluate whether the valence shell on the metal is full (18) or not (16, 17) to see if that is related to the bond length. Indeed, an example of needing to count the valence shell was recently brought to my attention by a student, and it will be the subject of a future post. 

Computers 1967-2013: a personal perspective. Part 5. Network bandwidth.

Wednesday, June 5th, 2013

In a time of change, we often do not notice that Δ = ∫δ. Here I am thinking of network bandwidth, and my personal experience of it over a 46 year period.

I first encountered bandwidth in 1967 (although it was not called that then). I was writing Algol code to compute the value of π, using paper tape to send the code to the computer. Unfortunately, the paper tape punch was about 10 km from that computer. The round trip (by van) took about a week, the outcome being often merely to discover that the first line of the code contained a compilation error. I think I got to computing π after about six weeks. That is a bandwidth of about 18 characters (108 bits) in 3628800 seconds, or 0.00003 bits per second.

I did my undergraduate work in 1969, when the distance between the card punch and the computer had reduced to about 50m, and instant turnaround involved circulating in a loop between the punch and the line printer, hoping that neither suffered a paper-wreck. The bandwidth had certainly gone up. On a good day, you could make 20 or so circuits, which did leave one feeling faintly dizzy. 

The next improvement came in 1972, when I was solving non-linear equations for kinetic rate constants, using a 110 bits per second (baud) or ~ 18 characters per second using the 6-bit computers of that era) teletypewriter. This was about 50m from the lab where the kinetic measurements were made (using, if you are interested a scintillation counter. Yes, I was mildly radioactive for most of my PhD, but I do not believe I glowed in the dark). This bandwidth was in fact fine for uploading kinetic data, and receiving the computed rate constant and its standard error. You might note however that this teletypewriter was the only one in the building I occupied, and yet demand for it was small (I was pretty much its only user). 

The next increment occurred in Texas 1974-1977, where I was now doing quantum chemical calculations. Back in time to the card punch and the lineprinter (Texas is big, and so now the distance between them was a 10 minute walk). But in my last year there, a state-of-the-art 300 baud teletypewriter was installed! This was now fast enough to play a computer game (something to do with Dragons and Dungeons I think), and so now there was competition to use it. Particularly from one of my friends, who shall be called George, and who on one occasion spent about 48 virtually contiguous hours trying to get to the last level. The rest of us returned to the card punch to submit the calculations. It was also during this period that the first emails started to be exchanged, but only really as a curiosity: “it would never catch on” was the opinion of most.

Back in the UK by 1977, I was overwhelmed by the speed of the 9.6 kbaud graphics terminal I now had access to, 32 times faster. And the rate continued to multiply, by a further 1000 to attain 10 Mbaud in 1987. But another change occurred during this period. The previous eras had involved transmitting the data no more than ~200m, from one point in the campus to another. But by 1986, if one tried hard enough, one could reach ARPANET. And that was 5000 km away! My first use of such distances was to reach California and download Apple’s system 5.0 for the Macs in the department (I have described elsewhere the role the Mac’s printer port played in this). From then on, we always did have the latest operating system installed on most of the machines (although not always did this subterfuge address the intended issue, which was to stop the computer crashing as often).

These speeds however did not reach beyond the university. Back home, around 1983, I was back to using a 300 baud modem, with an acoustic coupler to the land line. Our young daughter, aged 3 at the time, joined in the data transmission with gusto. Her joyful shrieks were invariably picked up by the acoustic coupler, and translated into a jumble of characters, which were then interleaved into the numbers coming back from quantum calculations. It was sometimes difficult to tell them apart! These domestic modems gradually got faster, probably attaining 9.6 kbaud by about 1993 (during the course of which the acoustic component was replaced by electronics, and oddly, our daughter stopped shrieking in quite the same way). 

Back in the university in 1993, the first 100 megabits per second (100Mbps ≅100 Mbaud) ethernet lines and switches were being installed, but the national and international backbones were still a lot slower. It was in this year that I was approached to be part of a SuperJanet project. We were going to do a molecular videoconference from London to Cambridge and Leeds; a three-way connection, and this needed ~ 20Mbps to transmit the signal from the video camera as well as the 3D images of molecules in real-time (compression techniques were not so advanced in those days). Because BT was sponsoring the project, they naturally wanted some publicity, and so we even got to appear on the national television news that night. But we came within about 1 minute of a disaster. Our 20Mbps connection went through the SuperJanet national backbone, the capacity of which was, you guessed, ~ 20 Mbps. The network operators (located at the Rutherford-Appleton laboratories), who we had not had the foresight to pre-warn, came within 1 minute of isolating Imperial College from the national network because of our bandwidth hogging. I met them a month or so later, and they told me this. I feel I was lucky to escape with my life and body intact from that meeting (or to put it another way, they were not happy bunnies). 

By about 2000, I had achieved 1 Gbps to my desktop computer (and there it has stayed for the past 13 years). What about home? Well, to cut the story short, I recently benchmarked the domestic WiFi connection between a laptop and “the world” at about 65 Mbps (download) and 18 Mbps (upload), a little less than 1 million times greater than 30 years earlier and a 12 orders of magnitude greater than in 1967. I gather however that some lucky inhabitants of Austin Texas (the scene of my 1974-1977 experiments), courtesy of Google, can get 1 Gbps!

I will end by quoting Samuel Butler, writing in 1863I venture to suggest that … the general development of the human race to be well and effectually completed when all men, in all places, without any loss of time, at a low rate of charge, are cognizant through their senses, of all that they desire to be cognizant of in all other places. … This is the grand annihilation of time and place which we are all striving for, and which in one small part we have been permitted to see actually realised” (Quoted in George Dyson, “Darwin amongst the Machines, The Evolution of Global Intelligence”, Addison-Wesley, N.Y., 1997. ISBN 0-201-400649-7).


I just benchmarked my office computer (using only solid-state memory and that 1Gbps connection) and got 58Mbps (download)/75Mbps (upload).

The standard program was NCSA Telnet if  I remember. You made a connection from the computer (using its printer port) to the ARPANET node at University College London (not a widely advertised service), and thence to an Apple FTP site where one could initiate an anonymous file transfer back to one’s computer.  System 5 was about half a Mbyte then, and this took about 1-2 hours to retrieve (unless the connection went down, in which case one started again).