Archive for the ‘crystal_structure_mining’ Category

Artemisinin: are stereo-electronics at the core of its (re)activity?

Sunday, April 13th, 2014

Around 100 tons of the potent antimalarial artemisinin is produced annually; a remarkable quantity given its very unusual and fragile looking molecular structure (below). When I looked at this, I was immediately struck by a thought: surely this is a classic molecule for analyzing stereoelectronic effects (anomeric and gauche). Here this aspect is explored.

artemisinin

I start by listing the bonds around which interesting things might happen:

  1. C3-C4 has the gauche motif of a 1,2-diol
  2. Carbons 7 and 4 are anomeric centres, with the focus on bonds 1-7/7-6 and 6-4/4-5
  3. Bond 1-2 has the potential for a so-called α-effect, where the lone pairs on adjacent hetero-atoms are buttressed.

The crystal structure is shown below, annotated with pertinent bond lengths (trivial atom numbering). The dihedral 2-3-4-6 and 2-3-4-6 are respectively -51 and 72° (hence a double gauche at the 3-4 bond).

Click for 3D

Click for 3D

First, an exploration of what might be happening around C4. The following is a search of the Cambridge crystal structure database, plotted for the two C-O bond lengths common to C4.

artemisinin1 artemisinin Here, DIST1 is C4-O6 and DIST2 is C4-O5. Notice the very pronounced asymmetry; at the red hotspot above, the most frequent occurrence is ~1.39 and 1.46Å respectively; artemisinin is more or less at that hotspot. This can be quantified by the NBO E(2) energies for the interaction of an oxygen lone pair antiperiplanar to the C-O σ* bond;

  1. Lp(O6)-σ*(C4-O5) = 21.2 kcal/mol which helps to account for the short C4-O6 and the long C4-O5 bonds.
  2. whereas the reverse donation of Lp(O5)-σ*(C4-O6) is merely 4.8 kcal/mol (normally the two donations are more or less equal, and hence so at the two C-O bond lengths).
  3. At the second anomeric centre of C7, Lp(O1)-σ*(C7-O6) = 19.9 kcal/mol
  4. whereas the reverse donation of Lp(O6)-σ*(C7-O1) is 5.7 kcal/mol, again highly asymmetric, as are the C-O bond lengths (1.413/1.441Å).
  5. Next, the gauche effect at C3-C4. The C4-H to C3-O2 donation is 6.4 kcal/mol, again contributing to the longer C-O length of 1.447Å.

Where such stereoelectronic interactions are asymmetric, one might expect enhanced reactivity. A good example of this are two stereoisomeric of a 7-ring herbicide[1] where one anomer with equal anomeric C-O lengths is a stable soil-persistent species, whereas the other with asymmetric lengths has a very short soil residency due to rapid hydrolysis. It might be tempting to speculate that some aspect of the activity of artemisinin may be due to such stereoelectronic asymmetries.

Finally, because it is virtually free to do so in a computational sense, I show the computed VCD spectrum[2] (covering the possibility that it is measured at some point). The calculated[3] optical rotation ([α]589 is +93° (obs ~+76°). Whilst the absolute configuration is not in any doubt, it is always nice to have further confirmations.

artemisinin

References

  1. P. Camilleri, D. Munro, K. Weaver, D.J. Williams, H.S. Rzepa, and A.M.Z. Slawin, "Isoxazolinyldioxepins. Part 1. Structure–reactivity studies of the hydrolysis of oxazolinyldioxepin derivatives", J. Chem. Soc., Perkin Trans. 2, pp. 1929-1933, 1989. https://doi.org/10.1039/p29890001929
  2. H.S. Rzepa, "Gaussian Job Archive for C15H22O5", 2014. https://doi.org/10.6084/m9.figshare.997360
  3. H.S. Rzepa, "Gaussian Job Archive for C15H22O5", 2014. https://doi.org/10.6084/m9.figshare.997463

Artemisinin: are stereo-electronics at the core of its (re)activity?

Sunday, April 13th, 2014

Around 100 tons of the potent antimalarial artemisinin is produced annually; a remarkable quantity given its very unusual and fragile looking molecular structure (below). When I looked at this, I was immediately struck by a thought: surely this is a classic molecule for analyzing stereoelectronic effects (anomeric and gauche). Here this aspect is explored.

artemisinin

I start by listing the bonds around which interesting things might happen:

  1. C3-C4 has the gauche motif of a 1,2-diol
  2. Carbons 7 and 4 are anomeric centres, with the focus on bonds 1-7/7-6 and 6-4/4-5
  3. Bond 1-2 has the potential for a so-called α-effect, where the lone pairs on adjacent hetero-atoms are buttressed.

The crystal structure is shown below, annotated with pertinent bond lengths (trivial atom numbering). The dihedral 2-3-4-6 and 2-3-4-6 are respectively -51 and 72° (hence a double gauche at the 3-4 bond).

Click for 3D

Click for 3D

First, an exploration of what might be happening around C4. The following is a search of the Cambridge crystal structure database, plotted for the two C-O bond lengths common to C4.

artemisinin1 artemisinin Here, DIST1 is C4-O6 and DIST2 is C4-O5. Notice the very pronounced asymmetry; at the red hotspot above, the most frequent occurrence is ~1.39 and 1.46Å respectively; artemisinin is more or less at that hotspot. This can be quantified by the NBO E(2) energies for the interaction of an oxygen lone pair antiperiplanar to the C-O σ* bond;

  1. Lp(O6)-σ*(C4-O5) = 21.2 kcal/mol which helps to account for the short C4-O6 and the long C4-O5 bonds.
  2. whereas the reverse donation of Lp(O5)-σ*(C4-O6) is merely 4.8 kcal/mol (normally the two donations are more or less equal, and hence so at the two C-O bond lengths).
  3. At the second anomeric centre of C7, Lp(O1)-σ*(C7-O6) = 19.9 kcal/mol
  4. whereas the reverse donation of Lp(O6)-σ*(C7-O1) is 5.7 kcal/mol, again highly asymmetric, as are the C-O bond lengths (1.413/1.441Å).
  5. Next, the gauche effect at C3-C4. The C4-H to C3-O2 donation is 6.4 kcal/mol, again contributing to the longer C-O length of 1.447Å.

Where such stereoelectronic interactions are asymmetric, one might expect enhanced reactivity. A good example of this are two stereoisomeric of a 7-ring herbicide[1] where one anomer with equal anomeric C-O lengths is a stable soil-persistent species, whereas the other with asymmetric lengths has a very short soil residency due to rapid hydrolysis. It might be tempting to speculate that some aspect of the activity of artemisinin may be due to such stereoelectronic asymmetries.

Finally, because it is virtually free to do so in a computational sense, I show the computed VCD spectrum[2] (covering the possibility that it is measured at some point). The calculated[3] optical rotation ([α]589 is +93° (obs ~+76°). Whilst the absolute configuration is not in any doubt, it is always nice to have further confirmations.

artemisinin

References

  1. P. Camilleri, D. Munro, K. Weaver, D.J. Williams, H.S. Rzepa, and A.M.Z. Slawin, "Isoxazolinyldioxepins. Part 1. Structure–reactivity studies of the hydrolysis of oxazolinyldioxepin derivatives", J. Chem. Soc., Perkin Trans. 2, pp. 1929-1933, 1989. https://doi.org/10.1039/p29890001929
  2. H.S. Rzepa, "Gaussian Job Archive for C15H22O5", 2014. https://doi.org/10.6084/m9.figshare.997360
  3. H.S. Rzepa, "Gaussian Job Archive for C15H22O5", 2014. https://doi.org/10.6084/m9.figshare.997463

The conformational preference of s-cis amides.

Sunday, February 10th, 2013

Amides with an H-N group are a component of the peptide linkage (O=C-NH). Here I ask what the conformation (it could also be called a configuration) about the C-N bond is. A search of the following type can be defined:

cis-amide

The dihedral shown is for H-N-C=O (but this is equivalent to the C-C-N-C dihedral, which is also often called the dihedral angle associated with the peptide group). I have also added a distance, from a C-H to the carbonyl oxygen. Other search constraints include T ≤ 175K, R < 0.05, no disorder, no errors, that neither N-C bonds are part of a ring and that the two carbons marked T4 both have four connected bonds. The search results in 619 hits (January 2013 version of the CCDC database), and these are displayed below.

cis-amide-search-heat

The horizontal axis reveals the highest concentration (red) at ~2.4Å due to a syn-co-planar alignment of the C-H bond with the plane of the C=O bond in the s-cis conformer (the significantly smaller hot-spot at ~3.9A may be due to an anti-co-planar alignment of this C-H bond).

s-cis-amide

The vertical axis shows a clear preference for a dihedral of 179° (in fact no hits with a dihedral of less than 14o° were found) and this can only arise from the s-cis conformation in which the H-N bond is oriented antiperiplanar to the axis of the C=O bond. This preference can be rationalised by filled/empty NBO-orbital interactions, which include:

  1. Antiperiplanar interaction between the N-H as donor and the C=O as a σ-acceptor (E(2) = 4.1 kcal/mol)
  2. Antiperiplanar interaction between the N-H as acceptor and C-H as donor (E(2) = 4.7 kcal/mol)
Click for 3D

H-N/C=O. Click for 3D

 

Click for 3D.

Click for 3D.

This latter overlap conspires to bring the C-H hydrogen close to the oxygen (~2.35Å, DIST1 in the diagram above). So one might be entitled to ask: is this a hydrogen bond? There are (at least) two ways of testing this.

  1. The NBO E(2) interaction energy between the oxygen in-plane lone pair and the H-C as acceptor is 0.8 kcal/mol. For hydrogen bonds, such E(2) energies more or less resemble the actual H-bond strengths, i.e. a strong H-bond has an E(2) energy of ~ 8 kcal/mol; and a medium O…H-C hydrogen bond weighs in at around 3 kcal/mol.  So this one is very weak. This is due to poor overlap resulting from the small ring size (5).
  2. The NCI (non-covalent-interaction) surface does reveal a feature in the CH…O region, but the colour coding (which indicates how attractive/repulsive this is) is both pale blue (attractive) and yellow (repulsive). Again this is only consistent with a very weak overall H-bond.
NCI surface. Click for 3D.

NCI surface. Click for 3D.

I end by reminding that the s-cis H-N-C=O conformation is a very common feature in peptides (the CCDC database comprises mostly small molecules, not larger peptides and proteins) arising from really quite subtle orbital interactions.

The conformation of acetaldehyde: a simple molecule, a complex explanation?

Friday, February 8th, 2013

Consider acetaldehyde (ethanal for progressive nomenclaturists). What conformation does it adopt, and why? This question was posed of me by a student at the end of a recent lecture of mine. Surely, an easy answer to give? Read on …

acetaldehyde

There really are only two possibilities, the syn and anti. Well, I have discovered it is useful to start with a search of the Cambridge data base. With R=H or C, X unspecified,  acyclic and T ≤ 175K, two searches were performed. The first identified the torsion around O=C-C-H. This clearly shows a maximum at 120° (with twice the probability), and a smaller one at 0°. This matches syn; the anti conformation above would be expected to have peaks at 60° and 180°; the latter in particular is singularly missing.

acetaldehyde-180

An alternative search is to define the distance between the oxygen and the H. For the syn conformer, distances of ~2.5 and 3.1Å are expected; for the anti conformer, 2.7 and 3.3Å. Again, syn matches better. Remember, searches based on the position of a hydrogen are less reliable than most, so these distributions provide only a statistical indication.

acetaldehyde-dist

Now for a (ωB97XD/6-311G(d,p) calculation of the rotational barrier. The minima occur at torsions of 0, 120 and 240°, matching syn, although the barrier is very low.

acet-rot

Now to try to find explanations. The standard one finds this in three effects:

  1. Donation from two C-H bonds (R=H above) into the π*C=O NBO orbital (in the manner that was used to explain the cis-orientation of the two methyl groups in cis-butene). 
  2. Donation from the single co-planar C-H bond into the σ*C=O NBO orbital (blue bonds above)
  3. Pauli bond-bond repulsions between two filled NBOs. 

Effect 1 has an NBO perturbation energy E(2) of 7.0 kcal/mol for the syn conformer and 6.45 for the anti. The explanation is the π*C=O NBO “leans outward”, overlapping better with the C-H bonds in the syn than in the anti.  the One up to the syn! Effect 2 has values of 1.3 for the syn and 4.1 for the anti. The latter now has the edge. But wait, there are other (smaller) interactions. The syn has an antiperiplanar orientation of the two C-H bonds shown above (X=H,red), E(2) = 3.3 vs 0.6 for the corresponding syn-planar orientation in the anti-conformation. It’s now a tie; neck-and-neck.

Effect three suggests that the disjoint NLMO steric exchange energy is 54.34 for the anti and 53.88 (i.e. lower) for the syn. It is vaguely disappointing that no absolutely clear-cut explanation emerges. But then the difference (in total free energy) is only 1.4 kcal/mol. But even this small difference in energy can manifest in fairly clear-cut conformational preferences obtained from crystal structures. Ultimately of course, all effects in chemistry are reducible to the sum of lots of small effects (in other words unpredictable until one does the sum). 

I cannot end without mentioning the largest of all the NBO interactions, namely the in-plane lone pair on the oxygen as donor and the aldehyde proton C-H as acceptor (X=H). This has values of 29.3 for syn and 28.8 kcal/mol for anti. This manifest (inter alia) in a greatly reduced C-H vibrational wavenumber (ν 2982 for syn, 2900 cm-1 for anti) compared to the methyl C-H values (~3043-3164).

So this tiny little molecule ended up a little less obvious than might have seemed at the outset. One can find interesting things in even the tiniest of things! 


HC...C-H alignment. Click for  3D.

HC…C-H alignment. Click for 3D.

 

HC...C-H alignment. Click for  3D.

O=C*…C-H alignment. Click for 3D.

σ-π-Conjugation: seeking evidence by a survey of crystal structures.

Sunday, February 3rd, 2013

The electronic interaction between a single bond and an adjacent double bond is often called σ-π-conjugation (an older term for this is hyperconjugation), and the effect is often used to e.g. explain why more highly substituted carbocations are more stable than less substituted ones. This conjugation is more subtle in neutral molecules, but following my use of crystal structures to explore the so-called gauche effect (which originates from σ-σ-conjugation), I thought I would have a go here at seeing what the crystallographic evidence actually is for the σ-π-type.

sigma-pi-conjugation

The basic two molecules are shown above; in effect propene 1 and butene 2. The latter was in fact the topic of another post, in which I attempted to show that the close H…H contact in cis-butene (2.1Å) was in effect an unwelcome consequence of the σ-π-conjugation of any of the four “outward leaning” C-H bonds of the methyl groups acting as donors (red-blue below) overlapping with the similarly “outward leaning” π* orbital of the alkene (purple-orange below; blue and purple overlap positively).

C-H/alkene interaction. Click for  3D.

NBO orbitals for C-H/alkene interaction. Click for 3D.

So how general might this be? To find out, I performed the following search on the Cambridge crystal database: cis-butene-search

  1. The search defines an alkene, bearing two cis-substituents each with at least one C-H bond. The substituents are both sp3 carbon, and the attachment bond to the alkene is defined as acyclic
  2. The H…H distance uses normalised terminal hydrogen positions (to try to correct for the normally over-short C-H bond lengths found by X-ray).
  3. Other constraints were R factor < 0.05, no disorder, no errors and (perhaps most importantly) T < 150K to try to reduce thermal libration.

I should qualify all of this by reminding that hydrogen positions in crystal structures are notoriously prone to errors. Nevertheless, with 624 hits using the above search, one might hope for statistical significance of a real effect.

Search result for close H...H contacts in cis-butenes.

Search result for close H…H contacts in cis-butenes.

For this sample, the most frequent H…H distance emerged as 2.1Å. This can only result from having the C-H bonds lie coplanar with the C=C alkene, as is shown above. The value is also remarkably close to the H…H distance for cis-butene itself (both computationally and as determined using electron diffraction). This does I feel provide a strong indication that σ-π-conjugation is manifesting in these systems.

Re-defining the search for propenes 1 as above gives 1656 hits, with a maximum in the distribution at 2.35Å corresponding to a syn-orientation of the C=C and the C-H bonds. The smaller maximum at about 2.75Å arises from a gauche-orientation between the C=C and C-H (in effect you have to halve this number, since there are twice as many possibilities for this to occur than for the syn). The “inward leaning” gauche C-H bond overlaps less well with the “outward leaning” π* orbital of the alkene.

Propene.

Search result for close H…H contacts in propenes.

These aspects are perhaps better seen in the orbital overlaps shown below.

Click for 3D.

Click for 3D.

I will follow-up this theme with esters and amides next.

The gauche effect: seeking evidence by a survey of crystal structures.

Friday, January 4th, 2013

I previously blogged about anomeric effects involving π electrons as donors, and my post on the conformation of 1,2-difluorethane turned out one of the most popular. Here I thought I would present the results of searching the Cambridge crystal database for examples of the gauche effect. The basic search is defined belowCCDC-search

Here, we define a four-atom torsion (TOR1), the two central carbon atoms having two groups R which can be only H or C. These two carbons are also defined as acyclic. The restrictions of the search as defined above also include R-factor < 0.05, not disordered and no errors. These combine to reduce the number of hits significantly (although not dissimilar distributions are obtained for less restricted searches). Each search takes only a few seconds, and one can rattle through many permutations very quickly.

So here come the results. First, QA=4M=F. All but one of the examples has a torsion in the region of 60°, the classic gauche effect!

F-C-C-F

F-C-C-F

Next, QA=O, 4M=F. Rather more hits, and the effect is almost as clear-cut. I should point out that the apparent “exceptions” to the gauche conformation may arise from structural restrictions, and each really would have to be inspected individually for the reasons (which I do not attempt here). 

OCCF

OCCF

With QA=4M=O,  one has many more instances. The effect is pretty convincing (it may be that hydrogen bonding may also control the conformation).

O-C-C-O

O-C-C-O

Now for QA=4M=Cl. The distribution is slanted more to the anti conformation, but there are still quite a few gauche.

Cl-CC-Cl

Cl-CC-Cl

With QA=4M=S, the conformations are now almost all anti; the gauche effect is no more! 

S-C-C-S

S-C-C-S

And for QA=4M=Br, it has also almost vanished (there is only one instance for I, and that too is antiperiplanar).

Br-C-C-Br

Br-C-C-Br

I now return to an earlier post in which I speculated that a cyano group might participate in the anomeric effect. Well here it is in the gauche effect; QA=CN, 4M = any of N,O,F,Cl,S. Quite a few gauche orientations for this pseudo-halogen!

Neg-C-C-CN

Neg-C-C-CN

Another group that can act as a powerful acceptor of electrons from a donor is QA=N(Me)3+.. With 4M= N, O, F, Cl, here  the population of gauche conformers is large. QA=CF3 is a similar group.

Neg-C-C-NMe3

Neg-C-C-NMe3

 

Neg-C-C-CF3

Neg-C-C-CF3

 

One can envisage other combinations. Thus QA= C=C, 4M = any of  N, O, F, Cl. An alkene seems one of the more powerful gauche effect participants!

alkene-C-C-Neg

alkene-C-C-Neg

And alkynes, perhaps slightly less so.

Alkyne-C-C-Neg

Alkyne-C-C-Neg

What about metals (QA = any metal, 4M = any of N, O, F, Cl, S). Well, not particularly biased either way, but clearly one in which the identity of the metal may matter.

Metal-C-C-electronegative

Metal-C-C-electronegative

I should end with inverting the model. If QA is electropositive (any group to the left of carbon, or below it in the periodic table) and 4M is electronegative, than they align almost exclusively anti-periplanar and not gauche. But notice how relatively few examples there are.  Synthetic chemists, please make more such molecules!

Electropositive-C-C-Electronegative

Electropositive-C-C-Electronegative

If you thought the gauche effect was restricted to just a few molecules, think again!