Posts Tagged ‘Interesting chemistry’

Checking a conclusion we made in 1987: Tetrahedral intermediates formed by nitrogen and oxygen attack of aromatic hydroxylamines on acetyl cyanide

Saturday, June 11th, 2022

Minds (and memories) can work in wonderful ways. In 1987[1] we were looking at the properties of “stable” tetrahedral intermediates formed in carbonyl group reactions. The reaction involved adding phenylhydroxylamine to acetyl cyanide. NMR signals for two new species were detected, and we surmised one was due to N-attack on the carbonyl and one was due to O-attack, in each case to form a stable tetrahedral intermediate. To try to identify which was which, 15N labelled hydroxylamine was used and then the 15N-13C coupling constants were measured, which could either be 1-bondJ (for N-attack) or 2-bondJ (for O-attack).

Well, 35 years later, literally in a dream on the morning of 7th June, 2022, these results came back to me and the dream involved wondering whether we had gotten the assignments of the N- and O-species the correct way around. You see we had assigned the larger of the 15N-13C couplings to the two bond (O-attack, species 3 below) rather than one-bond (N-attack, species 4 below) coupling. In 1987, the art of accurately computing such couplings was still in its infancy, but now in 2022 it is quick and easy to do. So here I report the results, which 35 years on allows a check of those assignments.

The necessary calculations are assembled at FAIR DOI: 10.14469/hpc/10593 conducted at the ωB97XD/aug-cc-pvdz/scrf=acetonitrile level. Firstly, it is important that the conformational space of these molecules is explored, since they contain a plethora of interesting anomeric effects. I will not discuss this process, simply quoting what I believe to be the lowest energy conformation for both isomers.

# Property Species 3 Species 4
1 ΔG298 -608.600542 -608.598472
2 ΔG215 -608.586956 -608.585163
3 NBO E(2) 14.3,19.4,10.9,8.1 10.0,11.2,9.9
4 δC obs 94.3 ppm 85.0 ppm
5 δC calc 97.2 (Δδ 2.9) 88.1 (Δδ 3.1)
6 JN-C obs 2J ±2.5 Hz 1J ±1.3 Hz
7 JN-C calc 2J +1.7 1J +0.8
  1. The relative free energies ΔΔG298 favour 3 over 4 by 1.3 kcal/mol at 298K (9:1). The article notes that 3 is significantly favoured over 4 at higher temperatures (i.e. ~298K) but that the concentration of 4 increases at lower temperatures. 
  2. At 215K, ΔΔG215 reduces to 1.1 kcal/mol, but this equates to 13:1 at this temperature. ΔΔG215 would need to be about 0.8 kcal/mol for 4 to increase (6.5:1), but these are small errors in energy and a more accurate calculation would have to be done to get this aspect correct.
  3. The NBO E(2) terms indicating overlap between a lone pair and an acceptor orbital (the anomeric effect), show a dazzling variety of interactions for such a small molecule. Species 3 shows four significant interactions, species 4 one less.
  4. The chemical shifts measured for 3 and 4
  5. – are matched by the calculation, the error being similar for both species.
  6. The 15N-13C coupling constants –
  7. – are again matched, with the 1J coupling being about half the value of the 2J coupling for both obs and calculated values.

The nature of modern scientific research, and the funding available for it, means that old work is rarely re-investigated using more recent techniques. In this case, the reinvestigation does not require the molecules to be re-synthesized again, merely that a retrospective computational layer be applied. As a result of my dream of four days ago, this process has produced an interesting new layer which thankfully confirms the original conclusions.

References

  1. A.M. Lobo, M.M. Marques, S. Prabhakar, and H.S. Rzepa, "Tetrahedral intermediates formed by nitrogen and oxygen attack of aromatic hydroxylamines on acetyl cyanide", The Journal of Organic Chemistry, vol. 52, pp. 2925-2927, 1987. https://doi.org/10.1021/jo00389a050

3-Methyl-5-phenylpyrazole: a crystallographic enigma?

Thursday, May 19th, 2022

Previously, I explored the unusual structure of a molecule with a hydrogen bonded interaction between a phenol and a pyridine. The crystal structure name was RAKQOJ and it had been reported as having almost symmetrical N…H…O hydrogen bonds. This feature had been determined using neutron diffraction crystallography, which is thought very reliable at determining proton positions. Another compound with these characteristics is 3-methyl-5-phenylpyrazole or MEPHPY01.[1] Here the neutron study showed it to apparently have the structure represented below, where the solid N-H lines indicate a proton equidistant between two nitrogens.

Inspection of the ORTEP plot shows a very odd feature; the thermal elipsoids (red arrow) for two of the N-H-N motifs are more or less spherical, indicating little thermal motion (the temperature of the determination is not noted, and is assumed as probably room temperature) but the other two (magenta arrows) are highly elongated in the direction of motion between the two nitrogen atoms. This feature was largely unexplained at the time of publication (1975) and indeed to this day. Here I offer a possible insight into this enigma.

The conventional structure is shown below showing four N-H bonds and four H…N hydrogen bonds.

So now for the results of some calculations. Computed at various B3LYP(±GD3BJ)/Def2-SVPP/Def2-TZVPP levels (Table, FAIR data DOI: 10.14469/hpc/10406), the located minimum in the total energy, saddle=0, corresponds to the conventional proton-localized structure shown above, where all four hydrogens are firmly attached to the four nitrogen atoms by a regular bond and the distances are 1.032 for the NH and 1.855Å for the hydrogen bond it forms. A zwitterionic isomer comprises the ion-pair shown below, examples of each component of which are known in the CSD (crystal structure database). 

There are three ways of distributing this motif, of which only 1 is stable to proton transfer. Structure 2 has a higher degree of charge separation whilst 3 superficially appears to reduce the degree of charge separation compared to 1. In fact, the three-dimensional structure of 1 allows the negative ion to stack above the positive ion, thus actually achieving minimal separation of charges.

The stacking also depends on the type of calculation. If dispersion correction is included, the aromatic faces stack directly above each other (as above). If omitted, the stacking actually corresponds more closely to that observed in the reported crystal structure, since the attraction between faces occurs not only within a structure but between adjacent structures in the solid state (something not modelled when the dispersion correction is applied only to a single unit).

The lesson learnt from the previous post is that the position of protons as determined by quantum-chemical geometry location using minimisation of the total computed energy might be misleading. Better perhaps to use the computed free energy? When this is done, as we saw in the previous post, the transition state for proton transfer as located in the total energy surface can actually have a free energy that is lower than that of the total energy minimum. So, for MEPHPY01, a stationary point in which all four hydrogens correspond to the apparently symmetrical experimental neutron diffraction structure emerges as saddle=3, corresponding to three force constants being negative. The bond lengths for this geometry occur in pairs, two with NH 1.25/N…H 1.30 and two with 1.28/1.28, revealing interesting asymmetry.

The normal vibrational modes for these three -ve force constants are shown below. The first (ν 1315i cm-1) shows all four hydrogens exchanging between nitrogens, a quadruple proton transfer. The second (ν 896i cm-1) shows a double proton transfer between one pair exchanging between two nitrogens and the last (ν 801i cm-1) is similar in form, but shows the other pair exchanging between a second different pair of nitrogens. These last two vibrational modes correspond to the very thermal ellipsoids seen in the crystal structure diagram at the top, where one pair of hydrogens show little motion and the other pair involves much greater motion between a pair of nitrogen atoms.This would correspond to formation of a species exhibiting two conventional NH…H hydrogen bonds and two symmetrical N…H…N units.

 

Two further stationary points corresponding to saddle=2 and saddle=1 can also be located (Table).

stationary points B3LYP+GD3BJ/gas B3LYP+GD3BJ/DCM B3LYP/gas
SVPP, saddle=0, neutral -1984.448365(0.0) -1984.460070(0.0) -1984.241838(0.0)
SVPP, saddle=0, ion-pair -1984.426827(13.5) -1984.439305(13.0) -1984.218379(14.7)
TZVPP, saddle=0, neutral -1986.712743(0.0) -1986.725277 (0.0) -1986.509333(0.0)
TZVPP, saddle=0, ion-pair -1986.688467(15.2) -1986.703562 (13.6)  -1986.481384(17.5)

SVPP, saddle=1 -1984.428898(12.1) -1984.442004(11.3) -1984.220238(13.6)
SVPP, saddle=2 -1984.430085(11.5) -1984.442986(10.7) -1984.220968(13.1)
SVPP, saddle=3 -1984.431457(10.6)
1315, 896, 801
-1984.441075(11.9)
970, 603, 294
-1984.223176 (11.7)
1335, 782, 760
TZVPP, saddle=1 -1986.691518(13.2)
TZVPP, saddle=2 -1986.689979(14.3)
TZVPP, saddle=3
-1986.690265(14.1)

Now here is the wacky thing. At the gas phase SVPP basis set ± dispersion levels, these lower-order saddle points are actually HIGHER in free energy than the third order saddle point! Conventional wisdom is that the higher the order of the saddle point, the higher should its energy be! I am not aware of anyone reporting an inverse observation before. The effect however is solvation and also basis-set dependent, since adding dichloromethane as a continuum solvent changes the free energy minimum from the third to the second-order saddle point. It might well also be dependent on the density functional method.

What are we to conclude? The free energy barriers for all the proton transfer saddle points computed above are not that small, being ≥ 10 kcal/mol. But at room temperatures, these exchanges will in fact be fast kinetic processes and the measured neutral diffraction structure may well emerge as averaged in some way. The free energies of the higher order saddle points suggests the dynamics of this system may in fact be very complex and very different from any “normal” hydrogen bonded system. This is clearly not the final word yet, but it does hint that the proton transfer dynamics of 3-Methyl-5-phenylpyrazole may be a system very well worth looking at again! And indeed exploring how robust the effects noted above are to different density functionals.


This post has DOI: 10.14469/hpc/10512

References

  1. F.H. Moore, A.H. White, and A.C. Willis, "3-Methyl-5-phenylpyrazole: a neutron diffraction study", Journal of the Chemical Society, Perkin Transactions 2, pp. 1068, 1975. https://doi.org/10.1039/p29750001068

C2N2: a 10-electron four-atom molecule displaying both Hückel 4n+2 and Baird 4n selection rules for ring aromaticity.

Thursday, April 7th, 2022

The previous examples of four atom systems displaying two layers of aromaticity illustrated how 4 (B4), 8 (C4) and 12 (N4) valence electrons were partitioned into 4n+2 manifolds (respectively 2+2, 6+2 and 6+6). The triplet state molecule B2C2 with 6 electrons partitioned into 2π and 4σ electrons, with the latter following Baird’s aromaticity rule.[1],[2]. Now for the final missing entry; as a triplet C2N2 has 10 electrons, which now partition into 4 + 6. But would that be 4π + 6σ or 4σ + 6π? Well, in a way neither! Read on.

Bonding MOs for C2N2.
Click image to load 3D model
π3, 1 electron π2, 1 electron
σ3 2 electron σ2, 2 electron
π1 2 electron σ1, 2 electron

The calculations (ωB97XD/Def2-TZVPP and CCSD(T)/Def2-TZVPP) are collected at FAIR DOI: 10.14469/hpc/10346. These show a partitioning into 5σ + 5π, a species that is not a minimum but undergoes a non-planar distortion.

However, the first excited state (the triplet) IS planar and is only 12.5 kcal/mol above the planar 5+5 precursor. It is now partitioned into 6σ and 4π, with the latter conforming to Baird’s rule for open shell triplets.[1],[2] So this is unlike C2B2, which showed 2π + 4σ partitioning with the σ series following Baird’s rule. Now we have two examples in which one of the σ or the π-manifolds follow Baird’s rule and the other follows Hückel’s rule. The systems themselves are somewhat contrived, but they show the simple fun and games that can be had with these aromaticity rules.


This post has DOI: 10.14469/hpc/10350

References

  1. N.C. Baird, "Quantum organic photochemistry. II. Resonance and aromaticity in the lowest 3.pi..pi.* state of cyclic hydrocarbons", Journal of the American Chemical Society, vol. 94, pp. 4941-4948, 1972. https://doi.org/10.1021/ja00769a025
  2. M. Rosenberg, C. Dahlstrand, K. Kilså, and H. Ottosson, "Excited State Aromaticity and Antiaromaticity: Opportunities for Photophysical and Photochemical Rationalizations", Chemical Reviews, vol. 114, pp. 5379-5425, 2014. https://doi.org/10.1021/cr300471v

Sir Geoffrey Wilkinson: An anniversary celebration. 23 March, 2022, Burlington House, London.

Thursday, March 24th, 2022

The meeting covered the scientific life of Professor Sir Geoffrey Wilkinson from the perspective of collaborators, friends and family and celebrated three anniversaries, the centenary of his birth (2021), the half-century anniversary of the Nobel prize (2023) and 70 years almost to the day (1 April) since the publication of the seminal article on Ferrocene (2022).[1]


The meeting was organised as “inverse hybrid” (to use the new terminology), with a maximum capacity in-person audience attending along with fourteen speakers, three of whom were remote and one who could not attend on the day but whose presentation was given on their behalf. I will not give abstracts for the talks here, but note two common themes that I thought emerged during the day.

  1. All the speakers found themes in either their memories of Wilkinson and their time in his laboratories or their current research work that show how he continues to  influence, along with the famous text book that he co-wrote, the modern world of chemistry. He truly left a remarkable legacy.
  2. This is a personal observation, but in his day, Wilkinson was famously sceptical of the ability of molecular modelling to cast profound insights into the molecules his group were studying. Yesterday I think with only one or two exceptions, the talks were accompanied by “DFT modelling” helping to provide such insights, either into the reaction mechanisms via energy profiles or into the properties of the molecules themselves, including their spectroscopy.

A small exhibition of artefacts included his famous portrait, all the editions of the text book and other items from his desk.

Finally, I thought I might explore the famous controversy surrounding the model of ferrocene which is shown in the photos below. It is shown with the two cyclopentadienyl rings in a so-called “eclipsed” conformation. To cast light on this, I show a search of the Cambridge crystal database of all molecules with this sub-structure. There are 24,868 of them.

The histogram plot of the dihedral angle is shown below. The staggered geometry has a dihedral of 36° and you can see a small maximum at this point in the distribution below. But this is dwarfed by 0°, the value for the eclipsed orientation. The  barrier  to  rotation is  known to be very small, and this is reflected in the almost continuous distribution amongst those 24,868 molecules.

References

  1. G. Wilkinson, M. Rosenblum, M.C. Whiting, and R.B. Woodward, "THE STRUCTURE OF IRON BIS-CYCLOPENTADIENYL", Journal of the American Chemical Society, vol. 74, pp. 2125-2126, 1952. https://doi.org/10.1021/ja01128a527

A four-atom molecule exhibiting simultaneous compliance with Hückel 4n+2 and Baird 4n selection rules for ring aromaticity.

Tuesday, March 22nd, 2022

Normally, aromaticity is qualitatively assessed using an electron counting rule for cyclic conjugated rings. The best known is the Hückel 4n+2 rule (n=0,1, etc) for inferring diatropic aromatic ring currents in singlet-state π-conjugated cyclic molecules and a counter 4n rule which infers an antiaromatic paratropic ring current for the system. Some complex rings can sustain both types of ring currents in concentric rings or regions within the molecule, i.e. both diatropic and paratropic regions. Open shell (triplet state) molecules have their own rule; this time the molecule has a diatropic ring current if it follows a 4n rule, often called Baird’s rule. But has a molecule which simultaneously follows both Hückel’s AND Baird’s rule ever been suggested? Well, here is one, as indeed I promised in the previous post.

The species shown above has two carbons and two borons in a ring. These have a total of 14 valence electrons, eight of which occupy the C-B bonds, leaving six contributing to circulating ring currents. These partition into two π-electrons which then form a Hückel 4n+2 aromatic (n=0) and four σ-electrons which then form a Baird 4n aromatic (n=1) as a triplet. The triplet for this molecule is indeed its lowest state, 38.9 kcal/mol or 45.4 kcal/mol in free energy lower than the two lowest energy singlet states. These arise by placing two electrons in either of the two orbitals σ2 or σ3 each singly occupied in the triplet state (FAIR Data collection: 10.14469/hpc/10267)


Bonding MOs for C2B2.
Click image to load 3D model
σ3 σ2
σ1
π1

So here we see a different sort of doubly aromatic molecule, to add to C4, B4 and N4. With two electrons less than C4, it is now doubly aromatic as a triplet state, this time conforming to two different electron counting rules. It would be good to know if any other examples showing this pattern are known.

Hückel’s rule originally applied to p-π electrons in a cycle, such as benzene. Nowadays it is also used for σ in-plane electrons in a cycle.


This post has DOI: 10.14469/hpc/10271.

More aromatic species with four atoms. B4 and N4.

Saturday, March 19th, 2022

I discussed in the previous post the small molecule C4 and how of the sixteen valence electrons, eight were left over after forming C-C σ-bonds which partitioned into six σ and two π. So now to consider B4. This has four electrons less, and now the partitioning is two σ and two π (CCSD(T)/Def2-TZVPPD calculation, FAIR DOI: 10.14469/hpc/10157). Again both these sets fit the Hückel 4n+2 rule (n=0).

Since B4 has only two rather than six delocalized σ-orbitals, the contributions to the central B-B bond are weaker and so the B-B bond is much longer.

Bonding MOs for B4.
Click image to load 3D model
σ1, -0.335 au
π1, -0.372 au

Next, N4.

π-Bonding MOs for N4.
Click image to load 3D model
π3 π2
π1
σ-Bonding MOs for N4.
Click image to load 3D model
σ3 σ2
σ1

The pattern for N4 is different in several aspects. Firstly the π-system has six bonding electrons distributed over only four atoms. This makes the electron repulsions too high and the species is no longer stable, having one large imaginary force constant corresponding to an out-of-plane distorsion. Secondly the lowest energy σ orbital is highly localised onto two nitrogens rather than being delocalised around the ring periphery. So all those electrons crammed into a small space have taken their toll.

Thus far we have identified three species, B4, C4 and N4 with interesting sets of respectively 4,8 and 12 electrons, all partitioned into 4n+2 collections. But what happens if one cannot do that; lets say 6 and 10 electrons? Hang around to find out!

Molecule of the year 2021: Infinitene.

Thursday, December 16th, 2021

The annual “molecule of the year” results for 2021 are now available … and the winner is Infinitene.[1] This is a benzocirculene in the form of a figure eight loop (the infinity symbol), a shape which is also called a lemniscate [2] after the mathematical (2D) function due to Bernoulli. The most common class of molecule which exhibits this (well known) motif are hexaphyrins (hexaporphyrins; porphyrin is a tetraphyrin)[3],[4],[5], many of which exhibit lemniscular topology as determined from a crystal structure. Straightforward annulenes have also been noted to display this[6] (as first suggested here for a [14]annulene[7]) and other molecules show higher-order Möbius forms such as trefoil knots.[8],[9] This new example uses twelve benzo groups instead of six porphyrin units to construct the lemniscate. So the motif is not new, but this is the first time it has been constructed purely from benzene rings.

The molecule has D2 chiral symmetry and is shown below (click on the image for the 3D model obtained from the crystal structure).

The authors suggest that the aromaticity in a D2-symmetric [12]-circulene is confined to six “Clar” rings each of six electrons, and is not delocalised around the entire molecule. For a molecule with this topology (defined by a linking number, Lk = 2π[10]) the entire system would be defined as aromatic (delocalised) for 4n+2 electrons and antiaromatic for 4n electrons around a continuous annulene loop. In this example outer annulene circuits of either 34 or 38 carbons can be constructed which retain D2-symmetry and which both follow the 4n+2 rule, whilst a small inner circuit of 14 carbons can be also be constructed. There are probably other D2-symmetric circuits that could be constructed.

When I saw the molecule, I asked myself what the calculated chiroptical properties for the molecule might be; the optical rotation of the two (separated) enantiomers of [12]-circulene were reported as +1130° (P,P) and -1112° (M,M). The calculated value (ωB97XD/Def2-TZVPP) is in excellent agreement. I have also included versions of this system with [11] and [10] benzo rings, which will be discussed in a future post.

Benzene units optical rotation (589nm), ° DOI
12 (P,P) +1143 10.14469/hpc/10000
11 (P,P) +1025 10.14469/hpc/10037
10 (P,P) -163 10.14469/hpc/10001

For good measure, the calculated VCD spectrum

Now to the geometry, as obtained from the crystal structure. The [12]circulene shows in total 12 short lengths of 1.348ű0.014, indicating significant localisation in the system. The D2-symmetric C34 path through the system shows a mean length for each bond of 1.405Å, with a maximum value of 1.443Å and a minimum 1.334Å. For this path, the topology of the system indicates Lw = 2π = 0.393Tw + 1.607Wr[11] This means that most of the coiling of the molecule that results in that figure eight is actually comprised of a topological property known as writhe (Wr) rather than adjacent twisting (Tw) of the p-orbitals. This retains much p(π)-p(π) overlap and hence stabilisation. The values for the inner C14 route are Lw = 2π = 1.256Tw + 0.744Wr which is more highly twisted than the larger outer pathway and so aromaticity via this route is less favoured due to less favourable p(π)-p(π) overlaps.

I also note that the Lw = 2π is an alternative chiral descriptor to the helical notation of (P,P). The (M,M) form would have Lw = -2π. The linking number is more general for more complex helical forms such as trefoils, cinquefoils, hexafoils etc.

So it turns out that this molecule has a fascinating challenge for trying to describe its extended delocalised aromaticity (rather than localised six-membered Clar rings), since more than one “annulene route” for which the “Hückel/Möbius rules” might apply exists.[10] Given that the maximum bond length for one of those routes (the [34]annulene) is 1.443Å, there may well be a contribution from this mode of aromaticity other than that from the Clar rings.

I hope to take a look at the [11] and [10]circulenes in a future post.


The explanation for this sign inversion is delightful but too complex to give here.[12]


This post has DOI: 10.14469/hpc/10036


References

  1. K. Itami, M. Krzeszewski, and H. Ito, "Infinitene: A Helically Twisted Figure-Eight [12]Circulene Topoisomer", 2021. https://doi.org/10.26434/chemrxiv-2021-pcwcc
  2. C.S.M. Allan, and H.S. Rzepa, "Chiral Aromaticities. AIM and ELF Critical Point and NICS Magnetic Analyses of Möbius-Type Aromaticity and Homoaromaticity in Lemniscular Annulenes and Hexaphyrins", The Journal of Organic Chemistry, vol. 73, pp. 6615-6622, 2008. https://doi.org/10.1021/jo801022b
  3. H. Rath, J. Sankar, V. PrabhuRaja, T.K. ChandrashekarPresent address: The D, B.S. Joshi, and R. Roy, "Figure-eight aromatic core-modified octaphyrins with six meso links: syntheses and structural characterization", Chemical Communications, pp. 3343, 2005. https://doi.org/10.1039/b502327k
  4. H. Rath, J. Sankar, V. PrabhuRaja, T.K. Chandrashekar, and B.S. Joshi, "Aromatic Core-Modified Twisted Heptaphyrins[1.1.1.1.1.1.0]:  Syntheses and Structural Characterization", Organic Letters, vol. 7, pp. 5445-5448, 2005. https://doi.org/10.1021/ol0521937
  5. S. Shimizu, N. Aratani, and A. Osuka, "<i>meso</i>‐Trifluoromethyl‐Substituted Expanded Porphyrins", Chemistry – A European Journal, vol. 12, pp. 4909-4918, 2006. https://doi.org/10.1002/chem.200600158
  6. T. Perera, F.R. Fronczek, and S.F. Watkins, "2,9,16,23-Tetrakis(1-methylethyl)-5,6,11,12,13,14,19,20,25,26,27,28-dodecadehydrotetrabenzo[<i>a</i>,<i>e</i>,<i>k</i>,<i>o</i>]cycloeicosene", Acta Crystallographica Section E Structure Reports Online, vol. 67, pp. o3493-o3493, 2011. https://doi.org/10.1107/s1600536811048604
  7. H.S. Rzepa, "A Double-Twist Möbius-Aromatic Conformation of [14]Annulene", Organic Letters, vol. 7, pp. 4637-4639, 2005. https://doi.org/10.1021/ol0518333
  8. G.R. Schaller, F. Topić, K. Rissanen, Y. Okamoto, J. Shen, and R. Herges, "Design and synthesis of the first triply twisted Möbius annulene", Nature Chemistry, vol. 6, pp. 608-613, 2014. https://doi.org/10.1038/nchem.1955
  9. S.M. Bachrach, and H.S. Rzepa, "Cycloparaphenylene Möbius trefoils", Chemical Communications, vol. 56, pp. 13567-13570, 2020. https://doi.org/10.1039/d0cc04190d
  10. P.L. Ayers, R.J. Boyd, P. Bultinck, M. Caffarel, R. Carbó-Dorca, M. Causá, J. Cioslowski, J. Contreras-Garcia, D.L. Cooper, P. Coppens, C. Gatti, S. Grabowsky, P. Lazzeretti, P. Macchi, . Martín Pendás, P.L. Popelier, K. Ruedenberg, H. Rzepa, A. Savin, A. Sax, W.E. Schwarz, S. Shahbazian, B. Silvi, M. Solà, and V. Tsirelson, "Six questions on topology in theoretical chemistry", Computational and Theoretical Chemistry, vol. 1053, pp. 2-16, 2015. https://doi.org/10.1016/j.comptc.2014.09.028
  11. S.M. Rappaport, and H.S. Rzepa, "Intrinsically Chiral Aromaticity. Rules Incorporating Linking Number, Twist, and Writhe for Higher-Twist Möbius Annulenes", Journal of the American Chemical Society, vol. 130, pp. 7613-7619, 2008. https://doi.org/10.1021/ja710438j
  12. M.S. Andrade, V.S. Silva, A.M. Lourenço, A.M. Lobo, and H.S. Rzepa, "Chiroptical Properties of Streptorubin B: The Synergy Between Theory and Experiment", Chirality, vol. 27, pp. 745-751, 2015. https://doi.org/10.1002/chir.22486

Protein-Biotin complexes. Crystal structure mining.

Sunday, December 12th, 2021

In the previous post, I showed some of the diverse “non-classical”interactions between Biotin and a protein where it binds very strongly. Here I take a look at two of these interactions to discover how common they are in small molecule structures.

The first search is of a CH hydrogen bond to the face of the aromatic ring in a tryptophane residue

The search is shown below, in which the distance of the hydrogen to the ring centroid is defined, as is the angle subtended at that centroid, constrained to lie within 20° of a vertical approach.

The resulting heat plot shows 2772 entries (no disorder, no errors, R < 0.05), with a rather diffuse red spot at around 2.7-2.8Å (but which can be as short as 2.3Å) and an angle of approach of ~90±5°. This matches the concept of a region of interaction rather than a more focused “hydrogen bond”. It is seen as a relatively common motif!


The next search is for “hydrogen bonding” between the sulfur of an C-S-C unit (as found in Biotin) and an OH group.
This is less common, with 151 entries in the Cambridge small molecule database, the red spot having a relatively short S…H distance of 1.65Å and a slightly non linear angle.

The NH analogue of this search is shown below (422 hits) shows two clusters. The one with a large angle at H is more concentrated and reveals a distance of ~2.9Å whilst the second cluster has smaller angle and a long tail out to ~2.5Å

So we conclude there is ample evidence in small molecule crystal structures for the types of interaction mooted for Biotin with proteins.

Biotin’s biggest lesson is the importance of nonclassical H-bonds in protein−ligand complexes.

Saturday, November 27th, 2021

The title comes from the abstract of an article[1] analysing why Biotin (vitamin B7) is such a strong and effective binder to proteins, with a free energy of (non-covalent) binding approaching 21 kcal/mol. The author argues that an accumulation of both CH-π and CH-O together with more classical hydrogen bonds and augmented by a sulfur centered hydrogen bond, oxyanion holes and water solvation, accounts for this large binding energy.

Here, I thought I would present a visualisation of the surroundings of biotin using the method of NCI (non-covalent-interaction) analysis, which looks at the behaviour of the electron density in the “weak” (i.e. non-covalent) regions of the biotin. This provides a more objective measure of the important interactions, independent of what we might consider important by virtue of having labels attached (such as e.g. “hydrogen bond”).

  1. I started by getting the coordinates of streptavidin (DOI: 10.2210/pdb3RY2/pdb) a protein where biotin has been co-crystallised.[2]
  2. Loaded into the CCDC Mercury program, I selected the molecule biotin itself and then added to the selection its close contacts with various groups in the streptavidin protein. These additions were truncated and capped with a methyl group to allow a wavefunction for the assembly to be calculated.
  3. Hydrogens were then added to this structure to complete atom valencies, using “idealised” positions and ensuring that when rotamers were possible, they were set up to form hydrogen bonds.
  4. A calculation (DOI: 10.14469/hpc/9982 at the ωB97XD/Def2-TZVPP/SCRF=water level) was performed.
  5. The heavy atom coordinates (i.e. not hydrogens) are unaltered from the X-ray structure. Since atom positions as measured by X-ray diffraction and as computed using a DFT procedure are slightly different, the original coordinates were also subjected to three cycles of DFT-based geometry optimisation (DOI: 10.14469/hpc/9983) to better reflect the electron density in the molecule.
  6. The resulting wavefunctions in the form of an .fchk file (for both unoptimised and partially optimised geometries) were then used to compute a grid of total electron density points
  7. The density, in the form of a cube of points, was fed to Jmol using the commands
    load biotin_den.cub; isosurface parameters [0.5 1 0.0005 0.05 0.95 1.00] NCI ""; color isosurface "bgyor" range -0.04 0.04;
    and the resulting NCI surface was written out using the command write biotin.jvxl for inclusion here.
  8. This is the NCI plot obtained from the raw coordinates from the PDB file.
  9. This is the NCI plot obtained from the coordinates from the PDB file after three geometry optimisation cycles. Can you spot any differences?

  10. These models are now available for you to explore by clicking on the images above.
    • Blue regions represent “strong” or classical hydrogen bonds. There are four of these in the NCI diagrams above and they are all compact, another characteristic of strong hydrogen bonds.
    • The hydrogen bond to sulfur is somewhat weaker, and appears in the display as a compact, albeit now cyan-coloured surface.
    • The remaining regions are both diffuse and green and represent weaker “interactions”. They are less compact than the classical hydrogen bonds. They do not represent a bond so much as an attractive region in the molecule and hence the term non-classical. Most are CH groups close to the π-surface of an aromatic ring, but some are also CH…O interactions.

Do go ahead and load the 3D surface. You should particularly explore the CH-π regions and note that they are not necessarily associated with a particular CH bond, but with several of these combining to form an interaction with an aromatic π region.

What might emerge is the realisation that binding interactions are not always between specific atoms as in classical hydrogen “bonds”, but also constitute “stabilising regions” between the ligand and the protein. You will probably spot several of these regions that are not actually listed in the article itself.[1] I suggest that we do not refer to CH…π bonds such as in the quoted title of this post but instead as CH…π regions.

It would be great if the entire complex could be subjected to an NCI analysis. Wavefunctions for >2000 atoms can be obtained nowadays, but it would require a bit of work to ensure the density can be computed accurately enough and at high enough cubic resolution to be useful in the context of NCI analysis.


This blog has DOI: 10.14469/hpc/9984


References

  1. D.B. McConnell, "Biotin’s Lessons in Drug Design", Journal of Medicinal Chemistry, vol. 64, pp. 16319-16327, 2021. https://doi.org/10.1021/acs.jmedchem.1c00975
  2. I. Le Trong, Z. Wang, D.E. Hyre, T.P. Lybrand, P.S. Stayton, and R.E. Stenkamp, "Streptavidin and its biotin complex at atomic resolution", Acta Crystallographica Section D Biological Crystallography, vol. 67, pp. 813-821, 2011. https://doi.org/10.1107/s0907444911027806

First came Molnupiravir – now there is Paxlovid as a SARS-CoV-2 protease inhibitor. An NCI analysis of the ligand.

Saturday, November 13th, 2021

Earlier this year, Molnupiravir hit the headlines as a promising antiviral drug. This is now followed by Paxlovid, which is the first small molecule to be aimed by design at the SAR-CoV-2 protein and which is reported as reducing greatly the risk of hospitalization or death when given within three days of symptoms appearing in high risk patients.

The Wikipedia page (first created in 2021) will display a pretty good JSmol 3D model of this; the coordinates being generated automatically on the fly from a SMILES string, which specifies only what atoms are connected in the structure by bonds. Given that the structure of this molecule as embedded in the SARS-CoV-2 main protease[1] has been determined (and can be viewed here), I thought I might display those coordinates as an alternative to the Wikipedia/JSmol generated structure.

Click to get 3D model

I extracted the ligand from the PDF file and then added hydrogens manually to obtain the above result. There are two noteworthy points about these representations:

  1. A mystery concerns the nominal C≡N group on the top right, which displays an angle at the carbon of 117°. A cyano group is of course linear (180°). This is not a defect of the crystal structure determination, but an indication of a rather stronger interaction occurring (as indeed noted[1]). The distance between the carbon of the cyano group and an adjacent sulfur is 1.814Å, which indicates a covalent bond has formed to the cyano group. The nitrogen of the erstwhile cyano group is 3.013Å away from an adjacent NH group, which suggests it is stabilised by a hydrogen bond.
  2. Crystal structure searching of units with S…C…N in which the N has only one bond reveals zero hits, but searches of S…C…NH reveal nine hits, with S…C distances in the range 1.74 – 1.80Å and C…N distances in the region 1.25-1.27&Aring. The reported CN distance is 1.251&ARing, confirming that when bound to the protein, the cyano group is replaced by an S-C=NH group and hence is clearly an important component of the mode of action of Paxlovid.
  3. The conformation of Paxlovid is in one respect not fully represented by the Wikipedia diagram, as shown below. This implies the t-butyl group (on the left) as being well separated from the pyrrolidinone ring system at the right of the molecule.

    In fact the two groups are adjacent, being held in that conformation by probably a combination of weak dispersion forces and a contribution from the surrounding protein in the crystal structure. This is more graphically shown by the NCI (non-covalent-interaction) diagram below (DOI: 10.14469/hpc/9964), where the green areas in the region between the two groups (ringed in red) represent stabilising interactions between them. You might also spot other green/cyan regions indicating additional weak hydrogen bonds between C-H groups and oxygen!

PAXLOVID NCI analysis

There are only a small number of crystal structures of small molecules containing the S-C=NH motif. I will try to find out how common this is in protein-ligand structures.


There are many tools for performing this operation. I used the following procedure. I downloaded the PDB file (https://files.rcsb.org/download/7vh8.cif), opened it in CSD Mercury, selected the ligand (by identifying the CF3 group and clicking on one atom), inverted the selection so that everything but the ligand was then selected and using edit/structure, I deleted the selected atoms, leaving only the ligand.

Postsript

The cyanopyrrolidine group such as in Paxlovid is well known as a specific probe.[2],[3],[4] CovalentInDB is a comprehensive database facilitating the discovery of such covalent inhibitors[5] and is available here. There is also a program called DataWarrior that is potentially able to find such probes.

References

  1. Y. Zhao, C. Fang, Q. Zhang, R. Zhang, X. Zhao, Y. Duan, H. Wang, Y. Zhu, L. Feng, J. Zhao, M. Shao, X. Yang, L. Zhang, C. Peng, K. Yang, D. Ma, Z. Rao, and H. Yang, "Crystal structure of SARS-CoV-2 main protease in complex with protease inhibitor PF-07321332", Protein & Cell, vol. 13, pp. 689-693, 2021. https://doi.org/10.1007/s13238-021-00883-2
  2. N. Panyain, A. Godinat, A.R. Thawani, S. Lachiondo-Ortega, K. Mason, S. Elkhalifa, L.M. Smith, J.A. Harrigan, and E.W. Tate, "Activity-based protein profiling reveals deubiquitinase and aldehyde dehydrogenase targets of a cyanopyrrolidine probe", RSC Medicinal Chemistry, vol. 12, pp. 1935-1943, 2021. https://doi.org/10.1039/d1md00218j
  3. N. Panyain, A. Godinat, T. Lanyon-Hogg, S. Lachiondo-Ortega, E.J. Will, C. Soudy, M. Mondal, K. Mason, S. Elkhalifa, L.M. Smith, J.A. Harrigan, and E.W. Tate, "Discovery of a Potent and Selective Covalent Inhibitor and Activity-Based Probe for the Deubiquitylating Enzyme UCHL1, with Antifibrotic Activity", Journal of the American Chemical Society, vol. 142, pp. 12020-12026, 2020. https://doi.org/10.1021/jacs.0c04527
  4. C. Bashore, P. Jaishankar, N.J. Skelton, J. Fuhrmann, B.R. Hearn, P.S. Liu, A.R. Renslo, and E.C. Dueber, "Cyanopyrrolidine Inhibitors of Ubiquitin Specific Protease 7 Mediate Desulfhydration of the Active-Site Cysteine", ACS Chemical Biology, vol. 15, pp. 1392-1400, 2020. https://doi.org/10.1021/acschembio.0c00031
  5. H. Du, J. Gao, G. Weng, J. Ding, X. Chai, J. Pang, Y. Kang, D. Li, D. Cao, and T. Hou, "CovalentInDB: a comprehensive database facilitating the discovery of covalent inhibitors", Nucleic Acids Research, vol. 49, pp. D1122-D1129, 2020. https://doi.org/10.1093/nar/gkaa876