Detecting anomeric effects in tetrahedral boron bearing four oxygen substituents.

April 30th, 2024

In an earlier post, I discussed[1] a phenomenon known as the “anomeric effect” exhibited by tetrahedral carbon compounds with four C-O bonds. Each oxygen itself bears two bonds and has two lone pairs, and either of these can align with one of three other C-O bonds to generate an anomeric effect. Here I change the central carbon to a boron to explore what happens, as indeed I promised earlier.

One can identify candidates for such molecules by a constrained search of the CSD or the Cambridge structural database, as shown below.

The four B-O distances for each compound matching the query are now subjected to further analysis, the greatest and least values are identified and the difference between them calculated.

The results are shown in the diagram below. Three outliers are identified for close inspection.

Each of the three candidates is also subjected to a Gaussian calculation (MD15L/Def2-TZVPP)[2] (See DOI: 10.14469/hpc/14092)

  1. QIXREW[3]. This molecule is overall neutral and for which ΔrB-O = 0.193Å (MN15L/Def2-TZVPP ΔrB-O = 0.175Å). The Wiberg bond indices of longest and shortest B-O bonds are 0.486 and 0.698, Δ = 0.212Å.This is significantly larger than the best example of the C-O series, for which the largest ΔrC-O = 0.074Å and 0.137 for the Wiberg index.
  2. XOVZOY[4] is a tri-anion with intercalated Ir3+ counterion. ΔrB-O = 0.347Å. A calculation on the isolated tri-anion (with a continuum water field to help emulate the crystal environment) results in the maximum B-O bond length difference of only 0.004Å, which is dramatically different from the crystal structure. This may be an example where the counter-ion is especially important for modelling structure, or it may be simply an anomalous refinement of the crystal structure.
  3. KBDCTB, ΔrB-O measured = 0.451Å, Calculated 0.0314Å.
    This is another structure where all may not be what it seems. This again is an anionic structure and geometry optimisation of a single molecule results in a dramatic change in the internal hydrogen bonding of the species. In the crystal structure, the carboxylic acid groups all form intermolecular hydrogen bonds. Optimized as an isolated molecule, the former are no longer possible and a big conformational change occurs to allow all four carboxylic acid groups to instead form intramolecular H-bonds. In this conformation, all four B-O bonds are essentially the same length. So this might well be an example of a large change in anomeric effects due to changes in geometry induced by hydrogen bonding.

    Intermolecular H-bonds Intramolecular H-bonds

One lesson one always learns when comparing the lengths of bonds observed in crystal structures with those calculated using quantum mechanics is that they sometimes do not match well. These mis-matches can occur for various reasons; changes in hydrogen bonding, or the presence of unmodelled counterions or simply errors in the reported crystal structure. But we might suggest from this brief foray into B-O bonds that the anomeric effects found there may indeed be larger than those of their C-O counterparts.

References

  1. H. Rzepa, "Detecting anomeric effects in tetrahedral carbon bearing four oxygen substituents.", 2024. https://doi.org/10.59350/dfkt5-k2b20
  2. H. Rzepa, "Detecting anomeric effects in tetrahedral boron bearing four oxygen substituents.", 2024. https://doi.org/10.14469/hpc/14092
  3. S.I. Kalläne, T. Braun, B. Braun, and S. Mebs, "Versatile reactivity of a rhodium(i) boryl complex towards ketones and imines", Dalton Transactions, vol. 43, pp. 6786, 2014. https://doi.org/10.1039/c4dt00080c
  4. H. Danjo, K. Hirata, S. Yoshigai, I. Azumaya, and K. Yamaguchi, "Back to Back Twin Bowls of <i>D</i><sub>3</sub>-Symmetric Tris(spiroborate)s for Supramolecular Chain Structures", Journal of the American Chemical Society, vol. 131, pp. 1638-1639, 2009. https://doi.org/10.1021/ja8071435

Internet Archeology: reviving a 2001 article published in the Internet Journal of Chemistry.

March 21st, 2024

In the mid to late 1990s as the Web developed, it was becoming more obvious that one area it would revolutionise was of scholarly journal publishing. Since the days of the very first scientific journals in the 1650s, the medium had been firmly rooted in paper. Even printed colour only became common (and affordable) from the 1980s. An opportunity to move away from these restrictions was provided by the Web. Early adopters of this medium in chemistry were the CLIC pilot project[1] in 1995 and the Internet Journal of Chemistry (IJC), the latter offering “enhanced chemical publication which permits the publication of materials which cannot be published on paper and end-use customization which permits the readers to read articles prepared for their specific needs“.[2] Publication of the latter started in January 1998, offering “authors the opportunity to enhance their articles by fully incorporating multimedia, large data sets, Java applets, color images and interactive tools.” The journal remained online for seven years, after which it was closed and the articles became inaccessible. By then many major chemistry journals had started evolving along some of the same lines, and it could be argued this journal had served its purpose of alerting both publishers and authors to these new opportunities. Here I describe how an IJC article published in 2001 was brought back to life in more or less the enhanced manner intended.[3]

Entitled “The Mechanism and Design of Asymmetric Co-Arctate Br+ (Mobius) Atom Transfers Between Alkenes. A Computational Study“, an abstract of the article is still visible via services such as e.g. Scifinder, but a more complete and open metadata description which can be provided from an assigned DOI (Digital object identifier) is not available, since back in 2001 the adoption of DOIs by journals was still in its infancy. Fortunately, the original source was still available from the authors as a combination of HTML, image files and data, the latter two being hyperlinked into the body of the article. These files are in fact all that is needed to recreate the original IJC article (if not its style), using the mechanism of a data repository[4],[5] rather than that normally designed for a journal. The procedure adopted was as follows:

  1. All the data files were uploaded to the repository as a dataset.[6], DOI: 10.14469/hpc/13929.
  2. The metadata record generated and registered for these depositions (https://data.datacite.org/application/vnd.datacite.datacite+xml/10.14469/hpc/13929) has Access (the A of FAIR) identifiers in the form of e.g.
    1. <relatedIdentifier relatedIdentifierType="URL" relationType="HasPart">https://data.hpc.imperial.ac.uk/resolve/?doi=13929&file=1</relatedIdentifier>
    2. Descriptive metadata providing further properties if needed, such as file names and media types and file sizes can be obtained via
      • <relatedIdentifier relatedIdentifierType="URL" relationType="HasMetadata">https://data.hpc.imperial.ac.uk/resolve/?ore=13929</relatedIdentifier>
    3. These access identifiers replaced the hyperlinks in the original article HTML
      1. Originally: <a href="supplemental/3-ts-rh3.pdb">-1.5</a>
      2. Becomes: <a href="https://data.hpc.imperial.ac.uk/resolve/?doi=13929&file=54">-1.5</a>
      3. It is worth noting that there are basically two methods of accessing a file. The first relies on its relative path in a hierarchical file system. Hard-coding such a location into a URL means it may not be persistent – the hyperlink is vulnerable to “link rot” when the file system is reorganised and the path to the file changes. The second method relies on a database query, which should be rather more persistent, since the database should always incorporate any reorganisation of the internal systems. A third option (not used here) is to assign a persistent identifier to every file, and to ensure that a properly persistent direct access mechanism is described in metadata for that file.
      4. The root document for the article, given the reserved filename index.html was edited to reflect the changes in the hyperlinks.
  3. The article document index.html was now itself uploaded to the repository. In a conventional data repository, such a file invokes no specific actions, but in the repository used for this purpose it does have the reserved meaning of invoking in effect a preview or “LiveView” using the syntax
    • <iframe name="liveview" src="https://data.hpc.imperial.ac.uk/resolve/?doi=13929&file=90"
      width="100%" height="600" id="liveview"></iframe>
  4. The article now functions much in the same way it would have done on IJC, albeit in one interesting way. The regular style adopted in journals is to place the ESI or electronic supporting information files into a separate enclave, linked via the article landing page by parochial mechanisms. In this instance the article and its data files are visible on the same page – it is a data repository after all – thus elevating the data to the same status as the article. Such elevation is often referred to as making “Data a first class citizen of the publication processes“.
  5. The opportunity now arose to incorporate an interactive tool based on the use of the JSmol molecule viewer.
    • By adding an additional header to the HTML document containing a Javascript invocation of JSmol, selected data could be brought to life by creating a molecular model in a separate window.
    • This is invoked by a variation on the hyperlink shown above in section 3.2 by
      <a href="javascript:show_jmol();javascript:handle_jmol('10.14469/hpc/13933',%20';frame 1;font label 16;zoom 5;moveto 4 90 4 80 65 120;spin 3;set echo bottom left;font echo 20 serif bolditalic;color echo green;echo TS for 3 (C2 symmetry);')">Load 3D Model</a>
    • Additional tools are now provided, from activating a (molecular) vibration, calculating a chirality (if applicable) or others invoked from a pull-down menu.
    • In this example, the data is again accessed directly from a data repository, albeit by a different mechanism from that shown in 3.2 and here based only on the DOI of the data and its media type (in this case chemical/x-mdl-molfile).

It was not the intention here to illustrate how a Journal infrastructure might work – merely to rescue an article published 23 years ago (a long time in the Internet era) from a journal that is no longer disseminating articles. In the process the article has acquired its own DOI (albeit as data and not journal article), something not available from the original journal and some level of interactivity of the type originally envisaged. The (manual) process took something around 2-3 hours to achieve, and would certainly need automating if it were to be used more than once. I take encouragement however that after so many years, it was still possible with relatively little effort to achieve this curation.

References

  1. D. James, B.J. Whitaker, C. Hildyard, H.S. Rzepa, O. Casher, J.M. Goodman, D. Riddick, and P. Murray‐Rust, "The case for content integrity in electronic chemistry journals: The CLIC project", New Review of Information Networking, vol. 1, pp. 61-69, 1995. https://doi.org/10.1080/13614579509516846
  2. S.M. Bachrach, "The 21st century chemistry journal", Química Nova, vol. 22, pp. 273-276, 1999. https://doi.org/10.1590/s0100-40421999000200020
  3. H. Rzepa, "Internet Archeology: an example of a revitalised molecular resource with a new activity now built in.", 2020. https://doi.org/10.59350/9c769-34y25
  4. Re3data.Org., "Imperial College Research Computing Service Data Repository", 2016. https://doi.org/10.17616/r3k64n
  5. FAIRsharing Team., and , ., "FAIRsharing record for: Imperial College Research Computing Service Data Repository", 2018. https://doi.org/10.25504/fairsharing.letkjt
  6. H. Rzepa, "The Mechanism and Design of Asymmetric Co-Arctate Br+ (Mobius) Atom Transfers Between Alkenes. A Computational Study", 2024. https://doi.org/10.14469/hpc/13929

Detecting anomeric effects in tetrahedral carbon bearing four oxygen substituents.

March 18th, 2024

I have written a few times about the so-called “anomeric effect“, which relates to stereoelectronic interactions in molecules such as sugars bearing a tetrahedral carbon atom with at least two oxygen substituents. The effect can be detected when the two C-O bond lengths in such molecules are inspected, most obviously when one of these bonds has a very different length from the other. The effect originates when one of the lone pair of electrons on one oxygen atom uniquely overlaps with the C-O antibonding σ* on another oxygen, thus shortening the length of the donating oxygen-carbon length and lengthening the length of accepting C-O bond. Here I take a look at tetra-substituted versions of this (C(OR)4), where in theory there are up to eight lone pairs, interacting with any of three C-O bonds, giving a total of 24 possible anomeric effects in one molecule.


We start the process with a search of the Cambridge crystal structure database, using the following search query:

This yields 25 hits. We now want to find out what the longest and shortest C-O bonds are, and how large the difference between them is. To do this, we have to resort to applying some functions, using the calculator tool built into the Mercury analysis software. The following functions were used:

  1. Greatest('search3'.'DIST1','search3'.'DIST2','search3'.'DIST3','search3'.'DIST4')
  2. Least('search3'.'DIST1','search3'.'DIST2','search3'.'DIST3','search3'.'DIST4')
  3. Greatest('search3'.'DIST1', 'search3'.'DIST2', 'search3'.'DIST3', 'search3'.'DIST4')-Least('search3'.'DIST1', 'search3'.'DIST2', 'search3'.'DIST3', 'search3'.'DIST4')

The results can be displayed as below, in which the difference between the two bond lengths is colour coded (red = greatest, blue = least).

  1. Here you can see that when the difference between the longest and short C-O bond lengths is small, the colour is blue.
  2. Green dots show a difference of about 0.04-0.05Å
  3. The red dot has the greatest difference of 0.087Å and corresponds to the entry SILDOH ([1], DataDOI: [2], 10.5517/ccq8lq8.

The next step is to apply a “reality check” using computation, here a MN15L/Def2-TZVPP calculation on the top eight entries as sorted by the largest C-O bond length differences (ΔrC-O > 0.05Å.[3], data DOI: 10.14469/hpc/13925

CCDC Ref code Crystal structure Computational structure
Longest Shortest Δ Longest shortest Δ
SILDOH 1.451 1.364 0.087 1.441 1.367 0.074
PILTOU 1.432 1.361 0.071 1.418 1.378 0.040
GISSAD 1.435 1.367 0.068 1.422 1.375 0.047
BODGEG 1.507 1.442 0.065 1.424 1.370 0.054
GINLOF 1.425 1.364 0.061 1.418 1.377 0.041
POCPOO 1.419 1.361 0.058 1.421 1.371 0.050
KEVFUM 1.417 1.361 0.056 1.395 1.391 0.004
AHEYAO 1.423 1.370 0.053 1.422 1.372 0.050
  1. The largest effect occurs for SILDOH, and this is replicated by calculation.
  2. The largest discrepancy between measurement and calculation is for KEVFUM,  where calculation predicts almost no C-O bond differences. This will be discussed elsewhere.

Focusing on SILDOH, we look at the NBO E(2) energies for the donor-acceptor interactions of an oxygen lone pair donating into a C-O antibonding σ* orbital.

Click on the image below for a 3D model of the two interacting orbitals (positive overlap = blue + purple, red + orange)

The interaction of LpO1 to the long bond C5-O4 = 18.0 and LpO2 to C5-O4 = 16.3 kcal/mol, whereas in the reverse directions, LpO4 to C5-O1 is only 6.0 kcal/mol and LpO4 to C5-O2 is 10.7 kcal/mol.  For a “normal” C-O bond however such as  C5-O3,  LpO2 to C5-O3 = 3.1 and LPO1 to C5-O3 = 5.3 kcal/mol. In effect, two oxygens “gang up” on weakening the  long C5-O4 bond, but leave the shorter C5-O3 bond alone. So the individual anomeric effects are no larger than normal, but the cooperative effect of two acting together is what produces the final geometric asymmetry.

The Wiberg bond index mirrors this effect. The bond indices are 0.9882 for O1-C5 and C5-O4 0.8512 (Δ =-0.137) which is a big difference in bond order and accounting for the large (record?) difference in bond length.

In the next post, I will analyse the equivalent molecules B(OR)4.

References

  1. R. Betz, and P. Klüfers, "Norbornane-2,7-diyl 1′,2′-phenylene orthocarbonate", Acta Crystallographica Section E Structure Reports Online, vol. 63, pp. o3933-o3933, 2007. https://doi.org/10.1107/s1600536807042298
  2. Betz, R.., and Klufers, P.., "CCDC 663670: Experimental Crystal Structure Determination", 2007. https://doi.org/10.5517/ccq8lq8
  3. H. Rzepa, "Detecting anomeric effects in tetrahedral carbon bearing four oxygen substituents.", 2024. https://doi.org/10.14469/hpc/13925

Data Citation – a snapshot of the chemical landscape.

February 26th, 2024

The recent release of the DataCite Data Citation corpus, which has the stated aim of providing “a trusted central aggregate of all data citations to further our understanding of data usage and advance meaningful data metrics” made me want to investigate what the current state of citing data in the area of chemistry might be. Chemistry is known to be a “data rich” science (as most of the physical sciences are) and  here on this very blog I try to cite whenever possible the source(s) of the data that  I often use when discussing a topic. Such citations are not necessarily the same as citing a journal source via e.g. its DOI, although of course one is very likely to find data associated with most articles nowadays, albeit almost entirely via any associated supporting information document. However the latter is often presented in a relatively unstructured (PDF) form, which does not adhere to what are called the “FAIR” guidelines of being findable, accessible, interoperable and reusable. Directly citing data is a way of improving its FAIR-characteristics. So what insights does the Data citation corpus reveal?

  1. This overview shows that by far the most common mechanism for citing data is via its Accession Number, used predominantly by Life Sciences (an example of this latter is linked here[1]), with the DOI (digital object identifier) being less common.
  2. Tunnelling down to citation counts in chemical sciences by publisher, an odd picture emerges with just a handful of citations.
  3. The more general physical sciences does not fare much better:
  4. Lets try a different approach, filtering by repository. Thus here are the statistics for the Cambridge crystallographic data centre, which was citing data in large amounts a few years back, but which appears to have dropped off in the last few years. Given that the entries there continue to go up almost exponentially, we begin to suspect that the data citations there are not being properly recognised as such by the citation corpus.
  5. Lets try another repository, Zenodo, which again is dropping but where the totals are about 500 a year for the most recent.
  6. OK, one more go, the RSC chemistry publisher.

I am not sure what to make of this; areas where you would expect very high levels of data citation in chemical sciences do not appear to exist – I think for some reason, the DataCite citation corpus is not yet capturing them.[2] But when things do start operating as perhaps expected, I think we will have a very valuable resource, which should firmly put data (whether FAIR or not) on the map.

References

  1. D. Batista, A. Gonzalez-Beltran, S. Sansone, and P. Rocca-Serra, "Machine actionable metadata models", Scientific Data, vol. 9, 2022. https://doi.org/10.1038/s41597-022-01707-6
  2. R. Page, "Problems with the DataCite Data Citation Corpus", 2024. https://doi.org/10.59350/t80g1-xys37

Mechanistic templates computed for the Grubbs alkene-metathesis reaction.

February 19th, 2024

Following on from my template exploration of the Wilkinson hydrogenation catalyst, I now repeat this for the Grubbs variant of the Alkene metathesis reaction. As with the Wilkinson, here I focus on the stereochemistry of the mechanism as first suggested by Chauvin[1], an aspect lacking in eg the Wikipedia entry. As before, the diagram below is hyperlinked to the appropriate data repository identifier so that you can go straight from the scheme to the data (Top level Data DOI: 10.14469/hpc/13796).

The essence of the reaction is the formation of a metallacyclobutane intermediate, which being approximately symmetrical with a plane of symmetry, can revert to the catalyst and an alkene in one of two ways, reforming the original alkene (two red dot carbons) or a forming a methathesised alkene (red-blue dot carbons).

Although the mechanism is often described as a [2+2] cycloaddition in which d-orbital participation from the metal lowers the activation energy significantly, calculations at the MN15L/Def2-TZVPPD/SCRF level indicate there can be up to four discrete steps involved in the process. There are three routes involving these steps that the calculations (B3LYP+GD3+BJ/Def2-TZVPPD/SCRF=DCM) reveal (DataDOI 10.14469/hpc/13796). The starting point for all three routes is the most stable reactant catalyst (left above) which has the H-C-H carbene group in the same plane as the P-Ru-P atoms.

  1. The red route involves the following steps:
    • Activation of the catalyst by rotation of the carbene from its lowest energy orientation by 90°.
    • followed by addition of an alkene to form a π-complex –
    • then formation of a C-C bond between the alkene and the carbene (animation below). The remarkable feature of this third step is that the carbene group must again rotate through 90° (indicated with a red rotational arrow in scheme above) prior to finally forming the C-C bond.
  2. The magenta route involves only one step, in which addition of the alkene is directly followed in a second stage by C-C bond formation, via what is called the “hidden intermediate” of the alkene complex (visible at ~IRC -3 for the energy profile below)
  3. The green and final route again involves up to four steps:
    • A pseudorotation to place the two chlorine atoms di-axial,
    • next, addition of an alkene to form a π-complex
    • immediately followed C-C formation between alkene and carbene, again with twisting of the carbene group in the final step. The combined IRC for the last two of these steps (below) shows that the alkene π-complex in fact sits in shallow but real minimum, compared to it being only a “hidden intermediate” in the magenta route.

    • The reaction can either reverse as this stage to eliminate a different alkene,  or progress through one final pseudorotation step to rejoin the product of the red and magenta routes.

      Click on image above to get 3D model of the transition state.

Of the three routes, the green one has the lowest “high energy” point, corresponding to a barrier of ΔG 14.7 kcal/mol, which corresponds to the facile room temperature reaction it is. The two almost-equal high points are the initial pseudorotation and the alkene complexation, although the final C-C bond formation is also very similar in energy.

So we have learnt that this mechanism is actually a bit more complex than is normally shown and that two of the steps (red and green) involve a very unusual methylene rotation accompanying the C-C bond formation. No doubt, the stereoelectronic orbital interactions responsible for this are fascinating, but an analysis of these will have to wait for another post.

References

  1. P. Jean‐Louis Hérisson, and Y. Chauvin, "Catalyse de transformation des oléfines par les complexes du tungstène. II. Télomérisation des oléfines cycliques en présence d'oléfines acycliques", Die Makromolekulare Chemie, vol. 141, pp. 161-176, 1971. https://doi.org/10.1002/macp.1971.021410112

3D Molecular model visualisation: 3 Million atoms +

January 27th, 2024

In the late 1980s, as I recollected here[1] the equipment needed for real time molecular visualisation as it became known as was still expensive, requiring custom systems such as Evans and Sutherland PS390 workstations. One major breakthrough in making such techniques generally available on less specialised equipment was achieved by Roger Sayle[2], then working at Imperial College around 1990 and using a Silicon Graphics workstation. He greatly optimised up the rendering algorithms by creating a program called RasMol (after his initials), which meant such visualisations could very rapidly also be achieved even on a personal computer. Moving from vector display technology (the PS390) to Raster/bitmap graphics had allowed spacefilling representations of molecules containing 100s if not 1000s of atoms – and in turn enabled the new World-Wide Web to exploit the technique.[3]

Whilst Rasmol is very much still around, it also provided an inspiration for successor programs such as Jmol (based on Java) and JSMol (based on the Javascript language built into all modern web browsers). There are now many articles in the literature describing this program. In 2008 the very first post on this blog described how run it in a WordPress instance[4].

Now a new milestone in molecular visualisation has been reached – the ability to display 3 million atoms! Bob Hanson has just released Jmol/JSmol 16.1.51 which supports the BinaryCIF file format. An example of the power of both program and this new format is illustrated with the protein 8glv[5] which contains 3 million atoms (the bcif file itself is only 47.4 Mb).

The Jmol/JSmol script to load it is:

t = now();    
     set autobond false;
     load =8glv.bcif filter "*.C";
     spacefill on;
     color chain;
     print now(t);

and the actual rendering takes just 10-20 seconds. You can see from the screenshots below that when it is zoomed in, it really does show individual atoms! Who knows what the practical atom limit is, but it is almost certainly more than three million! And it may even be possible on a mobile phone!


OK, you are asking why I have not loaded 8glv into this page? Well, I need to update JSmol on this site first, and have encountered an issue that needs fixing.

References

  1. H. Rzepa, "Computers 1967-2011: a personal perspective. Part 2. 1985-1989.", 2011. https://doi.org/10.59350/g4j62-4xk50
  2. R. Sayle, "RASMOL: biomolecular graphics for all", Trends in Biochemical Sciences, vol. 20, pp. 374-376, 1995. https://doi.org/10.1016/s0968-0004(00)89080-5
  3. H.S. Rzepa, B.J. Whitaker, and M.J. Winter, "Chemical applications of the World-Wide-Web system", Journal of the Chemical Society, Chemical Communications, pp. 1907, 1994. https://doi.org/10.1039/c39940001907
  4. H. Rzepa, "Jmol and WordPress: Loading 3D molecular models, molecular isosurfaces and molecular vibrations into a blog", 2008. https://doi.org/10.59350/pq7ds-gqr71
  5. T. Walton, and A. Brown, "96-nm repeat unit of doublet microtubules from Chlamydomonas reinhardtii flagella", 2023. https://doi.org/10.2210/pdb8glv/pdb

The Macintosh computer at 40.

January 25th, 2024

On 24th January 1984, the Macintosh computer was released, as all the media are informing us. Apparently, some are still working. I thought I would give my own personal recollections of that period.

In fact, the Mac reached UK stores via a dealership only in 1985. What brought it to the attention of our university chemistry department was that also in 1985 the Chemdraw program was released and visitors to e.g. ACS meetings that year (probably the spring meeting) brought news of it back. A third piece of the puzzle, the Laserwriter also appeared that year. What difference would all this make? Well, take a look at the diagram in this 1983 article[1]. I drew that with stencils and transfer lettering, and the diagrams in this article took me ages! The article was submitted to what was called a “camera ready” journal, as part of the process of accelerating its publication, so it had to be as perfect as I could make it. I had to start from the beginning several times, since sometimes even Typex could not fix the errors or rescue the diagram from being a bit to big to fit onto the Journal provided template.

After drafting these diagrams, I vowed never again! Fortunately, the Mac, Chemdraw and the Laserwriter appeared some 18 months later! I remember going around the (mostly organic) chemists in the department, asking if they would like to join in a bulk purchase and we ended up with 10 Macs. By 1985, the model had moved on to the Mac 512K which were the ones actually purchased and photos of the front and rear of one are shown below (I still have it, hoping a collector might make me an offer one day).

The first year of use revealed an infamous quirk. The port on the rear of the Mac 512K did not support attachment of any hard drives (although in 1985 these were ferociously expensive for a 10 MB drive!) and so most of the time one spent not using eg Chemdraw but pushing floppy disks in and out of the machine. A year later, the Mac Plus 1Mb version was introduced (third photo) and this had a SCSI port. I attached such a 10 Mbyte drive to this port and the bliss at not having to rotate floppy disks was immense.

Back to the 512K model. After they were delivered, I gathered all 9 other users and introduced them all to the mouse. In the first 15 minutes, there were rumblings that they would never get used to such a strange object, but at roughly the 45 minute mark, they were all converts. The program demonstrated was of course Chemdraw. Microsoft Word was not yet available but another simple word processor was (WriteNow) and everyone practised constructing diagrams such as the above. What joy! And no Typex, or starting the diagram from scratch – merely a simple 10 second edit.

By 1987 as I recollect, there were many 1MB models now installed and we set about networking them all together and connecting them to the Laserwriter. We even managed to use the Mac to connect to STN international to search Chemical Abstracts[2] and the modern era was well under way.

So this is my tribute to the Mac on its 40th birthday. I still use them to this day.

References

  1. A.M. Lobo, S. Prabhakar, H.S. Rzepa, A.C. Skapski, M. Tavers, and D.A. Widdowson, "C-substitution reactions of c,n-diaryl nitrones", Tetrahedron, vol. 39, pp. 3833-3841, 1983. https://doi.org/10.1016/s0040-4020(01)88625-7
  2. H. Rzepa, "A trip down memory lane: An online departmental connection map from 1989.", 2023. https://doi.org/10.59350/85xp6-2sy65

A mechanistic exploration of the Wilkinson hydrogenation catalyst. Part 1: Model templates

January 21st, 2024

Geoffrey Wilkinson first reported his famous work on the hydrogenation catalyst that now bears his name in 1965[1] and I met him at Imperial College around 1969 and again when I returned there in 1977. He was still working on these catalysts then and I was privileged to collaborate with him on unravelling the NMR spectra of some of these compounds.[2],[3],[4]. During that period, computational modelling of the mechanisms of molecules containing transition elements was still in its infancy and I never extended my collaboration into this area at that time. Now, even if belatedly, I decided to explore this aspect and started to do this about two weeks ago. Here I thought that I would use this opportunity to show how I am going about it.

The diagram below is an extension of the one found at Wikipedia and here is acting in effect as a “Finding Aid” for the data that would be gradually generated for the mechanism. At the outset, I decided to build my own version to also act as a laboratory notebook charting my progress, building the finding aid as I went. This explains, by the way the rather amorphous expansion of the diagram!

Before discussing the mechanism itself, I point out some features of the diagram itself. Each computed species is associated with a free energy (in Hartree) acting as a FAIR-type identifier for the calculation[5] as a means of improving the findability of the data and the replicability of the result. Also included is the energy relative to the lowest point in the mechanism (itself set to 0.0) and next to that you can see a five digit code. If prefixed by the string https://doi.org/10.14469/hpc/ this acts as a digital object identifier (DOI) for each calculation, pointing to a landing page providing information about the archived dataset. The top-level DOI 13538 acts as a collection or container for the project, being also the DOI that would normally be cited in any description of the results, such as here. The diagram above uses the graphical vector format SVG, which allows hyperlinks to be inserted. So if you click on one of these strings embedded in the diagram (see e.g.[6]), it should take you straight to the data for that result.

The first point to make about the mechanism itself are the stereochemistries of the various 3-6 coordinate species, which in the Wikipedia mechanism are not really discussed. On the right hand side of the diagram, two alternative pathways with different stereochemistry are included. On the lhs (in grey) a different sequence of events is set out, which rejoins the main pathway at the dotted line. The next point to make is the level of computational theory adopted, it being the MN15L DFT procedure, which is suitable for transition metal elements, and the Def2-TZVPP basis set – together with a continuum solvent correction. For rapid exploration, I made an initial big approximation, which was to set the substituent on the phosphorus to R=H rather than R=Ph. This allows templates for the entire cycle to be constructed relatively rapidly, and then revisited as desired in a follow up exploration using these templates.

The mechanistic features are described below. The DOI suffix is quoted for you to locate on the diagram.

  1. To the right of the cycle, we follow the accepted route, which is initial loss of one phosphine ligand, followed by insertion of H2 onto the Rhodium (13559)
  2. The hydrogens inserted can pseudorotate into a different stereochemical orientation (13563) and either of these stereoisomers can now complex with the alkene (13569 or 13576).
  3. The two resulting 6-coordinate complexes could in theory interconvert by a different pseudorotation (Turnstile[7]), but this appears high in energy (13580)
  4. One of the carbons of the alkene complex now inserts into the Rh-H bond (13543, 13578) to form a Rh-alkyl complex in which an agostic-style Rh-H interaction is apparent (13545, 13588)
  5. The agostic interaction is removed (13598, 13589) to form stereochemical isomers of the Rh-alkyl complex (13593)
  6. Another pseudorotation sets up the stereochemistry for the final step (13554, 13592).
  7. The remaining Rh-C bond can now insert into the remaining Rh-H bond, at which point the two separate isomeric paths now coalesce to form a single transition state (13596 ≡ 13549) releasing the activated Rh complex where the cycle first started and hence completing the cycle.
  8. To the right of the diagram are two cul-de-sac intermediates (in grey) which result from re-addition of phosphine to the hydride complex.
  9. To the left of the diagram is an alternative sequence which involves adding alkene to the Rh  first, and only then followed by H2 addition (13584, 13583). The energies of this path does appear significantly higher than the alternative. Once the alkene/H2 complex is formed, it now rejoins the cycle on the right of the diagram (horizontal double headed dashed arrow).

You can follow the (relative) energies of this mechanism from the diagram; they are all reasonable for a thermal reaction. However, I will refrain from making any overall decision about the rate determining step (thought to be step 7 above), because the model for both the phosphine ligand and the (unsubstituted) alkene do not yet have any steric components, which are known to be important. What we have here therefore are templates for the next stage of studying the mechanism, when Ph3P and e.g. propene will replace the current models.

Here I have tried to show a somewhat different approach to “laboratory notebook management”, whereby each step in the investigation can be accompanied by a persistent identifier (as a DOI) to that step, pointing to a location where the coordinates for the template can be readily obtained. The DOIs are added as each step completes, in this case into a Chemdraw diagram. Unfortunately, Chemdraw does not have a hyperlink tool (I did ask them to a few years back) and this can only be added to the export SVG file at the final stage. I inserted 40 such hyperlinks using a text editor; the process was not too onerous and because the SVG file is text based, it is also easily edited for errors and small corrections. Curiously, SVG editing tools such as the veritable Inkscape do not currently support addition of hyperlinks and given the well-established mechanisms for hyperlinking text, it seems odd that this has not developed for images.

References

  1. J.F. Young, J.A. Osborn, F.H. Jardine, and G. Wilkinson, "Hydride intermediates in homogeneous hydrogenation reactions of olefins and acetylenes using rhodium catalysts", Chemical Communications (London), pp. 131, 1965. https://doi.org/10.1039/c19650000131
  2. K.W. Chiu, H.S. Rzepa, R.N. Sheppard, G. Wilkinson, and W. Wong, "Two-dimensional δ/J-resolved<sup>31</sup>P n.m.r. spectroscopy of [bis(diphenylphosphino)methane](trimethylphosphine)chlororhodium(<scp>I</scp>)", J. Chem. Soc., Chem. Commun., pp. 482-484, 1982. https://doi.org/10.1039/c39820000482
  3. C. Kwok W., C.G. Howard, H.S. Rzepa, R.N. Sheppard, G. Wilkinson, A.M. Galas, and M.B. Hursthouse, "Trimethyl and diethylphenylphosphine complexes of rhenium(I, III, IV, V) and their reactions. X-ray crystal structures of a bis(η5-cyclopentadienyl)-ethane-bridged dirhenium(I) complex obtained from phenylacetylene, tetrakis-(diethylphenylphosphine) (dinitrogen) hydridorhenium (I), tetrakis(trimethyl-phosphine) (η2-dimethylphosphinomethyl) rhenium(I) and tetrakis(trimethylphosphine) (iodo)methyl rhenium(III) iodide-tetramethylphosphonium iodide", Polyhedron, vol. 1, pp. 441-451, 1982. https://doi.org/10.1016/s0277-5387(00)86558-4
  4. K.W. Chiu, H.S. Rzepa, R.N. Sheppard, G. Wilkinson, and W. Wong, "Bis(diphenylphosphino)methane trimethylphosphine alkyl and η5-cyclopentadienyl compounds of rhodium(I); 31P{1H} two dimensional δ/J resolved and Overhauser effect nuclear magnetic resonance spectroscopy", Polyhedron, vol. 1, pp. 809-817, 1982. https://doi.org/10.1016/0277-5387(82)80008-9
  5. H. Rzepa, "Harnessing FAIR data: A suggested useful persistent identifier (PID) for quantum chemical calculations.", 2018. https://doi.org/10.59350/nk414-18p76
  6. S. Arkhipenko, M.T. Sabatini, A.S. Batsanov, V. Karaluka, T.D. Sheppard, H.S. Rzepa, and A. Whiting, "Mechanistic insights into boron-catalysed direct amidation reactions", Chemical Science, vol. 9, pp. 1058-1072, 2018. https://doi.org/10.1039/c7sc03595k
  7. H.S. Rzepa, and M.E. Cass, "A Computational Study of the Nondissociative Mechanisms that Interchange Apical and Equatorial Atoms in Square Pyramidal Molecules", Inorganic Chemistry, vol. 45, pp. 3958-3963, 2006. https://doi.org/10.1021/ic0519988

Scholarly journals vs Scholarly Blogs.

January 12th, 2024

First, a very brief history of scholarly publishing, starting in 1665[1] when scientific journals started to be published by learned societies. This model continued until the 1950s, when commercial publishers such as Pergamon Press started with their USP (unique selling point) of rapid time to publication of ~3 months,[2] compared to typical times for many learned society publishers of 2 years or longer. Fast forward another 50 years or so, and the commercial publishers were now dominating the scene, but the business model was still based on institutional subscriptions, whereby the institution rather than authors paid the costs of publication. As the number of journals expanded, even well-off institutions had to make difficult decisions on which subscriptions to keep and which to cancel. By the late 1990s the delivery model was changing from print to online, but the overall issue was that many scientists around the world no longer had access to many journals.

Enter the APC, or article processing charge, whereby the authors themselves had to reimburse the journals for publishing their papers, although they could often still recover these costs from their institution. The cost of an APC depended on the reputation of the journal; those with the highest “impact factors” often charged the highest APCs, some of which could reach £5000+ for a single “paper” (still called that even in an electronic era). Also, some journals remained “hybrid”, where the costs were split between institutional subscriptions and APC funded. At least the latter could be accessed by anyone (including the “public”) without restriction (Open-Access) often also referred to as GOLD  and even Diamond (also known as platinum) articles which  are  GOLD open access but without author fees. Diamond is typically used by publishers who are keen to emphasise that they do not charge authors to publish open access.

With many APCs ranging from £1000 up to £5000 or more, some started asking why it should cost so much to have this type of publishing infrastructure. Also in the early 2000s, “social media” started up, which at first tended to concentrate on instant publication and hence impact. The longevity of these media was not considered capable or indeed even desirable of rivalling that achieved by journal publishers, which after all had been around for 360 years or so. Things have begun to change however. Enter as an example Rogue Scholar, and its associated blog Front Matter. The aim here is to exploit the underpinning technical infrastructure of a blog host by automatically adding features more commonly associated with learned society or commercial journal publishing.

I wrote[3] about some of the features available last September and now only four months later the functionality continues to expand. This includes:

  1. The ability to acquire a JATS XML version (Journal article tag suite), the standard format for scholarly articles
  2. I had previously noted that Blog posts are assigned a DOI based on the Crossref registration agency, and hence also acquire a metadata record which becomes useful for searching. All 800+ of the posts on this site have such a DOI for example.
  3. One interesting recent use of blogs is to act as a science newsletter associated with a funded grant, as an adjunct to simply publishing the research results in a journal.
  4. Indexing is also making big strides with the introduction of an API (application programmer interface), another service offered by scholarly publishers. As part of this, fields of science are being added to the metadata to enable filtering such as eg Chemistry
  5. Archiving, in theory for all of posterity, is also starting to be addressed . This requires transformation from HTML, typically used in blogs, to a medium more appropriate for long term archiving.

The cost of the infrastructures described above are certainly very much less than eg the APC charges noted above, in part because they are so highly automated. I expect things will move very rapidly on this front.


It is hoped to automatically include these in the post itself in the future. Meanwhile, it can easily be retrieved by a suitable search.

References

  1. H. Oldenburg, "Epistle dedicatory", Philosophical Transactions of the Royal Society of London, vol. 1, pp. i-ii, 1665. https://doi.org/10.1098/rstl.1665.0001
  2. D. Ginsburg, and W.J. Rosenfelder, "Alicyclic studies—X", Tetrahedron, vol. 1, pp. 3-8, 1957. https://doi.org/10.1016/0040-4020(57)85003-0
  3. H. Rzepa, "Improving the Science blog – The Rogue Scholar service.", 2023. https://doi.org/10.59350/8m2d8-47b52

Macrocyclic peptide antibiotics – now Zosurabalpin – then antibacterial agents based on cyclic D,L-α-peptide architectures.

January 8th, 2024

Zosurabalbin[1],[2], is receiving a great deal of attention as a new class of antibiotic which can target infections for which current treatment options are inadequate. It is a cyclic peptide and seeing this triggered memory of an earlier such species reported way back in 1995[3],[4]. This octa-peptide (YIJDIE, DOI: 10.5517/cc58gxs) was presumed to function in a novel manner, having linear water channels wide enough to form a molecular nanoscale pipe for a stream of water molecules to flow along. When inserted into the bacterial cell membrane via its lipophilic sidechains, it drained the bacterium of its cell water within seconds, thus killing it. A 3D model shows the effect very clearly.

Zosurabalpin does not function in this manner. Its structure was devised by optimising the various substituents until optimal activity was obtained (see this patent WO202319441).

The ligand (VB6) is seen below. A program such as Chimera can tease out many more details.

Zosurabalpin embedded in the protein pdb8frn can be viewed below and the coordinates can be obtained via DOI: 10.2210/pdb8frn/pdb

The original 1995 report[3] about the cyclic octapeptide appears was never developed into a clinically useful antibiotic, but I wonder where this approach led to.

References

  1. C. Zampaloni, P. Mattei, K. Bleicher, L. Winther, C. Thäte, C. Bucher, J. Adam, A. Alanine, K.E. Amrein, V. Baidin, C. Bieniossek, C. Bissantz, F. Boess, C. Cantrill, T. Clairfeuille, F. Dey, P. Di Giorgio, P. du Castel, D. Dylus, P. Dzygiel, A. Felici, F. García-Alcalde, A. Haldimann, M. Leipner, S. Leyn, S. Louvel, P. Misson, A. Osterman, K. Pahil, S. Rigo, A. Schäublin, S. Scharf, P. Schmitz, T. Stoll, A. Trauner, S. Zoffmann, D. Kahne, J.A.T. Young, M.A. Lobritz, and K.A. Bradley, "A novel antibiotic class targeting the lipopolysaccharide transporter", Nature, vol. 625, pp. 566-571, 2024. https://doi.org/10.1038/s41586-023-06873-0
  2. S. Hawser, N. Kothari, T. Valmont, S. Louvel, and C. Zampaloni, "2131. Activity of the Novel Antibiotic Zosurabalpin (RG6006) against Clinical <i>Acinetobacter</i> Isolates from China", Open Forum Infectious Diseases, vol. 10, 2023. https://doi.org/10.1093/ofid/ofad500.1754
  3. M.R. Ghadiri, K. Kobayashi, J.R. Granja, R.K. Chadha, and D.E. McRee, "The Structural and Thermodynamic Basis for the Formation of Self‐Assembled Peptide Nanotubes", Angewandte Chemie International Edition in English, vol. 34, pp. 93-95, 1995. https://doi.org/10.1002/anie.199500931
  4. S. Fernandez-Lopez, H. Kim, E.C. Choi, M. Delgado, J.R. Granja, A. Khasanov, K. Kraehenbuehl, G. Long, D.A. Weinberger, K.M. Wilcoxen, and M.R. Ghadiri, "Antibacterial agents based on the cyclic d,l-α-peptide architecture", Nature, vol. 412, pp. 452-455, 2001. https://doi.org/10.1038/35086601