Posts Tagged ‘HTML element’

Curating a nine year old journal FAIR data table.

Monday, May 29th, 2017

As the Internet and its Web-components age, so early pages start to decay as technology moves on. A few posts ago, I talked about the maintenance of a relatively simple page first hosted some 21 years ago. In my notes on the curation, I wrote the phrase “Less successful was the attempt to include buttons which could be used to annotate the structures with highlights. These buttons no longer work and will have to be entirely replaced in the future at some stage.” Well, that time has now come, for a rather more crucial page associated with a journal article published more recently in 2009.[1]

The story started a few days ago when I was contacted by the learned society publisher of that article, noting they were “just checking our updated HTML view and wanted to test some of our old exceptions“. I should perhaps explain what this refers to. The standard journal production procedures involve receiving a Word document from authors and turning that into XML markup for the internal production processes. For some years now, I have found such passive (i.e. printable only) Word content unsatisfactory for expressing what is now called FAIR (Findable, accessible, inter-operable and re-usable) data. Instead, I would create another XML expression (using HTML), which I described as Interactive Tables and then ask the publisher to host it and add that as a further link to the final published article. I have found that learned society publishers have not been unwilling to create an “exception” to their standard production workflows (the purely commercial publishers rather less so!). That exceptional link is http://www.rsc.org/suppdata/cp/b8/b810301a/Table/Table1.html but it has now “fallen foul of the java deprecation“. 

Back in 2008 when the table was first created, I used the Java-based Jmol program to add the interactive component. That page, when loaded, now responds with the message:

This I must emphasise is nothing to do with the publisher, it is the Jmol certificate that has been revoked. That of itself requires explanation. Java is a powerful language which needs to be “sandboxed” to ensure system safety. But commands can be created which can access local file stores and write files out there (including potentially dangerous ones). So it started to become the practise to sign the Java code with the developer certificate to ensure provenance for the code. These certificates are time-expired and around 2015 the time came to renew it. Normally, when such a certificate is renewed, the old one is allowed to continue operation. On this occasion the agency renewing the certificate did not do this but revoked the old one instead (Certificate has been revoked, reason: CESSATION_OF_OPERATION, revocation date: Thu Oct 15 23:11:18 BST 2015). So all instances of Jmol with the old certificate now give the above error message. 

The solution in this case is easy; the old Jmol code (as JmolAppletSigned.jar) is simply replaced with the new version for which the certificate is again valid. But simply doing that alone would merely have postponed the problem; Java is now indeed deprecated for many publishers, which is a warning that it will be prohibited at some stage in the future.‡ So time to bite the bullet and remove the dependency on Java-Jmol, replacing it with JSmol which uses only JavaScript.

Changing published content is in general not allowed; one instead must publish a corrigendum. But in this instance, it is not the content that needs changing but the style of its presentation (following the principle of the Web of a clear-cut separation of style and content). So I set out to update the style of presentation, but I was keen to document the procedures used. I did this by commenting out non-functional parts of the style components of my original HTML document (as <!– comment –>) and adding new ones. I describe the changes I made below.

  1. The old HTML contained the following initialisation code: jmolInitialize(".","JmolAppletSigned.jar");jmolSetLogLevel('0'); which was commented out.
  2. New scripts to initialize instead JSmol were added, such as:
    <script src="JSmol.min.js" type="text/javascript"> </script>
  3. I added further scripts to set up controls to add interactivity.
  4. The now deprecated buttons had been invoked using a Jmol instance:  jmolButton('load "7-c2-h-020.jvxl";isosurface "" opaque; zoom 120;',"rho(r) H")
  5. which was replaced by the JSmol equivalent, but this time to produce a hyperlink rather than a button (to allow the greek ρ to appear, which it could not on a button): <a href="javascript:show_jmol_window();Jmol.script(jmolApplet0,'load 7-c2-020.jvxl;isosurface &quot;&quot; translucent;spin 3;')">ρ(r)</a>,
  6. Some more changes were made to another component of the table, the links to the data repository. Originally, these quoted a form of persistent identifier known as a Handle; 10042/to-800. Since the data was deposited in 2008, the data repository has licensed further functionality to add DataCite DOIs to each entry. For this entry,  10.14469/ch/775. Why? Well, the original Handle registration had very little (chemically) useful registered metadata, whereas DataCite allows far richer content. So an extra column was added to the table to indicate these alternate identifiers for the data.
  7. We are now at the stage of preparing to replace the Java applet at the publishers site with the Javascript version, along with the amended HTML file. The above link, as I write this post, still invokes the old Java, but hopefully it will shortly change to function again as a fully interactive table.
  8. I should say that the whole process, including finding a solution and implementing it took 3-4 hours work, of which the major part was the analysis rather than its implementation.

It might be interesting to speculate how long the curated table will last before it too needs further curation. There are some specifics in the files which might be a cause for worry, namely the so-called JVXL isosurfaces which are displayed. These are currently only supported by Jmol/JSmol. They were originally deployed because iso-surfaces tend to be quite large datafiles and JVXL used a remarkably efficient compression algorithm (“marching cubes”) which reduces their size ten-fold or more. Should JSmol itself become non-operational at some time in the (hopefully) far future (which we take to be ~10 years!) then a replacement for the display of JVXL will need to be found. But the chances are that the table itself will decay “gracefully”, with the HTML components likely to outlive most of the other features. The data repository quoted above has itself now been available for ~12 years and it too is expected to survive in some form for perhaps another 10. Beyond that period, no-one really knows what will still remain. 

You may well ask why the traditional journal model of using paper to print articles and which has survived some 350 years now, is being replaced by one which struggles to survive 10 years without expensive curation. Obviously, a 3D interactive display is not possible on paper. But one also hears that publishers are increasingly dropping printed versions entirely. One presumes that the XML content will be assiduously preserved, but re-working (transforming, as in XSLT) any particular flavour of XML into another publishers systems is also likely to be expensive. Perhaps in the future the preservation of 100% of all currently published journals will indeed become too expensive and we might see some of the less important ones vanishing for ever?


Nowadays it is necessary to configure your system or Web browser to allow even signed valid Java applets to operate. Thus in the Safari browser (which still allows Java to operate, other popular browsers such as Chrome and Firefox have recently removed this ability), one has to go to preferences/security/plugin-settings/Java, enter the URL of the site hosting the applet and set it to either “ask” (when a prompt will always appear asking if you want to accept the applet) or “on” when it will always do so. How much longer this option will remain in this browser is uncertain.

In the area of chemistry, an early pioneer was the Internet Journal of Chemistry, where the presentation of the content took full advantage of Web-technologies and was on-line only. It no longer operates and the articles it hosted are gone.

References

  1. H.S. Rzepa, "Wormholes in chemical space connecting torus knot and torus link π-electron density topologies", Phys. Chem. Chem. Phys., vol. 11, pp. 1340-1345, 2009. https://doi.org/10.1039/b810301a

How to stop (some) acetals hydrolysing.

Thursday, November 12th, 2015

Derek Lowe has a recent post entitled “Another Funny-Looking Structure Comes Through“. He cites a recent medchem article[1] in which the following acetal sub-structure appears in a promising drug candidate (blue component below). His point is that orally taken drugs have to survive acid (green below) encountered in the stomach, and acetals are famously sensitive to hydrolysis (red below). But if X=NH2, compound “G-5555” is apparently stable to acids.[1] So I pose the question here; why?

acetal

This reminded me of some work we did a few years ago on herbicides containing such an acetal substructure, where one diastereoisomer was very unstable to hydrolysis (and hence did not have the lifetime required of a herbicide) whereas the other diastereomer was far less labile and hence more suitable.[2],[3] Crystal structures (below) revealed that the two C-O bond lengths of the labile form were very unequal in length (Δ0.043Å), whereas the stable form had two equal C-O lengths (1.408Å, Δ=0.0Å).

Click for 3D

KAWYOW, Click for 3D

Click for 3D

KAWYEM, Click for 3D

A search of the Cambridge structure database (CSD) surprisingly reveals no hits for molecules containing the (blue) substructure in which X=NH2, but there is one example[4],[5] of an orthoformate in which the group equivalent to X is protonated as Me2NH+. For this example, all three C-O lengths are shorter than even the hydrolytically stable herbicide above (1.405, 1.402, 1.396Å). The distribution for 6-ring acetals in general shows hot-spots at ~1.415Å and 1.43Å (but sadly it is not possible to e.g. use this database to correlate these lengths with the aqueous stability of the entries).

OCO

Is this tentative further evidence that a group X = NH2 positioned as above in an acetal can inhibit its hydrolysis?

HUZKEZ, click for 3D

HUZKEZ, click for 3D

Time for calculations. A model (X=R=H) for the hydrolysis was constructed as above in which proton transfer from an acid (ethanoic) is achieved via a cyclic 8-ring transition state and which includes a continuum solvent field as ωB97XD/6-311G(d,p)/SCRF=water and one explicit water in the proton relay. The IRC looks thus:

acetalH

This shows that the first event is protonation of an oxygen, closely followed by cleavage of the associated C-O bond, and ending with deprotonation of the erstwhile water molecule.

acetalha

The value of ΔG298 is 38.2 kcal/mol (38.4 in relative total energy). Although rather high for a facile thermal reaction (perhaps the 8-ring TS is a bit too strained; possibly adding a second active water molecule to form a 10-ring might lead to a lower barrier?), we are more interested in the effect upon this barrier of group X (Table below).

X ΔE ΔG DataDOI,TS DataDOI,IRC
H 38.4 38.2 [6] [7]
NH2,eq 39.8 38.8 [8] [9]
NH3.Cl,eq 45.1 43.1 [10] [11]
NH3.Cl,ax 42.6 41.5 [12] [13]
CF3,eq 41.9 40.1 [14] [15]
SF5,eq 43.6 42.4 [16] [17]

Introduction of X=NH3+.Cl into an (equatorial) position which is antiperiplanar to the C-O bonds of the acetal produces a modified IRC profile. The barrier measured at a point IRC = -10 is ~41 kcal/mol, which is noticeably higher than for X=H. In fact the final barrier is even higher, since the reactant goes on to form a hydrogen bond between the water molecule and the Cl, an extra stabilisation not present with X=H (and so not really appropriate to include in the comparison).

acetal-NH3Cl

acetalnh3cl-eqa

Placing the X=NH3+.Cl into an (axial) position which is not antiperiplanar to the C-O bonds shows a lower barrier compared to the equatorial isomer. This difference can also be illustrated by the NBO localised orbital energies of the two reactants. With X=NH3+.Cl axial, the lone pair on the oxygen being protonated by the acid has an energy of -0.464 au, whereas the equatorial equivalent is a “less reactive” -0.471 au (a difference in energy of 4.4 kcal/mol, which is VERY approximately related to the effects being discussed).

I conclude that the inhibition of acetal solvolysis is induced by the presence of an electron withdrawing group X, via antiperiplanar effects on the basicity of the acetal oxygen. In moderately low pH, X=NH2 is likely to be fully protonated; in this state, X=NH3+.Cl is an even better electron withdrawing group. The effect is also much stronger if X = equatorial. So one can predict here that if the alternate stereoisomer with X = axial were to be synthesised, it would hydrolyse more quickly. Other groups (X=F, CN etc) would probably show similar behaviour.


I have added two further entries, X=CF3 and X=SF5 in the table above, showing the latter to be more effective at inhibiting hydrolysis.

References

  1. C.O. Ndubaku, J.J. Crawford, J. Drobnick, I. Aliagas, D. Campbell, P. Dong, L.M. Dornan, S. Duron, J. Epler, L. Gazzard, C.E. Heise, K.P. Hoeflich, D. Jakubiak, H. La, W. Lee, B. Lin, J.P. Lyssikatos, J. Maksimoska, R. Marmorstein, L.J. Murray, T. O’Brien, A. Oh, S. Ramaswamy, W. Wang, X. Zhao, Y. Zhong, E. Blackwood, and J. Rudolph, "Design of Selective PAK1 Inhibitor G-5555: Improving Properties by Employing an Unorthodox Low-p <i>K</i> <sub>a</sub> Polar Moiety", ACS Medicinal Chemistry Letters, vol. 6, pp. 1241-1246, 2015. https://doi.org/10.1021/acsmedchemlett.5b00398
  2. P. Camilleri, D. Munro, K. Weaver, D.J. Williams, H.S. Rzepa, and A.M.Z. Slawin, "Isoxazolinyldioxepins. Part 1. Structure–reactivity studies of the hydrolysis of oxazolinyldioxepin derivatives", J. Chem. Soc., Perkin Trans. 2, pp. 1265-1269, 1989. https://doi.org/10.1039/p29890001265
  3. P. Camilleri, D. Munro, K. Weaver, D.J. Williams, H.S. Rzepa, and A.M.Z. Slawin, "Isoxazolinyldioxepins. Part 1. Structure–reactivity studies of the hydrolysis of oxazolinyldioxepin derivatives", J. Chem. Soc., Perkin Trans. 2, pp. 1929-1933, 1989. https://doi.org/10.1039/p29890001929
  4. Beckmann, C.., Jones, P.G.., and Kirby, A.J.., "CCDC 209989: Experimental Crystal Structure Determination", 2003. https://doi.org/10.5517/cc71hvl
  5. C. Beckmann, P.G. Jones, and A.J. Kirby, "<i>N,N,N</i>′,<i>N</i>′-Tetramethylstreptamine 2,4,6-orthoformate hydrochloride", Acta Crystallographica Section E Structure Reports Online, vol. 59, pp. o566-o568, 2003. https://doi.org/10.1107/s1600536803006287
  6. H.S. Rzepa, "C 6 H 14 O 5", 2015. https://doi.org/10.14469/ch/191581
  7. H.S. Rzepa, "Gaussian Job Archive for C6H14O5", 2015. https://doi.org/10.6084/m9.figshare.1599751
  8. H.S. Rzepa, "C 6 H 15 N 1 O 5", 2015. https://doi.org/10.14469/ch/191582
  9. H.S. Rzepa, "C6H15NO5", 2015. https://doi.org/10.14469/ch/191586
  10. H.S. Rzepa, "C 6 H 16 Cl 1 N 1 O 5", 2015. https://doi.org/10.14469/ch/191584
  11. H.S. Rzepa, "C6H16ClNO5", 2015. https://doi.org/10.14469/ch/191588
  12. H.S. Rzepa, "C 6 H 16 Cl 1 N 1 O 5", 2015. https://doi.org/10.14469/ch/191590
  13. H.S. Rzepa, "Gaussian Job Archive for C6H16ClNO5", 2015. https://doi.org/10.6084/m9.figshare.1601891
  14. H.S. Rzepa, "C 7 H 13 F 3 O 5", 2015. https://doi.org/10.14469/ch/191592
  15. H.S. Rzepa, "Gaussian Job Archive for C7H13F3O5", 2015. https://doi.org/10.6084/m9.figshare.1603088
  16. H.S. Rzepa, "C 6 H 13 F 5 O 5 S 1", 2015. https://doi.org/10.14469/ch/191595
  17. H.S. Rzepa, "Gaussian Job Archive for C6H13F5O5S", 2015. https://doi.org/10.6084/m9.figshare.1603420

Deviations from tetrahedral four-coordinate carbon: a statistical exploration.

Sunday, September 6th, 2015

An article entitled “Four Decades of the Chemistry of Planar Hypercoordinate Compounds[1] was recently reviewed by Steve Bacharach on his blog, where you can also see comments. Given the recent crystallographic themes here, I thought I might try a search of the CSD (Cambridge structure database) to see whether anything interesting might emerge for tetracoordinate carbon.

The search definition is shown below using a  simple carbon with four ligands, the ligands themselves also being tetracoordinate carbon. The search is restricted to data collected below temperatures of 140K, as well as R-factor <5%, no errors and no disorder. Cyclic species are allowed and a statistically reasonable 2773 hits emerged from the search.

Scheme

Recollect that the idealised angle subtended at the centre is 109.47°. I show below three separate heat plots of the search results. Why three? The way the search software (Conquest) works is that one could define four C-C distances and six angles, and then plot any combination of one distance and one angle. I show just three combinations here, but could have included many more.

There appear to be four distinct clusters of values for this angle that emerge from the three plots shown below (the “bin size” is 100, and the frequency colour code indicates how many hits there are in each bin).

  1. The hotspot is unsurprisingly ~109° with a corresponding C-C distance of ~1.54Å.
  2. There may be two clusters at angles of ~60° (cyclopropane), with C-C values ranging from ~1.47 to ~1.55Å.
  3. A collection at ~90° (mostly cyclobutane?), with C-C values up to 1.6Å.
  4. A collection at ~140° (again small rings), now with much shorter C-C values of ~1.46Å. This reminds of the approximation that the hybridisation in e.g. cyclopropane is a combination of sp5 and sp3.

Scheme

Scheme

Scheme

Ideally, what one might want to plot would be sums of four angles; for a pure tetrahedral carbon the sum would always be 438° (4*109.47°) but for a pure planar carbon it could be as low as 360° (4*90°). One could then see how closely the distribution approaches to the latter and hence reveal whether there are any true planar tetracoordinate carbon species known. Although the Conquest software cannot analyse in such terms, a Python-based API has recently been released that should allow this to be done, although I should state that this requires a commercial license and it is not open access code. If we manage to get it working, I will report!


As a teaser I also include a plot of six-coordinate carbon, in which the ligands can be any non-metal. Note the clusters at angles of 60, ~112 and ~120-130°. It is worth pointing out that the definition of the connection between a carbon and a ligand as a “bond” becomes increasingly arbitrary as the coordination becomes “hyper”. Because crystallography does not measure electron densities in “bonds”, we know nothing of its topology in this region. It is therefore quite possible that the appearance of the heat plot below might be related just as much to whatever convention is being used in creating the entry in the CSD as it would be to a quantum analysis of the bonding.

Scheme

References

  1. L. Yang, E. Ganz, Z. Chen, Z. Wang, and P.V.R. Schleyer, "Four Decades of the Chemistry of Planar Hypercoordinate Compounds", Angewandte Chemie International Edition, vol. 54, pp. 9468-9501, 2015. https://doi.org/10.1002/anie.201410407

Monastral: the colour of blue

Tuesday, March 8th, 2011

The story of Monastral is not about a character in the Magic flute, but is a classic of chemical serendipity, collaboration between industry and university, theoretical influence, and of much else. Fortunately, much of that story is actually recorded on film (itself a unique archive dating from 1933 and being one of the  very first colour films in existence!). Patrick Linstead, a young chemist then (he eventually rose to become rector of Imperial College) tells the story himself here. It is well worth watching, if only for its innocent social commentary on the English class system (and an attitude to laboratory safety that should not be copied nowadays). Here I will comment only on its colour and its aromaticity.

Copper phthalocyanine

In 1933, Hückel was still thinking about his molecular orbital electronic theory of benzene, but for ~15 years, there remained little need for the rule we now know as 4n+2, because n was invariably equal to 1 for most known aromatic molecules! It was only the discovery of so-called non-benzenoid aromatics in the 1940s (e.g. Dewar’s tropolone structure) that propelled chemists to identify aromatic molecules with other values of n. And Monastral blue is a prime example of n=4 (although it would be of interest to find out when it became so associated with the Hückel rule). If you count the red bonds above, there are eight, along with one lone pair of electrons located on the highlighted (blue) nitrogen atom. This makes 18 π-electrons in the ring, or 4×4+2 (there are paths other than the one shown, but they give the same count). Part of the reason for the remarkable thermal stability of this molecule must be its aromaticity.

So what about the colour? The visible spectrum is shown below, with λmax ~ 610 and 710nm.

Visible absorption spectrum of copper phthalocyanine.

Well, a TD-DFT ωB97Xd/6-31G(d) calculation reveals the following. This reproduces the band at 610nm very nicely, but leaves the identity of the band at  710nm mysterious. How does that originate? One might speculate that this could arise from the presence of another species. Thus copper phthalocyanine itself is neutral, but it could easily be oxidised to a cation, and this could then form a  1:1 π-complex with a second molecule of the neutral radical (DOI:10.1021/ja00238a021 )

The electronic excitation at ~610nm arises from the following MOs:

Orbital 147, the highest occupied MO (HOMO). Click for 3D

Orbital 148, the lowest unoccupied MO.

The unpaired electron in copper phthalocyanine occupies the following rather interesting orbital, which appears not to be involved in its blue colour.

Orbital 146. The singly occupied MO.

So, just as with mauveine, a mystery remains. The colour of Monastral blue is not monochromatic, in that it appears to be caused by two bands in the 600-700 region. Calculation however reveals it to have only one band at 610nm. What is the other one?