Archive for the ‘Chemical IT’ Category

Octopus publishing: dis-assembling the research article into eight components.

Friday, August 13th, 2021

In 2011, I suggested that the standard monolith that is the conventional scientific article could be broken down into two separate, but interlinked components, being the story or narrative of the article and the data on which the story is based. Later in 2018 the bibliography in the form of open citations were added as a distinct third component.[1] Here I discuss an approach that has taken this even further, breaking the article down into as many as eight components and described as “Octopus publishing” for obvious reasons. These are;



  1. D. Shotton, "Funders should mandate open citations", Nature, vol. 553, pp. 129-129, 2018.

Room-temperature superconductivity in a carbonaceous sulfur hydride!

Saturday, October 17th, 2020

The title of this post indicates the exciting prospect that a method of producing a room temperature superconductor has finally been achived[1]. This is only possible at enormous pressures however; >267 gigaPascals (GPa) or 2,635,023 atmospheres.



  1. E. Snider, N. Dasenbrock-Gammon, R. McBride, M. Debessai, H. Vindana, K. Vencatasamy, K.V. Lawler, A. Salamat, and R.P. Dias, "Room-temperature superconductivity in a carbonaceous sulfur hydride", Nature, vol. 586, pp. 373-377, 2020.

Exploiting the power of persistent identifiers (PIDs) for locating all kinds of research object.

Saturday, August 29th, 2020

The folks at DataCite have announced a new research object discovery service which aims to give users a “comprehensive overview of connections between entities in the research landscape”. The portal acts as the entry point for three basic types of persistent identifiers (PIDs);


A cascading tutorial in finding rich NMR data using the Datacite datasearch engine.

Saturday, April 11th, 2020

In the previous post, I introduced three of a new generation of search engines specialising in the discovery of data. Data has some special features which make its properties slightly different from the conceptual (or natural language) searches we are used to performing for general information and so a search engine specifically for data is invariably going to reflect this. At the simplest level, the data search can retain much of the generic simplicity of a regular search, but to exploit the unique features of data, one really does have to move on to an advanced mode. Here, by introducing a set of search definitions that gradually increase in specificity and power, I hope to convey some of the flavour of one way in which this could be done.


New generations of globally aggregating search engines – for (chemical) data.

Tuesday, April 7th, 2020

Chemists have long been familiar with search engines that aspire to index a large proportion of the chemical literature. Think for example the old-generation (and commercial) SciFinder (Scholar) and Reaxys or those that arrived in the 1990s in the online era such as the non-commercial Pubchem or ChemSpider (there are more). But you may not be as familiar with the latest generation of global search engines and here I will focus on three relatively new ones that specialise specifically in tracking down data rather than just publications.


The Persistent Identifier ecosystem expands – to instruments!

Saturday, March 21st, 2020

A PID or persistent identifier has been in common use in scientific publishing for around 20 years now. It was introduced as a DOI (Digital Object Identifier), and the digital object in this case was the journal article. From 2000 onwards, DOIs started appearing for most journal articles, journals having obtained them from a registration agency, CrossRef. This is a not-for-profit organisation set up by a publishers association for the purpose. Most readers of journal articles started to use this DOI as an easier way of navigating through invariably different and sometimes confusing metaphors set up by any given journal to navigate through its issues. Readers slowly learnt to prepend the URL to the DOI to “resolve” it directly to what is known as the “landing page” of the article. More recently, the prefix recommendation has changed to the slightly shorter form. Few readers are aware  however that the DOI can serve a much more interesting purpose than just taking you to the article landing page. This post will explore a few of these extras.


A Non-nitrogen Containing Morpholine Isostere; an application of FAIR data principles.

Sunday, August 4th, 2019

In the pipeline reports on an intriguing new ring system acting as an isostere for morpholine. I was interested in how the conformation of this ring system might be rationalised electronically and so I delved into the article.[1] Here I recount what I found.



  1. H. Hobbs, G. Bravi, I. Campbell, M. Convery, H. Davies, G. Inglis, S. Pal, S. Peace, J. Redmond, and D. Summers, "Discovery of 3-Oxabicyclo[4.1.0]heptane, a Non-nitrogen Containing Morpholine Isostere, and Its Application in Novel Inhibitors of the PI3K-AKT-mTOR Pathway", Journal of Medicinal Chemistry, vol. 62, pp. 6972-6984, 2019.

Metadata. Why?

Tuesday, July 2nd, 2019

I have had some interesting discussions recently regarding metadata. What emerges is that it can be quite a broadly defined concept and it is clear that a variety of answers might be obtained when asking the simple question “what is it useful for?” Here I set out some of my answers to that question.