Archive for the ‘crystal_structure_mining’ Category
Thursday, June 1st, 2017
Conformational polymorphism occurs when a compound crystallises in two polymorphs differing only in the relative orientations of flexible groups (e.g. Ritonavir). At the Beilstein conference, Ian Bruno mentioned another type; tautomeric polymorphism, where a compound can crystallise in two forms differing in the position of acidic protons. Here I explore three such examples.
The term occurs in the title of this article,[1] for a compound known as Omeprazole.

When the bottom structure (the 6-methoxy) is used to search the CSD, two separate series are found. The first of these is UDAVIF (DOI: 10.5517/ccp82qq, 6-Methoxy-2-((4-methoxy-3,5-dimethyl-2-pyridinyl)methylsulfinyl)-1H-benzimidazole). There is no information regarding the absolute configuration of the chiral S-centre. Although the downloaded coordinates show it as R it is probably a racemic mixture. A note added to the structure declares disorder: “Omeprazole exists as solid solutions of the two tautomers. The structure is mixed 5-methoxy/6-methoxy with occupancies 0.078:0.922“, which indicates 7.8% is present as in the upper structure above.

The second hit is VAYXOI (DOI: 10.5517/ccp82pp, rac-6-Methoxy-2-(((4-methoxy-3,5-dimethyl-2-pyridinyl)methyl)sulfinyl)-1H-benzimidazole) which now contains no disorder; the contaminating 5-methoxy tautomer is no longer present. Perhaps not quite a true tautomeric polymorph, since the 5-methoxy tautomer is never observed in pure form.
This does occur with a second example. DEBFAR[2] represents the keto form on the right which crystallises from methanol, whilst YUYDOL as the enol form on the left crystallises from n-hexane.

Calculations shed some light on this behaviour. DEBFAR has a computed (DOI: 10.14469/hpc/2591) dipole moment of 11D, whereas YUYDOL (DOI: 10.14469/hpc/2590) is 2.5D. In chloroform solutions (~half way between the two solvent polarities), the keto form is ~6.1 kcal/mol lower in ΔG than the enol. The crystal packing for the two forms is very different and the differences in this packing must clearly amount to >6.1 kcal/mol to over-ride the lesser stability of DEBFAR in solution.

The final example [3] is illustrated using scheme 2 from that article, one entitled tautomeric species of 4-hydroxynicotinic acid:

The original diagram has two unfortunate bond errors which are NOT reproduced above (and which perhaps are a good topic for discussion in tutorials with students), along with an unusual interpretation of the term tautomerism. The blue arrows above are mine and I suggest the isomerism between the connected species is resonance isomerism, and not tautomerism. So three possible different true tautomers then. Five crystal structures are reported which I list below.
- 10.5517/cctswjz (KUXPUP, 4-oxo-1,4-dihydropyridine-3-carboxylic acid, no H2O), 10.5517/ccdc.csd.cc1kfyxv (KUXPUP01 no H2O) and 10.5517/ccdc.csd.cc1kfyzx (KUXPUP02 no H2O)
- 10.5517/ccx59s4 (AVEMUK, 4-Oxo-1,4-dihydropyridine-3-carboxylic acid hemihydrate) and 10.5517/ccdc.csd.cc1kfz21 (AVEMUK01)
- 10.5517/ccdc.csd.cc1kfz54 (AKIHIN, 4-hydroxypyridin-1-ium-3-carboxylate monohydrate)
- 10.5517/ccdc.csd.cc1kfz10 (AKIHAF, 4-hydroxypyridin-1-ium-3-carboxylate)
KUXPUP and AVEMUK differ only in the presence of one solvent water molecule and both represent tautomer 2 above. AKIHIN and AKIHAF similarly represent tautomer 3 above; both are represented as 3a in the CSD and not as 3b. There are no examples of tautomer 1 in the crystal structure database; it may only exist in the gas phase. So the equilibrium 2 ⇌ 3 is another genuine example of tautomeric polymorphism, with the keto form favoured by more polar solvents, as was noted for the previous example.
With this last article,[3] comprehensive calculations at a good level were reported, including modelling the periodic cell using the Crystal program and including corrections such as BSSE (basis set superposition error) and dispersion terms. I was hopeful that this might lead me to something as simple as the computed dipole moments of the (isolated) species (as I reported above for the previous system), but these were not mentioned in the text of the article. Unfortunately, the supporting information also had no details of any such calculations, which left me frustrated again at how difficult it can be in (it has to be said) the vast majority of articles which report calculations to get details of such calculations.
Tautomeric polymorphism remains a very rare phenomenon. SciFinder for example only has 19 references citing it (2 of which are to conference talks). Perhaps the most intriguing[4] claims that 2-thiobarbituric acid has the richest collection of tautomeric polymorphs with five. Since no calculations are reported there, I might try these out and report back here.
Postscript: Here is some analysis of 2-thiobarbituric.
- THBARB (DOI 10.5517/cctbxcd, 10.5517/cctbxfg and 10.5517/cctbxgh) are three polymorphs of the keto tautomer, the isolated molecule having a small calculated dipole moment (DOI: 10.14469/hpc/2632).

- PABNAJ (DOI: 10.5517/cctbxbc) is a polymorph in the enol form, with a much larger calculated dipole moment (DOI: 10.14469/hpc/2633)

- PABNIR (DOI: 10.5517/cctbxdf) is a mixed polymorph with one enol paired with one keto form.

The relative free-energies of the isolated molecules are 0.0 (keto) and 9.0 (enol). The keto-enol pair is 0.4 kcal/mol more stable than the isolated components. This again shows the effect that crystal packing can have on the relative energies and also shows that a simple inspection of the dipole moment may cast light on the polymorphism.
References
- P.M. Bhatt, and G.R. Desiraju, "Tautomeric polymorphism in omeprazole", Chemical Communications, pp. 2057, 2007. https://doi.org/10.1039/b700506g
- Y. Akama, M. Shiro, T. Ueda, and M. Kajitani, "Keto and Enol Tautomers of 4-Benzoyl-3-methyl-1-phenyl-5(2H)-pyrazolone", Acta Crystallographica Section C Crystal Structure Communications, vol. 51, pp. 1310-1314, 1995. https://doi.org/10.1107/s0108270194007389
- S. Long, M. Zhang, P. Zhou, F. Yu, S. Parkin, and T. Li, "Tautomeric Polymorphism of 4-Hydroxynicotinic Acid", Crystal Growth & Design, vol. 16, pp. 2573-2580, 2016. https://doi.org/10.1021/acs.cgd.5b01639
- M. Chierotti, L. Ferrero, N. Garino, R. Gobetto, L. Pellegrino, D. Braga, F. Grepioni, and L. Maini, "The Richest Collection of Tautomeric Polymorphs: The Case of 2‐Thiobarbituric Acid", Chemistry – A European Journal, vol. 16, pp. 4347-4358, 2010. https://doi.org/10.1002/chem.200902485
Tags:Chemistry, chloroform solutions, Conformational isomerism, Crystal, crystallography, gas phase, Ian Bruno, Isomerism, Polymorphism, Ritonavir, S-centre, Tautomer
Posted in Chemical IT, crystal_structure_mining | No Comments »
Monday, May 29th, 2017
Derek Lowe highlights a recent article[1] postulating CH⋅⋅⋅π interactions in proteins. Here I report a quick check using the small molecule crystal structure database (CSD).
The search query (DOI: 10.14469/hpc/2594) is shown below.
- The distance refers to that between the (normalised) position of a hydrogen on a 4-coordinated carbon atom and the centroid of a carbonyl group substituted with R=C or H.
- The angle is that subtended at the centroid. An approach orthogonal to the axis of the carbonyl group will have a value of 1.0 for the sine.
- The torsion relates to the angle between the H…centroid and C-R vectors. The absolute value is constrained to 70-110° to filter only approaches towards the π-system of the carbonyl.
- The search is further restricted to no disorder, no errors and R < 0.05.

The two most interesting hits, both revealing short distances and ~orthogonal approaches to the π-system are:
Remember however that such “outliers” must always be carefully inspected. There are more numerous interactions in the region 2.4-2.6Å with a sine(angle) of >0.9 and and a close orthogonal approach to the π-system (green dots) which probably qualify for the title above. There seem many interesting but still putative small-molecule candidates for this proposed interaction postulated for proteins.
Postscript: Here the results of the search above with R= any of H,C,N,O,F,Cl up to values of the distance <2.4Å, which show a range of interesting (green) points.
References
- F.A. Perras, D. Marion, J. Boisbouvier, D.L. Bryce, and M.J. Plevin, "Observation of CH⋅⋅⋅π Interactions between Methyl and Carbonyl Groups in Proteins", Angewandte Chemie International Edition, vol. 56, pp. 7564-7567, 2017. https://doi.org/10.1002/anie.201702626
Tags:Company: CL ENGENHARIA, Derek, Derek Lowe, Lowe, search query
Posted in crystal_structure_mining | 2 Comments »
Saturday, May 6th, 2017
Mention carbon dioxide (CO2) to most chemists and its properties as a metal ligand are not the first aspect that springs to mind. Here thought I might take a look at how it might act as such.
There are up to five binding modes with one metal that one might envisage:
- Bonded interaction with the metal via just one oxygen atom,
- Bonded interaction via just the central carbon atom,
- Bonded interaction via the π-face of one C=O double bond,
- A weaker non-bonded interaction via carbon, or
- via oxygen.
Search queries of the Cambridge structure database (CSD) for these five modes are illustrated below (dataDOI: 10.14469/hpc/2524), with the constraints being applied to how many bonds (of unspecified type) each atom carries, along with no disorder and no errors. Thus query 1 is constrained by 1-coordination on one oxygen, and two on the carbon and other oxygen.

- This query yields four hits: 10.5517/ccvcdq9, 10.5517/cc12nq6n, 10.5517/cc12nq5m, 10.5517/cc12nq4l. The angle subtended at the central carbon of the CO2 ranges from 172-176°, a very modest bending of the linear CO2. There are no examples where the metal is bonded to both oxygens.

- The next category involves the metal binding just to the central carbon. Two examples are known, differentiated from O-coordination by a more acute angle at the central carbon of 121-132°.

- The π-coordinated type requires a slightly more complex search query, shown below. The π-complex is defined as adding one coordination to each of one oxygen and the carbon.

This reveals 16 examples:

The sine of the angle subtended at the centroid of one C-O bond shows that for most of the examples, the metal is close to perpendicular to this bond. The angle subtended at the central carbon ranges from 128-138, rather larger than the examples where the metal is bound just to the carbon. I have picked these two for illustration. The first (dataDOI: 10.5517/cc86r17) contains both CO2 and CO coordinated to the metal.
This one (dataDOI: 10.1021/ic101652e) contains a short metal-centroid distance of 1.78Å (as also does 10.5517/ccz34kr).

There are two examples where BOTH π-CO bonds are coordinated to a metal; 10.5517/ccqlv7c and 10.5517/ccqlv8d (Ni-centroid distance 1.9Å) but these are intriguing because the two π-complexes are co-planar and not orthogonal.

- The final two cases are defined in the CSD database by having not so much bonds between metal and either C or O, as close intermolecular contacts typical of e.g. hydrogen bonds. This one (dataDOI: 10.5517/cc12nq9r) is to Fe, with a metal-C distance of 2.87Å which is significantly shorter than the anticipated sum of the van der Waals radii of the two atoms.
The next (dataDOI: 10.5517/cc12npn2) has a close approach of Co to O of 2.23Å. The angles subtended at the carbon range from 174-180°. There are no convincing examples of close non-bonded approaches of the metal to both oxygen atoms simultaneously.
It is striking that the searches (as defined above) reveal relatively few examples. This might simply be a result of how the compounds are indexed in the CSD, reflected in the coordination constraints applied in the searches. Nevertheless, we see three quite different types of ligand-metal coordination in which bonds can be said to form and a more diffuse spectrum of weaker interactions to carbon dioxide. As a metal ligand, it is certainly interesting! Several deserve their wavefunctions looked at and I might report back on this aspect.
Tags:Carbon, Carbon Capture & Storage, carbon dioxide, chemical bonding, Chemistry, Environment, Ligand, ligand-metal coordination, metal, metal ligand, Propellants, Search queries, search query, short metal-centroid distance
Posted in crystal_structure_mining | 2 Comments »
Friday, April 28th, 2017
Research data (and its management) is rapidly emerging as a focal point for the development of research dissemination practices. An important aspect of ensuring that such data remains fit for purpose is identifying what curation activities need to be associated with it. Here I revisit one particular case study associated with the molecular structure of a product identified from a photolysis reaction[1] and the curation of the crystallographic data associated with this study.
This particular dataset (CSD, dataDOI: 10.5517/cctnx5j) is associated with an article entitled “Single-Crystal X-ray Structure of 1,3-Dimethylcyclobutadiene by Confinement in a Crystalline Matrix“.[1] Data for crystal structures supporting a research article is required (at least in part) to be deposited into the Cambridge structure database (internal reference MUWMEX) and for which a significant level of curation is performed. Although the definition of the term curation has evolved over the last few years, here I take it to include the following:
- Identification of appropriate metadata describing the data. For molecules, this would include any identifiers such as the name of the molecule and the connectivities of the atoms constituting that molecule.
- The submission of this metadata to a suitable aggregator, such as e.g. DataCite and its inclusion in any other databases associated with the data. These two tests are part of the FAIR data guidelines[2], covering the F (findable) and A (accessible).
- Performing any validation tests for the data that can be identified. With crystal structure data in CIF format, this is defined by the utility checkCIF and helps to ensure the I (inter-operable) of FAIR. The R refers in part to the licenses under which the data can be re-used.
On (it has to be said rare) occasions, these procedures can lead to a disparity between the author’s conclusions arrived on the basis of their acquired data and the metadata identified by the independent curators. This difference is most obviously illustrated in this case study by the chemical names inferred by the curation process for the structure represented by the data in the CSD:
- chemical name: “tetrakis(Guanidinium) 25,26,27,28-tetrahydroxycalix(4)arene-5,11,17,23-tetrasulfonate 1,5-dimethyl-2-oxabicyclo[2.2.0]hex-5-en-3-one clathrate trihydrate“
- chemical name synonym: “tetrakis(Guanidinium) tetra-p-sulfocalix(4)arene 1,3-dimethylcyclobutadiene carbon dioxide clathrate trihydrate“.
Only the synonym agrees with the title given by the original authors in their publication.[1] One might indeed strongly argue that these two names are not in fact synonyms, since they refer to quite different chemical structures with different atom connectivities. A search of the database for the sub-structure corresponding to 1,3-dimethylcyclobutadiene does not reveal any hits and so the information implied by this synonym is not recorded in the index created for the CSD database.
I asked the scientific editors of the CSD for some guidance on the curation procedures applied to crystal structure datasets and they have kindly allowed me to quote some of this.
- “In cases such as this, we as editors are sometimes faced with conflicting information and have to try our best to strike a balance between the data presented in the CIF, a published interpretation and our knowledge based on the information already in the CSD”.
- “In areas where there is a particular conflict between these, we often would include a comment (usually in the Remarks or Disorder field as appropriate)”. For this particular dataset, one finds the following under the Disorder field:
- “Under UV radiation the clathrated pyrone molecule converts to a disordered mixture of square-planar 1, 3-dimethylcyclobutadiene and rectangular-bent 1, 3-dimethylcyclobutadiene in van der Waals contact with a carbon dioxide molecule. The ratio of the square-planar to rectangular-bent 1, 3-dimethylcyclobutadiene clathrate is modelled with occupancies 0.6292:0.3708”.
- It is not entirely obvious however whether this last comment originates from the original authors or from the data curators. It does not resolve the difference between the assigned chemical name and the indicated chemical name synonym.
- “In the case of MUWMEX, I think that the editor produced a diagram (below) which seems chemically reasonable based on the crystallographic data with which we were provided and tried to cover the situation regarding disorder, van der Waals contacts etc in the ‘Disorder’ field. At this point, it is left to the CSD user to decide for themselves.”

We have arrived at a point where the CSD user must indeed decide what the species described by this dataset actually is. Ideally, the best recourse would be to acquire the original data in full and repeat the crystallographic analysis. This is an aspect of the curation of crystallographic data that is not conducted as part of the current processes, which would require as a minimum a superset known as the hkl information to be present in the data. Again, to quote the CSD scientific editors:
- “With regard to your question: Is there any mechanism in the Conquest search to identify structures where the hkl information is present? I understand that it is not currently possible to do this in ConQuest. It is, however, possible … to access structure factor data (where available) using Access Structures.”
For MUWMEX, the hkl information is not present in the CSD dataset and in 2010 when the structure was published would have to be obtained directly from the authors. By 2016 however, its presence in deposited datasets was becoming far more common. It is worth pointing out that even the hkl information is not the complete data recorded for the experiment. That is represented by the original image files recording the X-ray diffractions. This latter is hardly ever available as FAIR data even nowadays.
I hope I have here illustrated at least some of the challenging aspects of curating scientific data and the issues that can arise when derived metadata (in this case the name and the atom connectivities of a molecule) reveal conflicts with the original interpretations. This for an area of chemistry where both the data deposition and its curation is a very mature subject, having operated for ~52 years now. It is still a process that requires the intervention of skilled curators of the data, but perhaps even more importantly it reveals the need to identify even more strictly what the provenance of the interpretations is. Should the CSD curation rest merely at the stage of teasing out and flagging inconsistencies and allowing the user to then take over to resolve the conflicts? Should it be more active, in re-analyzing data for each entry where conflicts have been detected? Perhaps the latter is not practical now, but it might be in the near future. What is certain is that with increasing availability of FAIR data these sorts of issues will increasingly come to the fore. And not just for the very well understood case of crystallographic data but for many other types of data.
References
- Y. Legrand, A. van der Lee, and M. Barboiu, "Single-Crystal X-ray Structure of 1,3-Dimethylcyclobutadiene by Confinement in a Crystalline Matrix", Science, vol. 329, pp. 299-302, 2010. https://doi.org/10.1126/science.1188002
- M.D. Wilkinson, M. Dumontier, I.J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J. Boiten, L.B. da Silva Santos, P.E. Bourne, J. Bouwman, A.J. Brookes, T. Clark, M. Crosas, I. Dillo, O. Dumon, S. Edmunds, C.T. Evelo, R. Finkers, A. Gonzalez-Beltran, A.J. Gray, P. Groth, C. Goble, J.S. Grethe, J. Heringa, P.A. ’t Hoen, R. Hooft, T. Kuhn, R. Kok, J. Kok, S.J. Lusher, M.E. Martone, A. Mons, A.L. Packer, B. Persson, P. Rocca-Serra, M. Roos, R. van Schaik, S. Sansone, E. Schultes, T. Sengstag, T. Slater, G. Strawn, M.A. Swertz, M. Thompson, J. van der Lei, E. van Mulligen, J. Velterop, A. Waagmeester, P. Wittenburg, K. Wolstencroft, J. Zhao, and B. Mons, "The FAIR Guiding Principles for scientific data management and stewardship", Scientific Data, vol. 3, 2016. https://doi.org/10.1038/sdata.2016.18
Tags:assigned chemical name, author, chemical name, chemical name synonym, chemical names, chemical structures, editor, indicated chemical name synonym, Knowledge, radiation, Research, Scientific method, Technology/Internet, X-ray
Posted in Chemical IT, crystal_structure_mining | 5 Comments »
Monday, April 17th, 2017
Following on from my re-investigation of close hydrogen bonding contacts to the π-face of alkenes, here now is an updated scan for H-bonds to alkynes. The search query (dataDOI: 10.14469/hpc/2478) is similar to the previous one:
- QA is any of N,O,F,Cl.
- X is any atom, including metals and non-metals.
- The carbon atoms are both specified as 2-coordinate, and the C-C bond type as any.
- The distance is from the hydrogen (normalised) to the C-C centroid, restricted to < 2.5Å to capture just the shortest examples.
- The mean of the sines of the two angles subtended at the centroid is calculated to indicate whether the approach is orthogonal.
- The mean of the absolute value of the sines of the two angles subtended at each carbon is calculated to indicate how non-linear the X-C-C angle is.
- Other constraints are no disorder, no errors and R < 0.05.

First the intermolecular hits (38). Prominent short examples include:
In most of the stronger examples (blue), the approach of the hydrogen is perpendicular to the C-C bond centroid (X-axis of plot above). Many however exhibit significant bending (Y-axis of plot above) from linearity at the two carbons (~173°), mostly away from the H but in some examples towards the H!
Selected entries from the intra-molecular search (34 hits) are shown below. Perhaps due to the intra-molecular nature, the angle of approach of the H is more variable than the intermolecular examples and the bending of the erstwhile X-C-C angle is again prominent.
ωB97XD/Def2-TZVPP calculations of one intermolecular example, ICUTAC (two molecules, dataDOI: 10.14469/hpc/2482) and one intramolecular case, KIXFOO (dataDOI: 10.14469/hpc/2481). For the former, crystal packing compressions perhaps provide some shortening of the hydrogen bond and the molecule also includes an example of a short C-H to π interaction (obs[3] 2.63Å).

What is noticeable from reading the abstracts of the articles cited above is that these hydrogen bonds are rarely commented upon by the authors and it does seem that most of these close contacts are serendipitous (they were not designed). All are somewhat longer than the shortest distances encountered for alkenes and it would be interesting to establish if this is an intrinsic property of the triple bond or whether less effort has hitherto been expended on designing closer approaches.
‡ Not all entries have an assigned dataDOI at CCDC.
†CrossRef DOIs here are collected as a citation at the bottom of the post using the WordPress KCite plugin. Unfortunately for a few months now, this plugin has stopped recognising DataCite DOIs, which is why here they are treated differently from CrossRef DOIs. This is purely a current attribute of the KCite plugin and does not imply any fundamental difference in the two types of DOI, other than one tends to be used as persistent identifiers of journal articles and the other of datasets.
References
- M. Akita, M. Chung, A. Sakurai, S. Sugimoto, M. Terada, M. Tanaka, and Y. Moro-oka, "Synthesis and Structure Determination of the Linear Conjugated Polyynyl and Polyynediyl Iron Complexes Fp*−(C⋮C)<i><sub>n</sub></i>−X (X = H (<i>n</i>= 1, 2); X = Fp* (<i>n</i>= 1, 2, 4); Fp* = (η<sup>5</sup>-C<sub>5</sub>Me<sub>5</sub>)Fe(CO)<sub>2</sub>)<sup>1</sup>", Organometallics, vol. 16, pp. 4882-4888, 1997. https://doi.org/10.1021/om970538m
- J. Forniés, S. Fuertes, A. Martín, V. Sicilia, E. Lalinde, and M.T. Moreno, "Homo‐ and Heteropolynuclear Platinum Complexes Stabilized by Dimethylpyrazolato and Alkynyl Bridging Ligands: Synthesis, Structures, and Luminescence", Chemistry – A European Journal, vol. 12, pp. 8253-8266, 2006. https://doi.org/10.1002/chem.200600139
- R. Banerjee, R. Mondal, J.A.K. Howard, and G.R. Desiraju, "Synthon Robustness and Solid-State Architecture in Substituted <i>g</i><i>em</i>-Alkynols", Crystal Growth & Design, vol. 6, pp. 999-1009, 2006. https://doi.org/10.1021/cg050598s
- B. Xu, K. Bussmann, R. Fröhlich, C.G. Daniliuc, J.G. Brandenburg, S. Grimme, G. Kehr, and G. Erker, "An Enamine/HB(C<sub>6</sub>F<sub>5</sub>)<sub>2</sub> Adduct as a Dormant State in Frustrated Lewis Pair Chemistry", Organometallics, vol. 32, pp. 6745-6752, 2013. https://doi.org/10.1021/om4004225
- M.J. Pouy, S.A. Delp, J. Uddin, V.M. Ramdeen, N.A. Cochrane, G.C. Fortman, T.B. Gunnoe, T.R. Cundari, M. Sabat, and W.H. Myers, "Intramolecular Hydroalkoxylation and Hydroamination of Alkynes Catalyzed by Cu(I) Complexes Supported by <i>N</i>-Heterocyclic Carbene Ligands", ACS Catalysis, vol. 2, pp. 2182-2193, 2012. https://doi.org/10.1021/cs300544w
- R.D. Dewhurst, A.F. Hill, and M.K. Smith, "Heterobimetallic C<sub>3</sub> Complexes through Silylpropargylidyne Desilylation", Angewandte Chemie International Edition, vol. 43, pp. 476-478, 2004. https://doi.org/10.1002/anie.200352693
- T. Holtrichter-Rößmann, C. Rösener, J. Hellmann, W. Uhl, E. Würthwein, R. Fröhlich, and B. Wibbeling, "Generation of Weakly Bound Al–N Lewis Pairs by Hydroalumination of Ynamines and the Activation of Small Molecules: Phenylethyne and Dicyclohexylcarbodiimide", Organometallics, vol. 31, pp. 3272-3283, 2012. https://doi.org/10.1021/om3001179
Tags:alkene, alkyne, Functional groups, intra-molecular search, search query
Posted in crystal_structure_mining | 1 Comment »
Saturday, April 15th, 2017
Back in the early 1990s, we first discovered the delights of searching crystal structures for unusual bonding features.[1] One of the first cases was a search for hydrogen bonds formed to the π-faces of alkenes and alkynes. In those days the CSD database of crystal structures was a lot smaller (<80,000 structures; it’s now ten times larger) and the search software less powerful. So here is an update.
The search query (dataDOI:10.14469/hpc/2473) is shown below:
- A mid-point (centroid) of a C-C bond (of any type) is defined, but the carbons are each restricted to being 3-coordinate, with the substituents R being either C or H.
- The distance to a hydrogen (attached to group QA, where QA is any one of N,O,F,Cl, i.e. acidic H) is defined.
- The properties of the alkene are defined by the sines of the two angles subtended at the centroid. This defines how perpendicular the QA-H hydrogen bond is to the C-C bond.
- Four torsions R-C-centroid-H are defined by their sines. The mean of the absolute values of these will define how orthogonal the approach of the hydrogen to the π-π plane is.
- Further constraints in the search are no disorder, no errors, R < 0.05, the H atom position is normalised and this position is defined as being <2.5Å from the C-C bond centroid, which is ~0.3Å < the sum of the van der Waals values for C and H.

The first search is limited to intermolecular contacts between the C-C bond and the H and reveals that for most of the 18 hits, the H approach is close to perpendicular to the centroid but the inclination to the π-π plane is more scattered. The most interesting (shortest H…centroid contact of ~2.22Å, orthogonal approach) can be inspected as KANYAA (dataDOI: 10.5517/CC8JRQ7).
When the search is repeated for intramolecular contacts, rather shorter distances are obtained for 88 hits and with more variation in the angles of approach. The most interesting candidate (blue dots) is IGELAJ[2] (dataDOI: 10.5517/CC14PBW1 ) which has the very short intramolecular H approach of 1.90Å to the C-C centroid corresponding to ~2.04Å to the carbons, a contraction of ~0.8Å from the van der Waals sum.

The authors remarked[2] “that it possesses a better defined intramolecular hydrogen bond compared to the usual molecules for which it is noted“. They also note JOCQEX, which is present in the above plot, but for which there is a non-orthogonal approach of the hydrogen bond to the π-π plane. The authors do not mention TIBCUD[3] (dataDOI: 10.5517/CCPL0FP), which has a similar close approach of 1.92Å to the C-C centroid, but at an angle inclined to the C-C axis.
IGELAJ, as an intramolecular H-bond, was amenable to calculation of its geometry and properties (inter-molecular interactions would ideally require the periodic lattice to be computed), with the observation[3] that “another test was to compare the energy calculation of IGELAJ to a non-hydrogen-bound version where the OH bond is rotated 180°” and “the results predict IGELAJ to be 7.30 kcal more stable than the non-hydrogen-bound version”. This value, if correct, is indeed typical of a very strong hydrogen bond!
Pedant (curious?) as I am, I wanted to be clear what kind of calculated energy was being reported. Was it the difference in total energies, or the energies corrected for ZPE (zero-point-energy) as ΔH or the free energies for which entropy is included as ΔG? The article[3] itself is unclear on this aspect and no energies are reported in the supporting information. This is an illustration that “supporting information” in most current incarnations may often not provide crucial information; only a full deposition as the management of research (RDM) of FAIR data can provide. This process is illustrated for my own calculations of this system (ωB97XD/Def2-TZVPP, dataDOIs: 10.14469/hpc/2474, 10.14469/hpc/2475), which reveals that ΔG298 4.8 kcal/mol and ν 3761 cm-1. In comparison when the OH bond is rotated 180° the wavenumber goes up 3956 cm-1, a difference of 195 cm-1 is calculated, which is indeed a large red-shift. But the “non-hydrogen-bound version where the OH bond is rotated 180°” is not a valid reference point for a non-hydrogen bonded isomer, since it manifests instead as a transition state for OH rotation with νi 166 cm-1, there being no minimum other than the π-facially hydrogen bonded one (dataDOI: 10.14469/hpc/2476). So, for the lack of a suitable reference system, we cannot conclude what the strength of this particular hydrogen bond is, nor make any conclusions about it being unusually strong.
So IGELAJ holds the current record for the shortest π-facial hydrogen bond to an alkene, but not necessarily the strongest! I wonder if this record might be broken with the aid of further computational design and prediction?
References
- H.S. Rzepa, M.H. Smith, and M.L. Webb, "A crystallographic AM1 and PM3 SCF-MO investigation of strong OH ⋯π-alkene and alkyne hydrogen bonding interactions", J. Chem. Soc., Perkin Trans. 2, pp. 703-707, 1994. https://doi.org/10.1039/p29940000703
- M.D. Struble, M.G. Holl, G. Coombs, M.A. Siegler, and T. Lectka, "Synthesis of a Tight Intramolecular OH···Olefin Interaction, Probed by IR,<sup>1</sup>H NMR, and Quantum Chemistry", The Journal of Organic Chemistry, vol. 80, pp. 4803-4807, 2015. https://doi.org/10.1021/acs.joc.5b00470
- B. Ndjakou Lenta, K.P. Devkota, B. Neumann, E. Tsamo, and N. Sewald, "4-(1,1-Dimethylprop-2-enyl)-1,3,5-trihydroxy-2-(3-methylbut-2-enyl)-9<i>H</i>-xanthen-9-one", Acta Crystallographica Section E Structure Reports Online, vol. 63, pp. o1629-o1631, 2007. https://doi.org/10.1107/s1600536807009907
Tags:calculated energy, chemical bonding, Chemistry, Crystal, crystallography, energy, energy calculation, Intermolecular forces, Nature, search query, search software, Supramolecular chemistry
Posted in crystal_structure_mining | 2 Comments »
Thursday, April 13th, 2017
Layer stacking in structures such as graphite is well-studied. The separation between the π-π planes is ~3.35Å, which is close to twice the estimated van der Waals (vdW) radius of carbon (1.7Å). But how much closer could such layers get, given that many other types of relatively weak interaction such as hydrogen bonding can contract the vdW distance sum by up to ~0.8Å or even more? This question was prompted by the separation calculated for the ion-pair cyclopropenium cyclopentadienide (~2.6-2.8Å).
The search query for the Cambridge structure database is shown below.

The query (dataDOI: 10.14469/hpc/2471) defines centroids for two benzenoid rings, both comprising only 3-coordinated carbons. The sine of an angle subtended at each centroid to the other and to one ring carbon attempts to track how parallel the two rings are (strictly speaking, 12 such angles should be included). If the sines of both angles are 1.00, then the two centroids overlap orthogonally. A search constrained to no disorder, no errors and R < 0.05 reveals 1107 hits at a centroid-centroid distance of < 3.5Å. The colour code (red) indicates the distances in the range 3.4-3.5Å, which matches that of graphite, while distances down to 3.2Å (yellow-green) are not uncommon.

Here is another way of representing these results, in which the centroid-centroid distances (measured from the positions of 12 carbon atoms and hence statistically more reliable than any individual atom pair distance) are multiplied by either sin(ANGa) or sin(ANGb). The number of occurrences with distances < 3.2Å is less than 32 (out of 1107).
Taking a look at some of these outliers, PAZJEG has two entries, one with a short distance (dataDOI: 10.5517/ccsffzl) and one with a normal distance[1], which does tend to cast doubt on the former.

ZOMSEB[2], DataDOI: 10.5517/CCZS2MF) appears to have the planes of the molecules stacked ~2.5Å apart.
OXUDES02[cite10.1016/j.poly.2016.09.046[/cite], DataDOI: 10.5517/CCDC.CSD.CC1MBBFQ) has a separation of ~2.6Å.
Verifying these and other outliers would require expert inspection of the crystallographic data and its refinement. This might require access to the hkl structure factors, data which are now being “strongly encouraged”‡ for deposition with the CSD, but which are not present for most structures deposited before ~2016. In extreme cases, the original diffraction images collected by the cameras would allow for a fully independent re-analysis, data which however is rarely if ever deposited.
So the separation of π-π stacked six-membered benzenoid rings is only infrequently less than ~3.2Å in measured crystal structures. There are hints it might reach as short as ~2.6Å, but such examples with values significantly less than 3.2Å do require expert validation before they can be called real.
‡See structuredepositioninformation/ “We strongly encourage data to be deposited either with imbedded structure factor data or with an associated FCF or HKL structure factor file.”
References
- J. Rogan, D. Poleti, and L. Karanović, "Synthesis, Structure, and Thermal Properties of Two New Inorganic‐organic Framework Compounds: Hexaaqua(<i>μ</i><sub>2</sub>‐1,2,4,5‐benzenetetracarboxylato)‐bis(<i>N</i>,<i>N′</i>‐1,10‐phenathroline)dicobalt(II) Dihydrate and Hexaaqua(<i>μ</i><sub>2</sub>‐1,2,4,5‐benzenetetracarboxylato)‐bis(<i>N</i>,<i>N′</i>‐2,2′‐dipyridylamine)dinickel(II) Tetrahydrate", Zeitschrift für anorganische und allgemeine Chemie, vol. 632, pp. 133-139, 2005. https://doi.org/10.1002/zaac.200500292
- P. Das, C.K. Jain, S.K. Dey, R. Saha, A.D. Chowdhury, S. Roychoudhury, S. Kumar, H.K. Majumder, and S. Das, "Synthesis, crystal structure, DNA interaction and in vitro anticancer activity of a Cu(<scp>ii</scp>) complex of purpurin: dual poison for human DNA topoisomerase I and II", RSC Adv., vol. 4, pp. 59344-59357, 2014. https://doi.org/10.1039/c4ra07127a
Tags:Carbon, chemical bonding, Chemistry, Cyclopentadienyl anion, Graphite, Hydrogen bond, Intermolecular forces, Nature, Organic chemistry, search query, Stacking, Supramolecular chemistry, VDW
Posted in crystal_structure_mining | 1 Comment »
Tuesday, April 11th, 2017
Following my conformational exploration of enols, here is one about a much more common molecule, a carboxylic acid.
The components of the search are shown as four queries below, which will be combined in various Boolean senses (DOI: 10.14469/hpc/2462).
- Query one defines the carboxylic acid, with 3-coordinate carbon specified at the carbonyl along with 1-coordinate for the carbonyl oxygen. Then the HO-C=O torsion (o° for the syn conformation shown on the left above and 180° for the anti-conformation shown on the right) and the length of the O-C bond as variables.
- Query two defines a contact as ≤ the sum of van der Waals radii between QA (=N,O,F,Cl) and the hydrogen of the carboxylic acid (pink).
- Query three defines a contact as ≤ the sum of van der Waals radii between QA-H (QA=N,O,F,Cl) and the oxygen of the acid (pink).
- Query four defines a temperature of <100K for the data collection temperature.

The first search uses just Query 1, with additional constraints of no errors, no disorder and R < 0.05.

This can then be focused by combining Query 1 + Query 4, which shows a clear preference for the syn conformation.

Next, Query 1 with NOT query 2, which restricts the search to carboxylic acids that do not have contacts to the hydrogen of the OH group. This excludes carboxylic acid dimers, as shown above. The predominant hot-spot now corresponds to the anti conformation.

Again this is narrowed using Query 4, which removes almost all the syn examples.

Now using Query 3 (as shown above), which restricts the search to examples where the oxygen of the HO group is itself not in contact with an acidic hydrogen. This allows carboxylic acid dimers. This now reveals the syn preference again.

At <100K reinforces this effect.

Finally, Query 1 and NOT query 2 (no dimers) and NOT query 3, where a smaller preference for anti is seen.

So it seems that an interesting difference emerges between enols and carboxylic acids in that when no hydrogen bonding to the HO group is allowed, an anti preference emerges. The electronic origins of this effect will be probed in a future post.
Tags:Acid, Alcohols, carboxylic acid, Chemistry, Enol, Functional groups, Organic chemistry, search uses
Posted in crystal_structure_mining | No Comments »
Sunday, April 9th, 2017
Both the cyclopropenium cation and the cyclopentadienide anion are well-known 4n+2-type aromatic ions, but could the two together form an ion-pair?

A search of the Cambridge structure database reveals 52 instances of the cyclopropenium cation with a variety of counter-anions, 77 cyclopentadienide anions with a variety of counter-cations and one (SOWMOG, private communication to CSD) where the two sub-structures are common. The pyridinium-cyclopropenium fragment is actually a di-cation stabilized with dimethylamino substituents, with these charges balanced by two cyclopentadienide anions stabilized with ester substituents. The stacking distance between the ion-pairs is ~3.5-3.6Å, a bit larger than normal π-π stacking distances of 3.2-3.3Å

So could a “pure” cyclopropenium cyclopentadienide ion-pair exist, and if so what would its π-π stacking distance be? A ωB97XD/Def2-TZVPPD/SCRF=water calculation (DOI: 10.14469/hpc/2442) provides one answer to this question; 2.57Å!‡ It is a true minimum in the potential energy surface (all +ve force constants) with a calculated dipole moment of only 7.57D. This species is “only” 27.1 kcal/mol higher in ΔG than the neutral hydrocarbon (DOI: 10.14469/hpc/2443), a difference which is as low as it is because of the gain in aromatic stabilization of two rings upon ion-pair formation.

A few posts back, I was considering candidates for the most polar neutral compound synthesized and I suggested a candidate with a dipole moment of ~22D, based as it happens on cyclopropenium and cyclopentadienide rings directly connected by a bond. So when this bond is removed and the two rings are allowed to stack one above the other, we now have an interesting inversion of the original challenge: what is the least-polar ionic organic compound (ionic in the sense of being an unconnected ion-pair)?
Here are some more properties of this intriguing “neutral” ion-pair.
- It has a number of low-frequency modes with correspond to the two rings moving with respect to each other (ν 216 cm-1)

- The molecular electrostatic potential illustrates the sense of polarization, with negative region (orange) residing on the 5-membered ring:

- The most stable π-type molecular orbital (below) reminds of the π-complex formed in the benzidine rearrangement and that in fact modelling this ion-pair may require a multi-reference (CASSCF) wavefunction, with the single-determinantal one used here only being a first approximation.

- A QTAIM analysis of the electron density topology shows only weak “bond” connectors between the two rings, with ρ(r) being typical of weak interactions such as hydrogen bonds.

- An ELF (electron localisation function) analysis also holds no surprises, with all the electron density basins (purple) confined to the two rings, just as expected of an ion-pair.

- I will leave one further question to a future discussion; what happens to the aromaticity and ring currents of the two individual rings as they combine to form this ion-pair? Might this property be connected to the very close separation between the two rings?
So we have a remarkably “neutral” ionic hydrocarbon to match the “ionic” neutral organic molecules previously discussed. This ion-pair may yet prove to have interesting properties, even if is unlikely to be synthesized without the addition of stabilising substituents.
‡ For example, the stacking distance in graphite is 3.35Å.
Tags:Anions, Aromatization, Cation–pi interaction, Chemistry, Cyclopentadienyl anion, Ion, Ion association, potential energy surface, Simple aromatic rings
Posted in crystal_structure_mining, Interesting chemistry | 6 Comments »
Thursday, April 6th, 2017
Enols are simple compounds with an OH group as a substituent on a C=C double bond and with a very distinct conformational preference for the OH group. Here I take a look at this preference as revealed by crystal structures, with the theoretical explanation.

First, a search of the Cambridge structure database (CDS), using the search query shown below (DOI: 10.14469/hpc/2429)


The first search (no errors, no disorder, R < 0.05) is unconstrained in the sense that the HO group is free to hydrogen bond itself. The syn conformer has the torsion of 0° and it has a distinct preponderance over the anti isomer (180°). There is the first hint that the most probable C=C distance for the syn isomer may be longer than that for the anti, but this is not yet entirely convincing.
To try to make it so, a constrained search is now performed, in which only structures where the HO group has no contact (hydrogen bonding) interaction are included. This is achieved using a “Boolean” search;

The number of hits approximately halves, but the proportion of syn examples increases considerably. There is an interesting double “hot-spot” distribution, which amplifies the lengthening of the C=C bond compared to the anti orientation.

The next constraint added is that the data collection must be <100K (to reduce thermal noise) which reduces the hits considerably but now shows the lengthening of the C=C bond for the syn isomer very clearly.
A final plot is of the C=C length vs the C-O length (no temperature, but HO interaction constraint). If there were no correlation, the distribution would be ~circular. In fact it clearly shows that as the C=C bond lengthens, the C-O bond contracts.

Now for some calculations (ωB97XD/Def2-TZVPP, DOI: 10.14469/hpc/2429) which reveal the following:
- The free energy of the syn isomer is 1.2 kcal/mol lower than that of the syn. The effect is small, and hence easily masked by other interactions such as hydrogen bonding to the OH group. Hence the reason why removing such interactions from the search above increased the syn population compared to anti.
- The syn C=C bond length (1.325Å) is longer than the anti (1.322Å).
- The syn isomer has one unique σO-Lp/σ*C-C NBO orbital interaction (below) with a value of E(2) 7.7 kcal/mol, which is absent in the anti form. As it happens, a πO/π*C=C interaction is present in both forms but is also stronger in the syn isomer (E(2)= 46.8 vs 44.2 kcal/mol).
| unoccupied NBO, σ*C-C |
|
| Occupied NBO, σO-Lp |
 |
- The overlap of the filled σO-Lp with the empty σ*C-C orbital is shown below (blue overlaps with purple, red overlaps with orange).

To view the overlap in rotatable 3D, click on any of the colour diagrams above.‡
It is nice to see how experiment (crystal structures) and theory (the calculation of geometries and orbital interactions) can quickly and simply be reconciled. Both these searches and the calculations can be done in just one day of “laboratory time” and I think it would make for an interesting undergraduate chemistry lab experiment.
‡ This visualisation uses Java. Increasingly this browser plugin is becoming more onerous to activate (because of increased security) and some browsers do not support it at all. The macOS Safari browser is one that still does, but you do have to allow it via the security permissions.
Tags:Chemical bond, chemical bonding, Chemistry, Conformational isomerism, constrained search, Enol, free energy, Gauche effect, Hydrogen bond, Isomerism, Java, Physical organic chemistry, search query, Stereochemistry, Supramolecular chemistry
Posted in crystal_structure_mining, reaction mechanism | 2 Comments »