Archive for the ‘General’ Category

Is there a difference between a scientific blog and scientific journal?

Friday, January 14th, 2011

In my blogroll, I link to Tim Gowers’ blog. He is a very eminent mathematician, and so it is interesting to see what motivates him to write a blog about mathematics. This latest post goes a large way to explaining why. He starts by speculating about the features of some piece of research that might render it conventionally unpublishable, highlighting two reasons; (1) it is not original and (2) it does not lead anywhere conclusive. He then goes on to show how either outcome might nevertheless be useful to someone, even if unpublishable conventionally. The rest of his post then concentrates on the cap-set problem in pure mathematics. It boils down to the observation that the community as a whole might often spot something that individual might have a blind spot for. Or, that others may in turn be inspired by lines of research which had apparently led nowhere for the original poster. Tim of course is favoured by having often 80+ comments appended to each of his posts!

I could not help but reflect that the culture of chemistry is rather different. The primary chemical literature is probably full of research that is neither original nor conclusive! Certainly, I suspect that few researchers would abandon the scientific journal in favour of a blog to communicate that unoriginal/inconclusive result. Am I in fact digging a hole for myself by implying this blog is full of such stuff? I do hope not! Take for example this post, in which I tried to establish what I (perhaps mistakenly) thought was an unremarked-upon connection between molecular biology and chemistry, namely that the left and right handed forms of DNA bear a diastereomeric rather than an enantiomeric relationship to each other. Or that the reasons given 57 years ago for the prevalence of the right handed form may not withstand scrutiny using more modern tools (this would not be the case in mathematics, where a proven theorem remains proven, even with modern knowledge). Or perhaps, that the overall shape of the double helix (and its further folding into a superhelix) might depend on the finely tuned properties of just one single bond in the molecule.

One might also remark that one does not always have to decide between a  journal or a blog; it is possible to do both (and stay within the rules). A blog may allow some measure of open review before the ideas firm up in a manner more suitable for a journal. This journal article for example owes its genesis to the threads that developed on this blog. There is,  I think, room for both in the cut-throat competitive world that is scientific discourse.

Combichem: an introductory example of the complexity of chemistry

Sunday, December 19th, 2010

Chemistry gets complex very rapidly. Consider the formula CH3NO as the topic for a tutorial in introductory chemistry. I challenge my group (of about 8 students) to draw as many different molecules as they can using exactly those atoms. I imply that perhaps each of them might find a different structure; this normally brings disbelieving expressions to their faces.

Click on image to see molecules constructed from these atoms. The list is not comprehensive!

Amongst the useful concepts that can be introduced are:

  1. How to determine how many double bond equivalents (or degrees of unsaturation) are implied by the formula.
    1. Students spot one dbe in the above formula, but can take a little longer to notice that it can reside in a ring.
    2. Few (and I count tutors in this) will add sub-valent atoms (here, the possibility of a carbene or a nitrene) to the list.
  2. What is meant by “different”? This can be reduced to the equations: Ln k/T = 23.76 – ΔG/RT; t1/2 = (Ln 2)/k, where t1/2 is the half life (in seconds) of any species constrained by a free energy barrier of ΔG. A nice illustration of this equation is to be found on Jan Jensen’s blog (and an worthwhile calculation would be to find the barrier required to achieve a half life based on the age of the universe). This can be boiled down to three ranges.
    1. Half lives of ~10-15 s, or vibrations (and this includes transition states themselves). Arguably, resonance isomers, which involve the (nominal) motions of electrons and not nuclei, fall into this class as well.
    2. Half lives of < 101 s, which would include most conformational isomers (excepting atropisomers) and highly unstable isomers, and which cannot be bottled and labelled as such.
    3. Compounds with half lives > 102 s, up to of course the age of the universe. This would include configurational isomers (and if the students are up to it, you can ask them to identify any compounds constructed above which can exhibit optical isomerism).
  3. One might be inclined to (approximately) use arrows to indicate the timescales above. Thus electronic resonance is represented by double-headed arrow, conformational and E/Z isomers by an equilibrium arrow, and a single headed arrow indicating a reaction (which may in fact have a very low barrier) connecting two isomers.
  4. Its normally now time to count the electrons. This includes the “invisible ones”, the lone pairs, and also the occasion to introduce the valence shell octet.
  5. Putting the appropriate charges onto any atoms which require them is always fun (the dative bond is avoided). The blue structure revealed in the click above is an extreme interpretation of this! Gernot Frenking has pioneered the class of compound he calls carbones. For his latest article on the theme, see DOI: 10.1002/anie.201002773. The green compound would belong to this class, if it did not fall apart (probably with no barrier) to something which is not actually one molecule, but two (separable) molecules (purple). This brings us into what a molecule actually is. Could it be two molecules unconected by any bonds, but nevertheless also inseparable (such as catenanes, rotaxanes, and many other entwined systems)? Two molecules can also interact weakly, which is not normally referred to as bonds. In this case, the two molecules would be bound by a hydrogen bond.
  6. Quite a number of the isomers can be also called tautomers. This involve the movement of one type of atom in particular, the hydrogen (or proton). In terms of lifetime, they would fall into class 2 above (although if one takes extreme care to remove all traces of acids or bases, particularly from the surface of any glass container, one can extend the lifetimes quite considerably).
  7. The peptide bond is included in the isomers, and its ionic resonance formulation, which can lead the discussion to the molecular basis of life and how finely-tuned this bond in fact is.
  8. One might speculate about what the most stable of all the isomers might be, and how many are indeed bottleable. One might introduce quantum mechanics as nowadays a very reliable way of estimating this (and whilst you are at it, introduce free energies, entropies etc). For example, which of the two red geometrical isomers is the more stable, and why? What is the best resonance representation (i.e. where does one put the charges? On this specific point, a CCSD/6-311G(d,p) ELF calculation does come up with a very definitive answer of on the nitrogen rather than the oxygen).
  9. This might be followed up by introducing arrow pushing as a means of interconverting two isomers, and with one of the pair of isomers, one can introduce pericyclic selection rules, transition state aromaticity and other advanced stereochemical concepts.
  10. Now we are well into to stereoelectronics. One can introduce anomeric effects via the NBO technique. Thus in the red compounds, there is an interesting interaction between the lone pair on carbon and the anti N-H bond (but, spectacularly, not the syn N-H bond). There is another particularly strong one between the oxygen lone pair and the C-N bond.

I dare say I have only picked at the surface, but covering the above should be enough for one tutorial I should imagine 🙂


PS For the (calculated) relative energies of some of these isomers, see DOI: 10.1021/jo010671v

Secrets of a university tutor: (curly) arrow pushing

Thursday, October 28th, 2010

Curly arrows are something most students of chemistry meet fairly early on. They rapidly become hard-wired into the chemists brain. They are also uncontroversial! Or are they? Consider the following very simple scheme.

Curly arrow pushing

It represents protonation of an alkene by an acid. Two products are of course possible, leading to either a tertiary carbocation as shown in (a), or a primary one (not shown). Either involves two arrows, but how to illustrate this (important) difference in the outcome using the arrows. Most textbooks show (a). The lhs arrow starts at the middle of the bond, and ends at the atom of hydrogen. This unfortunately leads to an ambiguity. It does not define which carbon is involved in forming the new C-H bond.

In recognition of this problem an article has recently appeared (DOI: 10.1021/ed086p1389) which attempts to improve model (a) by using what they call bouncing arrows, as in (b). The arrow starts at the mid point of the C=C bond, but then bounces to one end, before heading off to again to end at the H atom. The idea is that the direction of bounce informs which of the two possible bonds will be formed. Leaving aside the (non-trivial) issue of how to persuade e.g. ChemDraw to produce a bouncing arrow, I note that an alternative system has been in use where I teach for many years; (c).

  1. This starts by addressing the problem of which bond to form by immediately drawing a dotted line where you want the bond to go.
  2. The arrow starts as before, at the mid point of a bond, but this time it ends at the mid-point of the dotted line. If nothing else, Chemdraw has no problem with this notation!
  3. Are there any other advantages? Consider (d). The green dots indicate the results of a QTAIM analysis, revealing bond-critical points (BCP) in either the reactants or the products. The first arrow both starts and ends at such a BCP. The second arrow starts at a BCP, and ends at a lone pair (these are not revealed using QTAIM. If instead, ELF synaptic basin centroids were to be used, then all arrows would start or end at such a basin). This therefore gives (c)/(d) some quantum mechanical reality.
  4. Another advantage is that one can formulate check-sumrules. By this I mean extra rules that can be used to check you have gotten things correct. Take a look at the red dots, one on the oxygen, another on the bromine. The metaphor is that these can be regarded as hinges, about which the bond swivels, the course of the swivel following that of the trajectory of the arrow.
    1. For heterolytic (electron pair) arrow pushing in which none of the centres involved changes its valency, the red dots must be located on alternating atoms.
    2. For heterolytic (electron pair) arrow pushing in which a valency change does occur (e.g. formation of a carbene), two red dots must be on adjacent atoms.
    3. In general, no more than one arrow either starts, or ends, at a bond. This used to be thought of as a fairly hard rule, but in fact its not difficult to come up with reactions which break it. For example, this one, where as many as three arrows either start or end at a given bond. And, as a challenge, can you break the rule by formulating arrow pushing for the (concerted) reaction between an alkyne and a per-acid (avoiding the anti-aromatic oxirene, the ring opening of which may conflate with the peroxidation).
    4. One can interrupt the concerted flow of arrows to form intermediates along the way. One famous example of such interruption is aromatic electrophilic substitution, which can however be persuaded to move all of its arrows more or less synchronously.
  5. The metaphor now is one of doors opening and closing, rather than bouncing arrows.

There must be thousands of tutors around the world, teaching tens of thousands of students the arcane art of arrow pushing. If anyone has yet another schema for doing so, I would be delighted to hear from them.

And now for something completely different: The art of molecular sculpture.

Sunday, October 17th, 2010

Chemistry as the inspiration for art! The inspiration was the previous post. As for whether its art, you decide for yourself. Click on each thumbnail for a molecular sculpture (the medium being electrons!).

MO 54. Click for 3D

MO 55. Click for 3D

MO 57. Click for 3D

MO 46. Click for 3D

MO 47. Click for 3D

MO 48. Click for 3D

MO 38. Click for 3D

MO 39. Click for 3D

MO 40. Click for 3D

Bio-renewable green polymers: Stereoinduction in poly(lactic acid)

Saturday, July 24th, 2010

Lactide is a small molecule made from lactic acid, which is itself available in large quantities by harvesting plants rather than drilling for oil. Lactide can be turned into polymers with remarkable properties, which in turn degrade down easily back to lactic acid. A perfect bio-renewable material!

Lactide

The starting point for ring opening polymerisation is racemic lactide, or rac-LA. This is an equal mixture of the R,R and S,S enantiomers, and it is now treated with a catalyst based on a metal M. If M=Mg, there is a rather remarkable stereochemical outcome for the resulting polymer. The catalyst selects alternating enantiomers for the assembly, resulting in a chain (R,R),(S,S),(R,R),(S,S), etc, the name for which is a heterotactic polymer. It could instead have created a blend of equal proportions of (R,R),(R,R),(R,R) and (S,S),(S,S),(S,S) which is an isotactic polymer. Needless to say, these two polymers have quite different properties, and it very much matters which is formed. Without such a catalyst, a random atactic polymer is created rather than a stereoregular arrangement.

Poly (lactic acid)

The question is how does the catalyst manage to assemble the polymer with such stereoinduction? The origins of this depend on a detailed understanding of the mechanism of the reaction, and in 2005 we suggested one which offered an explanation for the stereospecificity (see E. L. Marshall, V. C. Gibson, and H. S. Rzepa, DOI: 10.1021/ja043819b and an interactive storyboard).

Mechanism for stereoregular polymerisation

The key features of this rational were:

  1. Two possible transition states may control the reaction, TS1 and TS2. Which one depends on which is the higher in energy.
  2. The smallest model for this process involves loading two molecules of lactide onto the catalyst. The first has already been ring opened, and will control the stereochemistry of the second, which is the one suffering the ring opening bond formations/breakings shown above (the first is lurking in the group R).
  3. This leads to four different possibilities, (R,R)-(R,R)*, (S,S)-(S,S)*, (R,R)-(S,S)*, and (S,S)-(R,R)* (where the * denotes the reacting lactide, as in the diagram above). These are all diastereomers, and hence will be different in energy. If one of the first two is the lowest, then isotactic polymer will result; if the latter two then a heterotactic polymer.

Back in 2004, we had constructed a model based on B3LYP and of necessity a mixed basis set, being 6-311G(3d) on the Mg, 6-31G on the lactide and only STO-3G on the catalyst. This was done because the complete system was actually rather large. Even so, a transition state calculation would regularly take at least 10 days to find using the fastest computers available to us at that time. Using this procedure, we found that the rate limiting kinetic step  was in fact TS2 for all four possibilities noted above. Of these, the (R,R)-(S,S) transition state turned out to represent the lowest energy pathway, thus confirming the observed heterotacticity for this particular catalyst.

Well, times have moved on:

  1. Six years later, computers are around 20 times faster! We can now afford to improve the basis set to 6-31G(d,p) on all the atoms, including the catalyst (the Mg stays at 6-311G(3d) however; improving it to 6-311G(3d,2f) makes little difference).
  2. We can now include the solvent (thf) as a continuum field.
  3. In the last five years the B3LYP functional has been shown to underestimate the energies of globular molecules. A modern functional such as ωB97XD, which includes dispersion energy corrections, should be expected to do much better.

It is the purpose of this blog to report an update to the modelling. Quoting relative free energies (including the solvation correction), the results come out as;

  1. (R,R)-(S,S) 0.0 kcal/mol for the TS1 geometry (see DOI: 10042/to-4950)
  2. (S,S)-(S,S) 1.8 for the TS2 geometry
  3. (S,S)-(R,R) 5.5 for the TS1 geometry
  4. (R,R)-(R,R) 9.1 for the TS1 geometry.

Well, there are surprises! Using the gas phase B3LYP model the key transition state was TS2; now its TS1 (for in fact three of the four possible transition states). The bottom line (almost) is that the same stereoisomer as before comes out the winner! The take home lesson is that in six years of progress, modelling can now encompass solvent and dispersion corrections. Many mechanisms with > ~100 atoms investigated in the past without inclusion of these effects could probably do with a re-investigation, especially if the transition states are “globular” in nature. Any by now you are probably wondering what the transition state looks like. Well, here it is (and see it in all its glory by clicking on the diagram below).

(R,R)-(S,S) Transition state for stereoregular lactide polymerisation. Click for animation

And if you are also wondering how one might proceed to analyse the origins of the stereoinduction, the NCI interaction surfaces (as described in this post) are shown below. Note how the extensive degree of green interaction surface is associated with the globular nature referred to above.

Non-covalent interaction (NCI) surfaces for the (R,R)-(S,S) transition state. Click for 3D

Semantically rich molecules

Sunday, May 2nd, 2010

Peter Murray-Rust in his blog asks for examples of the Scientific Semantic Web, a topic we have both been banging on about for ten years or more (DOI: 10.1021/ci000406v). What we are seeking of course is an example of how scientific connections have been made using inference logic from semantically rich statements to be found on the Web (ideally connections that might not have previously been spotted by humans, and lie overlooked and unloved in the scientific literature). Its a tough cookie, and I look forward to the examples that Peter identifies. Meanwhile, I thought I might share here a semantically rich molecule. OK, I identified this as such not by using the Web, but as someone who is in the process of delivering an undergraduate lecture course on the topic of conformational analysis. This course takes the form of presenting a set of rules or principles which relate to the conformations of molecules, and which themselves derive from quantum mechanics, and then illustrating them with selected annotated examples. To do this, a great many semantic connections have to be made, and in the current state of play, only a human can really hope to make most of these. We really look to the semantic web as it currently is to perhaps spot a few connections that might have been overlooked in this process. So, below is a molecule, and I have made a few semantic connections for it (but have not actually fully formalised them in this blog; that is a different topic I might return to at some time). I feel in my bones that more connections could be made, and offer the molecule here as the fuse!

Two chair conformations of the molecule DULSAE. Click here for 3D. Note the (attractive) short H...H contacts.

To list all the likely semantics that a chemist would perceive in the graphic above would take far too long (by the time one would have finished, a text book would have been written). So here is a very very short summary in the context of conformational analysis.

  1. The molecule has a six membered ring as its backbone
  2. which can adopt two possible chair conformations
  3. which can interconvert by exchanging the axial and equatorial group pair for each of the four carbon atoms in the ring.
  4. An organic chemist will immediately notice a very unusual group, Fe(CO)2Cp, which itself is a semantic goldmine,
  5. but for the purposes here we will regard merely as a C-Fe bond!

The (semantic) question to be posed is “which of the two conformations shown above is the most stable“? That too of course has an abundance of implicit semantics, but most human chemists will probably know that this refers to asking which of the two geometries represents the lowest thermodynamic free energy (and we leave aside the issue of what medium the molecule is in, i.e. solid, solution or gas). A far trickier question is “why”?

So to (some interim) answers. Well, a ωB97XD/6-311G(d) calculation (wow, think of what is implied in that concise notation) predicts conformation (a) to be more stable by 2.3 kcal/mol (2.1 in ΔG, see DOI: 10042/to-4911). Now to the why. What connections would someone well versed in conformation analysis spot?

  1. The molecule has two methyl groups on adjacent atoms. They may prefer to be di-axial rather than di-equatorial to avoid excessive steric repulsions (whatever we mean by that!). That might prefer (b).
  2. The molecule has one carbon with both a cyano and an ether linkage. Well, that is susceptible to an anomeric effect (although, as I pointed out in an earlier post here, this connection has in fact often NOT been made in the literature). Only in conformation (a) is one of the oxygen lone pairs aligned anti-periplanar to the axis of the C-CN bond. The reasons why this is important are outlined in my Lecture course.
  3. Having spotted the last, the human might ask whether there is any possibility of an anomeric effect between an oxygen lone pair and the axis of the C-Fe bond? Well, I rather think that not a single human ever has asked that question! (I cannot know that of course, and perhaps someone has speculated upon this in the literature; this is where a full semantic web would help. That question could be posed of it! The reason  I suspect the connection might not have been made is that the anomeric effect is the domain of the organic chemistry, and  C-Fe bonds are those of the organometallic chemist. They do tend to see the chemical world rather differently, these two groups of chemists). If there was such an effect, it would favour (a).
  4. Then we have an X-C-C-Y motif. Depending on the nature of X and Y, the molecule might actually prefer a gauche conformation, i.e the dihedral angle XCCY would be around 60°. There are several such motifs one can detect; X=Y=O (twice). It might be that other permutations such as X=CN, Y=Fe(CO)2Cp, favour anti-periplanar. There are other permutations whose orientational preference may not even be recorded (in text books). Suddenly its gotten complicated!
  5. There are a number of short (~2.4Å) H…H contacts
  6. We are starting to understand that to unravel the conformation of this molecule, one may have to identify quite a number of different “rules”, and then to quantify each, and add up the numbers to get the final result. That energy of 2.3 kcal/mol may be composed of the result of applying quite a number of different rules. Hence the title of this post, a semantically rich molecule!

Well, I will leave it here for this post, without giving answers to the six points listed above, or really answering my main question posed above. That would make the post too complex (but I will follow this up!). I do want to end by planting the idea that answering this question involves making a great many chemical connections about the properties of this molecule, and then identifying quantitative ways (algorithms) in which an answer can be formulated. The molecule above is presented as a challenge for the Semantic Web to address!

Carbobenzene: benzene with a difference

Friday, April 16th, 2010

Some molecules, when you first see them, just intrigue. So it was with carbobenzene, the synthesis of a derivative of which was recently achieved by Remi Chauvin and co-workers (DOI: 10.1002/chem.200601193). Two additional carbon atoms have been inserted into each of the six C-C bonds in benzene.

Carbobenzene

The structure shows two resonance forms, which remind one of Kekulé and of course benzene itself. Counting reveals 18π-electrons in the conventional π sense, but with a further set of 12 π-electrons located in the plane of the ring, and orthogonal to the first set. Since both could be cyclically conjugated, we can say that the first set belongs to a 4n+2 count, and should set up diatropic ring currents resulting in aromaticity, whilst the second set would belong to the 4n category, and might set up paratropic ring currents in the plane of the system. The lowest occupied molecular orbitals of each set look as follows.

The lowest MO for the 18π-electron set. Click for 3D

Lowest occupied molecular orbital for the 12π-electron set. Click for 3D

Experimentally, the molecule is found to be aromatic. One way of quantifying this is via the so-called dissected NICS magnetic response index (DOI: 10.1021/ol016217v). At the ring centroid, NICS(0,1,2,3)zz (respectively 0,1,2,3Å above the plane of the ring) are found to be -49, -46, -38 and -28 ppm (DOI: http://10042/to-4878).  The un-dissected NICS (which includes all σ-current contributions) were -18, -16.6, -13 and -9 ppm. This both confirms diatropicity (for which NICS is strongly negative) and also suggests that the 12-electron π-framework is opposing the 18-electron π-framework.

Another, less common way to study the aromaticity is to look at the delocalization of the electrons using the ELF technique.

ELF function evaluated using only the 18 π electrons. Click for 3D

ELF function evaluated using only the 12 σ-electrons. Click for 3D

The 18-electron set bifurcate (break up into smaller basins), at the threshold of 0.87 shown above (the ELF function has a maximum of 1.0 and a minimum of 0.0), a high value which is typical of aromatic systems (benzene bifurcates at ~0.9). In contrast, the 12-electron set break up well before a value of 0.1 (shown), a low value which tends to indicate anti-aromaticity.

There are many other ways of exploring the properties of such aromatic molecules, but the two above tend to suggest that carbobenzene has two personalities, one aromatic, the other antiaromatic, and with the former dominant. This gives it an interesting twist on benzene itself, and makes one wonder whether this dual Janus-like personality could be exploited in some interesting fashion.

The structure of the hydrogen ion in water.

Sunday, February 21st, 2010

Stoyanov, Stoyanova and Reed recently published on the structure of the hydrogen ion in water. Their model was H(H2O)n+, where n=6 (DOI: 10.1021/ja9101826). This suggestion was picked up by Steve Bachrach on his blog, where he added a further three structures to the proposed list, and noted of course that with this type of system there must be a fair chance that the true structure consists of a well-distributed Boltzmann population of a number of almost iso-energetic forms.

The proposed structure of the hydrated proton in water

The evidence for the structure above comes from IR spectra. These operate on a fast enough scale to freeze-out individual forms, and therefore represent the instantaneous species rather than time averaged environments. A lively debate started on Steve’s blog, starting with Steve’s observation that the original article had reported only experimental results and no theoretical modelling of the proposed structure. It emerged that one way of modelling such species was within a cavity surrounded bv a continuum field modelling the bulk solvent (water in this case), and in particular one must properly optimize the structure and calculate the force constants within this field. When this is done, one significant difference between a simple gas-phase model of the structure above and its continuum-field structure emerges. In the former, the central O…H…O motif is symmetric (indeed the entire molecule is C2-symmetric). When the solvent field is applied, this unit desymmetrizes, ending up with one short (1.118Å) and one long (1.295Å) bond. I have transferred discussion of this from Steve’s blog to this one so that the resulting vibrations of this species can be shown here in animated form (its not possible to post animations in the comment field of a blog).

Firstly, the model. It is a PBE1PBE/aug-cc-pVTZ (the DFT method being the same as Steve used in his modelling, the basis set being rather better) and the continuum field applied was as SCRF(CPCM,solvent=water). The complete calculation can be inspected at DOI: 10042/to-4261. It is also important to remember that the force constants are harmonic. The resulting vibrations with the highest calculated intensities are tabled below.

Obs 1H freq Intensity 2H freq
338 481 ?
654 476 429 438
1202 1242 3837 942
1746 1749 599 1284
2816 3065 2829 2268
3127 913 2253
3134 3341 2018 2462
3134 3347 668 2424

One might note that the vibrations in the range 3100-3300 always tend to be over-estimated using theory, in part because of incomplete basis sets, and in part because the harmonic frequencies are always 200 or more wavenumbers higher than the observed anharmonic values. The match for the mid range vibrations (1746, 1202) seems remarkably good. Only the low range value (654) is significantly out, and this may be another anharmonic effect. Added for good measure are the closest matches to each vibration when the system is fully substituted with deuterium (because of mode mixing, the modes do not always map exactly; thus the mode at 338 appears to have no exact deuteriated analogue).

The displacement vectors are shown below (click on each picture to obtain an animation).

Normal mode 476 cm-1.

Normal mode 1242.

Normal mode 1749

Normal mode 3065

Normal mode 3341

Normal mode 3347

Calculated IR spectrum for H(H2O)6 +

Calculated IR spectrum for D(D2O)6 +

The overall conclusion does seem to be that the structure shown above for the solvated proton does seem to match the observed IR peaks rather well, and that further more accurate modelling of this species might be a worthwhile endeavour.

To blog or to publish. That is the question.

Tuesday, February 9th, 2010

Scientists write blogs for a variety of reasons. But these do probably not include getting tenure (or grants). For that one has to publish. And I will argue here that a blog is not currently accepted as a scientific publication (for more discussion on this point, see this article by Maureen Pennock and Richard Davis). For chemists, publication means in a relatively small number of high-impact journals. Anything more than five articles a year in such journals, and your tenure is (probably) secure (if not your funding).

Can one do both? Post a blog item, and then publish a follow-up in a high-impact journal? Well, yes and no.

I had better explain. A blog post is more often then not catalysed by reading an article, viewing another blog, or discussing something with a colleague. One posts in the hope of getting some feedback, from which one’s ideas might mature, develop, or indeed collapse! Scientists have long done this of course, albeit with a colleague down the corridor, at conferences or seminars. The ideas thus cast forth may also of course also get stolen, and so these traditional mechanisms for floating ideas are often very short on detail. Sometimes, returning to the idea of blogs, one post can lead to another, and the nature of the blog means the ideas can evolve, mutate very rapidly. Eventually, one might wish to take a good overview of all the various efforts. At this point, one is now considering publishing a journal article, since currently at least, the longevity of a journal is considered longer than that of a blog (see this post here for more ruminations on that theme). There are other good reasons for then choosing a journal rather than one’s blog. The QA (quality assurance) necessary to get an article accepted in a good journal is, let’s face it, rather greater than that of a blog (although to be fair, it is only motivation that limits the quality of the latter). Apart from adding all those control experiments/calculations that may be missing from the blog, one also must be far more fastidious in citing the literature correctly.

I do speak from (thus far one) experience. The story starts here, this being the initial post on a story that broke on Steve Bachrach’s blog about a compound with a potentially pentavalent carbon; Steve’s own post was based on an original article on the theme. Several more blog posts followed as the logical theme gradually developed. I eventually decided that telling how this set of logical connections came about was almost as interesting as the specific molecules it covered. The story had also evolved from discussing the element Astatine to speculating about the rare gas Helium, a somewhat less than obvious connection path (and how to discover connections between disparate and apparently unconnected concepts is a different story). Where should the story about how astatine was connected to helium be told? I decided it should indeed be in a formally published journal article. But it was also important to tell the story more or less as it happened, and particularly to include the role that the blogs themselves had played.

In fact, as soon as I started this undertaking, I realised that more calculations, and at a rather higher theoretical level, needed to be done in order to persuade the referees of the article that the science was sound, and also that it advanced our knowledge significantly. In the event, although the calculations were repeated, enhanced, or evolved in some manner or other, and new ideas injected, none of the original assertions was proven wrong (and of course its now not just me that thinks this, but the  2-3 referees who also commented). Ultimately, I would estimate I ended up spending perhaps ten times as much time on the journal article as on the sum of the initial blog posts on the topic. It an interesting question as to whether the motivation needed to put in this amount of care and attention could also have been generated with blog as the sole output medium (see my opening remarks).

The article is now published (DOI: 10.1038/nchem.596). Of course, you can only read it if your institution (or you personally) has a subscription to the journal (although, like this blog, the article can be located using public search facilities such as Google Scholar). There is another aspect of both the blog and the article worth mention. Both contain data. The blogs contain the molecular coordinates of all the molecules discussed, as well as the DOIs for the digital repository where the calculations are archived. So does the article, in the form of an interactive table, although again access to this table may or may not require a journal subscription (in this regard I note that whereas an earlier article I wrote for this publisher, see DOI 10.1038/nchem.373, is protected from non-subscribers, the interactive table which is part of the article is openly accessible. The journal deserves full credit for allowing this data to be on public access).

There is another aspect of the blog and the article, which was alluded to above. I introduced the theme of linking concepts together. This very blog post (and all the others) have been subjected to analysis using the calais archive tagger. This automatically determines appropriate tags to annotate each post with, and then declares them using standard methods (which include RDF). The published article is similarly tagged by the publisher. In theory at least, this collection of materials, the blogs and their tags,  and the article and indeed commentaries about both, should be reconcilable using appropriate semantic searches. But at this point, I feel that this topic deserves separate attention and I will close here.

The conformational analysis of cyclo-octane

Sunday, January 31st, 2010

In the previous post, I suggested that inspecting the imaginary modes of planar cyclohexane might be a fruitful and systematic way in which at least parts of the conformational surface of this ring might be probed. Here, the same process is conducted for cyclo-octane. The ring starts with planar D8h symmetry, and at this geometry (B3LYP/6-311G(d,p), DOI: 10042/to-3742) five negative force constants (corresponding to imaginary modes) are calculated. The most negative is non-degenerate, and gives directly the crown conformation of D4d symmetry (DOI: 10042/to-3738). The remaining four modes comprise two degenerate pairs. Following either of the E2u eigenvectors downhill leads to another conformation, D2d (DOI: 10042/to-3741), with a geometry which is noteworthy for exhibiting a pair of unusually close non-bonded H…H contacts (1.908Å). This value is about  0.3Å shorter than the sum of the Wan der Waals radii (DOI: 10.1021/jp8111556). We can debate whether such a close approach or inter-penetration of two hydrogens is a bond or not (an AIM analysis appears at the bottom of this post).

D8h, +82.8 kcal/mol
Follow B2u 467i Follow E3g 404i Follow E2u 230i
to D4d +0.8 to Ci 131i (Au), +7.5 to D2d +3.6

B2u

E3g

E2u

Cs 0.0 C2 +1.6

Following the remaining E3g mode leads to a stationary point of Ci symmetry (DOI: 10042/to-3743). This is a valley-ridge potential, since this point turns out to be a transition state itself, and following the Au imaginary mode at this point results in another, this time stable conformation, of chiral C2 symmetry (DOI: 10042/to-3744). This has a calculated optical rotation [α]D of 72° (at 589nm in chloroform).

Are these three conformations all there are? Well, a thorough analysis of the conformational space has in fact identified six minima (DOI: 10.1002/(SICI)1096-987X(19980415)19:5<524::AID-JCC5>3.0.CO;2-O), of which the most stable has Cs symmetry (the so-called chair-boat conformation, and the one most frequently found in crystal structures of cyclo-octanes). Where is that one in the above analysis? It arrives by a distortion of the D4d form (DOI: 10042/to-3747) via a transition state of no symmetry (DOI: 10042/to-3752)

Whilst the full potential surface clearly has many more features, following the modes of the planar conformation of cyclo-octane is a simple and rapid way of establishing four of the six limiting stable conformations (the two remaining forms have  D2 and S4 symmetry, see DOI 10.1016/0166-1280(88)80008-3).

AIM analysis of D2d cyclo-octane.

Finally as promised, the AIM analysis of the D2d conformer (above). The ρ(r) value at the interesting H…H critical point is 0.015, which is pretty high in comparison to most normal hydrogen bonds, and would be conventionally taken to indicate attraction. The Laplacian ∇2ρ(r) is +0.05. The “bond” ellipticity ε has a value of 0.29. Single bonds are close to zero, and C=C double bonds are ~0.4, so this is pretty high (see also DOI: 10.1002/anie.200805751).

The two highest C-H stretching vibrations for this conformation are well separated from all the others (ν 3095, 3103 cm-1 for the symmetric A1 and antisymmetric B2 combinations, below for animations). These vibrations serve to both decrease and increase the H…H distances as part of the atomic (harmonic) displacements, and clearly doing so takes more energy than vibrating any of the other C-H bonds. It seems unlikely that the C-H bonds are themselves stronger, so does that mean that the H…H interaction is attractive or is it repulsive? In this context, it is worth noting that the symmetric vibration (both H…H distances decrease/increase at the same time) is lower in wavenumber than the mode which decreases one and increases the other.

A1

B2