crystal_structure_mining « Henry Rzepa's blog

Archive for the ‘crystal_structure_mining’ Category

Anomeric effects at boron, silicon and phosphorus.

Friday, July 1st, 2016

The anomeric effect occurs at 4-coordinate (sp³) carbon centres carrying two oxygen substituents and involves an alignment of a lone electron pair on one oxygen with the adjacent C-O σ*-bond of the other oxygen. Here I explore whether other centres can exhibit the phenomenon. I start with 4-coordinate boron, using the crystal structure search definition below (along with R < 0.1, no disorder, no errors).[1]

The result shows two prominent clusters, one with both torsion angles being 180°, and another with both being ~60°. This latter is the one that implies that there must be two lone pairs, one on each oxygen, that are anti-periplanar to the adjacent B-O bond. There are two more diffuse clusters where only one antiperiplanar alignment is seen. So yes, 4-coordinate boron can exhibit an anomeric effect!

This compares to the carbon-anomeric plot which is shown here for comparison, where the top right cluster of 180° torsions contains proportionately few hits than with boron.

The next centre is at 4-coordinate silicon. Again three significant clusters are seen; one with two antiperiplanar lone pair alignments with Si-O bonds, and two more with just one such alignment. The previous hotspot for which both measured torsions were 180° is largely absent. So here, the anomeric effect is much stronger. Notice also that whereas the torsions in the region of 60° for the carbon centre lie along a ridge coincident with the diagonal (bottom left to top right), that for the silicon centre show a ridge running orthogonal to the diagonal. An interesting point to follow up perhaps?

Since the off-diagonal clusters are relatively prominent, implying just one anomeric interaction, it is of interest to see if this results in any asymmetry in the two Si-O bond lengths. If its present, the effect is small.

Finally 4-coordinate group 15 elements. Most of the hits are in fact for P; there are none for N. This shows four clusters; the two on the diagonal show respectively two and no antiperiplanar interactions. The two off-diagonal clusters show just one such orientation. As with Si, the ridge in the 60° region run orthogonal to the diagonal.

So this little exploration shows that the anomeric effect, best known for sugars and at a carbon centre, is in fact more general to the adjacent elements.

References

H. Rzepa, "Anomeric effects at boron, silicon and phosphorus.", 2016. https://doi.org/10.14469/hpc/696

Tags:Acetals, Alkane stereochemistry, Anomer, Anomeric effect, Bond length, Boron, Carbohydrate, Carbohydrate chemistry, Carbohydrates, crystal structure search definition, Ester, Physical organic chemistry, Stereochemistry
Posted in crystal_structure_mining | No Comments »

How does an OH or NH group approach an aromatic ring to hydrogen bond with its π-face?

Wednesday, June 22nd, 2016

I previously used data mining of crystal structures to explore the directing influence of substituents on aromatic and heteroatomatic rings. Here I explore, quite literally, a different angle to the hydrogen bonding interactions between a benzene ring and OH or NH groups.

aromatic-pi-query

I start by defining a benzene ring with a centroid. The distance is from that centroid to the H atom of an OH or NH group and the angle is C-centroid-H. To limit the search to approach of the OH or NH group more or less orthogonal to the ring, the absolute value of the torsion between the centroid-H vector and the ring C-C vector is constrained to lie between 70-100° (the other constraints being no disorder, no errors, T < 140K and R < 0.05).[1]

aromatic-pi-HN-140

The above shows the results for NH groups interacting with the aromatic ring. The maximum distance 2.8Å is more or less the van der Waals contact distance between a hydrogen and a carbon and as you can see the contacts "funnel down" to the centroid at < 2.1Å. The shortest distance[2] is for ammonium tetraphenylborate, which you can view in e.g. spacefill mode here[3]

390

The other interesting close contact derives from a protonated pyridine[4], which can in turn be viewed here.[5] The main message from the distribution shown above is that as the distances between the HN and the centroid get shorter, the "trajectory" of approach remains orthogonal to the ring (the angle defined above remains ~90°) and heads towards the centroid of the π-cloud. The hotspot itself (red, ~2.6Å) also lies along this trajectory.

Recollect that when I used such hydrogen bonding to see if crystal structures discriminate between the ortho or meta positions of a ring carrying an electron donating substituent, it was the distance from a HO to the carbon that was measured as the discriminator. So it's a faint surprise to find that with HN, and without the necessary perturbation of an electron donating substituent, the intrinsic preference seems to be for the ring centroid and not any specific carbon atom of the ring.

So how about the OH group? There are in fact rather fewer examples, and so the statistics are a bit less clear-cut. But there is a tantalising suggestion that this time, the trajectory is not ~90° but rather less, implying that the destination is no longer the centroid of the π-cloud but one of the carbon atoms of the ring itself. For those who like to "read between the lines" and spot things that are absent rather than present, you may have asked yourself why I did not use NH probes in my earlier post. Well, it appears that the NH group is less effective at e.g. o/p discrimination than is an OH group.

aromatic-pi-OH-140

I can only speculate as to the origins (real or not) of the difference in behaviour between OH and NH groups towards a phenyl π-face. Perhaps it is simply bias in the CSD database? Or might there be electronic origins? Time to end with that phrase "watch this space".

References

H. Rzepa, "How does an OH or NH group approach an aromatic ring to hydrogen bond with its Ï-face?", 2016. https://doi.org/10.14469/hpc/673
T. Steiner, and S.A. Mason, "Short N<sup>+</sup>—H...Ph hydrogen bonds in ammonium tetraphenylborate characterized by neutron diffraction", Acta Crystallographica Section B Structural Science, vol. 56, pp. 254-260, 2000. https://doi.org/10.1107/s0108768199012318
Steiner, T.., and Mason, S.A.., "CCDC 144361: Experimental Crystal Structure Determination", 2000. https://doi.org/10.5517/cc4v6tz
O. Danylyuk, B. Leśniewska, K. Suwinska, N. Matoussi, and A.W. Coleman, "Structural Diversity in the Crystalline Complexes of <i>para</i>-Sulfonato-calix[4]arene with Bipyridinium Derivatives", Crystal Growth & Design, vol. 10, pp. 4542-4549, 2010. https://doi.org/10.1021/cg100831c
Danylyuk, O.., Lesniewska, B.., Suwinska, K.., Matoussi, N.., and Coleman, A.W.., "CCDC 819118: Experimental Crystal Structure Determination", 2011. https://doi.org/10.5517/ccwhc5w

Tags:10.1021, 10.1107, 10.5517, aromaticity, benzene, Centroid, chemical bonding, data mining, Functional groups, Hydrogen bond, Physical organic chemistry, Pyridine, Simple aromatic rings, Supramolecular chemistry
Posted in Chemical IT, crystal_structure_mining | 3 Comments »

Exploring the electrophilic directing influence of heteroaromatic rings using crystal structure data mining.

Tuesday, June 21st, 2016

This is a follow-up to the post on exploring the directing influence of (electron donating) substituents on benzene[1] with the focus on heteroaromatic rings such indoles, pyrroles and group 16 analogues (furans, thiophenes etc).

The search query is shown above (and is available here[2]). As before, the distance is compared from an electrophile, modelled as the hydrogen atom of an OH group, to both the carbon next to the heteroatom (C2) and the C3 carbon. The torsion is defined so as to ensure that the OH group is approaching the π-face of the ring. The other constraints are R < 0.1, no disorder and no errors and normalised H positions.

Firstly, indoles (as above). There are only a few hits, but even so one can see that they all cluster in the top left triangle, where the distance to C2 is always longer than to C3. Indeed, this is the known position for electrophilic substitution of indoles.

The search can be extended by removing the benzo group so as to also include pyrroles. More hits are obtained, and again most of them collect in the top left triangle. The hot spot indicates that the difference in lengths is ~0.3Å in favour of the 3-position, a very similar discrimination to that previously found for benzene groups with an electron donating substituent.

Next, the N atom is replaced by any atom from group 16 of the periodic table (i.e. O, S, etc). The scatter is now in both top left and bottom right triangles, which suggest much weaker discrimination between C2 and C3; if anything in favour of C2 (often the observed regiospecificities for such compounds).

Finally, pyridines. Only a slight bias towards the C2 position. With pyridines of course, the electrophile in fact first interacts with the nitrogen lone pair in the plane of the molecule, which perturbs the eventual outcome. So this crystallographic method is perhaps a better intrinsic probe than kinetic reactivity.

References

H.S. Rzepa, "Discovering More Chemical Concepts from 3D Chemical Information Searches of Crystal Structure Databases", Journal of Chemical Education, vol. 93, pp. 550-554, 2015. https://doi.org/10.1021/acs.jchemed.5b00346
H. Rzepa, "Search for HO interactions to indoles, pyrroles, furans, and thiophenes", 2016. https://doi.org/10.14469/hpc/665

Tags:Asymmetric hydrogenation, benzene, benzo, Electrophile, Furan, Indole, Pyridine, Pyrrole, search query, Simple aromatic rings, Substitution reaction, Thiophene
Posted in crystal_structure_mining | No Comments »

Why is the carbonyl IR stretch in an ester higher than in a ketone: crystal structure data mining.

Saturday, June 18th, 2016

In this post, I pondered upon the C=O infra-red spectroscopic properties of esters, and showed three possible electronic influences:

s-cis-ester1

The red (and blue) arrows imply the C-O bond might shorten and the C=O bond would lengthen; the green the reverse. So time for a search of the crystal structure database as a reality check. The query is as follows:

s-cis-ester1

The response shows the bimodal distribution with as expected the s-cis conformation dominating. There is indeed a hint that for the s-cis, the C-O distance is rather shorter than for the s-trans conformation.

s-cis-ester1

Repeating the search, but specifying that the temperature of data acquisition is < 90K, one gets a much clearer indication of the difference in bond lengths.

s-cis-ester1

This alternative representation shows the C-O and the C=O distances, with red indicating s-trans and blue indicating s-cis conformations (T < 140K). The red dots occupy a bottom right cluster for which the C-O distance is longer and the C=O shorter than the corresponding blue cluster.

s-cis-ester1

Again reducing the temperature of data collection to < 90K shows a rather weak inverse correlation between the two distances for eg the blue dots.

s-cis-ester1

A shame however that this database does not hold IR values for the carbonyl stretches. I am sure correlations must exist, but how to get at them (other than manual collection of data).

Tags:Ester, Functional groups, Infra-Red
Posted in Chemical IT, crystal_structure_mining | 1 Comment »

A wider look at π-complex metal-alkene (and alkyne) compounds.

Monday, June 13th, 2016

Previously, I looked at the historic origins of the so-called π-complex theory of metal-alkene complexes. Here I follow this up with some data mining of the crystal structure database for such structures.

Alkene-metal "π-complexes" have what might be called a representational problem; they do not happily fit into the standard Lewis model of using lines connecting atoms to represent electron pairs. Structure 1 was the original representation used by Dewar intending the meaning of partial back donation from a filled metal orbital to the empty π* of the alkene. At the other extreme these compounds can be called metallacyclopropanes (2) in which only single bonds feature (these can be thought of as representing full back bonding from metal to alkene and full forward bonding from alkene to metal). Representations 3 and 4 are a more fuzzy blend of these, implying some sort of partial bond order for the metal-carbon bonds. Taken together, they imply that the formal bond order of the C-C bond might vary between single to double. Structures 1 and 2 in particular imply that there might be two distinct ways in arranging the bonding and that π-complexes and metallacyclopropanes might therefore be distinct valence-bond isomers, each potentially capable of separate existence.

Why do these representations matter? Well, I am going to mine the crystal structure database for these species to try to see if there is any evidence for a bimodal distribution in the C-C lengths, perhaps indicating evidence of the isomerism suggested above. Such a structural database is indexed against atom-pair connectivity in the first instance and then bond type; one can specify the following types of bond connecting any two atoms: single, double, triple, quadruple, polymeric, delocalised, pi and any. It is not entirely obvious which if any of these types apply to structure 1 (it is not possible to draw a bond ending at the mid-point of another bond using the Conquest structure editor); the dashed lines in structures 3 and 4 could be classed as delocalised, pi, or most generally any. The search query can be constructed thus, where the two carbons carry R which can be either H or C and all four C-R bonds are specified as acyclic (to try to avoid complications by excluding compounds such as cyclic metallacenes). Because representation 1 cannot be constructed in the editor, I am going to specify that each carbon carries four bonds of any type in the first instance. The torsion specified is defined as R-C-C-M and the full queries can be found deposited here.[1]

If the metallacyclopropane representation 2 is defined with explicit single bonds, one gets only 22 hits (no errors, no disorder, R < 0.1). The distribution of C-C bond lengths is shown below. Already one sees a representational problem emerging. A true metallacyclopropane might be expected to show a C-C single bond length, say > ~1.5Å. But only one or two of these examples actually have this value, the most probable value being ~1.4Å.

Using representation 3, one gets 1861 hits, but as before one sees a maximum at ~1.4Å with a tail reaching to both single and double bond values for the C-C distance.^‡

If the C-C bond is also specified as "any", the hits increase to 3948, but the bond length distribution is still very similar, with no sign of any bimodal distribution.

Such a distribution is however found if the torsions between the R-C bond vector and the C-M bond vector are plotted (for all types of bond). A large number of the complexes have a torsion <90°, which suggests that in fact the substituent R is probably interacting with the metal (even though this would lead to formal cyclicity, specifying R-C as acyclic does not detect this interaction). Could this be masking a bimodal distribution in the C-C lengths?

If the previous search is repeated, but this time specifying that all four torsions must lie in the range 90-180° (the range expected for a "classical" alkene-metal complex and selecting only the top right hand side cluster in the plot above) the reduced value of 1051 hits are obtained, but the monomodal distribution remains.

For this last set, here is a plot of the two C-metal bond length, with colour indicating the C-C bond length, indicating the two C-metal bonds are clearly linearly correlated.

One final variation; the atom on either C can only be H or a 4-coordinate (sp³) carbon; 645 hits. Again, a monomodal distribution centered at 1.4Å.

So this foray through metal alkene complexes suggests that there is a continuum between the formal metallacyclopropane with a C-C single bond and the only slightly perturbed alkene-metal complex with a C=C double bond. Whilst this would not prevent any one of these compounds existing as two distinctly different valence-bond isomers, it makes it very unlikely. I had noted in an earlier post that for molecules of the type RX≡XR (X=Si, Ge, Sn, Pb) that there was indeed a clear bimodal distribution of the X-X lengths evident in the crystal structures (for a relatively small sample number). The structures 1-4 shown at the start of this post are all simply just variations in a continuum and not distinct isomers.

POSTSCRIPT: I noted above the bimodel distribution in compounds involving formal triple bonds. So I repeated the search above for π-complex metal-alkyne complexes. Specifying an acyclic C-R bond, and any for the CC bond type, one gets the following.

There is now a tantalizing suggestion of two clusters, one at 1.3 and another at 1.4Å. The torsional distribution shows that the latter distance appears to be associated with much smaller torsions, whereas the top right cluster is associated with shorter lengths.

If the torsions are restricted to the range 90-180, then the histogram looses the smaller cluster, and perhaps gains a second cluster at 1.22Å? As I said, all quite tantalizing!

^‡The tail in all the histograms extends into the 1.1-1.3Å region, which seems unreasonable for a carbon where four bonds are specified. This region probably represents errors in the crystallographic analysis or reporting. But who knows, perhaps some very unusual compounds are lurking there!

References

H. Rzepa, "A wider look at the Ï-complex theory of metal-alkene compounds.", 2016. https://doi.org/10.14469/hpc/642

Tags:alkene, alkene-metal complex, alkyne, Bond length, Carbon–carbon bond, Chemical bond, chemical bonding, Cluster chemistry, Conquest structure editor, Coordination complex, data mining, double bond, editor, filled metal orbital, metal, metal-alkene complexes, metal-alkyne complexes, metal-carbon bonds, Pi backbonding, search query, Structural formula, Transition metal alkyne complex
Posted in crystal_structure_mining | No Comments »

A wider look at chlorine trifluoride: crystal structures and data mining.

Friday, June 10th, 2016

A while ago, I explored how the 3-coordinate halogen compound ClF₃ is conventionally analyzed using VSEPR (valence shell electron pair repulsion theory). Here I (belatedly) look at other such tri-coordinate halogen compounds using known structures gleaned from the crystal structure database (CSD).

The search query specifies 7A as the central atom, defined with just three bonded (non-metallic) atoms. Initially, if no constraint on any cyclicity in the three 7A-NM bonds is made (and with R < 0.1, no errors, no disorder), the following result emerges.

I have plotted the three angle variables using the X/Y axes above and used colour to indicate the third angle (red = ~180°, blue = ~90°). The clusters show that two of the angles are ~90° and only one is ~180°. There is also a set of blue points (~90°) which show a linear correlation and which can be shown to derive from cyclicity, as the plot below reveals when acyclicity is specified for all three NM-7A bonds.

In this distribution, the two clusters for ANG1 or ANG2 of ~180° are small and compact, but the cluster where both ANG1 and ANG2 are ~90° is much more diffuse. Not all of the points in this cluster show as red (ANG3 ~180°); there are a few cyan or blue examples here too; indicating all three angles are in the range 140-90°. This result is not arising from cyclic constraints.

This wider look at 3-coordinate compounds in group 17 (the halogens) quickly reveals a class of such molecules where all three angles are relatively small. This suggests that a closer look at the bonding in these systems, especially in terms of VSEPR, might be rewarding!

I end with an equivalent search for group 18 (the noble gases). Although the number of examples is small, all show the two small/one large angle so characteristic of chlorine trifluoride itself.

The above is I think a good example of (big?) data mining, where one is searching for patterns, and if lucky spotting patterns that deviate from the norm to investigate the possibility of new chemical phenomena.[1] It is also interesting to speculate upon the origins of why two of the clusters shown above are small and compact and the third is much more diffuse.

References

H.S. Rzepa, "Discovering More Chemical Concepts from 3D Chemical Information Searches of Crystal Structure Databases", Journal of Chemical Education, vol. 93, pp. 550-554, 2015. https://doi.org/10.1021/acs.jchemed.5b00346

Tags:chemical phenomena, data mining, equivalent search, Halogen, search query specifies 7A
Posted in crystal_structure_mining | 1 Comment »

The geometries of 5-coordinate compounds of group 14 elements.

Monday, May 30th, 2016

This is a follow-up to one aspect of the previous two posts dealing with nucleophilic substitution reactions at silicon. Here I look at the geometries of 5-coordinate compounds containing as a central atom 4A = Si, Ge, Sn, Pb and of the specific formula C₃4AO₂ with a trigonal bipyramidal geometry. This search arose because of a casual comment I made in the earlier post regarding possible cooperative effects between the two axial ligands (the ones with an angle of ~180 degrees subtended at silicon). Perhaps the geometries might expand upon this comment?

The search query is shown above results in 394 hits (May 2016) and is presented with the three variables in the query plotted as below, with the O-4A-O angle indicated by colour (red ~ 180°; blue ~90° and green ~120°).

The cluster at distances of 4A-O of ~1.9Å represents silicon compounds, and tends to suggest that the pair of distances 4A-O are quite similar in value. The angles correspond to a di-axial arrangement around the silicon. In this scenario, one might imagine a stereoelectronic effect similar to the anomeric effect when 4A = C operates and which has the potential to strengthen both di-axial oxygens.
The bulk of the points come at higher 4A-O distances of > 2.1Å and consist mostly of 4A = Sn. There are two a clear-cut distributions, one for angles of ~180° and a separate one for angles of ~90° and both are qualitatively different from the Si distribution. The 180° set corresponds to a di-axial arrangement for the oxygens, whereas the 90° set suggests an axial-equatorial geometry. Both distributions have prominent tails which reveal that as one 4A-O distance shortens, the other lengthens, equivalent to asymmetric anomeric effects at O-C-O.
Noticeably absent are any green points; these would correspond to bond angles of ~120° and hence would correspond to di-equatorial ligands.

This quick exploration (with potential variations that I have not explored above) can be added to the collection of “ten minute explorations” I have described elsewhere.[1]

References

H.S. Rzepa, "Discovering More Chemical Concepts from 3D Chemical Information Searches of Crystal Structure Databases", Journal of Chemical Education, vol. 93, pp. 550-554, 2015. https://doi.org/10.1021/acs.jchemed.5b00346

Tags:Anomer, Anomeric effect, Carbohydrate chemistry, Carbohydrates, Ligand, Molecular geometry, Physical organic chemistry, Stereochemistry, Stereoelectronic effect, Trigonal bipyramidal molecular geometry
Posted in Chemical IT, crystal_structure_mining | 3 Comments »

What is the approach trajectory of enhanced (super?) nucleophiles towards a carbonyl group?

Wednesday, May 11th, 2016

I have previously commented on the Bürgi–Dunitz angle, this being the preferred approach trajectory of a nucleophile towards the electrophilic carbon of a carbonyl group. Some special types of nucleophile such as hydrazines (R₂N-NR₂) are supposed to have enhanced reactivity[1] due to what might be described as buttressing of adjacent lone pairs. Here I focus in on how this might manifest by performing searches of the Cambridge structural database for intermolecular (non-bonded) interactions between X-Y nucleophiles (X,Y= N,O,S) and carbonyl compounds OC(NM)₂.

The search query[2] is shown above and involves plotting the distance from the nucleophilic atom (N above) to the carbon of the carbonyl group. The carbon is defined as having 3-coordination, one of which is O=C and two non-metal attachments. The torsion is constrained to values of |70-110|° to ensure that the approach of the nucleophile is approximately perpendicular to the plane of the carbonyl in order to overlap with the π*-orbital as electrophile. The pairwise sums of van der Waals radii are NC, 3.25; OC, 3.22 and SC, 3.5Å and the plots show all contacts shorter than these. The results of the searches are shown below.

The general observation is that the red hotspots do tend to come at trajectory angles of <100° and many are <90° such as the X=Y=N or X=Y=S examples. Given that the original Bürgi–Dunitz hypothesis (actually based on a small number of molecules synthesized for the purpose) proposed rather larger angles (105±5°) corresponding to optimum alignment of the nucleophile with the carbonyl π*-orbital, we might speculate whether the use of enhanced nucleophiles is the reason for the apparent decrease in the angle. And if so, what the underlying reasons would be.

I also cannot help but observe that the term supernucleophile is quite rare in the literature; SciFinder gives only 45 hits, but most are about neither hydrazines nor peroxides. There are also some unusual nucleophile varieties such as Cob(I)alamin[3], of which there are probably insufficient examples to reflect in the crystal structure statistics shown above. Given the interest in superbases, the relative lack of examples of unusual supernucleophiles seems surprising.

References

G. Klopman, K. Tsuda, J. Louis, and R. Davis, "Supernucleophiles—I", Tetrahedron, vol. 26, pp. 4549-4554, 1970. https://doi.org/10.1016/s0040-4020(01)93101-1
H. Rzepa, "Crystal structure search using enhanced nucleophiles", 2016. https://doi.org/10.14469/hpc/487
K.P. Jensen, "Electronic Structure of Cob(I)alamin: The Story of an Unusual Nucleophile", The Journal of Physical Chemistry B, vol. 109, pp. 10505-10512, 2005. https://doi.org/10.1021/jp050802m

Tags:Bases, Bürgi–Dunitz angle, Carbonyl, Electrophile, Ester, Flippin–Lodge angle, Functional groups, hydrazine, non-metal attachments, Nucleophile, Physical organic chemistry, search query, Superbase
Posted in Chemical IT, crystal_structure_mining | 1 Comment »

Celebrating Paul Schleyer: searching for hidden treasures in the structures of metallocene complexes.

Saturday, April 2nd, 2016

A celebration of the life and work of the great chemist Paul von R. Schleyer was held this week in Erlangen, Germany. There were many fantastic talks given by some great chemists describing fascinating chemistry. Here I highlight the presentation given by Andy Streitwieser on the topic of organolithium chemistry, also a great interest of Schleyer's over the years. I single this talk out since I hope it illustrates why people still get together in person to talk about science.

The presentation focused on the structure of the simplest possible metallocene, lithium cyclopentadienyl and why the calculated structure showed that the hydrogen atoms attached to the cyclopentadienyl ring pointed slightly away from the metal rather than towards it (by ~1-2°).^† Various explanations had been put forward, some had waxed and then waned. It was still basically an open problem. Now, the title of the symposium was Theory and Experiment: A Meeting at the Interface; Streitwieser had given the theory and whilst listening, I realised I might be able to help relate this to known experiments, i.e. crystal structure data. I could do so by analysing the known crystal structures of metallocenes.[1] So here is the basic search query, and I will go through it thus:

A general ring is defined (sizes 4,5,6,7,8) and the ring and metal-C bonds are all specified as of type "any" (it is difficult to know how such bonds might be classified, ie delocalised, aromatic, etc, so best not to constrain things) and a metal is attached.
4M is basically any metal; again the search is unconstrained, but one could focus on certain columns of the periodic table if one wished.
A ring centroid is computed.
ANG1 is defined as the angle H-C-centroid, the angle of interest in Andy's talk. The limits were constrained to lie between 140° and 179°. I did this because when the angle becomes 180°, the torsion becomes mathematically undefined and I did not want to risk this happening.
TOR1 is defined as the torsion H-C-centroid-metal. Values of 180° would indicate that the hydrogen was pointing away from the metal; values of 0° would indicate it was pointing towards the metal. The absolute value of the torsion is taken to avoid confusion induced by its sign.
ANG2 is one test whether the ring is planar. For an even membered ring, it is the angle subtended at the centroid to opposing carbon atoms. For odd membered rings it is the angle at the centroid involving one carbon and a centroid defined by an opposing pair of atoms (see below).
The quality of the crystal structure determination is controlled by specifying that the R value be < 5%, no errors, no disorder. Also, the terminal H-positions are normalised (to correct known errors in H distances deriving from x-ray diffraction). I would point out that in the early days, the actual positions of the hydrogen were often not actually determined, but "idealised". In this case this would mean that the H-C-centroid angle would probably be set to 180°. For perhaps the last 20 years or so however, the positions of hydrogen atoms have been routinely refined. Unfortunately, I know of no search query that can separate the two cases, and so we will have to live with the mixture and see what we get.
We define another constraint separately, which is that the temperature of the data collection sample is <140K. This ensures that the data will be free of more vibrational/thermal noise and so should be rather more accurate.
Finally, a note on the topic of "research data management" or RDM. I have deposited the files defining the search query in a repository and have assigned DOIs both to the overall search collection[2] and to each individual search definition, the DOIs for which are shown below.

The 4-ring case.[3] Here the temperature constraint was relaxed, since there are few entries. The two red "hot-spots" occur at torsion angles of ~180° (hydrogen pointing away from metal) at bond angle values of between 173-176°.

The 5-ring case.[4] This includes the classic ferrocene example, the first metallocene for which the structure was correctly identified. There are many more examples, and this search is now constrained to <140K. The two hot spots occur at bond angles of very close to 180°, at which values the torsion itself becomes undetermined. That the hot spots actually occur at 0° and 180° and are not spread evenly across the right hand side axis is remarkable given this. There is a significant tail for the 180° torsion (H pointing away from metal) down to H-C-centroid angles of about 170°, but there is no evidence of this tail for torsions of 0°.

One more test must be applied to see if the 5-ring is planar or not. The deviation from planarity is only 2-3°, and there seems to be no correlation between lower values of the H-C-centroid bond angle and non-planarity.

The 6-ring case.[5] There are again numerous examples of data <140K for such rings. There is now a very distinct hotspot at angles of ~170° for the case/torsion where the hydrogen is pointing towards the metal.

This feature persists when the ring planarity is tested, and it occurs specifically for rings where the angle subtended at the centroid is ~180° and H-C-centroid angles of ~170°. So this is clear-cut effect which demands explanation #1.

The 7-ring case[6] again shows a strong hot spot at ~172° for a torsion corresponding to the hydrogens pointing towards the metal. This hot spot is matched by angles subtended at the ring centroid that are close to 180° (i.e. planar). This is clear-cut effect which demands explanation #2.

The 8-ring case[7] also shows a hot spot for hydrogens pointing towards the metal by the strikingly large degree of ~157°, and this feature is associated with a linear C-centroid-C angle. This is clear-cut effect which demands explanation #3.

The 9- and 10-ring cases. There are no examples! Time to make some?

To summarise.

The above was done during a conference in response to a point made by one of the speakers. In fact, it proved possible to show the speaker the diagrams above <18 hours after he gave the talk.^‡
An immediate question that arose from this discussion was whether the hot-spots were artefacts of non-planar rings. So the ANG2 test was added to the plots the next day (today) as part of this dissemination.
Also discussed (yesterday) was how these conference insights might be shared. I suggested the forum here and Professor Streitwieser heartily agreed. Another alternative was to write it up as a regular journal article. But we both agreed that ..
what you see here is just a statistical analysis. The next stage would be to individually inspect all the molecules which make up these statistics. You see it might just be that every molecule contributing to a "hot-spot" cluster might have special circumstances which conspire to make it look as if there is an interesting chemical effect going on. It is unlikely that such coincidences could accrue in such a manner, but the possibility does have to be considered.
I think we both felt that a better way was to expose the basic effects here, as a sort of open science research project, and anyone interested could then (a) try to replicate these plots, which is why you will find the DOIs of datasets containing the definition files to assist in any such replication and (b) tunnel down to any specific hot spot to identify the precise chemical characteristics that might give rise to the geometrical effect.
This could then be followed up by computational analysis of the electronic properties which might give rise to the effect. This would in effect complete the cycle, since this was the starting point for Streitwieser's original talk. Remember, the theme of the celebration was the interplay between theory and experiment, a particular favourite of Schleyer's.
Regarding the chemical insights, a distinct trend over the ring sizes 4-8 can be seen. The 4-ring shows the hydrogens pointing away from the metal, the 5-ring could be said to be largely agnostic (remember the error in crystallographic angles is probably in the region 1-3°) whilst there is an indication that for the 6-8 rings the ring hydrogens tend to point towards the metal. I have summarised three key points illustrating this as #1-3 above.
It is tempting to conclude that a fairly general chemical effect is operating here over #1-3, although of course it could be a number of effects specific to each ring which merely look like a general trend.

So the chemical interpretation of this project is unfinished, a general feature of much of science of course. But my aim here was to give a flavour of how a scientific meeting at its best can bring together like (or often unlike) minds which can tease out new connections and lead perchance to new discoveries.

^‡These hours were productively employed by sharing a Franconian banquet together, and a modicum of sleep, as well as the searches described above. And in case you see no citations at the bottom of this post, they too take about 48 hours to propagate through the CrossRef and DataCite systems. Be patient and they will appear. ^†In my original representation, I showed the Hs pointing towards the metal. In fact Prof Streitwieser has just contacted me reversing this orientation and correcting my recollection of his lecture.

References

H.S. Rzepa, "Discovering More Chemical Concepts from 3D Chemical Information Searches of Crystal Structure Databases", Journal of Chemical Education, vol. 93, pp. 550-554, 2015. https://doi.org/10.1021/acs.jchemed.5b00346
H. Rzepa, "Crystallographic searches of metallocene type complexes.", 2016. https://doi.org/10.14469/hpc/346
H. Rzepa, "4-Ring metallocene search query", 2016. https://doi.org/10.14469/hpc/347
H. Rzepa, "The 5-ring case.", 2016. https://doi.org/10.14469/hpc/348
H. Rzepa, "6-ring metallocene search queries", 2016. https://doi.org/10.14469/hpc/349
H. Rzepa, "7-ring metallocene search queries", 2016. https://doi.org/10.14469/hpc/350
H. Rzepa, "8-ring metallocene search queries", 2016. https://doi.org/10.14469/hpc/351

Tags:Centroid, chemical effect, chemical insights, chemical interpretation, City: Erlangen, Country: Germany, Degree of a continuous mapping, Ferrocene, Hydrogen bond, individual search definition, metal, overall search collection, Streitwieser, terminal H-positions, Torsion, X-ray
Posted in Chemical IT, crystal_structure_mining, Interesting chemistry | 6 Comments »

Discovery based research experiences: gauche effects in group 16 elements.

Wednesday, March 2nd, 2016

The upcoming ACS national meeting in San Diego has a CHED (chemical education division) session entitled Implementing Discovery-Based Research Experiences in Undergraduate Chemistry Courses. I had previously explored what I called extreme gauche effects in the molecule F-S-S-F. Here I take this a bit further to see what else can be discovered about molecules containing bonds between group 16 elements (QA= O, S, Se, Te).

OO-SQ

The search definition is shown above, with DIST1 being the QA-QA bond length, the QA-QA bond being acyclic, each QA bearing only two bonded atoms and NM being any non-metal. The first result shown is for QA=S.

S-S

The first discovery is that the most common torsion (red-hot spot) is about 90°, but there appears to be a statistically significant distortion towards longer S-S distances as the torsion deviates from this angle. For those who are so inclined it would perhaps be worth improving my term "appears to be" with a more formal numerical analysis of the distribution shown above and its significance. Any offers?
The other discovery worth exploring is the number of occurences with an angle of 180°. With F-S-S-F itself (not a solid), I had previously noted that this angle actually represented a transition state in the torsion! So what might be inferred from these examples?

The next search includes a further constraint that the temperature the data was recorded at be <140K. This reduces vibrational "noise" and so should increase the significance. S-S-140

Here we discover the same "V"-shaped distribution as before, possibly more significant statistically than the previous search. Again, a proper statistical analysis of the significance of this result is desirable.

The next search is for QA = Se or Te. X-X

The Se and Te distributions can clearly be distinguished, with a weak "V-shape" visible for Se, but absent for Te. Again, those hits at 180!
There are a few instances "in-between" the two distributions, which appear to be Se-Te systems.

Finally, QA=QB = O.

O-O

The discovery here is the apparent absence of any "V-shaped" distribution.
The hot spot now occurs at 180°, but with a tail down to 60° or less. Clearly, the definition of "NM" as any non-metal probably needs to be explored further for specific instances to see what influence the nature of NM has. NM for example could be another O, which might be a severe perturbation.

So here I have tried to tease out seven directions for further discovery. I am attending/presenting at the session I noted at the top and will report back on any interesting observations.

Tags:City: San Diego, metal, non-metal, Singular spectrum analysis, Time series analysis
Posted in crystal_structure_mining, Interesting chemistry | No Comments »

Henry Rzepa's blog

Archive for the ‘crystal_structure_mining’ Category

Anomeric effects at boron, silicon and phosphorus.

References

How does an OH or NH group approach an aromatic ring to hydrogen bond with its π-face?

References

Exploring the electrophilic directing influence of heteroaromatic rings using crystal structure data mining.

References

Why is the carbonyl IR stretch in an ester higher than in a ketone: crystal structure data mining.

A wider look at π-complex metal-alkene (and alkyne) compounds.

References

A wider look at chlorine trifluoride: crystal structures and data mining.

References

The geometries of 5-coordinate compounds of group 14 elements.

References

What is the approach trajectory of enhanced (super?) nucleophiles towards a carbonyl group?

References

Celebrating Paul Schleyer: searching for hidden treasures in the structures of metallocene complexes.

References

Discovery based research experiences: gauche effects in group 16 elements.

Recent Posts

Archives

Blogroll

Meta