Posts Tagged ‘Python’
Tuesday, October 4th, 2016
Peter Murray-Rust and I are delighted to announce that the 2016 award of the Bradley-Mason prize for open chemistry goes to Jan Szopinski (UG) and Clyde Fare (PG).
Jan’s open chemistry derives from a final year project looking at why atom charges derived from quantum chemical calculation of the electronic density represent chemical information well, but the electrostatic potential (ESP) generated from these charges is very poor and conversely charges derived from the computed electrostatic potential are incommensurate with chemical information (such as the electronegativity of atoms). He has developed a Python program called ‘repESP’ in which ‘compromise’ charges are generated which attempt to reconcile the physical world-view (fitting the ESP) with chemical insight provided by NPA (Natural Population Analysis). Jan was the main driver to making his code open source, “opening his supervisor’s eyes” to the various flavours of open source licences. To ensure that all subsequent improvements to the program remain available to anyone, the source code has been released under a ‘copyleft’ licence (GPL v3) and is maintained by Jan on GitHub, where Jan looks forward to helping new users and collaborating with contributors.
Clyde has made various contributions to opensource chemistry over the period of his PhD, with the focus mainly on utilities to improve quantum chemical research and the enhancement of a popular machine learning library with a method that has been successful in chemometrics, creation of an opensource channel for teaching chemists programming and data analysis and creation of a tool to help encourage open sourcing software development. Cclib is the most popular library for parsing quantum chemical data from output files and Clyde has contributed patches for the Atomic simulation environment which enables control of quantum chemical codes from a unified python interface. He was responsible for the construction of a computational chemistry electronic notebook published to github and which is now under active development by others as well. This aims to encapsulate computation chemical research projects, both for the sake of reproducibility and for the sake of organising and keeping track of quantum chemical research. Alongside this platform he created an enhanced Gaussian calculator for the Atomic Simulation Environment that enables automatic construction of ONIOM input files, also now under active development. He also made contributions to scikit learn, the most popular python machine learning framework, implementing a kernel for Kernel Ridge Regression that has become the most successful kernel for regression over molecular properties. He was part of the team that won the 2014 sustainable software conference prize for creation of the opensource healthchecker software as part of Sustain. He has argued for opensource as a platform for teaching resources and created the Imperial Chemistry github user account, which is now run by the department. Materials for the Imperial Chemistry Data Analysis and Programming workshops implemented as Python Notebooks are now available through this account and continue under active development.
Criteria for the award will include judging the submission on its immediate accessibility via public web sites, what is visible and re-usable in this way and of evidence of either community formation/engagement or re-use of materials by people other than the proposer.
Tags:Analytical chemistry, chemical information, chemical insight, Cheminformatics, Chemistry, Chemometrics, Clyde Fare, Company: GitHub, computation chemical research projects, computational chemistry, computing, Cross-platform software, driver, GitHub, Jan Szopinski, machine learning, open sourcing software development, opensource healthchecker software, Peter Murray-Rust, public web sites, Python, quantum chemical calculation, quantum chemical codes, quantum chemical data, quantum chemical research, Quotation, Server & Database Software, simulation, Software, supervisor, sustainable software conference prize, Technology/Internet
Posted in Bradley-Mason Prize for Open Chemistry | No Comments »
Saturday, November 1st, 2014
Egon Willighagen recently gave a presentation at the RSC entitled “The Web – what is the issue” where he laments how little uptake of web technologies as a “channel for communication of scientific knowledge and data” there is in chemistry after twenty years or more. It caused me to ponder what we were doing with the web twenty years ago. Our HTTP server started in August 1993, and to my knowledge very little content there has been deleted (it’s mostly now just hidden). So here are some ancient pages which whilst certainly not examples of how it should be done nowadays, give an interesting historical perspective. In truth, there is not much stuff that is older out there!
- This page was written in May 1994 as a journal article, although it did have to be then converted into a Word document to actually be submitted.[1] Because it introduced hyperlinks to a chemical audience, we wanted to illustrate these in the article itself! Hence permission was obtained from the RSC for an HTML version to be “self-archived” on our own servers where the hyperlinks were supposed to work (an early example of Open Access publishing!). I say supposed because quite a few of them have now “decayed”. We were aware of course that this might happen, but back in 1994, no-one knew how quickly this would happen. What is interesting is that the HTML itself (written by hand then) has survived pretty well! I will leave you to decide how much the message itself has decayed.
- This HTML actually predates the above; it was written around November 1993 and represented the very first lecture notes I converted into this form (on the topic of NMR spectroscopy). A noteworthy aspect is the scarce use of colour images. At the start of 1994, the bandwidth available on our campus was pretty limited (the switches were 10 Mbps only) and a request went out to reduce the bit-depth of any colour images to 4-bits to help conserve that bandwidth! I rather doubt anyone took much notice however, and the policy was forgotten just a few months later.
- In 1996, I had two visitors to the group, Guillaume Cottenceau, a french undergraduate student, and Darek Bogdal, a Polish researcher who wanted to learn some HTML. Together they produced this, which was an interactive tutorial to accompany the NMR lecture notes previously mentioned. These pages introduce the Java applet (yes, it was very new in 1996), which Guillaume had written and which Darek then made use of. And hey, what do you know, the applet still works (although you might have to coerce your browser into accepting an unsigned applet).
- Here is a programming course that I had been running with Bryan Levitt for a few years, now recast into HTML web pages some time in 1994-5. This particular project I still hold dear, since it expanded upon the NMR lectures by getting the students to synthesize a FID (free induction decay) using the program they wrote, and then perform a Fourier Transform on it. I even encouraged students to present their results in HTML (I cannot now remember how many did). This link is to the computing facilities we offered students in 1994 for this project, ah those were the times! In 1996, the programming course was replaced by one on chemical information technologies, and here students were most certainly expected to write HTML. Some of the best examples are still available. And to illustrate how things happen in cycles, that course itself is now gone to be replaced by, yes, a programming course (but using Python, and not the original Fortran).
- In tracking down the materials for the programming course described above, I re-discovered something far older. It is linked here and is (some of) the Fortran source code I wrote as a PhD student in
1974 1972. So I will indulge in a short digression. My Ph.D. involved measuring rate constants, and the accepted method for analysing the raw kinetic data was using graph paper. For first order rate behaviour, this required one to measure a value at time=∞, which is supposed to be measured after ten half-lives. I was too impatient to wait that long, and worked out that a non-linear least squares analysis did not require the time=∞ value; indeed this value could be predicted accurately from the earlier measurements. So in 1974, I wrote this code to do this; no graph paper for me! Also for good measure is a least squares analysis of the Eyring equation. And you get proper standard deviations for your errors. In retrospect I should have commercialised this work, but in 1974, almost no-one paid money for software! What a change since then. I must try recompiling this code to see if it still works! And for good measure, here is a Huckel MO program I wrote in 1984 or earlier (I did compile this recently and found it works) and here is a little program for visualising atomic orbitals.
- In January 1994, I was asked to create a web page for the WATOC organisation. This certainly predated the web sites for e.g. the RSC, the ACS, indeed famous sites such as the BBC and Tesco (a large supermarket chain) which only started up in mid 1994. The WATOC site itself moved a few years ago.
- This is one of those wonderfully naive things I started in 1994, and which did not last long (in my hands). Nowadays, the concept lives on as MOOCs. Note again the almost complete expiry of the hyperlinks.
- This is a project we also started in 1994, Virtual reality[2],[3]. The idea was that if HTML was text-markup, VRML was going to be 3D markup. VRML itself never quite caught on, but it is having a new life as a 3D printing language!
- And by 1995, I felt confident enough in my ability to (edit) HTML, that we started a virtual conference in organic chemistry (we did four of them in the end). I remember the first one involved contributors sending me a Word version of their poster, and I did all the work in converting it into HTML. Such virtual conferences still run, but in truth most participants still prefer to travel long distances to go drink a beer with their chums, rather than hack HTML.
I am going to stop now, since this is far too much wallowing in the past. But at least all this stuff is not (yet) lost to posterity.
References
- H.S. Rzepa, B.J. Whitaker, and M.J. Winter, "Chemical applications of the World-Wide-Web system", Journal of the Chemical Society, Chemical Communications, pp. 1907, 1994. https://doi.org/10.1039/c39940001907
- O. Casher, and H.S. Rzepa, "Chemical collaboratories using World-Wide Web servers and EyeChem-based viewers", Journal of Molecular Graphics, vol. 13, pp. 268-270, 1995. https://doi.org/10.1016/0263-7855(95)00053-4
- O. Casher, C. Leach, C.S. Page, and H.S. Rzepa, "Advanced VRML based chemistry applications: a 3D molecular hyperglossary", Journal of Molecular Structure: THEOCHEM, vol. 368, pp. 49-55, 1996. https://doi.org/10.1016/s0166-1280(96)90535-7
Tags:3D printing language, ACS, BBC, Bryan Levitt, chemical audience, chemical information technologies, Darek Bogdal, Fortran, Guillaume Cottenceau, HTML, http, Java, large supermarket chain, personal Web presence, Python, researcher, spectroscopy, Tesco, Virtual reality, WATOC, web technologies
Posted in Chemical IT, Historical | No Comments »