{"id":19180,"date":"2017-12-07T13:28:33","date_gmt":"2017-12-07T13:28:33","guid":{"rendered":"http:\/\/www.ch.imperial.ac.uk\/rzepa\/blog\/?p=19180"},"modified":"2017-12-07T13:28:33","modified_gmt":"2017-12-07T13:28:33","slug":"fair-data-%e2%87%8c-raw-data","status":"publish","type":"post","link":"https:\/\/www.rzepa.net\/blog\/?p=19180","title":{"rendered":"FAIR data \u21cc Raw data."},"content":{"rendered":"<div class=\"kcite-section\" kcite-section-id=\"19180\">\n<p><strong><span style=\"color: #ff0000;\">FAIR data<\/span><\/strong> is increasingly accepted as a description of what research data should aspire to; <strong><span style=\"color: #ff0000;\">F<\/span><\/strong>indable, <strong><span style=\"color: #ff0000;\">A<\/span><\/strong>ccessible, <strong><span style=\"color: #ff0000;\">I<\/span><\/strong>nter-operable and <span style=\"color: #ff0000;\"><strong>R<\/strong><\/span>e-usable, with <span style=\"color: #ff0000;\">C<\/span>ontext added by rich metadata (and also that it should be<span style=\"color: #ff0000;\"> O<\/span>pen). But there are two sides to data, one of which is the raw data emerging from say an instrument or software simulations and the other in which some kind of model is applied to produce semi- or even fully processed\/interpreted data. Here I illustrate a new example of how both kinds of data can be made to co-exist.<\/p>\n<p>I will start with a recent publication<span id=\"cite_ITEM-19180-0\" name=\"citation\"><a href=\"#ITEM-19180-0\">[1]<\/a><\/span> with the title\u00a0<i>Crystallographic Snapshot of an Arrested Intermediate in the Biomimetic Activation of CO<sub>2<\/sub>.\u00a0<\/i>The nature of this intermediate caught the eye of another research group, who responded with their own critique<span id=\"cite_ITEM-19180-1\" name=\"citation\"><a href=\"#ITEM-19180-1\">[2]<\/a><\/span><sup>\u2021<\/sup> along with the comment &#8220;<i>However, since we have no access to the original crystallographic data &#8230;&#8221;\u00a0<\/i>They might have been referring to the semi-processed data (containing the so-called <em>hkl<\/em> structure factors) but they may also have been alluding to the <strong>raw image data<\/strong> captured directly from the diffractometer cameras. That traditionally has not been available via the CSD (Cambridge structural database), but would be required for a complete re-analysis of the crystal structure. Now the first example of how both FAIR (processed) data and raw data can co-exist has appeared.<\/p>\n<p>The latest version of the CSD database shows an entry resulting from the following publication<span id=\"cite_ITEM-19180-2\" name=\"citation\"><a href=\"#ITEM-19180-2\">[3]<\/a><\/span> and the deposited data has its own DOI there (<a href=\"https:\/\/doi.org\/10.5517\/ccdc.csd.cc1n9ppb\">10.5517\/ccdc.csd.cc1n9ppb<\/a>). That entry in turn has a DOI pointer to the <strong>Raw data<\/strong> (<a href=\"https:\/\/doi.org\/10.14469\/hpc\/2300\">10.14469\/hpc\/2300<\/a>) held in a different location and the pointer is reciprocated (\u21cc) with the latter pointing back to the former. Both datasets point to the original article, thus completing a holy triangle.<sup>\u2020<\/sup><\/p>\n<p><img data-recalc-dims=\"1\" decoding=\"async\" data-attachment-id=\"19184\" data-permalink=\"https:\/\/www.rzepa.net\/blog\/?attachment_id=19184\" data-orig-file=\"https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?fit=1786%2C1736&amp;ssl=1\" data-orig-size=\"1786,1736\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"214\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?fit=300%2C292&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?fit=450%2C437&amp;ssl=1\" class=\"aligncenter size-large wp-image-19184\" src=\"https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?w=450&#038;ssl=1\" alt=\"\"  srcset=\"https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?resize=1024%2C995&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?resize=300%2C292&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?resize=768%2C746&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?resize=50%2C50&amp;ssl=1 50w, https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?w=1786&amp;ssl=1 1786w, https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?w=900&amp;ssl=1 900w, https:\/\/i0.wp.com\/www.rzepa.net\/blog\/wp-content\/uploads\/2017\/12\/214.jpg?w=1350&amp;ssl=1 1350w\" sizes=\"(max-width: 450px) 100vw, 450px\" \/><\/p>\n<p>There is more. The Raw dataset\u00a0(<a href=\"https:\/\/doi.org\/10.14469\/hpc\/2300\">10.14469\/hpc\/2300<\/a>) declares it is a member of a superset, called <em>Crystal structure data for Synthesis and Reactions of Benzannulated Spiroaminals; Tetrahydrospirobiquinolines (<\/em><a href=\"https:\/\/data.hpc.imperial.ac.uk\/resolve\/?doi=2297&amp;access=\">10.14469\/hpc\/2297<\/a><em>)\u00a0<\/em>where you can find information about six other related structures. That collection is in turn a member of a superset called <em>Synthesis and Reactions of Benzannulated Spiroaminals; Tetrahydrospirobiquinolines (<\/em><a href=\"https:\/\/doi.org\/10.14469\/hpc\/2099\">10.14469\/hpc\/2099<\/a><em>)\u00a0<\/em>where DOIs to other types of data associated with this project can be found, such as\u00a0<em>Computational data<\/em> (<a href=\"https:\/\/data.hpc.imperial.ac.uk\/resolve\/?doi=2098&amp;access=\">10.14469\/hpc\/2098<\/a>) and <em>NMR data<\/em> (<a href=\"https:\/\/data.hpc.imperial.ac.uk\/resolve\/?doi=2294&amp;access=\">10.14469\/hpc\/2294<\/a>). Although a human can with some determination follow these associations up, down and across, the system is designed to also be followed by automated algorithms that could traverse this web quickly and efficiently.<\/p>\n<p>So you can now see that a crystal structure held in the CSD could be the starting point for a journey of FAIR data discovery, in manner that has not hitherto been possible. How quickly the CSD will become populated by links to <strong>Raw <\/strong>(and other)<strong> data<\/strong> remains to be seen. I have not yet discovered any mechanism for specifying a CSD query which stipulates that Raw data must be available, but no doubt this will come.<\/p>\n<p>To end, back to the\u00a0<i>Biomimetic Activation of CO<sub>2 <\/sub><\/i>referred to at the start. With no access to the original data, recourse was made to computational modelling.<span id=\"cite_ITEM-19180-1\" name=\"citation\"><a href=\"#ITEM-19180-1\">[2]<\/a><\/span> Which where \u00a0I came in, since I wanted access to the original (computational) data. Sadly it did not appear to be available with the article,<span id=\"cite_ITEM-19180-1\" name=\"citation\"><a href=\"#ITEM-19180-1\">[2]<\/a><\/span> in much the same manner as the original complaint. Perhaps, when FAIR data becomes fully accepted as part of how science is done nowadays, such complaints will become ever rarer!<\/p>\n<hr \/>\n<p><sup>\u2021<\/sup>In fact the original authors did respond<span id=\"cite_ITEM-19180-3\" name=\"citation\"><a href=\"#ITEM-19180-3\">[4]<\/a><\/span> with an acknowledgement that their original conclusions were not correct.<\/p>\n<p><sup>\u2020<\/sup>Almost. The article <span id=\"cite_ITEM-19180-2\" name=\"citation\"><a href=\"#ITEM-19180-2\">[3]<\/a><\/span> cites DOI: <a href=\"https:\/\/doi.org\/10.14469\/hpc\/2099\">10.14469\/hpc\/2099<\/a> (Ref 28),\u00a0but it does not cite DOI:\u00a0<a href=\"https:\/\/doi.org\/10.5517\/ccdc.csd.cc1n9ppb\">10.5517\/ccdc.csd.cc1n9ppb<\/a> because the latter had not been minted yet at the time the final proofs were corrected, and there is no mechanism to add it at a later stage.<\/p>\n<h2>References<\/h2>\n    <ol class=\"kcite-bibliography csl-bib-body\"><li id=\"ITEM-19180-0\">S.L. Ackermann, D.J. Wolstenholme, C. Frazee, G. Deslongchamps, S.H.M. Riley, A. Decken, and G.S. McGrady, \"Crystallographic Snapshot of an Arrested Intermediate in the Biomimetic Activation of CO&lt;sub&gt;2&lt;\/sub&gt;\", <i>Angewandte Chemie International Edition<\/i>, vol. 54, pp. 164-168, 2014. <a href=\"https:\/\/doi.org\/10.1002\/anie.201407165\">https:\/\/doi.org\/10.1002\/anie.201407165<\/a>\n\n<\/li>\n<li id=\"ITEM-19180-1\">J. Hurmalainen, M.A. Land, K.N. Robertson, C.J. Roberts, I.S. Morgan, H.M. Tuononen, and J.A.C. Clyburne, \"Comment on \u201cCrystallographic Snapshot of an Arrested Intermediate in the Biomimetic Activation of CO&lt;sub&gt;2&lt;\/sub&gt;\u201d\", <i>Angewandte Chemie International Edition<\/i>, vol. 54, pp. 7484-7487, 2015. <a href=\"https:\/\/doi.org\/10.1002\/anie.201411654\">https:\/\/doi.org\/10.1002\/anie.201411654<\/a>\n\n<\/li>\n<li id=\"ITEM-19180-2\">J. Almond-Thynne, A.J.P. White, A. Polyzos, H.S. Rzepa, P.J. Parsons, and A.G.M. Barrett, \"Synthesis and Reactions of Benzannulated Spiroaminals: Tetrahydrospirobiquinolines\", <i>ACS Omega<\/i>, vol. 2, pp. 3241-3249, 2017. <a href=\"https:\/\/doi.org\/10.1021\/acsomega.7b00482\">https:\/\/doi.org\/10.1021\/acsomega.7b00482<\/a>\n\n<\/li>\n<li id=\"ITEM-19180-3\">S.L. Ackermann, D.J. Wolstenholme, C. Frazee, G. Deslongchamps, S.H.M. Riley, A. Decken, and G.S. McGrady, \"Corrigendum: Crystallographic Snapshot of an Arrested Intermediate in the Biomimetic Activation of CO&lt;sub&gt;2&lt;\/sub&gt;\", <i>Angewandte Chemie International Edition<\/i>, vol. 54, pp. 7470-7470, 2015. <a href=\"https:\/\/doi.org\/10.1002\/anie.201504197\">https:\/\/doi.org\/10.1002\/anie.201504197<\/a>\n\n<\/li>\n<\/ol>\n\n<\/div> <!-- kcite-section 19180 -->","protected":false},"excerpt":{"rendered":"<p>FAIR data is increasingly accepted as a description of what research data should aspire to; Findable, Accessible, Inter-operable and Re-usable, with Context added by rich metadata (and also that it should be Open). But there are two sides to data, one of which is the raw data emerging from say an instrument or software simulations [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[3,1754],"tags":[106,2309,1233,1428,2318,1473,2331,2333,1510],"class_list":["post-19180","post","type-post","status-publish","format-standard","hentry","category-chemical-it","category-crystal_structure_mining","tag-computing","tag-context","tag-data","tag-data-management","tag-information","tag-knowledge","tag-raw-data","tag-software-simulations","tag-technologyinternet"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p1gPyz-4Zm","jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/www.rzepa.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/19180","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rzepa.net\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rzepa.net\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rzepa.net\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rzepa.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=19180"}],"version-history":[{"count":0,"href":"https:\/\/www.rzepa.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/19180\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.rzepa.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=19180"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rzepa.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=19180"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rzepa.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=19180"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}