Making Open Science Sustainable for Chemistry

Video in TIB AV-Portal: Making Open Science Sustainable for Chemistry

Formal Metadata

Making Open Science Sustainable for Chemistry
Title of Series
CC Attribution - NoDerivatives 4.0 International:
You are free to use, copy, distribute and transmit the work or content in unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Interview with Ian Bruno: Cambridge Crystallographic Data Centre, Cambridge, UK, recorded at the BEILSTEIN OPEN SCIENCE SYMPOSIUM (22 – 24 May 2017). Ian Bruno is discussing making open science sustainable with Martin Hicks. Ian makes the point that simply making data sets available is not sufficient, they need to be discoverable and re-usable. He outlines how the Cambridge Crystallographic Data Centre (CCDC) started producing the Cambridge Structure Database over 50 years ago, and over time have had different models to ensure its sustainability. In the internet era, the challenge has been to provide a level of free access, whilst generating income through the value added services and software that work on top of the database. Sustainability is not just an issue of financial support, but is also driven by the user base. Here the CCDC set up a network of national affiliated centers, responsible originally for the distribution of the database – then on tapes. Nowadays this network helps in attracting funding for local academics to receive the value added services. The driver of sustainability of a resource is its scientific value. Ian discussed the idea of the scholarly commons and different ways of communicating science, in particular looking at the underlying research objects that contribute to a story. To improve discoverability we need to address the issues of publish and perish paradigm, and develop workflows that are much more closely related to the research that people are doing. Here we need the appropriate levels of documentation or meta-data associated with those research objects so that people are able to find them. In terms of standards for chemical structures, Ian notes that people are currently using de-facto standards and have built workflows around them, and thus they will be reluctant to change. There are currently discussions underway, to ensure that the specification of these formats and extensions are made openly accessible.
Keywords open science databases
Physical chemistry Survival skills Activity (UML) Transport Deposition (phase transition) Wursthülle Chemistry Quartz Anomalie <Medizin> Origin of replication Initiation (chemistry) Tiermodell Protein Kristallkörper Exploitation of natural resources Ausgangsgestein Chemical structure Sea level Base (chemistry) Crystallography Process (computing)
Area Octane rating Deposition (phase transition) Protein domain Ham Addition reaction Sea level Hope, Arkansas Wine tasting descriptors
Volumetric flow rate Setzen <Verfahrenstechnik> Functional group Chemical structure ISO-Komplex-Heilweise Body weight Chemistry Process (computing)
Setzen <Verfahrenstechnik> Deterrence (legal) Potenz <Homöopathie> Initiation (chemistry) Meat Breed standard Storage tank Addition reaction Chemistry Hope, Arkansas Process (computing)
on the closures speaking to him through nov the cambridge crystal traffic data center were after. science symposium here and we've been talking about. tate repositories open access making science more open and one of the big questions is how can these initiatives says the data on my insides be made sustainable what does what needs to happen to do that. ok so it's been a lot of talk at the symposium about to open data that a lot of comments as well about how just making data sets available is and you also need to think about how they found our way. discovered. can use them with other data sets and and make good use of them later. now that takes time and effort from the community someone in the community and that somehow has to be thankful. there's quite a lot of ways which people do currently pay for those sorts of activities and there's a variety of his day trip as a tree protein crystal structures which has been around for more than forty years schooled protein data bank that's very much funded through grants from european and american and funding agencies to do their activities and that enables them to. might that the results of those activities freely available offices in the case of the cambridge crisper graphic data center where i work the database we have came to shuttle database was established more than fifty years ago and the activity there was a really fun to do the same way through grant funding the over time. some members of the community based in academia industry were willing to pay for access to the state this was before the days of the internet an electronic dissemination so there's a tradition of paying for good quality data access to good quality data. as a result of that we're able to become self-sustaining and we now have a model where we are able to sustain our activities through providing value added services and software products that work on top of the aggregation of the data to help people apply that in their real life work. challenge we face more recently with the introduction of the internet and people being able to deposit data directly with this is an expectation that that data should be freely available and the change we've been facing is how we can provide a level of free access but still enables us to generate income through those for. value added services so the moment we are having to sort of like trade that line so that we do still retain the income that we can get from academia and industry and i'm also think about how we open up the services so that there is a basic level of access that supports them some uses the date. our current process to remember to use grows got many faces so it's not just a question of america finance initiative it also has to be accepted by the community and one of the big challenges in moving forward in making data more you can. in some structured and public me is to get the community to actually use the service irrespective of parents found it so are harder to govern the sort of plan or scheme of of how you went about. involving the community in what you're trying to set up and getting their have a mental going on and interests say let's start by fixing perhaps on the academic side of things say issue it gets goes back to the early days as well say you were in the original. the research funding that we were getting started to be diminished. and who was discussions had with people outside of the uk and other countries about finding money to help support that support the activities and what led to has been a network of what we call national affiliated centers that exist around the world and in the early days they took responsibility for the distribution of the database weeks that in the mud. copy on tape the compere want tapes and sent off to be in their regions today's society different way that they operate because no need for that kind of physical copying of tapes what we've been able to do recently as exploiters relationships to you attract levels of income from a particular region that enables academics with that. region whoever they are to have free access to the value added services that we provide a hard to bring or your experience to scramble the transport to two other initiatives which are being started gotten some of the ones we've been talking about here and i don't mean that are the movie the financial aspects on more. you based aspects what are the essential. incentives that the community needs think the most important thing is that there is a community to sell it to see a resource succeed and survive and there are lots of ways which you can conceive of funding resource such as this we talked a lot about the c.d.c. model which has been able to see three providing value added said.
says that might work in other areas and other domains of the communities have been more accepting of an idea of the data deposition fee which carries the cost of processing it takes at the point that position. think my hope is that you know what will drive the sort of sustainability of the results is a scientific value. and that will help drive the yellow help build communities round which will help maximize its usurious scientific investment but also through its financial investment is one of the other aspects we've been discussing hours or for an oscar goes to discover ability and also be amount of data which was pretty good. whose arm and it's growing dramatically every year. hamby seuss i guess partly. i'm due to this article published in paris parish paradigm. what can you got any thoughts or suggestions how we can perhaps reverse this trend ham find ways of scientists are not so incentivize to to publish more but to just published the essentials for him perhaps fewer pay us something.
i'm going to say this could be quite a long way in the future and so subway to get to where we want to be and has been discussions going on in the wider community about something is being referred to as the scholarly come and. i think these ideas behind car which start to move us to was thinking about publication workflows that are much more closely related to the research that people to search the someone does a piece of research that ends up with the result was a complete be made ready to publish straight away if not made available straight away so. i think it probably the publishing parish paradigm perhaps starts being thrown from might my stopping change if you take all if you start redefining the idea of published today and think about different ways community rating signs particularly the underlying research suggests contribute to store them may still be. it probably almost certainly will still be value in having narratives that sit across a large number of individual experiments and results. perhaps they become less important and we can focus more on just communicating the results of science sooner than we currently to an in a way that makes them already read discoverable and a usable by people who might be to use them safer that to succeed you need the appropriate levels of document. additional matta data associated with those research objects so people do stand a chance of being able to find what they're looking for and understand whether as somebody to them errors.
or rather question are have are rebuilding started talking about ourselves down a game.
for discussion in part of the audio a chemical in chemical data interest group to campbell structures and therefore max and i keep putting my hand up meeting saying. i think we really should be looking at it a new structure format which would work for the complete work flow from research is for publishers through perhaps to to data bases because we keep having to transfer between one for matter another which is a loss the process i am also very inefficient to think. that there's enough interest in the community to monks are say researchers to to actually address this issue has been attempts in the past historically decent said katie things along the lines of weight just described and they have tended not to be successful so has just might suggest the even if the.
the whale it wouldn't actually succeed. and i think it's worth looking at the reasons why those sometimes haven't succeeded and it's because people have a lot to a factory standards and bill workflows around them and changing those is actually quite a lot of effort to think we're probably the position right now we perhaps should be starting with those de facto stand. and making sure looking to see how we can make values extensible to be able to more reliably capture additional pieces of information and his discussions going on about how we can make sure that the specifications of those formats and made it can be accessible so that the community can start build on. none are groping really work in empowering we have before maps their marriage people are using their meat need to be made more open me to cease to be standardized malls were extended but we know that they're not much hope to move comes out in power to act it would be good if it's possible to have. i'm now community generated discussion on what we need to be doing how we need to be doing it with a view to the future because i think we're in a different position now in time with internet very cheap storage them we were when these other initiatives started out as involving one. i'm to myself when i married the problems so i think the opportunities for doing something a higher but also with all the legacy formats as well the. inclination of people to change what they're doing because i got their jobs that are not really interested in changing think is also one of these things will be hindering i'm changing my say i gather point that thing that's those days come as well as a gate and i think what i'm going to say i should be to stewart chilcot is also involved in. most of these discussions rage but for the idea that perhaps we should at least he thought experiment about a universal format for chemistry that's not necessarily move ahead to the implementation but it might sit down and think about what it might look like and how we might go about designing he.