FAIR Reusable #4 - R for Reusable - 20-09-17

Video thumbnail (Frame 0) Video thumbnail (Frame 2524) Video thumbnail (Frame 3302) Video thumbnail (Frame 9491) Video thumbnail (Frame 10357) Video thumbnail (Frame 11124) Video thumbnail (Frame 12861) Video thumbnail (Frame 14821) Video thumbnail (Frame 19256) Video thumbnail (Frame 23541) Video thumbnail (Frame 30732) Video thumbnail (Frame 31667) Video thumbnail (Frame 33072) Video thumbnail (Frame 34052) Video thumbnail (Frame 35007) Video thumbnail (Frame 35612) Video thumbnail (Frame 36467) Video thumbnail (Frame 38841)
Video in TIB AV-Portal: FAIR Reusable #4 - R for Reusable - 20-09-17

Formal Metadata

FAIR Reusable #4 - R for Reusable - 20-09-17
Title of Series
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
#4 REUSABLE covers: -- an overview of the REUSABLE principles which bring together licensing, provenance and domain-relevant standards -- resources to support institutional awareness and uptake of Interoperable principles Speakers 1) Keith Russell, ANDS provides an overview of the principles behind the FAIR concept of how to make data Re-usable 2) Margie Smith (unrecorded), Geoscience Australia discussed why Provenance information is critical to data reuse and how GA have approached attaching provenance information to data 3) Nerida Quatermass from Creative Commons Australia @QUT presents on licensing frameworks and choosing a licence to make the data more Re-usable
Service (economics) Data management Service (economics) Transformation (genetics) Term (mathematics) Artificial neural network Projective plane Self-organization Series (mathematics) Mereology
Point (geometry) Standard deviation Meta element Context awareness Virtual machine Set (mathematics) Mereology Raw image format Metadata Field (computer science) Smith chart Attribute grammar Element (mathematics) Time domain Derivation (linguistics) Energy level Software framework Domain name Standard deviation Information File format Forcing (mathematics) Mathematical analysis Usability Attribute grammar Bit Lattice (order) Vector potential
Slide rule Presentation of a group Information Link (knot theory) Multiplication sign Virtual machine Software framework Bit Right angle Open set Smith chart
Meta element Slide rule Addition Physical law Virtual machine Shared memory Content (media) Public domain Bit Open set Function (mathematics) Computer font Attribute grammar Element (mathematics) Derivation (linguistics) Representation (politics) Right angle Freeware Spectrum (functional analysis) Resultant Exception handling Form (programming)
Point (geometry) Link (knot theory) Code Source code Virtual machine Translation (relic) Public domain Rule of inference Computer icon Attribute grammar Formal language Mechanism design Term (mathematics) Authorization Computing platform Condition number Addition File format Digitizing Projective plane Expression Constructor (object-oriented programming) Content (media) Shared memory Database Maxima and minima Cartesian coordinate system Open set Symbol table Word Internetworking Personal digital assistant Search engine (computing) Telecommunication Right angle Row (database)
Web page Slide rule Context awareness Interior (topology) Code Adaptive behavior Source code Electronic program guide Materialization (paranormal) Virtual machine Set (mathematics) Content (media) Field (computer science) Computer icon Attribute grammar Element (mathematics) Number Revision control Uniform resource locator Derivation (linguistics) Term (mathematics) Computer configuration Selectivity (electronic) Software framework Series (mathematics) Descriptive statistics Computing platform Formal grammar Condition number Physical system Source code Computer icon Addition Information Content (media) Attribute grammar Bit Maxima and minima Cloud computing Uniform resource locator Arithmetic mean Computer configuration Software Search engine (computing) Personal digital assistant Internet service provider Office suite Quicksort
Key (cryptography) Link (knot theory) Information Suite (music) INTEGRAL Connectivity (graph theory) Virtual machine Function (mathematics) XML Open set Attribute grammar Category of being Mechanism design Message passing Wiki Telecommunication Core dump Order (biology) Software framework Right angle
Suite (music) Slide rule Presentation of a group Information Link (knot theory) Computer configuration Personal digital assistant Artificial neural network Device driver Website Software framework
Group action Standard deviation Link (knot theory) Information Bit Metadata Element (mathematics) Number Type theory Goodness of fit Latent heat Different (Kate Ryan album) Internet service provider Self-organization Series (mathematics) Quicksort Spacetime
Collaborationism Strategy game Computer programming
welcome everybody so this the the fourth in the series of webinars about the fair data principles ah this is the webinar on our four reusable I'll just briefly
introduce myself my name is Keith Russell I work for the Australian national data service I'm the host for today and thank you to my colleague Susanna Servine who is in the background co-hosting this webinar and organizing things in the background just there's a general introduction this Tralee national data service works with research organizations from around Australia to establish trusted partnerships reliable services and to add value to research that enhance the capability in the research sector we're working together with two other interests funded projects research data services RDS and nectar to create an aligned set of joint investments to deliver transformation in the research sector so this webinar is part of a larger series of Ann's activities which aim to support the Australian research community in increasing our ability to manage research data as a national asset so this is the fourth and final of the four webinars in the series about the fair date of the principles we've had webinars on farm and a book accessible interoperable and now we're up to the fourth one reusable please note this is one that comes up every now and then and the R stands for reusable it does not stand for replicable or reproducible so reusable is actually broader than those other terms and means that it can be used for more purposes than just purely to replicate or reproduce the original research today I have I'll give a very
brief introduction on what force 11 says about reusable under the fair data principles and then I'd like to hand over to a narrator who will be talking narrative from Creative Commons who will talk about licensing frameworks and using a license to make your data more reusable and after that Margie Smith from Geoscience Australia will talk about provenance information and not only why it's important but also how GA have actually approached attaching provenance information to research now first of all I'll give a brief
introduction to what force eleven agreed on as part of the fair data principles under the heading of reusable so first of all I'd like to emphasize that to actually make your data reusable you will also need to incorporate elements and they're findable accessible and interoperable so if your data is not going to be findable or not going to be accessible then it will ultimately not be reusable anyway so this is you best to see this as on top of making your data findable accessible interoperable these are extra elements that you need to think about to make it reusable the way they've the way they've talked about it well first of all there's this first high level heading saying that the data and the metadata should have a plurality of accurate and relevant attributes well that's pretty general and they then drill down into three specific attributes that are required now the first of those attributes is left to the data and the metadata are released with a clear and accessible data usage license if you make your data available without any license at all it makes it very hard for a user or read potentially user to actually use it because it's just it's completely unclear what what the agreement is if there's any copyright over the data if there's any restrictions things like that so that's why it's very important to have a license so it's clear what you can do with it and if you do assign a license please make sure you use a standard license they've definitely preferred and ideally in a machine readable format because that way machines can actually interpret whether the data can be used by that machine to do an analysis to pull in the data and to actually incorporate it in analyses or whether they need to skip it because it's not licensed for that purpose a narrator will talk in much more detail about a possible framework to use to assign a license to your data the second point they make under kind of these attributes is that their data and the metadata should be associated with information about the provenance now this provides clarity on the steps that were taken in collecting selecting analyzing the data so all the steps that have been taken to return it from raw data into derive data and into that final data sets that is made available as under fair through using the fair data principles so this is for a potential reader this is extremely informal information because it gives you much more information at the context and the background in in which the data was created and whether the data will also be suitable for the purposes that the real user wants use it for attaching provenance information is easier said than done and I'm really grateful that Maggie's going to be able to talk a little bit more about what what's happened in practice and how GA is tackled this and how GA is incorporating provenance information now the third and final point they make about these relevant attributes is that the data and the metadata should meet domain relevant community standards and to findable there talked about in more in general about having find a metadata that allows the data to be findable and under interoperable they talked a little bit about the data and using standards the point they're making here is that it's very useful to make sure that the data and the most of data is in it or the data is in a data format and a file format that is commonly used in the discipline so that means another researcher in that same discipline can easily pick it up and use it and if you use a metadata format think about using one that is common in the discipline to so that it contains specific fields that are relevant to that discipline so that a researcher in that discipline can easily understand more of the detail what what columns are in that data set what the context is around the date in which the data was collected etc so that makes it much more useful for a potential reuse er from that community to pick up the data and reuse it now I
would like to say first of all hand over to narrator and narrowed a quite a mess from Creative Commons Australia based at QUT nirodha I've asked her to present on licensing frameworks and choosing the license to make you dated we're more reusable to give you a bit of a sense what a possible framework is you could use that is a standard framework and that can be made machine readable so that other reuse does have a bit much clearer picture about how the data can be used so I'd like to hand over to narrator so thanks
very much Keith I think I've just got time for a very quick overview of how the open licensing framework create provided by Creative Commons achieves fair with regards to reuse rights the slides will be made available to you and you'll notice that each slide has a link to relevant information and there's a slide at the end of the presentation which lists good resources as well so
copyright laws grant the monopoly over a work in material form to the owner of it Creative Commons licenses have filled a need for a public license that is one that anybody can rely on as a permission to reuse a work before CC licenses the only way to get reuse rights was by the exceptions allowed in copyright law or through licences directly negotiated between a copyright owner and a licensee so the public license like a Creative Commons license is central to opening up access to research output including the sharing of data associated with these I've put an open access spectrum representation on the slide because it's really important to distinguish between free access and reusability which starts with permission to share but extends to the right to make derivative works these permissions to reuse are communicated with a clear machine readable license so
you probably all know a little bit about Creative Commons licenses but as a quick overview there are four license elements that can be combined and that results in six licenses and they're featured on this slide again on a spectrum of a allowing more to less reuse of a work the most open or permissive license is known as the attribution or CC by license and the most restrictive is the attribution non-commercial no derivatives license you'll see in my slide the free cultural works seal it's just put there to show you that there are two licenses that qualify for that but the relevance of that seal was that it was developed for the Wikimedia or for Wikimedia and Wikipedia content and it signals an important delineation between less and more restrictive licenses applied to works in the digital Commons so that just fills out the story with that in addition to the licenses Creative Commons offers to public domain tools now CC 0 is the public domain tool
for creators to use but there's also a public domain mark which is represented by a copyright symbol with a strikethrough and that's something that is used to notify works that are already in the public domain so that's being used commonly by cultural heritage institutions in their digital collections for example but I'm just going to focus on CC 0 because it can be particularly important to maximize the reuse of data and databases because it otherwise might be unclear where the highly factual data and databases are restricted by copyright or other rights so CC 0 is intended to cover all copyright and database rights so that however data and databases might be restricted under copyright or otherwise those rights are all surrendered so CC 0 is for most a waiver it means you waive all of your rights that you have zero rights left in a work effectively dedicating it to the public domain it has a legal code beneath it because you need a legal mechanism to relinquish your rights so when you release content under a CC 0 waiver you're explicitly stating that you do not expect attribution now there's a little uncertainty around CC 0 because Australian moral rights are fairly new but the licenses have been designed as carefully as possible to respect the author's wishes so the intent and the general understanding is that you do not need to provide attribution so probably the main point that I would like to make and Keith has already referred to this do license your data international rules are too variable to rely on the public domain CC 0 ensures maximum compatibility with other licensed works and it prevents attribution stacking for example attributing to many in a project or where not only do you attribute the immediate source of a derivative work but plus plus plus upstream works and there are other ways to acknowledge contribution the next best thing is probably the CC by the attribution license if you really want attribution to be a legal requirement the licenses
communicate reuse rights through the three layer design built into the license now the first layer is the legal code that's the legal instrument which states the terms and conditions of the license that second layer is the human readable format it's the plain language summary that we usually see if we click on the link to a CC license it's got the relevant icons that clearly indicate the conditions of your licensing and the reuse rights under the license you might recall the words you are free to under the following terms in addition to supporting reuse by individuals the fair principles put specific emphasis on enhancing the ability of machines to automatically find and use the data and that brings us to the third really important layer of the license which is the machine readable translation of the license which attaches itself to digital works or digital copies of works the translation code which is called writes expression language becomes embedded in the digital source and that helps search engines and other applications identify a work I might say this can also be achieved by uploading a work to a content sharing platform that supports CC licensing and takes care of the machine readability for you it's it's also important to actually mark a work with a license and I'll talk about that shortly regarding the robustness of the legal instrument the Creative Commons licenses have been upheld in every jurisdiction in which litigation concerning them has occurred but to date they've been no recorded cases of litigation concerning a CC license in Australia which would tend to support the quality of their construction I just will make the point that CC license licences are irrevocable and so they last for the term of copyright the licenses are also non-exclusive so it's open to the rights holder to apply another license to the material should the need arise that's called dual licensing so for example if you release material under a CC by non-commercial license but a commercial partner wishes to exploit the material you are free to enter into a separate license with the commercial partner that promotes their commercial use now to
maximize discoverability by search engines and software systems when you are licensing a work you should make sure to use our license choose a tool to get the machine readable HTML code the license chooser also works to mint the license for the purpose of marking the work itself there are four important things that I'll just point out with with regard to the license chooser and that's that it gives you a framework to select your license to provide attribution and citation and I'll just talk about each of those things a little bit with regard to license selection the license chooser guides you by a series of questions about what reuse you'll allow so will you allow adaptations of your work to be shared will you allow commercial uses of your work and depending upon the answer that you give to that those questions the relevant license or the appropriate license for you to select will be offered to you and you can see an example there you do need to remember that if your work is an adaptation of a work licensed under a CC share-alike license so there are two of those then your derivative work must be made available under the same license as per the share-alike condition with regard to attribution attribution is a base condition of all of the CC licenses there is flexibility around attribution requirements though which you'll read in the license it says reasonable two means medium and context this is really helpful it enables you to do things like not having attribution with inner work if it's not reasonable to do so you can link to a separate resource that would provide the required attribution it's also flexible in that a license or can waive some or all of the attribution requirements the next really important feature of the license chooser is with regards to citation so being able to locate the work and perhaps also the source works that led to that work and I think that probably answers some of the concerns from data creators about being able to find the original data there are a few other requirements here if the work you're licensing is a derivative of another work then you need to communicate that your work is a derivative and you need to include the source URL of the original work and you also need to describe the modification that you've made now when you're modifying materials under the new version for CC licenses you actually have to make a note of any modifications that you make to the materials regardless of whether the modification is significant enough to merit it being a derivative work and you have to provide the URI back to the source so again I think that's a good reassurance that has been built in to the license the version for licenses it might be unfeasible to include attribution i've already referred to that perhaps within a merged data set in which case include a URI back to the unmodified version lastly the license tool allows you or gives you the option to provide a more permissions URL so for example if you license something CC by but you're okay with people not attributing you in certain cases then this is your chance to specify that in that resource document that you forgot remember that you can't change the terms of a CC license but you can always grant additional permissions or warranties beyond what the license allows the other thing is that CC licenses allow for you to incorporate elements of third-party materials into your works just by marking these and providing attribution to them so I referred to the need to
mark a work to convey the license as well and on this slide you can see a number of ways in which to do that and there are some useful source documents there like C C's download page that gives you all of the icons buttons etc regarding content platforms even if there isn't a license field in a Content platform there's usually a description or some sort of freeform field where you can enter information about a work so
that was a very brief overview of the Creative Commons license and the license chooser framework and I guess my key message for today is that reuse is a core component of their data so you know do license your data to enable reuse I think that the Creative Commons licenses provide a simple mechanism to ensure that the users of research have the rights they need to reuse replicate and apply research outputs and data and to disseminate and communicate research output in order to maximize the impact of work while protecting very importantly the intellectual property in the academic integrity of a work I think with the built-in attribution and citation which creates a clear path to the original data and that's the useful
resources link and you'll all be able to get your hands on that when the copy is made available and that's it keep well thank you very much Erica that was really interesting and great to see see not only that the human readable side of things but the machine readable side of things in the way that that information can be made available to machines Thanks that's good so now I'd like to hand over to Maggie Maggie Smith from Geoscience Australia we'll be presenting on how Geoscience Australia's been working on collecting information about provenance of research data and attaching that to the research data thank you thank you
narrative for for the the the background on the on the CC CC license suite and the different options there and all the way from CC 0 to the most restrictive licenses and I think it's really useful way of seeing how the framework works and how especially you can make it machine readable and attach that Maggie thanks the presentation on attaching that provenance information how you do that and what that means and especially interested in the the the drivers for GA in naturally collecting that information and making that making that available
finally case you're interested in more information about reusable and first of all there's the slides and the slides will be made available from from narrator and from Maggie so they have links to relevant information there's also information on on the Ann's website we have some information on licensing data for Ria's so worth maybe it go into that link in having a look so just a few
resources on reusable so there's also if you're interested in the topic of provenance and attaching provenance information to research data we have an interest group on this topic and you can join in that interest group to in involved in the discussions I hear what's going on there finally if you are interested in different types of metadata metadata that are specific to different disciplines that allow for maximum reuse within that discipline then I would recommend following this link and there's links off to a whole
series of metadata standards in use across a series of disciplines and finally last year we did 23 research data things and one of these returns is also relevant to the discussion today and that was thinking number nine around licensing data for reuse so if you want to not only learn a little more but also have a bit of hands-on experience I recommend going to a number number thing nine and trying out trying out the assignments that there okay finally as we've now come to the the end of these the four webinars on the fair data principles just thought I'd give you a bit of an update where we are at that so what will we be what will we be doing around fair in the coming year so in the coming year we are interested in to continue work on what it means to make data fair and that includes sort of collecting and sharing examples of making data fair in specific disciplines because there are different ways there different elements and different aspects to making data fair which are irrelevant in different disciplines so we'll be working in that space and trying to come up trying to share some examples and good practices in that space we will also continue to engage with data providers with research organisations research facilities and institutions to work on aspects of policy human and technical infrastructure but also skills that can be put in place to make it as easy as possible for researchers to make their research out of there we would
like to acknowledge the the National collaborative research infrastructure strategy program that provides the funding for ends thank you very much