Snippet - FAIR Findable #1 - Intro to FAIR and F for Findable II

Video thumbnail (Frame 0) Video thumbnail (Frame 5143) Video thumbnail (Frame 7832) Video thumbnail (Frame 9167) Video thumbnail (Frame 10398) Video thumbnail (Frame 11261) Video thumbnail (Frame 13016) Video thumbnail (Frame 15065) Video thumbnail (Frame 18646)
Video in TIB AV-Portal: Snippet - FAIR Findable #1 - Intro to FAIR and F for Findable II

Formal Metadata

Snippet - FAIR Findable #1 - Intro to FAIR and F for Findable II
Title of Series
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Pattern recognition Group action Forcing (mathematics) Moment (mathematics) Virtual machine Shared memory Metadata Set (mathematics) Usability Bit Distance Metadata Number Independence (probability theory) Type theory Mathematics Pointer (computer programming) Different (Kate Ryan album) Natural number Pattern language Quicksort Resultant
Identifiability Set (mathematics) Volume (thermodynamics) Parameter (computer programming) Open set Information privacy Process (computing) Personal digital assistant Information security Information security Routing Resultant Form (programming)
Point (geometry) Standard deviation Personal digital assistant Physical law Information security Number
Standard deviation Trail Set (mathematics) Staff (military) Data management Process (computing) Software Different (Kate Ryan album) Software Computing platform Cuboid Procedural programming Computing platform
Point (geometry) Group action Momentum Materialization (paranormal) Function (mathematics) Open set Computer programming Wave packet Element (mathematics) Independence (probability theory) Data management Mathematics Statement (computer science) Form (programming) Standard deviation Pattern recognition Focus (optics) NP-hard Scaling (geometry) Dataflow Projective plane Expert system Horizon Open set Data management Statement (computer science) Self-organization Point cloud Quicksort Window
Point (geometry) Meta element Identifiability Interface (computing) Metadata Set (mathematics) Database Digital object identifier Open set Flow separation Metadata Formal language Element (mathematics) Number Different (Kate Ryan album) Repository (publishing) Personal digital assistant Search engine (computing) Self-organization Quicksort Routing Position operator Descriptive statistics Physical system
to start off what what are the fair data principles so the fair data principles
were drafted by forced 11 now force 11 is a community an international community of scholars and librarians archivists publishers and research funders that's what came together organically started in 2011 hence the force 11 and as being around ever since that date and what this group this community is looked at is to sort of facilitate change toward improved knowledge creation and sharing and as they were working on this in 2015 they came together and they said well be good to have some principles around research data and the sharing of that research data and how you can best do that so late in 2015 they crafted these for fair data principles and in early 2016 they wrote an article in which was published in nature about it and from that moment onwards the ball started rolling and these principles started to receive attention and international recognition sort of this is actually quite useful I think a number of things to keep in mind if you look at the fair data principles and probably the reason why they are attracting so much attention is there there are a number of things there I think one of the things there to note is that they don't just look at making research start a human readable but they also look at making research data machine readable and I think that offers a lot of opportunities into the future by making by thinking towards the future situation in which research start is machine readable can be harvested by machines can be pulled together can be used for big data approaches can be used for novel approaches in exploring data and knowledge creation and finding patterns and different knowledge out of that data I think is an interesting step into the future another thing that is quite valuable I think about the fair date principles is that they are technology agnostic there's if you read them you'll find there's no one recommendation to go with once we the technology it's formulated in a way that different types of technology can be used to solve solve the challenges another thing they did they've done quite well is to create a set of principles which are disciplined independent so the principles can be adopted across different disciplines in different ways meeting the needs of the specific discipline also if you look at the distance opposed it talks not only about the metadata and it not talks not only about the data but the two combined and where we're working on the metadata can enhance the visibility of the data for example or the reusability of it so as you probably have noticed by now there is an acronym and it stands for findable accessible interoperable and reusable the are reusable is the one that's sometimes we results in a bit of confusion people think that it's actually reproducible but it's actually reusable it's a broader concept so just keeps just keep that in mind we'll talk about each of those principles in more detail in the coming weeks before we move into the first one findable I have a few general pointers which probably worth keeping in mind as we look at the fair data principles so one of the
questions I get sometimes is well do you want all data to be fair and I don't think that is the case I've been think that is necessary and I don't think it fits in with research practice if you look at research isn't as in the process in which they create research data there are various steps in that process and in some cases huge volumes of data are being created out of experiments coming off instruments etc these huge volumes of data can't be kept or stored in that original form they often need to be manipulated analyzed processed etc so these huge volumes are working data are probably not suitable to be made findable accessible interoperable and reusable it's rather the data as it moves through those steps in the final resultant analyze data is probably more suitable for that purpose researchers all stand and sometimes you scratch data too explore different experiments explore different settings see how things work not all of that data is worth keeping or worth to keep using where I till the end now there are also cases in which there are there's research with commercial interests may be commercially funded even in that case there can be arguments why especially those commercial parties are not interested in having any of that research visible or public to the outside world in that they want to keep it quite to themselves that this research has taken place this also happens in case of national security and defence research so in those cases that probably does not make a lot of sense to make any of that research data findable accessible interoperable reusable one
question we sometimes get is well what about data that contains data about human subjects we've where there's privacy ethics considerations around the data should that data not also be kept hidden or private now there is a distinction here between open data and fair data so in open in the case of open data we're talking about making everything open in the case of fair actually talking about making it accessible through the appropriate routes and that doesn't have to be open so in in the case of human data that refers to human subjects identifiable data they might well be a very good argument why that data cannot be made openly available but it can be made accessible through appropriate routes in that case it would still be fair because it would still be accessible however it would just not be open we'll talk more
more about that next week when we get to the accessible point well what what the
fair data principles are not about and this is something that varies only sometimes it crops up in copyright law let's talk about fair use and fair dealing that's not capitalized that's in lower case that's something completely different and that's not related to the fair data two principles one of the other things I ran into recently was turns out that a number of market research companies actually have developed their own fair data mark which talks about how these market research companies treats the data that they collect as they're doing their market research that is also lowercase and that is completely not related to the fair data principles in capitals one other thing that's worth
keeping in mind is that fair is not an actual standard so some people expect say well I want to make my data fair and I want to make sure it fits all the boxes exactly you'll notice what as we started talking about the the Fair principles when digging to them in more detail it's actually not that black and white it is a set of principles it's a set of ideas about how you can approach it and how you will actually approach it in practice will probably depend on the discipline so there's not one standard therefore that will work across all disciplines another thing to keep in
mind about the third out principles is that if you want to achieve if you want to make more data more fair it's not just about the research data itself but it will actually do require some work around it so it will require a layer of underlying infrastructure and that can be human infrastructure electrical infrastructure which is in place so that a researcher does not have to do it all on their own but they'll actually will be things in place that will make it easier for the researcher to achieve making their data fair so things that you can think about there are policies around making the data fair procedures and guidelines that might be in place be great if there are tools or platforms or software in place that actually make it easier for the researcher to make their data fair at the end of that workflow and finally it's going to be important to have the skills and the skill set available to the researchers of research men the data managers librarians research analysts in research staff all the different staff members that are involved in that process to make it easier to make the data fair down the track so I think one of the questions I
get is why why is it now these specifically these fair data principles are coming up and why are these being adopted so widely or I think for one they've got an attractive acronym I think the other thing is that it covers quite nicely work that is already being done you look at them in more detail you'll find that some of the things that are covered there are actually things that that organizations around the country have been caring about for a while and in caring about more and more so some of it is probably not it's less about a completely novel approach but rather bringing together under a nice acronym in a you know well packaged form I think other reasons why it's proven to be useful to all the first of all its receiving a lot of international recognition is much as the national initiative if you look at the principles there actually there is actually quite a lot of detail hidden below them and quite useful detail the fact it is discipline independent makes it easy it is not as hard a sell as making all data open the only challenge here and this comes back to that point about the fair data principles not being a standard is that it is hard to measure it's hard to hold up to a list and say this data is fair and this data is not fair at all there is a more of a more of a scale from being less fair to more fair so if
you're looking at where fair has been picked up and in various ways there's plenty of examples out there I've just picked off a few here some of them international some of the National someone disciplinary so for example in the European Union the high-level expert group working on the European open and science cloud sort of picked up the fair data principles and embedded that in their work and their thinking around what the European open science cloud should look like if you look at the horizon 2020 funding program by the European Commission that's also drafted guidelines for data management and in those guidelines they also use the fair data principles if you look in the US NIH has just set up a data Commons pilot in which they wanted exploring and what a cloud would look like for for sharing research data and there they're also looking at the federated principles in the Netherlands initiatives being set up called go fair which is now reaching out to get more international momentum and more international partners that's also a very interesting development in that they they've they've looked at the fare principles and also how how you need different elements to support that including cultural change training and building an infrastructure to make sure that data can be made fair easily in the UK there's a currently a program project going on around fair in practice and taking the fare principles and exploring what that means in different disciplines the American Geophysical Union has just come up with a project in I think was only yesterday the press release window that what they were looking at is what it will mean to make data open and fair in Earth and Space Sciences it's falling that further and closer to home here in Australia one thing you might have already heard come by is the fair access to research outputs for policy statement which was drafted and is now available for support by institutions around the country and the focus there is very much around research outputs in the AOC NHMRC definition as in the publications and conference proceedings all sorts of publications materials and how those how those materials can also be made there so that was a long-winded introduction
more in general about the the fair data principles what the one principle I wanted to talk about today is the first of those and that's findable and if you look at the actual principles and the way it's described findable is broken down into four elements so for the research data to be findable in principles they say that the data and the metadata should be assigned a globally unique and eternally position identifier well in practice that just means it needs a either a DOI or a handle or a Perl some identifiers which is globally unique and eternally persistent and there's an organization that sits behind it that cares about making sure that that identifier will resolve to that data set even when that data set would move this is where that example of being technology independent comes up they don't recommend one over the other any of those solutions works as long as that identifier is in place and it gradually resolve second heading there is that they say that data should be described with rich metadata that's great however they don't specify what rich metadata means so there is this is one of those places where it's not black and white it's your data fair or not what we'd say is make sure that is enough metadata assigned to alongside the data so it can be far found that it that it sort of answers the right questions for us from somebody that's searching for your data the third heading talks about the metadata and the data being registered and indexed in a searchable resource so there's different ways in several ways to tackle this and a number of number of ways to think about that is while having a search interface locally a database locally some way of making sure that your data collection can be found so a search interface but what we'd also definitely recommend is making sure that the data collections are that the descriptions of the data collections are passed on to aggregators national aggregators for example we searched our Australia but there are also other aggregators out there are more disciplinary aggregators like turn they might go out into an international disciplinary aggregator like Olek open language archives community yeah and it could also be data can also be published in national international disciplinary repositories like Pangaea for earth and environmental sciences or the in the case of astronomy for example international virtual Observatory Alliance and the systems they have in place so there's various possible routes to publish your data but make sure that it goes into a place where it's can be searched can be found and also will be indexed by search engines like Google Scholar finally last point and this really comes back to the first one is if you're going to have a globally unique and eternally persistent identifier for the data collection like a DOI or handle Perle make sure that it's actually captured in the metadata okay so that was the first that was a quick overview of findable and the way that they have described findable