Research Data Alliance 2nd Plenary - reports - National and International Trends and Developments - 1st October 2013

Video thumbnail (Frame 0) Video thumbnail (Frame 3293) Video thumbnail (Frame 5400) Video thumbnail (Frame 8737) Video thumbnail (Frame 12663) Video thumbnail (Frame 17139) Video thumbnail (Frame 20146) Video thumbnail (Frame 29130) Video thumbnail (Frame 33405) Video thumbnail (Frame 44877) Video thumbnail (Frame 47152) Video thumbnail (Frame 47822) Video thumbnail (Frame 50521)
Video in TIB AV-Portal: Research Data Alliance 2nd Plenary - reports - National and International Trends and Developments - 1st October 2013

Formal Metadata

Research Data Alliance 2nd Plenary - reports - National and International Trends and Developments - 1st October 2013
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
October 1st 2013 - Ross Wilkinson reports on the second plenary meeting of the Research Data Alliance that was recently held in Washington.
Degree (graph theory) Service (economics) Existence Integrated development environment Universe (mathematics) Moment (mathematics) Self-organization Generic programming Lattice (order) Table (information) Error message
Slide rule Presentation of a group Momentum Execution unit Basis <Mathematik> Lattice (order) Food energy Sphere Perspective (visual) Degree (graph theory) Natural number Internet service provider Energy level Website Speech synthesis Data Encryption Standard Right angle Library (computing) Self-organization
Group action Momentum Key (cryptography) Building Image resolution Sigma-algebra State of matter Group action Term (mathematics) Statistics Uniform resource locator Natural number Bridging (networking) Sheaf (mathematics) Right angle Momentum Endliche Modelltheorie Quicksort Resultant Spacetime Chi-squared distribution
Standard deviation Meta element Group action Software developer Codierung <Programmierung> Decision theory Function (mathematics) Perspective (visual) Number Word Latent heat Mechanism design Term (mathematics) Endliche Modelltheorie Data conversion Self-organization Area Computer font Key (cryptography) Building Linear programming Menu (computing) Data analysis Bit Lattice (order) Group action Term (mathematics) Website Endliche Modelltheorie Right angle Momentum Spacetime
Ocean current Meta element Standard deviation Slide rule Group action Software developer Codierung <Programmierung> Propositional formula Mereology Word Data management Internet forum Service (economics) Memory management Data analysis Bit Lattice (order) Price index Line (geometry) Group action Sign (mathematics) Integrated development environment Personal digital assistant Website Self-organization Endliche Modelltheorie Thermal conductivity Spacetime
Standard deviation Meta element Group action Multiplication sign Combinational logic 1 (number) Set (mathematics) Numbering scheme Bit rate Formal language Data management Word Type theory Different (Kate Ryan album) Endliche Modelltheorie Category of being Descriptive statistics Area Service (economics) Data storage device Metadata Data analysis Degree (graph theory) Type theory Category of being Computer science Self-organization Endliche Modelltheorie Quicksort Data type Arithmetic progression Bounded variation Sinc function Convolutional code Online chat Spacetime Windows Registry Slide rule Implementation Identifiability Variety (linguistics) Codierung <Programmierung> Computer-generated imagery Virtual machine Drop (liquid) Code Metadata Emulation Number Mach's principle Prototype Factory (trading post) Cartesian closed category Software testing Data type Domain name Standard deviation Information Forcing (mathematics) Core dump Binary file Group action Approximation Natural language Communications protocol Abstraction
Standard deviation Meta element Slide rule Asynchronous Transfer Mode Context awareness Group action Service (economics) Codierung <Programmierung> View (database) Virtual machine Mereology Area Supercomputer Data model Friction Latent heat Type theory Term (mathematics) Repository (publishing) Business model Category of being Data type Area Metadata Core dump Bit Group action Testbed Uniform resource locator Personal digital assistant Service-oriented architecture Spacetime
Context awareness Group action Multiplication sign Cloud computing Function (mathematics) Mereology Perspective (visual) Public key certificate Area Data model Mechanism design Semiconductor memory Single-precision floating-point format Repository (publishing) Endliche Modelltheorie Extension (kinesiology) Descriptive statistics Physical system Area Moment (mathematics) Electronic mailing list Metadata Data management Repository (publishing) Website Self-organization Right angle Quicksort Whiteboard Sinc function Spacetime Service (economics) Tournament (medieval) Characteristic polynomial Thresholding (image processing) Metadata Number Friction Internet forum Whiteboard Internetworking Natural number Term (mathematics) Operator (mathematics) Boundary value problem Form (programming) Capability Maturity Model Self-organization Standard deviation Projective plane Directory service Group action Personal digital assistant Data center
Trail Group action Game controller Key (cryptography) Bit Group action Urinary bladder Element (mathematics) Connected space Element (mathematics) Mach's principle Linker (computing) Different (Kate Ryan album) Whiteboard Data structure Position operator Spacetime
Inheritance (object-oriented programming) Linker (computing) Hypermedia Sine Multiplication sign Web page Videoconferencing Self-organization
thank you what I want to do is really talk about the fact that this thing called the research that Alliance is in existence and is becoming quite important just six months ago the
research that our alliance was launched in Gothenburg in Sweden and we started something about sharing data without
errors around the world no mean feat and yet incredibly important and completely in alignment with what we're doing here in Australia so Australia was one of the three founding members of RDA along with the European Union and the u.s. through the National Science Foundation so had some pretty heavy hitters behind it and people turned up to the first meeting with a degree of enthusiasm and looking forward to what might be from Australia Stephanie cutters and Andrew trailer and myself have been involved in helping perform Cartier but really the thing about our do is that it's what it's going to do and I'll get to that in the moment what its intended to do is to enable data to be shared more easily and rapidly and the reason for that of course is that the donor environment where end is changing really quite quickly Australia's made substantial investments in research data through investments like I mass and turn but also through more Jerrod generic investments such as our dsi and and the investments that many of the organizations that are sitting around this table of made cesaro's made substantial investments but many of the universities have as well and right around the traps we're seeing the need to treat data differently as occurred so what I want to do really today was to give you an update on the second plenary this was an interesting thing I
i have used friend Berman's slides as the basis for this presentation friend is one of the co counselors of the Russos data reliance and i spend many evenings talking about our do with with friend but the notion of three days of peace love and data is reminiscent of the first meeting and the notion of the Woodstock of research data occurring and it was very interesting so we had 368 participants from 22 countries and all sectors a huge amount of interest in this we had quite a lot of significant country interest in participating in RDA at under level with interest from Germany South Africa France England China South Korea New Zealand Taiwan and bliss does in fact go on one of the interesting things about RDA is that it clearly has momentum and so data site convened a data citation summit the following two days after and those of us with energy we need to attend there so
the nature of the meeting was essentially a working me but it started out with some clean oats Tom Kalil from the Office of Science and Technology Policy unit at the White House spoke the significance of that is that it's saying that right at the very top of government throughout the world we have people interested in these issues so the first plenary was opened by the Deputy Commissioner of European Commission whose know briefly escaped me we then had Tom Kalil speak and really backing up the comments made around Millie cruises the name that was escaping me briefly and perhaps you're typing in so we had the u.s. speaking after it maids major announcements about publicly funded data being publicly available whether it's from the research sphere or from the government sphere we then had as per usual and inspiring talk by John Wilbanks who is known for his work around the creative commons really driving the agenda that says that it's in everybody's interest including providers to make that data available and then Carol come providing an insight into some of the issues around making that that are available from a libraries perspective we had Claire McGlaughlin listed as from the Australian Embassy but most recently responsible for getting the interest to be too crude through government we had a bunch of people talking about why they saw it was of interest to work in participation in cooperation with our da from Coto de sep w3c and data science cetera so what this slide says is that there's a significant degree of interest at the under level in Rea now this one says that there's a lot
of people involved 50 countries have participation in RDA academics research sector private sector public then there's a few people who are not known in that space but the key is really that
RTI is building momentum so RTI is intended to enable work to be done and delivered rapidly I guess we use the metaphor of building bridges between data locations enabling people and data to connect through the bridges now sometimes those bridges are going to be essentially a technical in nature and sometimes they're going to be more in the social nature you know how do you in fact encourage those sorts of things but what are the things that is the hallmark of RDA is that the work needs to happen fast aria is not set up to be a 50-year initiative we have the notion of a ten-year initiative and what we're trying to achieve to is to get consensus and dated to be shared amongst those groups quickly what we want to try and do then is to put together in particular working groups who will deliver our results within 18 months so what we're trying to do is follow roughly speaking the IETF model of groups formed as a result of people coming together and saying we need a problem resolves working on that problem coming up with a resolution and then delivery so there were lots of birds of the fetish sessions established their their group starting to look at how they are going against the delivery of promises that they made six months ago because true that's one third of the way through the end of the group using that the model that we're talking about there are a whole lot of stuff that came out of that one is to ensure that the IP that is developed an interesting and working groups is protected so that is able for all to be used and also how do we in fact enable the RDA activities to be delivered but the the pictures on the right of the heart of what I do is it's not about people sitting in a big
auditorium listening to people talk it's about people doing and so these were the
360 people turning up at the meeting doing so are do is intended to be doing crew so what is it doing this so these
are the things that emerged during the meeting prior the meeting at the last meeting of areas where there was a decision made by people who practically came along and said yes I want to put some effort into thinking about or working on this to coming up with what might happen what I want to do now before I go into the detail of these is distinguished between the various bits so the working group is intended to be a short sharp piece of effort the tackles a problem and comes up with a solution that is adopted adoption is the key so what we wanted to see was work that was done and delivered and then used so from an Australian perspective we needed to know whether the work that was being done in this space was delivering so we'll come back to that in a little while the interest groups were where perhaps there were a number of pieces of work that could be done but there was an interest in the longer term in these areas and there was work needed to be done in in this space so there was if you like an area for conversation with the aim to take those pieces of conversation and spin-off working groups that were deliver now some of the activities that might make sense let me give you a good example with a community capability model which is really describing our areas ready to engage in that space where in some sense the output is less a specific example of data being shared and more an example of ideas being shared in the mechanism that would enable those things to occur but I think you can see from this that there are plenty of groups gathering together to discuss related issues there were a bunch of people who'd been through the hoop that is establishing a working group which is a well-described statement of work and importantly a notion of what will be delivered and then even more importantly who will take that up a much wider group of in just groups around the spaces and I'll talk about that a little bit and then as indicated that were a large amount of effort put into the data site citation activities in particular and then a data citation leading right at the end of that the data site absolutely all it
requires is going to the RDA website the research that our alliance site and then diving to there and express interest and there are chairs of all of those different groups who will enable you to participate the intention of RDA is that the plenary meetings are a small part of the work of our do the bulk of the work of RDA is occurring through the discussion group it is I think important that Australia starts to get more involved in this australia's Got hoops to offer but more importantly estrellas got heaps to benefit because Australia has been so actively involved in doing one of the things this gives this is an opportunity to implement things that are internationally agreed and secondly it gives us a chance to test the proposition that the approach being used at a particular institution will stand up in the international environment moreover these themes give you an indication of where their interest is around the world what other things are really would say is that is not only an opportunity to get involved in the current things but if there's something that's are important around research that are that isn't well represented currently you should create a group one of the things about that is that ends will work with you along those lines to support your involvement and engagement in that I've indicated that Andrew Stephanie I've been involved in the organization of our do but we're equally interested in supporting the work in the conduct of our do that regard it is the case that all of the meeting proceedings are available online through the RDA forum and you can click on the relevant
bits in it let's skip to the next slide the only thing that's worth noting here is the rapid growth in this space you see all of the list in bold new since last plenary says that there's a bit been going on so I thought I'd give you
a brief update on where things of going with the working group progress since they're the ones that perhaps furthest along the cars the data type registries was really agreeing that there would be registries of enabling descriptions so what I what sort of dated you have how to describe their how do you know if it's interoperable you need to have some sort of type registry that is put in place so that machine to machine capability is applied this has been led by Larry lemon at CNRI they've got an approximate vocabulary agreed upon and early prototypes see so for me this is almost Top of the Pops something that's very practical needed if you're actually going to do machine to machine data interoperability and likely to be delivered the second one metadata stairs started out being well we want to talk about metadata and that was not really a working group and so what's an out of that was a metadata interest group and this particular area morphed into if we're going to enable metadata to be used in the different places where we need to know what are the metadata standards that are floating around so the need for a registry of metadata standards is clear there are several around the work has transpired to be based on the DCC metadata standards registry because that's the most advanced in the world that has been identified by that group so I've heard in Australia a number of time people saying okay well we're going to use the metadata what's a nurturer we apply here's the space we read and we know that there are gazillions of metadata standards what this is looking for is some meta data centers that are commonly in use in the relevant space the next one is practical policy so this is how do you have sitting in front of your data a policy regime that is machine in force so this might be an access protocol this might the protocol that says okay this is how you can engage the data this might be a policy enactment that looks at the licensing regime and then applies appropriate combination policies to enable it to work so there is something like 13 different policies from around the traps being looked at Reagan more was is leading this and he's a driven person so something will come out of this that will deliver looking for more policies but the aim of this is to enable people who wanting to run a data store with appropriate policy in front that can be machined enacted to put in place policies that I actually going to stand the test of time the next one persistent identifiers types which is looking at well what are the types that are being used for peace not unrelated to the work of the data type registry but this is more specifically looking at how do you get the relevant types for kids gathering that information and putting that together working that's progressing one of the things you'll notice about these particular set of working groups is that they have degrees of overlap and arguably not the world's best titles requires a degree of cleverness to work out what they're really doing and so it's probably necessary to have a look at their paragraph descriptions that is again available on the RDA website the next one is perhaps the one where the computer scientists will be most comfortable which is the data foundations and terminology which is looking at an abstract data organization model for interoperability so this is looking at it again from a bunch of different areas where there is already data organization models and then looking at well what can be generalized out of that and so they come up with a relatively complex model already I suspect the model will get even more complex but city underneath there is going to be something that's relatively machine implementable that doesn't turn into a grand unified scheme of everything but something there's actually able to be used by a number of the other areas now the first working group that is in fact based on a specific domain is based on linguistics not that you'll be able to tell it from the title but is actually seeking data categories and codes for natural languages one of the interesting things is that based on an iso standard but the i associated only takes you so far when you're looking at how do you describe languages and language variations and how do you work at languages that are at small thing of different languages or languages that are changing etc so this is a very practical straightforward description where there'll be a shared nice base that's agreed so if you skip
back a slide back to this you'll see that agricultural data marine data oxygen can drop Seco genomics data is emerging so there are variety of areas that are emerging but if you look at
this slide it says that the bulk of the work started early on in just working in the kind of underlying areas now my view is that I di really needs the specifics to come forth relatively soon I then wanted to look at interest groups of Anne's Australian relevance because there's a lot and yet we have some engagement but not enormous amount we've got work in some of these areas they didn't mention that this work in the legal interoperability area where the Australian contribution is already significant it's not just Ann's contributions as decided that there are exemplars there so if i run through them brokering so this is about saying if you've got data in one location and data in another location and you need to get access to that data do you have to go off to the first location do the negotiation get the day I got the second location do the negotiation get to the data bring that Odin together and then do something interesting or can you ask a question of a broker which will then use machining engagements with those different bits to allow that to occur now of course in Australia we've been talking a lot about enabling data reuse and that's not just simply using data from a single location again this is saying well here is the lead where is that data located what are the mechanisms and available is to get out that data so I think there's a real need for that and we've not really been engaged in that big data is of course biggie in terms of space and lots of people talking about big data sometimes big data is a disassociate of where's high-performance computing because you need to have high performance computing next to that data and the data isn't so much transferable as you need to put services over the data where it lies and the data needs to live in something that's used to dealing with petabytes and so there may well be an opportunity to look at a group that's arising out of this that may set up a test bed in this context for doing big data issues and it may well be the case that we could have a an Australian host honor on what are the RDS I facilities for example
there is no business model currently part of the group will need to look at that skipping over to the right you'll
see there's a data economics interest group that is emerging which is really just asking a question who pays so who's going to pay for all of this infrastructure who's going to pay for making data available how is that occurring Australia has a model that has essentially been around upfront investment through the tournaments the initiatives at the various institutions putting in place mechanisms that enable all to access that model isn't being used internationally quite so much and is also sometimes arising out of discipline areas how do you infecting enable those things is the topic of that discussion the brokering discussion is really around well what do you need to have in place of course there are already answers to some of these questions within discipline and for me we will see succeed in RDA if we learn from discipline approaches but not if we learn from a single discipline approach because part of what we need to do in this space is integrate across those areas so trusted repositories at the moment there is no requirement to put your data in any particular form of repository you can put it anywhere so sometimes it might sit on a publisher's site sometimes of like pseudo an institutional site sometimes it might sit either an RDS I site sometimes it might sit on a datastore next to hpc facility and what we're trying to do is avoid that the data sits on a memory stick it could well be the case that under say if you generate data in your research project you should make that available and a possible threshold is that the repository you put it is trusted so what's it mean to have a trusted repository what are the characteristics of a trusted repository now the Dutch have been thinking about this problem quite a lot and quite high and coming up with certification regimes around around that spaced equally the world data system is looking at you know what does it mean for our data system to be certified as part of that space so there are a number of approaches floating around I think this is going to become a hot issue because i do think funders are going to say i want to know that the data is held in a place where there's a essentially a trusted operator if so how will you do that is a sort of gold stamp an elephant standards needed for a repository or are their descriptions of various methods that of making the repository of trusted one i think that will be relevant to us capability how do you know if you died already what are the things you should be thinking about half hour journey are you uses Liz lions capability model that you've developed in the context that this is CC and it's not unrelated to the ends that a maturity model what this one highlight is the fact that it's not simply about technical data interchange mechanisms that are needed here what's also needed are the social ingredients of this space so how do you know what how you going how do you know where you're up to now aunt has been putting a lot of emphasis on data publishing is lots of issues are such over that because there's a lot of people starting to say data publishing is important and mean different things by that so one of the workflows that sit behind publishing your data how do you know that your data is published in the influences of that you know one of the bibliometrics associated with that one of the nature of data publication services and who's going to pay the date of publication costs going over to that data economics one note of citation I won't spend much time on today since data citation has been the subject of other discussions in this forum suffice to say that there's a lot of engagement in this space and we think is very important as they mentioned legal interoperability is important how do you know that you can use somebody else's data how do you know whether you can combine the data how do you know whether you can make a better available and how do you do that in an international context most importantly because we know that research data does not respect boundaries but the Lord us how do we make sure we get that right and so having people from many jurisdictions involved in that discussion so we have Baden from Australia Paul Yulia from the US and a guarded and I briefly escaped me from spying has been three people who have been significantly involved in that discussion a marine data is a great example of a group from an Australian perspective in that marine data interoperability is relatively mature in the space the imax engagement with their european and US counterparts as d they are planning to set up the southern ocean observing system mechanism to exchange data and being hosted in australia so there's a lot of reasons why this area is one where australia would be very seriously engaged not just at i lost but also the agencies who have an interest in marine data there was a recent paper published by make friend and Vint Cerf who was listed as one of the founders of the internet asking the question recently who pays for research data because we know that data costs we have a model in Australia we have a completely different model in the US where there is no such thing really as infrastructure investment in research it all kind of comes through the research investment so when you want to put in place a new synchrotron you seek an NSF grant to do so now that doesn't really fit the australian model and there are really significant implications for that so David scientists are incredibly important than the US because you have to justify yourself in terms of a science agency whereas we pay at least as much attention to data management data librarianship data technologists etc are all terms that we use in this context so the models that sit behind this are going to need to work internationally even though different economic models are going to be applied locally metadata is clearly big and it's more than directories as a wide discussion in that space and finally I wanted to look your attention to preservation because we think they preservation is very important however that group is really not yet got going to the extent that one might imagine I think exactly how that group will work is still a little unclear but I do think that would be value in having Australian engagement moving on to how you get engaged so you
can get engaged as an individual you can get engaged as an organization there's a organizational advisory board which will govern the organizational assembly there's a lot of organizations that are thinking of joining we think the Australian tactic data center is a really good target in that list intersect would be good to an etc so those are lots of ways in which those discussions can occur what this really means is that it provides a small amount of funding but the opportunity is that it enables institutions that see this is important as a way of kind of early access to the outputs of RDA and equally to influence the thinking of RTI as to what are the things that we need to have taken up it also provides the rear future proofing if you're going to adopt a particular approach hell does that sit internationally another relationship is organizational affiliation where there are a bunch of organizations which don't really want to be a member of rei and it doesn't really make sense for our da to be a member of them but nevertheless a relationship that exists in that space so we will be creating a legal entity which would enable organizations to join and have something to join by the next plenary that will be up in place so next
plenary here is where it is it's in Dublin and it's hosted by australia australia and ireland are cooperating over holy holy matter right there plenary fire formed within netherlands probably amsterdam and plenary five or six is likely to be back so holding a min twice a year September and March is roughly the dates insult them in so back
to how it's all coming together there's the the RDA colloquium which is the people who give us money sitting on the top with no actual connection to the rest other than money flows so the RDA Council I'm involved in that along with six other people from around the globe we have some way from England Germany France to from the US one from Botswana and one from Australia the technical advisory board is looking at ensuring we're on the right track technically it's not saying what we will do because groups will say what we do it's simply there to enable us to achieve maximum impact in that space and in control or is involved in this we have recruiting for an expanded technical advisory board the Secretary General is a person who runs the Secretariat works with Council works with colloquium is a doer is a community activist and doesn't yet exist so we're recruiting for that position soon we hope to have that position put in place by Christmas and then we've got an organizational advisory board and assembly which is doing that work that I just talked about but the heart of the work is done by the working groups the interest groups people participating from the membership into all of these different discussion groups saying I think we should do this and this is why not quite so sure about that because of this so the pictures not right because there's a little bit of structure and the big bid of working groups and RD a membership there that's the key to the space so those are the
people involved is where it will be next
and that's the end like say thanks to
everybody for participation then it's
being an important issue in them I'm
glad you took the time