Research Data Alliance 2nd Plenary - reports - National and International Trends and Developments - 1st October 2013
Video in TIB AV-Portal:
Research Data Alliance 2nd Plenary - reports - National and International Trends and Developments - 1st October 2013
Formal Metadata
Title |
Research Data Alliance 2nd Plenary - reports - National and International Trends and Developments - 1st October 2013
|
Author |
|
License |
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. |
Identifiers |
|
Publisher |
|
Release Date |
2013
|
Language |
English
|
Content Metadata
Subject Area | |
Abstract |
October 1st 2013 - Ross Wilkinson reports on the second plenary meeting of the Research Data Alliance that was recently held in Washington.
|
00:00
Degree (graph theory)
Service (economics)
Existence
Integrated development environment
Universe (mathematics)
Moment (mathematics)
Self-organization
Generic programming
Lattice (order)
Table (information)
Error message
02:12
Slide rule
Presentation of a group
Momentum
Execution unit
Basis <Mathematik>
Lattice (order)
Food energy
Sphere
Perspective (visual)
Degree (graph theory)
Natural number
Internet service provider
Energy level
Website
Speech synthesis
Data Encryption Standard
Right angle
Library (computing)
Self-organization
05:49
Group action
Momentum
Key (cryptography)
Building
Image resolution
Sigma-algebra
State of matter
Group action
Term (mathematics)
Statistics
Uniform resource locator
Natural number
Bridging (networking)
Sheaf (mathematics)
Right angle
Momentum
Endliche Modelltheorie
Quicksort
Resultant
Spacetime
Chi-squared distribution
08:27
Standard deviation
Meta element
Group action
Software developer
Codierung <Programmierung>
Decision theory
Function (mathematics)
Perspective (visual)
Number
Word
Latent heat
Mechanism design
Term (mathematics)
Endliche Modelltheorie
Data conversion
Self-organization
Area
Computer font
Key (cryptography)
Building
Linear programming
Menu (computing)
Data analysis
Bit
Lattice (order)
Group action
Term (mathematics)
Website
Endliche Modelltheorie
Right angle
Momentum
Spacetime
11:26
Ocean current
Meta element
Standard deviation
Slide rule
Group action
Software developer
Codierung <Programmierung>
Propositional formula
Mereology
Word
Data management
Internet forum
Service (economics)
Memory management
Data analysis
Bit
Lattice (order)
Price index
Line (geometry)
Group action
Sign (mathematics)
Integrated development environment
Personal digital assistant
Website
Self-organization
Endliche Modelltheorie
Thermal conductivity
Spacetime
13:44
Standard deviation
Meta element
Group action
Multiplication sign
Combinational logic
1 (number)
Set (mathematics)
Numbering scheme
Bit rate
Formal language
Data management
Word
Type theory
Different (Kate Ryan album)
Endliche Modelltheorie
Category of being
Descriptive statistics
Area
Service (economics)
Data storage device
Metadata
Data analysis
Degree (graph theory)
Type theory
Category of being
Computer science
Self-organization
Endliche Modelltheorie
Quicksort
Data type
Arithmetic progression
Bounded variation
Sinc function
Convolutional code
Online chat
Spacetime
Windows Registry
Slide rule
Implementation
Identifiability
Variety (linguistics)
Codierung <Programmierung>
Computer-generated imagery
Virtual machine
Drop (liquid)
Code
Metadata
Emulation
Number
Mach's principle
Prototype
Factory (trading post)
Cartesian closed category
Software testing
Data type
Domain name
Standard deviation
Information
Forcing (mathematics)
Core dump
Binary file
Group action
Approximation
Natural language
Communications protocol
Abstraction
19:41
Standard deviation
Meta element
Slide rule
Asynchronous Transfer Mode
Context awareness
Group action
Service (economics)
Codierung <Programmierung>
View (database)
Virtual machine
Mereology
Area
Supercomputer
Data model
Friction
Latent heat
Type theory
Term (mathematics)
Repository (publishing)
Business model
Category of being
Data type
Area
Metadata
Core dump
Bit
Group action
Testbed
Uniform resource locator
Personal digital assistant
Service-oriented architecture
Spacetime
22:22
Context awareness
Group action
Multiplication sign
Cloud computing
Function (mathematics)
Mereology
Perspective (visual)
Public key certificate
Area
Data model
Mechanism design
Semiconductor memory
Single-precision floating-point format
Repository (publishing)
Endliche Modelltheorie
Extension (kinesiology)
Descriptive statistics
Physical system
Area
Moment (mathematics)
Electronic mailing list
Metadata
Data management
Repository (publishing)
Website
Self-organization
Right angle
Quicksort
Whiteboard
Sinc function
Spacetime
Service (economics)
Tournament (medieval)
Characteristic polynomial
Thresholding (image processing)
Metadata
Number
Friction
Internet forum
Whiteboard
Internetworking
Natural number
Term (mathematics)
Operator (mathematics)
Boundary value problem
Form (programming)
Capability Maturity Model
Self-organization
Standard deviation
Projective plane
Directory service
Group action
Personal digital assistant
Data center
31:26
Trail
Group action
Game controller
Key (cryptography)
Bit
Group action
Urinary bladder
Element (mathematics)
Connected space
Element (mathematics)
Mach's principle
Linker (computing)
Different (Kate Ryan album)
Whiteboard
Data structure
Position operator
Spacetime
33:41
Inheritance (object-oriented programming)
Linker (computing)
Hypermedia
Sine
Multiplication sign
Web page
Videoconferencing
Self-organization
00:00
thank you what I want to do is really talk about the fact that this thing called the research that Alliance is in existence and is becoming quite important just six months ago the
00:14
research that our alliance was launched in Gothenburg in Sweden and we started something about sharing data without
00:25
errors around the world no mean feat and yet incredibly important and completely in alignment with what we're doing here in Australia so Australia was one of the three founding members of RDA along with the European Union and the u.s. through the National Science Foundation so had some pretty heavy hitters behind it and people turned up to the first meeting with a degree of enthusiasm and looking forward to what might be from Australia Stephanie cutters and Andrew trailer and myself have been involved in helping perform Cartier but really the thing about our do is that it's what it's going to do and I'll get to that in the moment what its intended to do is to enable data to be shared more easily and rapidly and the reason for that of course is that the donor environment where end is changing really quite quickly Australia's made substantial investments in research data through investments like I mass and turn but also through more Jerrod generic investments such as our dsi and and the investments that many of the organizations that are sitting around this table of made cesaro's made substantial investments but many of the universities have as well and right around the traps we're seeing the need to treat data differently as occurred so what I want to do really today was to give you an update on the second plenary this was an interesting thing I
02:14
i have used friend Berman's slides as the basis for this presentation friend is one of the co counselors of the Russos data reliance and i spend many evenings talking about our do with with friend but the notion of three days of peace love and data is reminiscent of the first meeting and the notion of the Woodstock of research data occurring and it was very interesting so we had 368 participants from 22 countries and all sectors a huge amount of interest in this we had quite a lot of significant country interest in participating in RDA at under level with interest from Germany South Africa France England China South Korea New Zealand Taiwan and bliss does in fact go on one of the interesting things about RDA is that it clearly has momentum and so data site convened a data citation summit the following two days after and those of us with energy we need to attend there so
03:37
the nature of the meeting was essentially a working me but it started out with some clean oats Tom Kalil from the Office of Science and Technology Policy unit at the White House spoke the significance of that is that it's saying that right at the very top of government throughout the world we have people interested in these issues so the first plenary was opened by the Deputy Commissioner of European Commission whose know briefly escaped me we then had Tom Kalil speak and really backing up the comments made around Millie cruises the name that was escaping me briefly and perhaps you're typing in so we had the u.s. speaking after it maids major announcements about publicly funded data being publicly available whether it's from the research sphere or from the government sphere we then had as per usual and inspiring talk by John Wilbanks who is known for his work around the creative commons really driving the agenda that says that it's in everybody's interest including providers to make that data available and then Carol come providing an insight into some of the issues around making that that are available from a libraries perspective we had Claire McGlaughlin listed as from the Australian Embassy but most recently responsible for getting the interest to be too crude through government we had a bunch of people talking about why they saw it was of interest to work in participation in cooperation with our da from Coto de sep w3c and data science cetera so what this slide says is that there's a significant degree of interest at the under level in Rea now this one says that there's a lot
05:53
of people involved 50 countries have participation in RDA academics research sector private sector public then there's a few people who are not known in that space but the key is really that
06:13
RTI is building momentum so RTI is intended to enable work to be done and delivered rapidly I guess we use the metaphor of building bridges between data locations enabling people and data to connect through the bridges now sometimes those bridges are going to be essentially a technical in nature and sometimes they're going to be more in the social nature you know how do you in fact encourage those sorts of things but what are the things that is the hallmark of RDA is that the work needs to happen fast aria is not set up to be a 50-year initiative we have the notion of a ten-year initiative and what we're trying to achieve to is to get consensus and dated to be shared amongst those groups quickly what we want to try and do then is to put together in particular working groups who will deliver our results within 18 months so what we're trying to do is follow roughly speaking the IETF model of groups formed as a result of people coming together and saying we need a problem resolves working on that problem coming up with a resolution and then delivery so there were lots of birds of the fetish sessions established their their group starting to look at how they are going against the delivery of promises that they made six months ago because true that's one third of the way through the end of the group using that the model that we're talking about there are a whole lot of stuff that came out of that one is to ensure that the IP that is developed an interesting and working groups is protected so that is able for all to be used and also how do we in fact enable the RDA activities to be delivered but the the pictures on the right of the heart of what I do is it's not about people sitting in a big
08:27
auditorium listening to people talk it's about people doing and so these were the
08:37
360 people turning up at the meeting doing so are do is intended to be doing crew so what is it doing this so these
08:50
are the things that emerged during the meeting prior the meeting at the last meeting of areas where there was a decision made by people who practically came along and said yes I want to put some effort into thinking about or working on this to coming up with what might happen what I want to do now before I go into the detail of these is distinguished between the various bits so the working group is intended to be a short sharp piece of effort the tackles a problem and comes up with a solution that is adopted adoption is the key so what we wanted to see was work that was done and delivered and then used so from an Australian perspective we needed to know whether the work that was being done in this space was delivering so we'll come back to that in a little while the interest groups were where perhaps there were a number of pieces of work that could be done but there was an interest in the longer term in these areas and there was work needed to be done in in this space so there was if you like an area for conversation with the aim to take those pieces of conversation and spin-off working groups that were deliver now some of the activities that might make sense let me give you a good example with a community capability model which is really describing our areas ready to engage in that space where in some sense the output is less a specific example of data being shared and more an example of ideas being shared in the mechanism that would enable those things to occur but I think you can see from this that there are plenty of groups gathering together to discuss related issues there were a bunch of people who'd been through the hoop that is establishing a working group which is a well-described statement of work and importantly a notion of what will be delivered and then even more importantly who will take that up a much wider group of in just groups around the spaces and I'll talk about that a little bit and then as indicated that were a large amount of effort put into the data site citation activities in particular and then a data citation leading right at the end of that the data site absolutely all it
11:28
requires is going to the RDA website the research that our alliance site and then diving to there and express interest and there are chairs of all of those different groups who will enable you to participate the intention of RDA is that the plenary meetings are a small part of the work of our do the bulk of the work of RDA is occurring through the discussion group it is I think important that Australia starts to get more involved in this australia's Got hoops to offer but more importantly estrellas got heaps to benefit because Australia has been so actively involved in doing one of the things this gives this is an opportunity to implement things that are internationally agreed and secondly it gives us a chance to test the proposition that the approach being used at a particular institution will stand up in the international environment moreover these themes give you an indication of where their interest is around the world what other things are really would say is that is not only an opportunity to get involved in the current things but if there's something that's are important around research that are that isn't well represented currently you should create a group one of the things about that is that ends will work with you along those lines to support your involvement and engagement in that I've indicated that Andrew Stephanie I've been involved in the organization of our do but we're equally interested in supporting the work in the conduct of our do that regard it is the case that all of the meeting proceedings are available online through the RDA forum and you can click on the relevant
13:26
bits in it let's skip to the next slide the only thing that's worth noting here is the rapid growth in this space you see all of the list in bold new since last plenary says that there's a bit been going on so I thought I'd give you
13:47
a brief update on where things of going with the working group progress since they're the ones that perhaps furthest along the cars the data type registries was really agreeing that there would be registries of enabling descriptions so what I what sort of dated you have how to describe their how do you know if it's interoperable you need to have some sort of type registry that is put in place so that machine to machine capability is applied this has been led by Larry lemon at CNRI they've got an approximate vocabulary agreed upon and early prototypes see so for me this is almost Top of the Pops something that's very practical needed if you're actually going to do machine to machine data interoperability and likely to be delivered the second one metadata stairs started out being well we want to talk about metadata and that was not really a working group and so what's an out of that was a metadata interest group and this particular area morphed into if we're going to enable metadata to be used in the different places where we need to know what are the metadata standards that are floating around so the need for a registry of metadata standards is clear there are several around the work has transpired to be based on the DCC metadata standards registry because that's the most advanced in the world that has been identified by that group so I've heard in Australia a number of time people saying okay well we're going to use the metadata what's a nurturer we apply here's the space we read and we know that there are gazillions of metadata standards what this is looking for is some meta data centers that are commonly in use in the relevant space the next one is practical policy so this is how do you have sitting in front of your data a policy regime that is machine in force so this might be an access protocol this might the protocol that says okay this is how you can engage the data this might be a policy enactment that looks at the licensing regime and then applies appropriate combination policies to enable it to work so there is something like 13 different policies from around the traps being looked at Reagan more was is leading this and he's a driven person so something will come out of this that will deliver looking for more policies but the aim of this is to enable people who wanting to run a data store with appropriate policy in front that can be machined enacted to put in place policies that I actually going to stand the test of time the next one persistent identifiers types which is looking at well what are the types that are being used for peace not unrelated to the work of the data type registry but this is more specifically looking at how do you get the relevant types for kids gathering that information and putting that together working that's progressing one of the things you'll notice about these particular set of working groups is that they have degrees of overlap and arguably not the world's best titles requires a degree of cleverness to work out what they're really doing and so it's probably necessary to have a look at their paragraph descriptions that is again available on the RDA website the next one is perhaps the one where the computer scientists will be most comfortable which is the data foundations and terminology which is looking at an abstract data organization model for interoperability so this is looking at it again from a bunch of different areas where there is already data organization models and then looking at well what can be generalized out of that and so they come up with a relatively complex model already I suspect the model will get even more complex but city underneath there is going to be something that's relatively machine implementable that doesn't turn into a grand unified scheme of everything but something there's actually able to be used by a number of the other areas now the first working group that is in fact based on a specific domain is based on linguistics not that you'll be able to tell it from the title but is actually seeking data categories and codes for natural languages one of the interesting things is that based on an iso standard but the i associated only takes you so far when you're looking at how do you describe languages and language variations and how do you work at languages that are at small thing of different languages or languages that are changing etc so this is a very practical straightforward description where there'll be a shared nice base that's agreed so if you skip
19:26
back a slide back to this you'll see that agricultural data marine data oxygen can drop Seco genomics data is emerging so there are variety of areas that are emerging but if you look at
19:43
this slide it says that the bulk of the work started early on in just working in the kind of underlying areas now my view is that I di really needs the specifics to come forth relatively soon I then wanted to look at interest groups of Anne's Australian relevance because there's a lot and yet we have some engagement but not enormous amount we've got work in some of these areas they didn't mention that this work in the legal interoperability area where the Australian contribution is already significant it's not just Ann's contributions as decided that there are exemplars there so if i run through them brokering so this is about saying if you've got data in one location and data in another location and you need to get access to that data do you have to go off to the first location do the negotiation get the day I got the second location do the negotiation get to the data bring that Odin together and then do something interesting or can you ask a question of a broker which will then use machining engagements with those different bits to allow that to occur now of course in Australia we've been talking a lot about enabling data reuse and that's not just simply using data from a single location again this is saying well here is the lead where is that data located what are the mechanisms and available is to get out that data so I think there's a real need for that and we've not really been engaged in that big data is of course biggie in terms of space and lots of people talking about big data sometimes big data is a disassociate of where's high-performance computing because you need to have high performance computing next to that data and the data isn't so much transferable as you need to put services over the data where it lies and the data needs to live in something that's used to dealing with petabytes and so there may well be an opportunity to look at a group that's arising out of this that may set up a test bed in this context for doing big data issues and it may well be the case that we could have a an Australian host honor on what are the RDS I facilities for example
22:16
there is no business model currently part of the group will need to look at that skipping over to the right you'll
22:24
see there's a data economics interest group that is emerging which is really just asking a question who pays so who's going to pay for all of this infrastructure who's going to pay for making data available how is that occurring Australia has a model that has essentially been around upfront investment through the tournaments the initiatives at the various institutions putting in place mechanisms that enable all to access that model isn't being used internationally quite so much and is also sometimes arising out of discipline areas how do you infecting enable those things is the topic of that discussion the brokering discussion is really around well what do you need to have in place of course there are already answers to some of these questions within discipline and for me we will see succeed in RDA if we learn from discipline approaches but not if we learn from a single discipline approach because part of what we need to do in this space is integrate across those areas so trusted repositories at the moment there is no requirement to put your data in any particular form of repository you can put it anywhere so sometimes it might sit on a publisher's site sometimes of like pseudo an institutional site sometimes it might sit either an RDS I site sometimes it might sit on a datastore next to hpc facility and what we're trying to do is avoid that the data sits on a memory stick it could well be the case that under say if you generate data in your research project you should make that available and a possible threshold is that the repository you put it is trusted so what's it mean to have a trusted repository what are the characteristics of a trusted repository now the Dutch have been thinking about this problem quite a lot and quite high and coming up with certification regimes around around that spaced equally the world data system is looking at you know what does it mean for our data system to be certified as part of that space so there are a number of approaches floating around I think this is going to become a hot issue because i do think funders are going to say i want to know that the data is held in a place where there's a essentially a trusted operator if so how will you do that is a sort of gold stamp an elephant standards needed for a repository or are their descriptions of various methods that of making the repository of trusted one i think that will be relevant to us capability how do you know if you died already what are the things you should be thinking about half hour journey are you uses Liz lions capability model that you've developed in the context that this is CC and it's not unrelated to the ends that a maturity model what this one highlight is the fact that it's not simply about technical data interchange mechanisms that are needed here what's also needed are the social ingredients of this space so how do you know what how you going how do you know where you're up to now aunt has been putting a lot of emphasis on data publishing is lots of issues are such over that because there's a lot of people starting to say data publishing is important and mean different things by that so one of the workflows that sit behind publishing your data how do you know that your data is published in the influences of that you know one of the bibliometrics associated with that one of the nature of data publication services and who's going to pay the date of publication costs going over to that data economics one note of citation I won't spend much time on today since data citation has been the subject of other discussions in this forum suffice to say that there's a lot of engagement in this space and we think is very important as they mentioned legal interoperability is important how do you know that you can use somebody else's data how do you know whether you can combine the data how do you know whether you can make a better available and how do you do that in an international context most importantly because we know that research data does not respect boundaries but the Lord us how do we make sure we get that right and so having people from many jurisdictions involved in that discussion so we have Baden from Australia Paul Yulia from the US and a guarded and I briefly escaped me from spying has been three people who have been significantly involved in that discussion a marine data is a great example of a group from an Australian perspective in that marine data interoperability is relatively mature in the space the imax engagement with their european and US counterparts as d they are planning to set up the southern ocean observing system mechanism to exchange data and being hosted in australia so there's a lot of reasons why this area is one where australia would be very seriously engaged not just at i lost but also the agencies who have an interest in marine data there was a recent paper published by make friend and Vint Cerf who was listed as one of the founders of the internet asking the question recently who pays for research data because we know that data costs we have a model in Australia we have a completely different model in the US where there is no such thing really as infrastructure investment in research it all kind of comes through the research investment so when you want to put in place a new synchrotron you seek an NSF grant to do so now that doesn't really fit the australian model and there are really significant implications for that so David scientists are incredibly important than the US because you have to justify yourself in terms of a science agency whereas we pay at least as much attention to data management data librarianship data technologists etc are all terms that we use in this context so the models that sit behind this are going to need to work internationally even though different economic models are going to be applied locally metadata is clearly big and it's more than directories as a wide discussion in that space and finally I wanted to look your attention to preservation because we think they preservation is very important however that group is really not yet got going to the extent that one might imagine I think exactly how that group will work is still a little unclear but I do think that would be value in having Australian engagement moving on to how you get engaged so you
29:58
can get engaged as an individual you can get engaged as an organization there's a organizational advisory board which will govern the organizational assembly there's a lot of organizations that are thinking of joining we think the Australian tactic data center is a really good target in that list intersect would be good to an etc so those are lots of ways in which those discussions can occur what this really means is that it provides a small amount of funding but the opportunity is that it enables institutions that see this is important as a way of kind of early access to the outputs of RDA and equally to influence the thinking of RTI as to what are the things that we need to have taken up it also provides the rear future proofing if you're going to adopt a particular approach hell does that sit internationally another relationship is organizational affiliation where there are a bunch of organizations which don't really want to be a member of rei and it doesn't really make sense for our da to be a member of them but nevertheless a relationship that exists in that space so we will be creating a legal entity which would enable organizations to join and have something to join by the next plenary that will be up in place so next
31:26
plenary here is where it is it's in Dublin and it's hosted by australia australia and ireland are cooperating over holy holy matter right there plenary fire formed within netherlands probably amsterdam and plenary five or six is likely to be back so holding a min twice a year September and March is roughly the dates insult them in so back
31:54
to how it's all coming together there's the the RDA colloquium which is the people who give us money sitting on the top with no actual connection to the rest other than money flows so the RDA Council I'm involved in that along with six other people from around the globe we have some way from England Germany France to from the US one from Botswana and one from Australia the technical advisory board is looking at ensuring we're on the right track technically it's not saying what we will do because groups will say what we do it's simply there to enable us to achieve maximum impact in that space and in control or is involved in this we have recruiting for an expanded technical advisory board the Secretary General is a person who runs the Secretariat works with Council works with colloquium is a doer is a community activist and doesn't yet exist so we're recruiting for that position soon we hope to have that position put in place by Christmas and then we've got an organizational advisory board and assembly which is doing that work that I just talked about but the heart of the work is done by the working groups the interest groups people participating from the membership into all of these different discussion groups saying I think we should do this and this is why not quite so sure about that because of this so the pictures not right because there's a little bit of structure and the big bid of working groups and RD a membership there that's the key to the space so those are the
33:43
people involved is where it will be next
33:47
and that's the end like say thanks to
33:50
everybody for participation then it's
33:53
being an important issue in them I'm
33:55
glad you took the time
