Publishing Data in the Context of ICSU World Data System
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 24 | |
Autor | ||
Lizenz | CC-Namensnennung 3.0 Deutschland: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/15290 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache | ||
Produktionsjahr | 2014 | |
Produktionsort | Nancy, France |
Inhaltliche Metadaten
Fachgebiet | |
Genre |
11
13
24
00:00
Kontextbezogenes SystemPhysikalisches SystemProgrammRoboterBildverstehenInformationSimulationVisualisierungAnalysisMotion CapturingInformationsspeicherungTermInformation RetrievalDienst <Informatik>Framework <Informatik>Element <Gruppentheorie>KontinuumshypotheseFrequenzSelbst organisierendes SystemFunktionalWeb SiteTermElektronisches ForumSoftwareKollaboration <Informatik>Domain <Netzwerk>Coxeter-GruppeGruppenoperationProgrammierungDigitalisierungWhiteboardGüte der AnpassungDisjunktion <Logik>Dienst <Informatik>VerschlingungAssoziativgesetzPhysikalisches SystemZahlenbereichMAPProgrammierumgebungFokalpunktPackprogrammElement <Gruppentheorie>Bridge <Kommunikationstechnik>Projektive EbeneRegulärer GraphFramework <Informatik>VisualisierungDistributionenraumGrundraumMereologieAutomatische HandlungsplanungRechenzentrumFlächeninhaltSoftwareentwicklerRepository <Informatik>IntegralDatenanalyseKategorie <Mathematik>InformationKoordinatenEinsRechenschieberSummengleichungOnline-KatalogBildverstehenVollständiger VerbandExpertensystemSuite <Programmpaket>WechselsprungPunktHypermediaDatenaustauschInformationsspeicherungWorkstation <Musikinstrument>Vorzeichen <Mathematik>Freier LadungsträgerStabOrdnung <Mathematik>Strategisches SpielWort <Informatik>Rechter WinkelDivergente ReiheGanze FunktionSpieltheorieRadikal <Mathematik>Prozess <Informatik>ProgrammiergerätIdentitätsverwaltungSprachsyntheseMotion CapturingBitLokales MinimumMatchingGewicht <Ausgleichsrechnung>Computeranimation
08:15
Dienst <Informatik>SteuerwerkDomain <Netzwerk>GruppenoperationGenerizitätEndliche ModelltheorieWiederherstellung <Informatik>AnalysisDokumentenserverSoftwareProjektive EbeneVerschlingungOffene MengeFitnessfunktionBitCoxeter-GruppeGüte der AnpassungGruppenoperationSchnittmengeRepository <Informatik>RelativitätstheorieDistributionenraumGebäude <Mathematik>Arithmetisches MittelUnternehmensmodellRechenzentrumDatenmissbrauchDienst <Informatik>MereologieSelbstrepräsentationTermImplementierungDokumentenserverKoalitionMetadatenMAPService providerInhalt <Mathematik>Einfach zusammenhängender RaumPhysikalisches SystemKontextbezogenes SystemQuick-SortEndliche ModelltheorieFlächeninhaltWiederherstellung <Informatik>Shape <Informatik>ProgrammbibliothekAdressraumAnalysisGrundraumGenerizitätBridge <Kommunikationstechnik>Nichtlinearer OperatorPunktPay-TVOrtsoperatorFramework <Informatik>Mailing-ListeElementargeometrieZahlenbereichMathematikEigentliche AbbildungRechenschieberFokalpunktVarietät <Mathematik>AggregatzustandTopologieMomentenproblemPhysikalismusGreen-FunktionSoftwaretestWort <Informatik>MultiplikationsoperatorZentrische StreckungPolygonnetzGeradeVerkehrsinformationKartesische KoordinatenTechnische OptikVorzeichen <Mathematik>Inverser LimesNeuroinformatikCodeWellenpaketHypermediaBenutzerschnittstellenverwaltungssystemSondierungMatchingTabelleComputeranimation
16:30
DokumentenserverW3C-StandardQuellcodeDienst <Informatik>RechnernetzKomponente <Software>KanalkapazitätStandardabweichungOffene MengeDigitalsignalFramework <Informatik>KonfigurationsdatenbankGruppenoperationSmith-DiagrammPersonal Area NetworkMenütechnikWitt-AlgebraLokales MinimumRechenzentrumStandardabweichungSoftwareSelbst organisierendes SystemAuswahlaxiomDienst <Informatik>IdentifizierbarkeitDigitales ZertifikatBeobachtungsstudieMultiplikationsoperatorGruppenoperationOrtsoperatorOrdnung <Mathematik>Projektive EbeneWeb-ApplikationRechenschieberWeb SiteZusammenhängender GraphTabelleFunktionalFigurierte ZahlElektronische BibliothekExogene VariableFramework <Informatik>Service providerTeilmengeData MiningRepository <Informatik>TermSoftwareentwicklerKonfigurationsdatenbankDatenfeldBildverstehenDokumentenserverSelbstrepräsentationCoxeter-GruppeMereologieKategorie <Mathematik>DatenmissbrauchFlächeninhaltKontextbezogenes SystemPhysikalisches SystemPunktOffene MengeHypermediaProgrammierungVerschlingungAdditionMetadatenFunktion <Mathematik>Element <Gruppentheorie>TopologieVererbungshierarchieBitQuick-SortAggregatzustandMomentenproblemPartikelsystemProgrammiergerätBitrateKartesische KoordinatenMobiles InternetGesetz <Physik>DivisionWhiteboardVorzeichen <Mathematik>ResultanteVerband <Mathematik>
24:45
Computeranimation
Transkript: Englisch(automatisch erzeugt)
00:00
So, good morning and thank you Adam for the introduction and thank you Herbert for the invitation to give this presentation for the data site annual meeting. So, as I was introduced, I'm Mustafa Mokrain, I'm the Executive Director of the International Program Office of the World Data System based in Tokyo since actually May 2012 were based in Tokyo. Before we
00:28
have a permanent secretariat for the former system, the World Data Centers, so this is really a big change for for the system. I would like to first introduce the World Data System as an interdisciplinary body of ICSU, the
00:44
International Council for Science. You heard yesterday from Simon that Simon Hudson for data also is an interdisciplinary body of ICSU, so ICSU is a world with data issues operating under ICSU and I highlighted
01:02
here on this slide actually the ICSU long-term vision when it comes to data questions, data related issues. So, the long-term vision of ICSU is of a world where excellence in science is efficiently translated into policymaking and socio-economic development and in such a world, universal and equitable
01:24
access to scientific data and information is a reality. So, you can understand very easily that both GoData and the World Data System have a direct mandate under the ICSU umbrella to operate in this area of universal and equitable access to scientific data and information and of course linked to
01:42
the research programs sponsored by ICSU because ICSU is mostly concerned with sponsoring international research collaboration and also a link with science policy, providing a link with science policy. So, the World Data System is actually governed by a unique body, so we keep bureaucracy to its minimum.
02:02
We have a scientific committee appointed by the ICSU executive board and as you can see it's a very diverse, geographically speaking, group of leading data experts and also science experts, so it's a good balance between research and data competencies on this group. Some of the people like
02:24
Francois Genova are sitting in this room, so I'd like to highlight their presence here and it's chaired by Bernard Minster. So, beyond the governing body, WDS is a membership organization and actually we're trying to
02:44
define WDS members as scientific data services and what we mean by scientific data services are basically services that assist organizations in the capture, storage, curation, long-term preservation, discovery, access, retrieval,
03:05
aggregation, analysis, visualization of scientific data, as well as other associated services like legal frameworks and of course all of this in support of disciplinary and multidisciplinary scientific research because this is clearly the focus for ICSU.
03:32
So, what about actually the concrete membership of the World Data System? So, it builds on a historical legacy of the former World Data Centers, the ICSU
03:43
World Data Centers and also the Federation of Data Analysis Services, but it's really a rebranded services. We have four categories of membership, regular members, network members, partner members and associate members. Regular
04:00
and network members are actually the data holders or the data stewards as we're trying to define them currently. So, we have currently certified 57 regular members, basically data repositories, multidisciplinary or disciplinary data repositories and they're entrusted with this function of
04:22
being data stewards and also they provide data analysis services. We have also network members and the network members also are certified according to a catalogue of criteria to ensure their trustworthiness, etc. And these are
04:41
mostly umbrella bodies and they are groups of regular members and they provide a coordination level for WDS regular members. So, some examples, the NASA Eostis project which brings together all the NASA digital active
05:01
archive centers or the IODE, the International Ocean Oceanographic Data Exchange program which brings together National Oceanographic Data Centers, so this is more a disciplinary network. We have also the IVOA, the International Virtual Observatory Alliance which brings at least together the
05:22
astronomical data centers in an interoperability framework but Francois Genova is more apt to talk about IVOA than I am. We have also partner members and associates and I wanted to highlight that DataSite is a partner member of WDS and I also checked that I can announce today, checked
05:44
with Adam and Jan that we are in ready to sign a memorandum of understanding between XOR data system and DataSite to provide a framework for collaboration which is really exciting news. I think it's something we're looking forward to in terms of collaboration. So, this is for
06:04
the membership, the geographical distribution of our membership. So, I'm always hesitant to use this map to illustrate a geographical distribution because it's a bit misleading. The blue numbers represent the regular members, so these are easy to map on a world map because they have a
06:25
national point, but the network members are covering an overlapping number of countries, so it's very difficult in terms of visualization here to give you an idea of the coverage, but in summary we are a
06:43
global organization and we have really an international and multidisciplinary coverage. We published recently our strategic plan for the period 2014-2018 and I wanted just to give you an idea of the strategic targets so that you can understand what we're trying to achieve,
07:02
what the world data system is trying to achieve in this period. So, the first strategic target is to make trusted data services, the ones I defined earlier, an integral part of international collaborative scientific research, so we're trying to build stronger bridges between the research
07:20
programs and in particular the one sponsored by EXU and the trusted data services we're promoting. The second strategic target is to nurture active disciplinary and multidisciplinary data communities and here we have identified that some communities in some domains have matured enough, have adopted already interoperability agreements, arrangements and have agreed
07:45
on providing some services to their communities. In other domains there's still work to be done and we're trying here also to promote those communities to get together and mature to provide these services. The third strategic target is to improve the funding environment for scientific
08:02
data services, increase their sustainability and for this we're trying to work with research funders. We have strong collaborations with, for example, the Belmont Forum which is a group of funding agencies for global change research. The fourth strategic target is to improve trust in
08:23
and quality of open scientific data services. I think this is one that I will concentrate on in the rest of my talk because I will try to give you some context about the data publication activities we're promoting in WBS and the
08:41
final strategic target is really to position ICSU World Data System as a premium global multidisciplinary network for quality assessed and I insist really on this point here, quality assessed scientific research data. So jumping into the publishing data aspect because the idea of this
09:03
presentation is to put the publishing data activities in the context of the WBS concept and building on the strategic target I just mentioned improve trust in and quality of open scientific data services. We believe that by facilitating access to and use and reuse of data sets through the
09:25
data publication concept we are improving the trust in and also the quality of open scientific data or we are promoting that. I will not really go into the details of the data publication as a concept. I think we had very good
09:41
presentations yesterday introducing even the where it started and some of the former world data centers were key in maturing that thinking around the concept of data publication as well as being part of founding members of data site. So WDS building on that experience we established a working
10:04
group which later became endorsed by the Research Data Alliance as an interest group so we're operating this jointly with RDA and it is also a hot topic in the Research Data Alliance community at the last plenary it was a very well
10:20
attended breakout session as well. Now building bridges with the previous presentation on the long tail I think I will try to illustrate how data publication or the data publication concept is we hope is going to improve the quality and the availability of research data scientific data. So if we
10:45
look here at the distribution of the data sets out there the scientific data sets out there in relation to their fitness for use you can here see that there's in yellow here a portion of this data of these data which are
11:03
managed and published data mostly data coming from large-scale monitoring initiatives projects computed data but also data deposited in disciplinary and multidisciplinary data centers such as members of of the world data the world data system but in proportion you can see how how it fits into into
11:25
this into this compared to the the other parts in green we have what we call the somewhat managed and open access data these are data sets stored mostly in institutional repositories so they're open they're discoverable but
11:43
there have varying levels of quality when it comes to the metadata and also in terms of long-term preservation arrangements etc and then you have the long tail the gray part which is what we consider unmanaged and
12:00
non-published data so from individual projects scientists labs sitting on USB sticks and and you know that story so how data publication can play a role here we see it really as a mean to move the data sets available here in
12:23
the long tail towards the yellow and the green part so making it more discoverable making it more usable and and reusable so in the context of this data publication interest group at least four areas of focus have been
12:40
identified to try and bring together this concept of data publication addressing data publication workflows addressing also services in fact I this slide where I have the list of the various work groups that the working groups that have been established under this interest group publishing data
13:02
interest group so the workflows working group will try to deliver a generic workflow model or at least address generic workflow models for data publication there's a working group on bibliometrics co-chaired by
13:22
Sarah who is sitting also in in this room and Kirsten Leonard and it's looking at approaches and solutions to allow analysis and of content and proper citations in the context of data publication there's also a working
13:40
group slash interest group which is a nomenclature issue between WDS and RDA but it's addressing cost recovery models and looking at business models to see how we can accommodate this data publication approach and ensure sustainability of data repositories in terms of funding and finally there's a
14:05
services working group and I will give a little bit more context to that one the services working group and this one is really looking more closely to providing cross-referencing services so I'll give you a little bit more
14:24
details on this one in the five minutes left so in in summary the whole initiative around data publication is really about creating better links between data repositories data service providers and a scholarly
14:41
publishing for example linking the editorial workflows and also linking the services between between the two I have to also mention that this interest group has a very wide representation of the stakeholders
15:00
involved in this initiative in the data publication research facilities are represented data repositories data service providers universities libraries industry stakeholders as well are there so it's really I think what's starting to shape is some sort of consortium around this idea of data publication
15:23
it's a coalition of the willing and and hopefully also early adopters and also testers of of the first pilots for implementation coming out of this working group so as I said I wanted to concentrate a little bit on the data publication services working group so this working group is really came to
15:46
the conclusion that there is no common framework for cross-referencing data sets and articles and it's proposing to create this service to connect data sets and articles as you can see on the figure below on the left hand side this
16:04
is the current situation is a great number of bilateral agreements between the stakeholders the publishers the data centers the geometric service providers and and this is clearly not very efficient when we try to scale up
16:23
such a system so what this working group is proposing and exploring is the possibility to build up a cross reference reference referencing service to serve as a one-stop shop to say to use a common term which all
16:43
stakeholders publishers data centers bibliometrics services and possibly functionalities to be created to serve research could could use so in in general what are the current risks with this data publication concept and
17:04
and where what we're seeing it going I think it it brings some questions and in fact I don't have necessarily answers for all of them but I think I thought it was interesting to bring them to the table here for discussion how do data publication services fit in the global data infrastructures I will
17:23
try to give a beginning of an answer maybe in my last slides and also another question in how scholarly publishing itself will evolve over the next decades because this is I think a moving target so while developing the concept of data publication we have to take this into consideration and and the
17:46
impact of data publications what is the exact impact of data publications we heard this question many times yesterday and I think there are some studies already in that field but I think we have to monitor that aspect
18:00
as well it all what these working groups these working groups will come up we'll have implications on stakeholders in terms of organizational and technical requirements so how these stakeholders as I mentioned will adopt and and and endorse these requirements is a big question mark although we're hoping that
18:21
this consortium concept will enable that and finally what are the costs of course of that and is it is it sustainable so unfortunately I don't have an answer for all of these and I'm here to discuss these with you but I wanted to give you also some context to rather to put the data publication
18:45
work we're doing in the context of other WDS activities and one of these activities is actually the concept of a knowledge network a WDS knowledge network being developed by a WDS working group so it's as I have on the slide here a web-based interlinked repository of relationships between the
19:05
actors and entities that make the research landscape so these are the institutions data services projects also topics of research funding bodies etc so we're aiming to develop such such a resource but the in order to be in a
19:26
position to develop this resource we recognize that and and also we support the vision that we need an interlinked foundational global research infrastructure and this infrastructure needs to be sustainable needs to be
19:41
scalable and also because there's no other no other choice we think needs to be distributed amongst leading organizations data centers data service providers initiatives etc and every everyone will take a responsibility over parts of that distributed global research infrastructure and we really in
20:02
the context of this vision or this concept we really hope to draw on the example of the linked open data approach and to reuse many of the existing services and components standards etc and provided by existing organizations so if we just look at one representation of this knowledge
20:23
network as it is currently developed by this working group you can see here in the red bubbles the the elements of this knowledge network so you have the people the projects institutions research data outputs trusted digital repositories
20:41
citations all of these are interlinked and the the relationships between these are actually mined from from the metadata available within WDS member data repositories but most interestingly we do not want to limit it to data mining
21:03
metadata but also including additional resources non-traditional resources such as social media citizen science projects etc so in this concept or in this spirit of reusing as much as possible the existing and leveraging
21:22
the existing infrastructure we have identified on this figure what exists on what we think is possibly coming to fruition and and what needs to be developed further so you can understand that for citations there's a already an organization here data site that's providing such a service we could build
21:44
on other organizations like orchid in assigning persistent identifiers for providing this but here you can see also the link with the data publication work with the publishing services working group trying to build together this
22:00
cross-referencing system between articles and data sets it fits completely into this concept of the knowledge network and we will be able to build on these services I think I'm going overtime so I'll have to speed up the final point I would like to stress and because this is also an area
22:20
of possible synergies with with data site is this bubble here the trusted digital repositories and and the need also to build a registry for trusted digital repositories so WDS provides and and it's not the only institution providing this a framework for certification for trusted digital
22:41
repositories and services it's considered a lightweight certification framework together with the data seal of approval we're trying even to consolidate this and provide a streamlined lightweight certification but there are others going to the highly demanding ISO ISO standard so what
23:01
we envision here to help that knowledge network build that knowledge network is a global registry of trust to digital repositories and we're really looking forward to the the current developments of the repree data data bib for example merger under data site umbrella and we see here really a synergy and WDS could for example endeavor to maintain a subset of
23:28
that registry to define trusted digital repositories together possibly with other organizations and extending those certification properties to to other
23:40
other schema finally I will skip this one the benefits of the global registry we can discuss this offline and I would just thank contributors to for this presentation Vin who go from say on NRF on the knowledge network activity Michael different broke on the data publication and most importantly
24:02
all the co-chairs and contributors of the various working groups and interest groups I've listed the co-chairs here these are people contributing voluntary voluntarily to these groups and I really would like to extend my thanks and for their involvement in this work and as Simon mentioned we have an
24:25
exciting conference coming up in New Delhi side data con we hope that many of you have submitted papers or are planning to attend we really want to bring together conference to bridge the gap between data practitioners and and
24:42
the research program so this is the final slide thank you