Findability of Research Data and Software Through PIDs and FAIR Repositories

Video in TIB AV-Portal: Findability of Research Data and Software Through PIDs and FAIR Repositories

Formal Metadata

Findability of Research Data and Software Through PIDs and FAIR Repositories
Title of Series
Part Number
Number of Parts
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Axiom of choice Metre Context awareness Identifiability Principal ideal domain System administrator View (database) Multiplication sign Execution unit Set (mathematics) Price index Mereology Metadata Revision control Hypermedia Authorization Uniqueness quantification Descriptive statistics Position operator Physical system Standard deviation Metadata Principal ideal domain Bit Virtual machine Structured programming Word Digital photography Data management Voting Software Integrated development environment Personal digital assistant Repository (publishing) Interpreter (computing) Communications protocol Family Form (programming)
Group action State of matter Multiplication sign System administrator View (database) Decimal Execution unit Source code Digital signal 1 (number) Set (mathematics) Price index Parameter (computer programming) Information privacy Mereology Bit rate Different (Kate Ryan album) Hypermedia Ontology Repository (publishing) Physical system Vulnerability (computing) Social class Area Graphics tablet Email File format Moment (mathematics) Electronic mailing list Sampling (statistics) Metadata Bending Principal ideal domain Bit Computer Parsing Virtual machine Connected space Structured programming Digital rights management Angle Repository (publishing) Order (biology) output Website Self-organization Text editor Data logger Row (database) Web page Metre Point (geometry) Classical physics Server (computing) Service (economics) Identifiability Observational study Link (knot theory) Principal ideal domain Virtual machine Maxima and minima Metadata Inclusion map Ring (mathematics) Software Software repository Authorization Uniqueness quantification Data structure Traffic reporting Form (programming) Self-organization Context awareness Information Characteristic polynomial Database System call Graphical user interface Subject indexing Intrusion detection system Personal digital assistant Formal grammar Key (cryptography) Form (programming) Identity management Freezing
Game controller Beta function Computer file Set (mathematics) Field (computer science) Power (physics) Mathematics Natural number Ring (mathematics) Software System identification Logic gate Compilation album Address space Self-organization Regulator gene Moment (mathematics) Physical law Principal ideal domain Degree (graph theory) Intrusion detection system Order (biology) Self-organization Key (cryptography) Identity management
Group action Game controller Service (economics) Computer file Principal ideal domain Multiplication sign Workstation <Musikinstrument> Sheaf (mathematics) Materialization (paranormal) Set (mathematics) Mereology Theory Field (computer science) Number Latent heat Ring (mathematics) Harmonic analysis Endliche Modelltheorie Innere Energie Physical system Self-organization Graphics tablet Execution unit Information Software developer Projective plane Moment (mathematics) Sampling (statistics) Physicalism Sound effect Principal ideal domain System call Subject indexing Arithmetic mean Data management Sample (statistics) Intrusion detection system Internet service provider Physics Website Key (cryptography) Identity management
Axiom of choice Context awareness Service (economics) Identifiability Principal ideal domain State of matter Multiplication sign 1 (number) Archaeological field survey Digital object identifier Information privacy Power (physics) Goodness of fit Natural number Computer configuration Software Repository (publishing) Information Form (programming) Physical system Graphics tablet Service (economics) Information Block (periodic table) Archaeological field survey Moment (mathematics) Principal ideal domain Digital object identifier Graphical user interface Data management Sample (statistics) Computer configuration Intrusion detection system Repository (publishing) Internet service provider Physicist Order (biology) System programming Video game Text editor Theory of everything Object (grammar) Physical system
Metric system Beta function System administrator Multiplication sign Source code File format Sheaf (mathematics) Set (mathematics) Mereology Information privacy Web 2.0 Different (Kate Ryan album) Hypermedia Object (grammar) Matrix (mathematics) Endliche Modelltheorie Descriptive statistics Physical system Graphics tablet File format Metadata Principal ideal domain Bit Knot Digital signal Computer Digital object identifier Parsing Virtual machine Connected space Message passing Projektiver Raum Repository (publishing) Internet service provider Order (biology) System programming Metric system Physical system Ocean current Metre Point (geometry) Identifiability Service (economics) Principal ideal domain Information systems Collaborationism Virtual machine Metadata Number Frequency Inclusion map Profil (magazine) Harmonic analysis Form (programming) Axiom of choice Information Key (cryptography) Validity (statistics) Projective plane Mathematical analysis System call Subject indexing Lie group Object (grammar)
Axiom of choice Source code Workstation <Musikinstrument> Client (computing) Information privacy Medical imaging Velocity Hypermedia Different (Kate Ryan album) Object (grammar) Compiler Videoconferencing Descriptive statistics Physical system Collaborationism Principal ideal domain Digital signal Image registration Computer Process (computing) Internet service provider Order (biology) Endliche Modelltheorie Summierbarkeit Cycle (graph theory) Quicksort Computer file Computer-generated imagery Student's t-test Automatic differentiation Number Inclusion map Latent heat Term (mathematics) Profil (magazine) Energy level Data structure Digital rights management Traffic reporting Computing platform Metropolitan area network Form (programming) Focus (optics) Distribution (mathematics) Key (cryptography) Information Image resolution Code Client (computing) Total S.A. Graphical user interface Uniform resource locator Hypermedia Personal digital assistant Function (mathematics) Universe (mathematics) Data center Video game Library (computing) Standard deviation Multiplication sign Modal logic View (database) Abklingzeit Combinational logic 1 (number) Set (mathematics) Image registration Mereology Mathematics Flag Graphics tablet Service (economics) Moment (mathematics) Metadata Internet service provider Digital object identifier Type theory Repository (publishing) Self-organization Website Energy level Freeware Physical system Data structure Metre Functional (mathematics) Service (economics) Identifiability Link (knot theory) Principal ideal domain Collaborationism Focus (optics) Metadata Field (computer science) Wave packet Software Uniqueness quantification Harmonic analysis Integrated development environment Quicksort Axiom of choice Projective plane Computing platform Object (grammar) Videoconferencing Local ring
Torus INTEGRAL Multiplication sign Source code Open set Disk read-and-write head Demoscene Roundness (object) Shared memory Repository (publishing) Information Endliche Modelltheorie Physical system Graphics tablet File format Digitizing Point (geometry) Sampling (statistics) Shared memory Coordinate system Principal ideal domain Digital signal Artificial life Numbering scheme Digital object identifier Latent heat Process (computing) Repository (publishing) Internet service provider Order (biology) Self-organization Website Physical system Metre Ocean current Functional (mathematics) Service (economics) Identifiability Perfect group Principal ideal domain Disintegration Information systems Online help Digital library Generic programming Number Zeno of Elea Natural number Profil (magazine) Software Traffic reporting Quicksort Information Image resolution Forcing (mathematics) Projective plane Total S.A. Software Personal digital assistant Data center Video game
Email Reading (process) Service (economics) Identifiability Code View (database) Disintegration Demo (music) Source code Digital object identifier Dressing (medical) Landing page Latent heat Roundness (object) Zeno of Elea Computer configuration Different (Kate Ryan album) Software Videoconferencing Energy level Backup Physical system Addition Dependent and independent variables Key (cryptography) Demo (music) Online help Block (periodic table) Code Generic programming Usability Bit Digital object identifier Flow separation Type theory Word Software Hash function Repository (publishing) Video game Speech synthesis Self-organization Convex hull Object (grammar)
Multiplication sign Projective plane Set (mathematics) Sound effect Bit Limit (category theory) Digital object identifier 2 (number) Graphical user interface Process (computing) Zeno of Elea Repository (publishing) Software Order (biology) Cuboid
Email Web crawler Scripting language Metric system Zeitdilatation Equals sign Area Leak Graphical user interface Shared memory Repository (publishing) Volumenvisualisierung Local ring Computer icon Email Link (knot theory) Decision theory View (database) Web page Aliasing Computer file Bit Control flow Digital object identifier Formal language Vector space Repository (publishing) Software framework Authorization Task (computing) Physical system Internationalization and localization Web page Computer-generated imagery Password Maxima and minima Menu (computing) Hand fan Latent heat Zeno of Elea Software Musical ensemble Ranking Maize Implementation Address space Self-organization Execution unit Lemma (mathematics) Globale Beleuchtung Code Login Field (computer science) Ultraviolet photoelectron spectroscopy Cartesian coordinate system Computer programming Wave packet Function (mathematics) Library (computing) Address space
Execution unit Link (knot theory) Electronic program guide Electronic mailing list Set (mathematics) Usability Revision control Hooking Software Read-only memory Repository (publishing) Software Order (biology) Repository (publishing) Physical system Form (programming)
Execution unit Beat (acoustics) Personal identification number Web page Simultaneous localization and mapping Electronic mailing list Usability Digital object identifier Bookmark (World Wide Web) Metadata Number Leak Revision control Mathematics Sample (statistics) Zeno of Elea Repository (publishing) Software Order (biology) Self-organization Right angle Information security Form (programming)
Web page Dialect Addition Link (knot theory) Control flow Range (statistics) Graphic design Line (geometry) Digital object identifier Event horizon Number Revision control Mathematics Mathematics Sample (statistics) Zeno of Elea Software Revision control Ranking Information security Descriptive statistics Form (programming)
Web page Point (geometry) Ocean current Default (computer science) Link (knot theory) Multiplication sign Projective plane Maxima and minima Planning Branch (computer science) Open set Digital object identifier Flow separation Revision control Latent heat Process (computing) Software Zeno of Elea Personal digital assistant Software Software testing Maize Resultant
Stapeldatei Link (knot theory) Computer file Keyboard shortcut Digital object identifier Metadata Number Type theory Message passing Zeno of Elea Pi Software Video game Physical law Cycle (graph theory) Reading (process)
Web page Point (geometry) Stapeldatei Identifiability Multiplication sign Projective plane Electronic mailing list Set (mathematics) Digital object identifier Laser Number Mathematics Chain Zeno of Elea Repository (publishing) Personal digital assistant Phase transition Software File archiver Physical law Right angle Software testing Traffic reporting Descriptive statistics
Point (geometry) Programming language Multiplication sign Set (mathematics) Bit Database Client (computing) Formal language Arithmetic mean Word Process (computing) Software Personal digital assistant Order (biology) Self-organization Natural language Descriptive statistics Form (programming)
Domain name Group action Identifiability Demo (music) Physical law Projective plane Maxima and minima Control flow Deutscher Filmpreis Dynamic random-access memory Mereology Metadata Uniform resource locator Uniform resource locator Estimator Personal digital assistant Repository (publishing) Order (biology) File archiver Right angle Computing platform Physical system
Mobile app Link (knot theory) Direction (geometry) Multiplication sign Projective plane Set (mathematics) Dynamic random-access memory Revision control Uniform resource locator Graphical user interface Roundness (object) Software Computer configuration Personal digital assistant Repository (publishing) Chain Website Office suite
Trigonometry Twin prime Projective plane Electronic mailing list Set (mathematics) Dynamic random-access memory Mereology Medical imaging Mathematics Personal digital assistant Software Authorization Endliche Modelltheorie Logic gate Descriptive statistics
Revision control Mathematics Execution unit Ring (mathematics) IRIS-T Revision control Electronic mailing list Menu (computing) Right angle Wireless Markup Language Number
Point (geometry) Multiplication sign Time travel Set (mathematics) Digital object identifier Mereology Revision control Zeno of Elea Computer configuration Different (Kate Ryan album) Spacetime Website Gamma function MIDI Block (periodic table) Projective plane Electronic mailing list Menu (computing) Digital object identifier Flow separation Hardware description language Personal digital assistant Lie group Order (biology) Revision control Normal (geometry) Family Row (database)
Standard deviation Presentation of a group Group action Manufacturing execution system Ferry Corsten View (database) Decision theory Multiplication sign Demo (music) Design by contract Public key certificate Data quality Uniform resource locator Strategy game Repository (publishing) Area Public key certificate Mapping Decision theory File format Software developer Shared memory Metadata Bit Principal ideal domain Staff (military) Term (mathematics) Digital object identifier Flow separation Windows Registry Checklist Type theory Digital rights management Process (computing) Repository (publishing) Auditory masking output Website Condition number Row (database) Point (geometry) Slide rule Perfect group Link (knot theory) Divisor Variety (linguistics) Simultaneous localization and mapping MIDI Maxima and minima Checklist Rule of inference Metadata Number Landing page Latent heat Dublin Core Zeno of Elea Software repository Integrated development environment Software testing MiniDisc Backup Data type Domain name Standard deviation Metre Information Online help Forcing (mathematics) Projective plane Expert system Content (media) Planning Login Core dump Database Letterpress printing Frame problem Uniform resource locator Word Software Search engine (computing) Personal digital assistant Revision control Strategy game Statement (computer science) Computing platform Object (grammar) Dublin Core Family Abstraction Library (computing)
Filter <Stochastik> Standard deviation Digital filter Principal ideal domain Line (geometry) Sheaf (mathematics) Maxima and minima Database Mathematical analysis Open set Content (media) Spring (hydrology) Type theory Computer configuration Single-precision floating-point format Software Repository (publishing) Software testing Gamma function Summierbarkeit Physical system Area Source code View (database) Menu (computing) Range (statistics) Programmer (hardware) Repository (publishing) Personal digital assistant System programming Functional (mathematics) Row (database)
Filter <Stochastik> Domain name Execution unit Information View (database) Computer-generated imagery Computer network Revision control Type theory Mathematics Repository (publishing) Computer configuration Normed vector space Summierbarkeit Traffic reporting Resultant
the 1st part find little and so we already have considered
and please what means findable in the sense of family as so where have with 1 of the 1st principle we have after me to data and the data and that said they are signed globally unique into internally eternally is very interesting persistent identifier a bit on the back quantity eternally persistence and maybe be a bit of a double who had the correct 1 because the 1 should include the and the other as though and we have this whole and PIDE environment here and which is the very 1st the very 1st thing the opera's as thought about air when it came to a defined ability aspect and which is so nice and and by this a persistent identifier in itself and it all on it only includes generic now to data to make a dataset citable as so where had that this actually not enough and so they I would think they edit here we have to say that the data are described with what they call which we debate which me today that may include the prominence and metadata of course soda description of how an experiment was assigned what software was used which version of the software was used a short abstract some key votes maybe so if you think about the standard publication and what a journals want off you had to submit 20 submit a manuscript actually and you can also apply and many of those principles them when you try to publish a research datasets and so the wording sense so is is not so different so you may be familiar with it how many data repositories so I'm all also do demand certain keywords or something like that a few but it's all an optional thing to submit so I actually do you just need to provide often the data behind the PID that is necessary to get a PID like you and you name as an author a title and and of course said there and then automatically the unit costs are basically published and you get a you view I and that was actually but it does not cover it which media data so he it's also mandate a Photo repository administrators to make sure that the the scientists have the opportunity at least to also provide this kind of which meter to data whatever it might mean for your discipline because that's not specified here it's opted to communities to define for themselves what means which meter data PID 1 it is clear that even here you have about 20 persistent identifiers to choose from and you can even and develop your own so he had the community and is the 1 is there they have to decide on what to understand that's what are the protocols what are the guidelines had to be in place for their for their needs so here the fair slowly and definable patisseries open to interpretation same goes here but that's more on the technical side 3 that data and the 2 data I which to stood or indexed in a searchable resources that is often covered when you get a persistent identifier but it's not always the case and there are about 20 identifier systems are there and only some of them are linked to a searchable resources because you can also set up you for example you local handle server it's about 50 dollar euro a year as a as an institute you get handles which are then distributed and within the Institute and dead that's that's basically it's and you can show shut down and the this over any time you like and then there PID here the local handle is not searchable anymore so it's not automatically connected to an international or national national searchable we source like at the sky we in this land grab that make up men of science but who will like here yeah anything you like so it's not automatically connected so that that means the choice here off the persistent identifier has to be taken with some care and consideration in mind you add the next 1 former and today to specify and the data identifier and that's also principle that is and the some room for our for interpretation and because the did it is then up said behind be an identifier systems can be can rarely and they can be adapted and changed and modified all the time does not always a protocol behind it all put a as of which is in itself again fail in the context of the community so if 1 chemists let's say in the U.S. it means that OK for this experiment I have to have to have a certain kind of data as sets and there's some of it is there in the stand out her 1 and summer of for this and the standard to and I combined it and make my new standard fleet as something negative then maybe to colleague and and Europe on China where ever doesn't have to agree with that and it's not that not part of a community position so for the same that experiment maybe there maybe 5 6 7 what they call stand out which are not the view to which are not which an not went which are not discussed within the of and the community so it's this kind of it's a tricky it can be a tricky issue to be at rest him and so what this year not you will pass a terrorist well as as a teacher may be as an instructor as a person who who knows about research data management and it gives those knowledge a way to a to colleagues so you have no that's there and it's good practice to assign a globally unique and persistent identifier of publication odd to after and the upload because he can also get a
PhD which is not searchable yet before the publication state some Depositos offer this this feature especially fodder for the GUI and the parser Joyce also provide them the debate as the in the human and machine readable format so you can look up you examine how and it's as a good example of of being bowlful being machine readable and their human and we default but as it was such a you have to be of course all be open enough to learn all soul new formats and to not be afraid of it and not be afraid to ask sparse specifically starting off of the the dead on most of the times of the year the PhD you have they often names yet the subject areas and so on and this this metadata input and emitted Agenda Weishan should be supported and by this would posit Tories so it can be an exam amount she might you cannot that may be really easily and bloomed you can use an API if you ask your data Kuwait so you're Institute actually to connector Alain to to have an an open API you can use to submit and the data and and it is especially when you have a lot lots of their data sets that would be actually do way and to go so not manually typing in all the to date information because that's there will be time consuming and if you have more than 1 of 2 data sets it may take a page is to have a so here using the API is that had that our provided by most of the digital data repositories may be 1 way to go hidden then and will positive themselves support of course searching and what it means and I I was showing the shortly he before and example of you i so if they have this PID system in place and they use a common 1 another local henryk slower and for example bend Elf or they are often connected to many different headers call-waiting weaknesses and that their data publications can be can be found actually yes and the last point I different-colored already so as support in general and it's both should be supported at their meaning a manual and they 10 today to upload and as well as them all automatic only 1 as assigned to and yeah there are also some all CIA and so I 1 recommendation could be to check datasets that you use and if there's a PID units and decided to use the PID please 2 sites and the data and to not that you was also or are anything I and if he health and that that's that that would be yeah at to lead the way to go interestingly and many journals there when you submit a manuscript and submit US-France lists and you didn't really use a tail or not lf once manager and some scientists still tends to to manually crew rates citation list then they're often only put their you I'll and not the GUI even if 3rd paper or the data set this side as a DUI already so and that happens quite often and there has been a study 2 years back foreign and there but from sample and she year discovered that had to be between 60 and 70 per cent of all their publication records across about 5 thousand journals and they use the U I L and there are many broken links that instead of the GUI even if there of course the the publications headed the wife so because John also use the GUI system and have been using it for I would say atleast that 10 years said they was coffers cost F as a DUI agency and they are so familiar with the system we think that even the editors should be also aware that there should know not be you outs computer call them included into weapons list but the DUI of course but that's still not the case so there are many many examples of broken you what's out there and soul yet for the future please maybe it would be nice if you could included you I you and then all of the other persist and of course and then if you use a repository and you should to make sure that it's the repository offers a PID system you should give preference to repositories that autumn to mount an those says so that you have no way to publish a data set with powder you I because if you want to have publication if you really want is public appearances of your data should make sure the dead air it it actually uses a PID system and today if this not the case as some will see later that some especially local repositories to not include on not PAD system yet that 1 argument as costs sometimes because the repository administrator sometimes in some countries have to pay I had to get PID system and a con basically into implemented in today I will positive structure and sold they usually but sometimes it happens that they do not include as if this set of a service and please supporters and report especially and the requirements that you need such a PID to euro will positively Ekman all the data Kuwait you institution whoever is responsible in some for connection to rats on repository and another thing of course it comes with the this ending which meter data and we know that's depending on the discipline this can be wary complicated and time-consuming again at the moment if decimal stand not recommended by the repository and please goal of 2 good scientific practice and provide as much as you can as a possible and keeping in mind what would and fellow we such a PhD whatever and do within the next I've
10 20 years if he she had to work with mine state I guess what that may be in in mind and their which media data and can become out but of course there is a lot of work going on and different disciplines and a moment true half year as some more support in place actually as soon as possible but again as some disciplines are not there yet and there has to be a also you can consider maybe I encourage colleagues yourself for to take part in those groups slammed like was state of Lions where where community and initiatives to further and put forward which data standouts and vocabularies and ontologies of calls owls included here and and maybe have some time to participate they allied freeze look them up and see what others have been doing along the cloak because it can really help and address this issue yes and the last question may be even more complicated and what would a researcher from a different discipline you if you're she had to we use the data by the metadata information I provided so that's also another question to maybe get up and take a different angle on things and maybe I ask a friend or colleague and to look over if he or she is from a different discipline and of course of them there the data publication year asks for it you do not have to do that with all the data but at least for those those you're going public referendums they're going to be reused we wouldn't recommend actually to to get on and yet try to get those yeah keep those aspects in mind when you publish data as some examples that thing it's is that more and it's nice when you have some some good examples here so and this 1 is that from the point makes on this side so we have here classical and with such paper in any gas science this journal here a back here and classical and classical article fomite and you recognize it immediately and there's also began as a database behind it this this wetlands by a note in the paper off on their availability of to supporting data and here again and we have a local identify in places it which is mentioned in the paper and this should this local identify used by this repository and they also use a GUI for each published data record and this is in the reference list as a classical citation here so it's weapons 14 and again with the authors the title of the data dataset the publisher he added India science database in 2017 and again the DUI at you know you I'll format so it's machine actionable so not only for our where human but also for for a machine for example this connection can be made clear in the DUI meter data and ideally and there you have here this nice and data citation in place and many journals at the moment class for example as a publisher had they can also go and go this way so you can already as safe as you're a DUI at in the beginning so before you even have the GUI for your publication in place and they are automatically linked so not only human readable but also machine readable linked and before the actual publication of both the paper and date of publication has done so on a technical point of view it's not a problem at all possible it's also possible with using modern identifiers are that and you eyes that's not a problem but it's just also a formal issue form for journals for editors to actually support this kind of and data publication formant and yes a bit more on the in the PID pair so which ones are they and how they should be used and how they can be used for we such data proved PID said basically I we ran and we mention nodded you I frequently but there are many more so I can't do a short want of pans out certain how many of you can in half an orchid account OK about all the 50 per cent of keeping Q and this may or any other was a Chinese on 1 OK and you know it's also for can be persons now but yet it's both that's why it's book it's book OK and so we have orders like a scope was an idea we such ID here by pumps Aboitiz and there is we actually gap can be bold we have for organizational ideas and they're fun ideas and like great like a fund with they index organizations winged alter identity for example solely this is this is an example for the final side an organization site and then of course he also have we source D.'s soul and they can include APK notes for example which is not you local handle but which is actually server that that's manage and by this to be getting an so here it said they offer for example also and persistent identifiers for what you would so it's can be compared to did you I system of course we have handle ahead of next we have to or you am as service and we have many more like a especially for some that's disciplines like peach and persistent identifiers for cultural heritage entities and of course there them anymore and have a scholarship computer
questions so that the question was and power as they think is the usage on a set of of orchid across as the love and a company and if they're sellout to another organization on a private 1 even as though occured at the moment it's a it's not it's a non commercial organizations based in the US and that
is said and may I know the all people quite well and amended actually is so no guarantees here by them and it is not to be sold so you have to be weighed and amended is actually that do researchers can keep full control on what is on the orchids idea what is this place copy at the and what is not so there are many occured proper out there I would not recommended to be that way but there are many poor market Poppets out there we're just that the pitch refers to a name on Akron even and displaying nothing more so you can still use this orchid idea and still you are we search for file can be kept up to date but just for yourself 1 no 1 else or only for you the booking crew body Institute you and not beyond that so as it reaches researcher you may argue you want your papers to be wet you want to be with the ball you at all have this competition in place so you want to be to a certain degree of public person and that's also why we think again commercial players like research gate for example so was successful because they actually sell their data and can the 2 and 2 others and and were the fall could on the other hand you had this and non-commercial background they cooperate closely the order and DUI agencies like or and beta science so it's a very close compilation you know they all and and players who want to this is we say that the the daytime and as behind so is behind the the PID sexually dead public and the main but with the personal a prof I like all could and the we said can this decide and what publications are going to be displayed actually but as soon as you go public graphic paper with the date of publication of course a public means public so it means that the needed data is opened so I would argue that way I would argue that they do not have to put their personal details in and there's no field for displaying in nature are on the the country you were born law you e-mail address of stuff like that and you can add your e-mail address but you cannot do not have to and an orchid and that is kept private and according to German law that said like that of course there are the data you submit and also and had to to Orkut there will and I will stay and if they change their regulations all going to a change the regulations you still if we had to say OK I delete my proper and I and I do not want my data to be reused in any way as
so our eyes were Microsoft day not they stopped the I am methods and I was up like that it's they're much more an automated already
and there we searches can be can can be printed more or less it can before itself fully here we doable and there were 4 orchids and there the advantage is actually that the researcher can steal and maintain control over their data that is place being behind a knock for file and displayed behind model could profile and they have more control on this side a OK so thanks for the discussions of our interesting but that's not all they're even PADC where would have thought OK so we aren't talking and they have been some start ups and some new and I would say did groups and so excitement going on about PAD so projects of scary from projects parts it projects 1 3 years 5 years or 10 years and they have have you well they have some information about a project of course and they say why not provided persistent identify also for our project the index them so that they can be a found again and had maybe some of you had experience when you tried to look up old to resources all projects and the sites are not maintained anymore and get fall far because they're the sites are I lost the service has been switched whatever so before PID for a project you would still have to meet a data information in theory is still in place what a project was about to was responsible set up and as sold this information wouldn't be wouldn't be lost again no along with the website where this actually I often happening today so this is 1 aspect then there's a PID discussion going on flight instruments there is a wary active 5 a group around and obviate interest group under suspect as so instruments in meaning and especially well climate science for example you gonna shipowner we such wrestle and you said bird complicated so a system of many if when there many different instruments you connect them together and and you you stand to collect your data and which is based on public funding public money of course and so well why not get instrument ideas that on that and not the specific for certain about companies but the bed are are specific for certain research discipline so you can in your group you can describe this instrument of a certain set of me to data and you can easily wait we use those them need to date and provenance information when you fly sample and try to submit to your data set for publication so that it's connected together so we here and that's another development that's going on at the moment and what is in place all lady is there PID for ship crews so again for research wrestles and then I and many countries of course they're doing expeditions to far away places and they have and they have been for a long time now we're calling off course all this information about and the way the ways they're going about their that the stations to sample stations and so on there was sitting how long they stay at a station and so on all the geographical call to an associated with it and that they are now and trying to harmonize Toledo's kind of information and how is going to be used in research daytime and of course also a article publications so that X a 2 really when you read a paper on the materials and method section you know long effect OK on a cruise of Wollaston in 2010 and we Turker samples in the Arctic Ocean OK the Arctic Ocean estate where it's safe to leave you have to do take those samples and I want to kind of went ice shelves where there so when you do you have female how where and how many days that how do you did you would his sample is or something like that so all girls their prominence also informational as-needed data and they should be included also in this behind pieties actually for ship quizzes so it's and yeah it's another to what it meant going on and death communities very active behind this another 1 is physical samples for example fuel station of your geologists and go out and to sample like minerals or something like that and you can set up have field station and you can be described as an AI field station using an international energy you have something number number and IGES and so there's also a PID for that and I have all community also is very active to to maintain and to take care of those PID systems behind us and of course would already talked about there's also a new world meant to provide pieties for data management plans so gas announcer and we are still in a conference last year to researchers we really need to care about East so it was
a conference coming many and I will positively admins coming together and they are asked there is this really important for the researchers themselves to know about this is identifier systems and there we think it's safe to say yes but what do they need to know so what is the research actually need to care here because here she went wants to focus on their on their we search and not so much on the infrastructure and behind it so PID sexually infrastructure that's very important is the the service but it's a infrastructure aspect it's like the the power from your PAL block it should be banned all the time you shouldn't actually care too much about it but you need to have a certain awareness I guess how hard to use it and what's good use and what's not we did a survey last year of across their 1400 scientists in the natural sciences and engineering across Germany and about 70 cent there said that they're using due eyes for their journal publications and if thought Of course it's and natural thing to do because our journal or I'm the editors they're providing us with 2 do you Iose and we're familiar with the DUI system at coming form bedside and there about 10 % well so familiar with the use of digital object identifiers for we such data and they're even it even less than and even less we're familiar with order this is no identifiers for police such data so it was the 1st 20 that maybe do you I send you a few artists maybe I was going to be the ones most frequently used for ways of state so then we asked why they are not so familiar so why didn't you use them PID said the UIC especially before for order and objects for publications and a half of them on that they don't didn't know about the option to use wise for other publications and then decide on and we do not want any counseling services we are trusting now lie trusting our as journals or infrastructure providers that they know what's best and when they say we do not need do you I on other PID for a paper then it's fine that white so we we did not care about that not that important for us as long as we have our GUI for our article that said like that's that's the most important thing and it reflects some of the situation is still have of course we are still struggling also infrastructure providers to make it as easy and as natural as possible within the research data workflow research workflow editor home laboratory at the at the home and evoking croup actually to use and bells and PID services and not even know that year that you're using them used and now I would also invite you to to have a look at their we we data but also if you're not familiar with it it's who knew wettest we offer data ovipositor always along the slope and if you if you look them up and just do a quick have a quick search you will see that there were 2 thousand repositories and this that they are so that's quite a lot if you're I was such an a certain discipline and you have and I have to pick let's say for you discipline of 4 out of maybe 100 data repositories away lable It's all idiot a tougher every tough choices and even 50 is still a tough choice so when you look them up you see them but if you a few such specifically for the ones who wear include PID services they will only be about 800 of them what the Buddha wasn't who include PAD services and if you tailor bounded search even have further foreplay somebody you I and not as a physicist maybe you want to use and not as a persistent identifier because it's more of more common in your community then you're down to only a couple of shots of maybe so depending on the discipline and the PID services offered and you can know it down quite quickly and then you if you have some maybe a closer look into the data options they offer so if they often involve will for example if you need that life they offer certain license so said for our Floyd you data publication then you come quickly down as to there were positive which may be the most suitable for you our choice so that would be 1 way I am too were how to include at PID as in in your actually data management in in your choice and how to choose the good data repository it's just 1 way but we always I will comment we 3 data because it said at the moment it's the most up to date information on they too will pass a toe isolatable globally that we have OK and
continuing with the PID so wide and important part of the Federated principles what is behind
the PID maybe here and a little bit more detail soul and piety
if they 8 meter data on high is maintained their well and can provide a lot of prominence of course information an behind digital digital object so not only behind the research data have but behind any digital object and they they included as we heard before it can also be a description of a fuel station for example it can be a description offer ship crews them and there are many more aspects of do research workflow so where I live provenance and media data and I'm a PID always there also attaches on a legal aspects for example data policies and guarantees there so this the to data information on research data and also can include for example licenses it can include aspects of and Datametrics also it can include here where fluences and 2 other publications which are important Floyd the usage of this particular data set of data we source We are describing as so where there are many aspects here and and we can try to narrow it down you a bit at all central at points but you should to amend at when it comes to you at this this information and the PID concept so prominence and means and that we have validation and their credibility and that researcher actually should comply with good scientific practices and be sure about what should get actually a PID and what shouldn't as some data were positive it's an institutions have where we focused and strict data policies in place and to you also to use for example different kind of PAD systems for different purposes so our 1 example and climate sciences there is there what is sometimes used is there's there's a handle provided for the data set of the data package as a whole and used for modeling used for our 1st analysis of the data and but actually day data publication so what is mentioned in a paper later on and is only a part of it and only the small part of the whole data package actually goes into the public repositories system and it gets a DUI here so they use them different kinds of PID service systems for different purposes but again here it their strongly depends on the on the discipline and on the community and also be under 1 you know call and data policies you have what the who acquirements here would be so this is the the aspect about the provenance and mitigate of course is central to the visibility insight ability of a data set and they should be provided with where considerations the policies behind the PID system I guess that's not sold at interesting for many researchers and so just say they ensure the persistence in the worldwide web that's basically the key point and behind the policies of many many different that TID system at least in the media data behind the persistent identifier should be a wearable for long . and for a long period of time so this is that 1 of the key messages and the different PAD systems and agencies they support or what means long what means persistent they are not so and they're not harmonized that way so they interpret different time spans and here so it's basically up also to order I will post administrators and to ensure this meter data availability behind PID system so he again it depends strongly on the policies and for the machine readability and there are of course an essential part of the future discover below this probability of the data set and then it's always good to have it at a check for example what's and metadata phonograph provided behind the PAD system and if they can be reused behind the data side you I am persistent identifier system for example you can get a you can use the API which is open to get the Metre data and many different formats if you want to index them so you can I get it in and chase smell the European gathered in exam and you can get it in an RDF XML as well so that just as we have possibilities is how you can get the metadata information so it's quite a flexible but not all persistent identifiers systems are that flexible soul and if you were a really interesting that all may be setting up your own scholarly next on anything like that and just that I keep in mind that they're different hairstyles let's say out there and check out what a metadata formats you would like to use for the metrics me touch at briefly has sold they're supported by many and PAD systems but they're still had that where we beginning and there's not through a common system and which is actually and widely accepted by we Ciger and communities the when we go to to a parser to administrators so something like that so there are many different systems out there are 4 different kind of matrix you would like him to to carry out and there are some projects I would like to mention one year they're making data are college project space than their U.S. said led by the and California Digital Library I guess and a beta Cyprus numbers also has also than participating here and dead it's it's 1 of many projects in this whole matrix section but you will will have to to see in in but will come here and there the following years also of course this for example and in some current research information systems SumQuest systems I like DATA TO the user the ball and day and now included a matrix in the form of the small common doughnuts and offered by all matrix so you will see maybe some of this this these though knots also out there and that but of course it's not look focusing on data publications much but basically still on no research papers but maybe this via com a similar to well appm and when it comes to data publications but we don't we don't know we PID provide interoperable needed greater and his most of them new and most of their key idea providers also I talk to each other to get a lot of harmonization in place and 1 example of this harmonization is the close connection between an orchid profile of researcher and the DUI system offered by costs have and the beta
site so we have a collaboration going on between dose of free and PID providers to actually a shared their meter data and to get to up that function which is completely optional for the researchers well I said they can say that they're all could pull off file should be updated automatically if the new publication is accepted and published for example by a journal and and so I would say 95 per cent of them they use our courses and you I still and the possibility that there you're the new next paper is going to have a course a few eyes quite high and if you have also know could what file you can't and I can also arise and this collaboration basically had to to update yogurt profile if you wish so exactly and so would this say here for this example there the choice would go again for for the for the GUI system but again maybe also order and P idea and provide us with showing that this callable Weishan and in the future so and their parks going on to the whole and it and handle and community and of course this also exchange for example and far from the EU would act project the fills re going on so there's a lot of fat harmonization faults and so on and going on at the moment now I'm talking a lot about you an example for our was structure of such a view I can be seen and he'll basically it's an consists of epoxy which is the same all the time and the plastics and that a specific for a certain repository and for example and it's also always and the same as structured starts with the 10 and then and then it can include a a number or even and can indicates and they must have been an acronym have fought for the data repository itself and this practice to include Anania it's often used for always ability aspects I would say it's not recommended actually because the name of all repository can change and especially when you consider a persistent aspect of a key idea and including a GUI so persistent means not well who does guarantee and that there were positively I will use the same name let's say in the next 15 20 30 years or something like that so as you I agency we were commenced to use and mutual formant here for example a number combinations but again it's up to the repositories to choose so it's not up to the scientists because this is not this is not up to negotiation and you cannot decide which do you are you would like to share itself but in general ads automatically please yeah create at and it that's given to you by the repository of by the Journal you are you using and the last part here that's the suffix that's though the part that assess specific to your and unique of course to the data sources the resource type you referring to with this you I yes I should also mention that it's an internationally recognized and supported them stand out since a couple of years and it actually means that the DUI would always were 1st and to the objects themselves and not to the location that means that the you also behind and you I they can change and that change all the time attention quite openly as sole has them were positively Edmund and you have to take care of that actually the wetlands behind you Iose up-to-date and so that you're you clients the researchers to customers whoever they do not get a broken link and then I got that checkers standing checkers for that so it's about automated as well so that's usually not problem the GUI services since 2004 5 as that key idea provides you I service and we say we're our border among the 1st ones who started providing the service have for and code compilation with the German Center for Climate computing in humbled today have set and there they were the 1st 2 approaches and ask us we have a data set here and then it goes into the IPCC the international panel on climate change report and you want to say that we want to make a Westlund's for this data set and not only to to do on publications or how to do that said Baker then there were not many they too will pass erase and he's the fall public once let's say like that in place as soul and that UI system actually wasn't that that's and it was used by journalists and back then all waiting soul and we adapted it and data Cyprus fondant together with hunter funding members among them some velocity California ditched reliably and them and we and then came together and the British Library was also among them and that they came together and have decided OK to put an organization behind this you with lensing saying and the at the time and they also came and the necessity to adjust their data up behind you I system because not only to include research data but also all the time and resources that are produced in the the we search term life cycle have a has yet be of course we have to live your focus on the science and technology and am part as a GUI agency and there were flag in Germany it said the it's a structured like this that we have a fire the new I agencies in Germany Jesus is among 1 of them at that the the the the we yeah for economics said Nash library for economics and on on the national level while for medicine as as another is again another 1 and 2 more on and here we spilled up actually according to the 2 disciplines and provide this sum UI service within TIDE for academic year and customers that they take that supply condemning proposes which I have financed at least 50 % by by public funding and as the UI services for free so if you set a perceptual and here and there and a university located clarity in Germany and you have funded by public money at his that 50 % and I did you I so this is completely for free for you so you get all they do the support that we have here a support service behind that and you can apply for a prefix an or more on 1 all the fixes and there you get the of the supports you you need to use this do you I system yes some of the services that out there is that for
example managing the prefix assignment and then of course 1st support so if you're troubleshooting and technical issues they're using the data side API or something like that and that's we provided 1st level support manner and we do also training and counseling counseling services some on the as a GUI agency and of course that provide access to add their registration at platform which can be accessed there also manually because sometimes some repositories only have 1 to 3 data sets a year they want to publish and provided you I've for and that sometimes of course you have 50 thousand a month so it really 1580 yeah but you have maybe 10 10 thousand amount that happened before so it really depends on and the system you want to use and how does it look like so you have the International DOI Foundation and on the on the part that the global let's say managing bought have for getting in the U I and then you have the the the global ad agencies like what's that would beta sites here and you have won the national side for other countries and he in Germany for the different and we search alley said you have the registration agencies and then would them then provide support for the clients and yeah into shortly on anomalous maybe not so important form we set aside a few but maybe if you where I want to give this information Baker also in into your institution are into Euro we such data team on and so on about 1 comma decimal 3 million you eyes have been registered to and so on have found by DAB and then here's some of that of the distribution so most of it I actually such datasets we ate some for I equate literature for some of this includes and crews were parts and of funding epaulets and for for German and funding agencies and I had 10 % images and a student for a person for unusual media and today I am very external times also where you said you I service in total we have 1 106 65 data centers but the numbers of course change daily so do all we we we have 1 to 2 new data centers a month are coming to coming to Gebbie year 1 could say most of them are university year that we some I of course also research institutions from different animals and like its are only 2 examples that Max-Planck this we have 1 whole 1st soul and many more and we also you have of course data centers from other countries he has the policy that we had we actually recommends that they happen local agency in place so country-specific but of course that's not always the case so between Europe all off the U.S. we also have of course cooperation agreements actually to and to you have to have them have to have clients also of course form of data centers from other countries untill they have their own you I agency nationally yeah the French of course we also use a system as so our own man cataloging team metadata users will lead to a decay time estates like sample and in there and he had the portal we have we parts we have our tuition media so we use a system and following a well I'm so to sum up and what are the well PAD 1 A 1 non- and what is not essential aspects about their persistent identifiers for researchers and also was solving some method was the what is the PID IN orders not and the definition make most common wandered as agreed on the PID is a long-lasting wetlands to a digital resources it can be really any digital we so so it's not focused anymore all under Article publication or on a date of publication it bank India had can video it can be and a half digital the description of a field station it can be a lot of things so it's not again it's not only a PID is not not always so references to research data article and get different sorts of that pieties as we have shown and different use cases so it really depends on of course the type of TIB you looking at and if you were and you should check if there had the PID is said the 1 you should use 1 article on data of persons organizations field stations and so on and it needs a strongly depends of course on the intention on the organization and will there were organizations at which offered you eyes and and on the metadata schema behind them pieties start with your eyes behind it badly ideas the of about organizations it's so if you want 1 just ask you your local Institute you library or for example if you submit a paper they offer to mostly by journalists and there as a researcher where important you do not have to pay for the why you should never ever ever pay to get to I by yourself you shouldn't because the the job a few Institute off lively ofyou popular where you are going to publish your paper you're up here will work positively where you want to submit it data publications so you do not have it's not intended no PAD system is intended to be used individually by 1 researcher if not a case who even if some maybe some yes some of them and organizations on S there is someone telling you otherwise is not the original intention and was never and is not supported by the big at pH agencies as well as just not the case so PAD some mostly used for persistent citation that served another key aspect and if it's there all published resources should have 1 not all have formed as of yet but they should at the PAD it doesn't have to be I'm it doesn't have to be an not or something like that a widely accepted by the community would be nice if it wasrt but it doesn't have to be it should be at but it should be a system organization a lively that is supporting on Institute at the supporting and this and this persistent identifier than next 1 very important a good citations should always include PID and appeared he doesn't really mean a you I'll it can mean you well there's no here up to as a personal note about and if there's a PAD system away lable and if you object of course has the PID you should use it in a citation and you should and just so you are
citational when you reference tool for example so says that the terror many many orders and node would ever use that their PID is always included in your weapons this well important and agreed John also she checked that and the next 1 meter data behind a PID well we really important so please take care when providing them ask you data Kuwait you Instituto you support team and to to help you take care of that and pieties are not perfect PAD systems are 1 by by human so we are we are not perfect admissions are we're definitely not add there should bear organizations so at peace be that in mind there can be a force fossil fall behind it the way do you I actually they are so the systems on a perfect said day and they hope they try to improve it right collaborate they try to do it and as good as possible but if you have really of course if you have issues if you have the the chest suggestions to you a positive he's to had to 2 persons to the data Carretas 2 persons involved here than issuing and persistent identifiers services pleased that contact them feel free to contact them I guess the most of them would agree at that time they can improve their services and there's some issues can be solved through really quickly so we have already had an example of a data center where about 150 thousand their datasets from 1 day to another were not long reachable and that that also never here and there we we looked at it in a recent what's going on we picked up the phone and issue was solved in 2 hours so it it just forgot to update you I'll handles 200 have is behind us 100 15 thousand data so it's it can be solve very quickly but someone has to has to say here he is an issue and then we can find a solution for it and then but not least pieties I useful and also are fun to use and fun to if you do some symmetric some or something like that play a land and it's they're actually can be quite a lot of fun so yeah just if you have some time to see for himself and they a help to make you work more usable so some aspects it's not yet the Wunderlist maybe the this you do not want to know why you do not need to know maybe some of you want to know and or you want to they have in in in the back of your mind like something I I showed like total number of PADS registered while numbers always nice for institutions for as a researcher maybe you do not need to know them that much at least when it comes to PAD systems the names of the agency's it should just work so depending on a repository use depending on the Journal you submit the publication to it should work that you get a you'd you I had a persistent the persistence works so that's still an issue they're struggling they're for and PAD providers so 1 of 1 of many so but you shouldn't care about the the infrastructure behind you shouldn't have to care about it fight each other don't why 1 say better than than the on or something like that of course that that it happens that als at conferences and so on and and how perfect the PAD system is because they're certainly not send there's room for improvement and yes 1 can can keep that in mind but it's not that important so you should care about as a researcher and you should care about your passion you research and as long as you're not in their information science yourself and then I yeah it should be the job actually of the PAD providers to focus on communicating the most practical points and decidable was ability aspect as a benefit for the researcher should be crystal clear and we will have some white samples now that coming up and this week he and to actually show you the head the goodies set of having an PID system in place so we like to refer to p ideas in our case as TIB you Iose as the commute with occluded can keep it all together so the different and the different digital the these the current research information systems and the different data and on a kind of digital repositories yeah out their discovery in this is the the we such a profile are like here know case that's a that's a a voice sample solar and you I can be at your eyes and pieties in general can be what 1 way to to make it into operable to keep it together and actually to have found open API source some for air in places where you can actually use that to build up your own system that as harmonized in 1 way or another and and can be at West's more easily yes and there's still the issue of fair and and well what I order formats let's say using them also persistent identifiers and 1 that has to be mentioned she of course the data journals so for they publications again now we have to there's a possibility now and for some disciplines like nation nature scientific data or by medical data channel and there will be many more popping up in there in the next years because the site repositories that's also away and they of course they express them as both as as they can act as the data were positive tori but as as the journal of a general function so if report actually under we such data something like that as well so and it's 1 more that's it's 1 more source actually to to use their new eyes your eyes' PID thingy it's going to be pretty nice to see which just as a personal note which share and you II agency so data dataset and so on and I will take here and will pay Galizia DPP idea that will be the PAD provider for this new kind of digital and publication format and with that we come from on the more practical side Vicki and back to the software's was well so the 1st round of questions we users get out play will almost everybody two-thirds of the right of it and who we as use is inaudible if you'll 203 people OK we have prepared them all and it's very life all would see how it goes the main gist here is that there is no official integration between Git happens in all of that comes from a project of the model assigns lap of called coordinator right just now that the PID is are usually managed by an organization and the the customers are on the 1 hand the users will want to find the resources but also the institutions were
hosting resources and the the video
eyes are so-called mint it's a persistent identifiers but there's also a different type of identifiers and they're called intrinsic so they are generated from the object itself and on get her up as you probably know the Git hashes also indicate system generally you get hashes are such a unique identifier which are not assigned to resource but they are generated from the system so therefore always uniquely identifying something and therefore the the question of persistence then is shifted to the to the from in the DOI system for a summary of his contractual obligations of the organization with resources to update the or else we just heard from 1 example where they did a bit later than they should have done it so this is the habits and but there's also at a technical level where for example you can do the redirects on your cell that or you can use the skid hashes for example and these this integration between get happens inaudible now takes advantage of both basically a sea of on the 1 hand the Get system was the intrinsic identify us and on the other hand this this academic preference for having your eyes so you can't simply sink the 2 accounts there are the 2 systems and then get for your repository and your I. minted in addition to the intrinsic identifies so that's why this speech said if you're enrollment dress like the Romans so if the view I systems more accepted in the academic word here 1 option to give your cold you so we're going to have a little demo low in a minute of that and here this is the nor is as we mentioned already a rather generic research data repositories so it is not discipline specific which in turn means that they cannot provide you with tool much a predetermined for example it had years that you should feel out so here it is the question of our own responsibility we have to provide these rich metadata so that the I've actually becomes useful there's 1 taught this is being wild boarded-up currently and has also recently opened its doors to the public it's called software heritage and what they are aiming for is for example they're ingesting all the source told from the top from I think the a repositories and several others skipped lap for example is also an up-and-coming or already quite popular code hosting service and to young on suffer heritage the goal is to have all this public source code available as well in it as a copy it and also in the rather persistent and manner and citable through that resource as well short the original copy for example and you'd have ever be removed for any kind of reasons whether that's technical problems or but it has been bought by Microsoft round whatever the hell you heard the news I wouldn't be too worried about this but for for much of the cold there's goal of getting a 2nd copy basic key on suffer heritage and generally this interesting side note there's a workshop going on I think on Wednesday from the gist scanned the software sustainability Institute so if you of follow up on your own notes in this workshop at he's also have a look at there's software preservation tack they're at the block of the self Sustainability Institute because probably in the next few days or weeks there will be more info available there as well FIL as I said life demo known we will see how that goes and after fixed
explain a little bit there there I'm going to switch the also in like to your
limit 1st explain the background a little bit
here then for a different
project I recently while nodes have you worked with I started writing an R package of which at the lucky side effect that because we were preparing this workshop in parallel and we can use as our package for some demonstrations and for example I did not minted boy for this repository yet but we're going to a twin-track a pretty short works the that is the process so we will probably not see the GUI itself because the node also urged and might need some time to maybe review with the application and review the upload but we can set up all of the necessary steps and now we are here in the issue tracker from this project from this repository enzyme a summit myself a little Trotskyist so I am I tried installing this package already simply short technically work so this will be of of getting a doorway I tested and in the the nodal set box and that's when I found out that the new tools in order doesn't assigned that your eyes right in the next 2nd but a seconds but there appears to be some delay and then maybe tomorrow when we talk more about it up and they did we can look it up with it actually works so what you do with when you have a GitHub repository that your that you want to have upgraded with the doorway this 1st of course we look the the
official documentation you from get hap making your cold citable they explain a little bit about the boy they explained that you should make sure that you know which vector repository want half with the door and then you're supposed to look into that comes in on so try not innocent-looking either
from work or from the top but you can also set up a specific is enough to account I'm going to use my get become no
there we go was my e-mail address you from the library and snow by switching excitable
crawl it Let's have a look at 2 so we
authorized application now we need is an old to pixel repository that we want
so we're going to go the the top menu item here and this page as those
long page so we will just jump to where it is and
there's so there's a repository and we must just going to switch it on so
get started we're almost ready actually
and the next step in Our the side of a cold guide is to check whether the repository has this hook enabled so that is in
settings it was wet books right what books we have a furious looks good and
now creating a new release you know this terminology from the software world I guess the release a release version and on now gets indicate system that
works by tagging we can talk more about this tomorrow or we generally will talk more about it and get happen tomorrow so some of this stuff doesn't maybe it makes sense that's the order was not basically waiting for us to attack the release and the release we can do on I get
half this new form for this and so on because I had prepared my little tossed list
I want to look it up but again
see you that happens when you morpho
repository to all the different organizations funded have jolt bookmarks break but this will no longer happen when we have a joint right so we have switched those in orders and God and this is good now we're going to tackle release and because I had already prepared some version numbering you which we will also talk about
in on 1 of the next days this package is at this version 0 . 4 2 so we're just going to use that troops
creates a release so there was a form
that was here and there's some helpful suggestions that if you want to use the version number here just use the V and a version number the title now we come into this rich metadata aspect what should the title beat I I'm going for something simple and would just say it's in a little d we use and 2 should describe this release again that's a top kind of metadata and as I showed you just now because I had already compiled this list
of of changes so basically the change from 1 version to the next what has been fixed what has been changed I think it's a it's a good practice to include the release notes into this release a description so it is just
in in marked on format and because this is the 1st release it I'm creating to get the story I will also include the change from the
last of versions so we can close
this intermediate pages can't people events this release includes this and this fix this and this has changed all that seems to be a this page shows a
line break here in between somewhere so some minor reformatting here which of course did not happen in the preparation of this them all and there's always something is there so does it look good think it looks OK and then the last piece of meta-information here is a binary check is this a production-ready release this prerelease as you've seen in the version number which again we will talk about more in the next few days that it was a 0 comma decimal released so I am going to say this is more like a pre-release version still because it is ready to be used and tested but said it is maybe not completely mature and also in addition to this a version here I'm going to attend the b . tack so these were this is the form basic thing it's
not too many steps I would argue and so on oppressed published No no we going to switch to the issue again years
after the release of corporate all the results so I think my plan is you is fulfilled so there we go the pre-release is
on but is it also owns node already
so we're going to switch to is in normal well
you look again who come there we are by
adjusting in the in the test it took several hours of processing but OK Doris already this was even better than
expected so there we have it this is the publication dates we have just published a software as open access it is available on the so if we hope that we will get to they did have all
of your page of this project exactly at this version here so this is the release stack there where is it I'm not sure I can't I can such at now but is this specific point in time when I make this this attack so as this project involves this link from the node was still point to this old at that time points so but of course users can always switch to the master branch as you probably know in gets the master is just the the default branch were all the what happens and in this case the 0 . for there to be Datek is also the most current tax so that's why I stick the displayed and now the only thing that is left to do basically is the the cold itself
with this do you I am so there's a nice shortcuts here and get the you batch we will use this marked called here just copy-and-pasting it into the
README file also we have a read me here who going we're going to edit it
and there's already a batch for the life cycle which we can leaf and I'll just at the doorway batch directly in below it Châu the commitment such
but something I had not planned in advance but I think gets is in World War I GUI hoops I think that's a correct spelling if we really compete with the metadata also commit messages are metadata and also I can demonstrate 1 nice little feature here in the tab or indeed generally if you type fix and then the issue number wanted to demonstrate this on the next days but we're
going to do now that was a fucking was 14 right here so that was 14 the
then this will automatically be
closed this issue so I
think there's no all need to act more description and to extend this I mean we're getting a doorway when adding the doorway batch a so there we are if we go back to
the main page of the repository people now see that we have a joyful this and from there they can also get back to this
nice lending page so what we have now done basically is created a persistent identifier for a report for a repository in the top that means the notorious grab those zip copy is an archive of exactly this point in time so it's not the entire edit history for this project is just the current set of finds you can browse the list here we have this destruction which in this case is simply automatic isn't from the Kitab released page so it's the same thing I mean this issue numbers don't make a lot of sense in this case about it's probably something we can add edits here I haven't in the tests
I haven't got to this point the yeah and then it automatically recognize that this software it's it's at the publication dates bonds in order here I can add might work it's I think the AUC will not be isn't when I put it in the top so I think I should add it so it's then may take the time by the way I mean this was known maybe 10 minute process including all the descriptions and explanations the affiliation that's well I guess it would be war like this so apparently the affiliation is automatically wrapped from the get-up organization which on some cases will be corrected in some cases it will maybe be a bit to squashed yeah he's description so I I think I'm going to edit at the language is interesting here of course I mean their meaning the human language not the programming language and yeah here would be the possibility to add more keywords how do we have time for this we have 1 more hour should I demonstrated all of this of form because I think it's pretty straightforward so I
have I can show you which key words I had already sets in you tap as well they called topics you can add topics there's even some suggestions here and because in this case is that it is a micro biology biobank the back dive database the client therefore the packages a back dive client so it's written in it's about microorganisms to but to database and so all of these things I already tacked indeed tariffs I'm just going to copy and paste the keywords not sure
how this works maybe it has to be it probably has to be 1 after the other this is not the most interesting part of a demo sorry that year 1 expects to have some kind of article PTO maybe for a ride archive I'll would the access rights our open access and the license it is another 1 in this case because it's a suffer license which we will learn more about in the next few days as well on Friday specifically yeah there was no grounds in this case so I guess I could remove this related or external identified estimate identify as well if you we exactly half of them 1 reason why the door system was created right we have of course a domain with the top-level domain we have a group name on the GitHub platform and with a project name on the guitar platform but here the question of persistence is basically on search by the people managing it a repository can be renamed a group can be remains renames a get-out quotes final clause so all of these are reasons why a URL would break it's not a natural law that your house break but some people decide to break the URL unfortunately and this is also something we have to fight and to be honest but for now we have ensured that with the Deutsche we at least have a copy and a copy of MIT metadata a copy of the actual suffer cold but the related identifier would be the Kitab intrinsic identify and the Git release tack in this case
so the explanation here in the North was that the reason is in order recommends
filling this out is that you want to
have the software in this case or in the data set as well that maybe in your case is linked back for example to the paper that you published before we saw an example where a paper a reference downwards basically to the data sets but then you don't really know maybe the repository where the data set is listed will never take care of referencing app what's to the paper that was of course published later for example after the data set so here you have the possibility to to a nearby dB directional link and saying if I had already published a paper about this I would get the put the GUI of the paper here and then probably use the options sites this app lots of the resource here the paper sites my paper so that my merit office so that people who discovered the softer 1st can also find the paper or as we saw before the other way round is currently the more common way and what's the known or no will also do I didn't mention it before but every time I create a new release SI I develop my software and so release it under a new version it would automatically grant a new copy of it so I have a chain basically of your eyes 1 general GUI for the whole project but also for each successive version as well so then in this case probably there is an automatic feature for its own tool at the due I of the next version to the old records and and then the node should now where was it here as select this this qualifier for this resource that that's you why is a new version of this app lot of the old version so there will be 1 option here to enrich the made at other set considerably and all this a lot more here we can go through all of this contributors they
are in many cases also red automatically from the gates history so what you will see in many cases where no for example a larger package will be uploaded to the normal there's so many all of us but it's that gate contributors so even somebody was just fix the title will appear in the author list from the model and this Of course the pros and cons of for this behavior but I think I want to differently at my my doorway here my work it's as well and at least no sorry after I'm already listed up here right and I'm not a contributor to project and the of are and I don't need to do this but if there would have been the people would substantially contributed to this but maybe not by committing into the project I could list them here as well a yeah in which the data set in a way that's the nodal could not automatically ingest from both did repository itself yes OK I'm not going to go through all of these see here parts enrich the data later so are there any questions about this parched I'm
going I'm doing image do 1 thing I will the change description to just the 1st few
sentences here off the atom off the README so that I don't need to
pace the raw data you but I can paste
the formatted Virgin already yeah that looks a bit better rings OK OK in right are identified as required so I'm kind of didn't removed this empty future sorry now I'm saving it knowledge should be possible to publish exactly so did this
already create new version noted and create a new version so that's the only 1 we have currently in as you can see here there will be a version list if I do this repeatedly but you there's a a separate your i.e. here for the this project in general so and so I can also use this but the specific version has this nice that she was a 6 1 in the end so and it appears that the general version of 6 0 so I was I had around number a so the question was whether it was
possible to update the is another record and have the same DOI so in the mid that yes you can have some updates but if I publish
a new version every new release on the top and then that new release will get a new toy but the his sight all versions I will remain the same as long as this record exists so you get a whole family of toys basically that are interconnected when you were and go through the normal project lifecycle of updating for example an R package several times having new versions and you get a bunch of boys so the question
was how to ensure that the people get exactly the version of a data set for
example when they click on a doorway in the reference list and it is by using this version voice not the general for the project but in some cases you also want to reference the project itself and then you can use this 1 but the get to use in order would push the version DOI it to you this is the more visible and more convenient option in order to have this this time travel like feature it up we go into the the different time points where the thing was published work was released in that version that you cited in the he further questions OK so then while I'm very very happy that this went so quickly you especially in the technical parts this was so little stumbling block going
to close all of this lawmakers
under and yet we're back in the
presentation so to summarize the PID apart it is an abstraction layer that directs you to the actual locations of objects that's what it and therefore the underlying location can change for any number of reasons which maybe we can discuss in the in the evening and now we come to end of exercise which will probably no feel the rest of the day for us off the outer choose a fair repository 1st a bold statement or something that may be also very obvious because there is no really perfect alright repository as my colleague mentioned before it depends on which data quality do you have for August the repository require depends on the discipline of your subject your institution's your find that they may have the requirements as well or they may be predetermined where you should put your staff of course reputation still is also a factor you will probably learn about useful repositories just from word of mouth from your colleagues some repositories will also help you make your research visible may be simply by tweeting out a link it to a new applaud that you have this all many to many different ways where they can help you on some do some don't 1 interesting question is of course the exit strategies sold do they have for example legal contracts and place with a back up locations for example where they can transfer the data in case their own funding runs out and also as a speck of strategy been test that we it's also something something you could ask the the repositories then and as we've talked with the fat principles of lot of making the data valuable could mean that the repository forces you a little bit too old for example at some more mated it and then you might might think of them at 1st then also this whole family of certificates which repositories can can get for themselves as the certification process and some of these certificates will be based on a document review some will be based on a on an ordered was somebody comes here and and checks and some will be based also on very hot technical things but there's a huge variety of this so in the end you have to balance the different needs of the of his of yourself maybe of your colleagues of the founders of institutions and so on so it is not a super easy decision we admit it the Digital Curation Centre has a checklist that can help you as we saw this upper management plan that can help you think about which considerations you should take into account yeah but in general of course I mean you're all here because you're interested in this topic but still we want to evangelize a little bit and we went through promotes the idea of data sharing through rule repositories of course I mean you're basically handing over your data to doctor curation expert so they can help you make sure that the data safeguarded is regularly backed up and preserved longtime something that you as individual research on it even skilled research groups may not have the ability to do for the time frames we are interested in here that all of this is done with the hope of some enabling other people to use a research to find it and then of course to cited and therefore give the crowd back to their view the creators of the data sets the writers of software developers of software we can use some general recommendations so of course we set pieties are extremely important they are almost the 1st point to get Dr. our final to get any kind of resource of findable he should look out for use of standards that are widely recognized so some of them we've talked a lot about you elected site Dublin Core is another 1 that comes from the library experts and also they may be discipline specific and metadata our format so at least make sure it's not something they cooked up themselves know that yeah the licensing we will talk more about this in the next few days and the certificate is a good guideline but as I mentioned there's a huge variety of factors that go into a certification process and some are easier to fulfill and some more difficult to fulfill its as is at a complicated process besides the real 3 data on repository search engine there's also fair sharing of all was which is a bit the younger and also as you can see here in the and not going to read all of this but you can see that they don't have as many records but they also integrates databases and policies so you get a bit more complete overview of what is available in that domain not just made artist stand as repositories and identify a ski mask however we will use them for a little exercise because this just more repositories to search and its again see the number is even lower so these slides were mates just maybe 1 or 2 weeks earlier and I looked it up just now this even more repositories this in re 3 data the last few days but that both of them have in common that they are on the 1 hand a project so the cure rated by experts but both of them well come as user input so you will probably see it in a few minutes when you find repository on re 3 data for example there is a button there I think is sent correction suggestion or something like this so if you would do it all that's some of the information is outdated updated for for everyone and this is what we would ask you to do let's several and both to read 3 data so yeah it's it's a 3 more than on Friday already so there's people busy even on the weekends and in cooperating more repositories them you can of course die in right away but I want to show you a few other ways 1st so we can browse the entire re 3 database by subject by content type or by country so for example we can get this nice map soul if for example a national funding agency requires you to the user repository from that country that would be a good idea to jump into the data this way if you're interested more about what is popular in my subject area maybe you could do it like this so let's see whether there's something or biology and this even might from this biology biological chemistry foot chemistry and this microbiology is I'm going to just jump in here microbial I guess I would be a microbial research up
here repositories that our attack tests of Microbiology so that I and then we can so that the more so the subject filter anymore boxer and you know I mention this all of these aspects that you really need to see a
care operate which get users and so they use so they say I want to use that voice system so they're only half unless maybe I need to have the option of a bottle in my data and I'm down to a single bonds or in microbiology section we're down to 1
repository that fulfills these requirements in this case it would be dry it's not that I want to make any
special advertising for them but that's just how the filters in my requirements the more they came out to be you know as you see this some summary here so they have open access option there's a licensing advise this you I so we filtered and so we can look up some more information and if any of this as I mentioned before I was wrong know and maybe you know all the subject area and you know that this record is wrong there is I think it's used to suggest options and I what I want to show you last
is the the curation option here exactly so if any of this information was outdated now and you knew it then you could submit a change request here sold to modify the curators of 3 data that something is outdated and you can even cite this whole the so if for some reason your you we want to cite the Mature-rated version of the Racketeer may because you're writing a report of all the views of repositories in your research domain and then you can even do that and with that's I would suggest that you get on type I please look out for example the repository from your own country or from your research domain now as a demonstrator just now and think about which they're of these filters are relevant for you and think about why some of the filters are maybe restricting the results from this too much to be useful and which compromises you may have to make when you choose a name as fair as possible repository