Merken

Workflows for assigning and tracking DOIs for scientific software

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
who all physical can I use
these steps the thank you thank you for the introduction thank you for the invitation and a special thanks to TAB because obviously dataset would exist without a be the 1 of the founding members of getting this going 1st in German is DFG-funded project that internationally when
when data source from 2000 9 but by a have the luxury of being at the toward the end of the day so there's a lot of stuff that was sent to day that sort of this the groundwork works I can be quick in a few slides In particular dance presentation
does now because what I tried to do is the talk about well how would you do all this stuff if we agree on the principles and usually that's the
difficult part and then talked about a few issues and about a few solutions and I will go and look more detail it's it's very much work in progress because I think there's different directions people go and there's no community agreement how we do this and maybe this all probably there will be multiple ways how you can of sites software of course my focuses on those using DU wise uh as an integral part of this band this talk here is very legato dog which is a detailed read to find truffles and I'll
get back to to this this picture this lecture got from a colleague some of you might know and like we on was a better scientists at the good thing about
and I just wanna start xk CD comic that's always good which is about good codes and you see it because and
service that energy the lower right corner you sort of start again what is good cold
and so this more specifically scientific software this is actually already formed 2010 from Greg Wilson most of start ups Software Carpentry that there is common agreement that Scitex software apart and we have this many times and there is also a common agreement that may be and some of best practices for writing software was followed for scientific
sofware including things like formal training testing stress that's examples and if you go to
suffer covers a class I'm not the involved in this but please or 1 of the 1st things you learned if you write code to use version control minutes so the independent what to do this is my personal
history of western control systems that this just to remind everyone that the fixed traits of every 5 years or 10 years we use something else again and then the next step said on the
right thing you made which is coated with until next step is to make this available publicly and not just when you hot dry forward some system that's close up on and this a sort of very much assuming all the open source software purchase video my focus
so some of these would the most of it applies to closed source as well but there's just that and things to consider the this is all obviously they get top cats with the scientists question and I heard several times today will get trapped all commercial can go the way in the moment so that so that we get a lot of our institutions and I very much disagree with that I think it happens as far as public source code is the most popular place right now and there's some of the lot of advantages in having that if feed 5 years 50 % of the code is in hundred thousand places it's the findability black and also the sustainability is probably much harder to do this in 1 thousand places alone and place and 1 place it happens infrastructure and we all use private infrastructure as well so it's not perfect but I think it's a very good solution I think that before
we do something else we have to be very clear what exactly is missing and I will talk about a few things obviously this in a moment but in and of course once you make
your it your source code publicly available then of course there's other stuff you can build on top of that of course there's also technology but and test the use continuous integration like in this example called you all these things can be built on top of these and publicly available source for this of course happening in that sort of improves the quality of the software and here is 1 example of the Makoto again which is a piece of software I started working on a while ago on
there might be 2 things 2 big things that are missing in the of of world and I think the bucket world similar commercial and source codes places and what I'm not going to talk about this is so limits registries for example we have a lot what friend this morning where it's about a specific language a specific community which is
more about the publishing and and compiling etc. but that's a very important part that there was just enough time to go into this as well but obviously all these places also started the source code repositories so whether it's crime whether it's type where that's who we germs and what not so the 2 p is missing is metadata and I mean metadata Journal because source code repository of tons
of metadata but the metadata we care about and is already starts simple things like authors of which might be just a very simple things like real names set of user names and also just the mapping of was an author is not simply the list of people who have done at this 1 commits to resource compositori and then goes deeper in particular yeah was linking to other scholarly things so sort of what we heard earlier in the the and and in a
perfect example of something that support us this sort of nobody cares about in a in a sort of source propose generalized this finding information so I wrote this for software was this funding of and there is no metadata In general source from cluster and the other part obviously is the archiving because we are all aware that all source for repositories are full working with software that's the focus and not only is a plus to be everything whenever you want but also we don't know whether they're still around 5 years later In this is
just so I was talking about about metadata that particular when we talk about citations it's linking things together this research prophetess of project yeah uh that uh on the area on the from Mr. initial data service so that's my research papers and middle but never researcher publication grant of course of records in there and you have relations and to what it tried to sort of model this new metadata so that you
know the what's software for example was produced by the researchers at your institution in the last year of what funding collaborating with some other institutions etc. just something we all 1 4 papers and data and obviously also father of the research products the that take the state I don't expect to know what happened on the state and it's a friendly reminder this is when have started so it was everywhere now but it's actually 8 years and so if you feel that more long term
we don't know what we will be doing the source code 8 years and this is just a reminder that this is a real solve who called
decided to close down and 2016 around look read the fine print your source code will still available somewhere but this is it's not a place where you can work and and if that would happen to get trapped which probably is more popular now than we could ever was its power reason they close down that that would create a lot of problems and way beyond their own or community and that's another reminder it's where any day this was basically like the Internet will go down because everything depends on this because it's infrastructure and for top science unfortunately only the tiny piece and that's actually my
despite this page by so but because whereby catalpas they don't care about science
enough and the person who who make this possible was so the 1st author and the the principles paper items this has left it top so these sort of always various systems has gotten smaller so and then talked about this already so in this get
while the source code repositories don't give you the the long term archiving don't give you the matter you need then you have to go somewhere else and the solution that's most widely used from from our the perspective is using
is in the normal things that the wires
and these are the numbers and because signal the gives you hours for releases this since they're not from source code repositories but it's actually you 5 for this is that your 5 the wise 2017 this projects that we are now at little 4 thousand spots on the track to see nice growth and what you can also say that is still small numbers that's an
old was 70 5 % of the votes 26 thousand U. Wiesner for software the rest is mostly produce that has been doing this long anybody else spent and fiction than there's sort of very small numbers for everybody else that which is
interesting in this has been around for a while there's nothing specific or every repository could to exactly the same but but hasn't really happened the will be but this is the running example from yesterday FIL
something well I released on the top portion 2 point 3 7 pushed was in order archived g y and this is the type of some and have certainly is used to
to crowdsourcing the projects for example for ecological data so it's an interesting piece of software it's quite popular and the another example is again a
glottal source server brings up all the time and some example cycling I have have sort of control this this is actually from my own
book the profile solely constant hazard you because it's part of this work flow the King's College content it's very easy uh the tools dataset provides just push it to to market profile and that sort of 1 place where control the kind of things I have done in and of course goes for everything that's is a notable or in other places which brings up the interesting
questions what is scientific software I was thinking about this right for representation nobody we touched on this today I because we use of a of there's probably and nothing but it's not so easy It's not that old son who suffer use particle language a particular way of doing things mean and but for the things I was on the last 5 minutes that's important because what should you give the wise to random software in care with antibodies using it in 6 months allergic care about archiving so I think there is this is the
subset while total both software and and also it's interesting question if you wanna do some research on I don't know quite simple things and that's sort of news terrorist current what how languages are most proper languages today versus 5 years ago in writing scientific software astronomy that's a simple question and it's a not so easy many astronomers but because they have such a senseless infrastructure guy chancellors but should take ecology but also that because those our you get the idea so so 1 way of defining scientific software is uh something that's familiar to people using that so to put in program is a concept of duck typing so instead of saying this thing is whatever class you to say all this thing can do that thing so that still has a long history that this before programming its limbs like that looks like a duck quacks like a duck so it's a duck which means set software if somebody sites in the paper what sorting is a GUI somebody is in some other than scientific software and it's much easier than for example what the the service which very service by starts to track impact of scientific suffer what they have done instead this mining documentation and source code for key words that are scientific so that there's
other policy what you can do what I like this because it's very easy the I'm and
easier sort of something I wanna get to in the last few minutes when only knows which rates of he has some odd even you the closure language you if you're not biting the coat is sort
of worth listening to and use it as a great speaker and talks about things that go beyond the specific language on and something that support for him and that sort of 1 of his other works and is simple yeah the
and the workflow that we have it helps in the and then the downstream things when talking with about making talk consider that still too complicated but I think we have a nice profit
my name is can be on 1 example is if if this functionality that's missing in the top you integrations and you can do all kinds of crazy things and there's a place where you can find all this if you search for archive there's nothing but what stops and organizations provide an archiving services to how repositories which is what's order was doing
but it's there's a lot of other things have to be in this in order as well the so and
that has been mentioned and previous presentation of course you can also take a different approach saying archiving of software that speak here of by the suffer heritage project sort of somewhat similar to what the Internet Archive's links so that somebody else is doing that and I don't have to worry of Michael goes away at least there's a sort of full yeah the the on another
issue which is actually not quite simpler if you think about you have fought there and in 5 years you won executed this beyond having so so you the environment and the cold ocean is an organization that focuses on that this is an example of our code that's relatively straightforward but the general idea is we have this also this ordering using docker containers and and keep them around so that if you want re-run exactly the
same in 2 years and you can do this with the service they provide the user dataset the wise in this example here this actually lead to the publication so that's the source code and then we have units different do I because of the 1st quarter so the to to write as a research note the the and now I sort of into well you use the wise and what doesn't enable and this is just a simple example if you use such data site for everything gets software the keywords variation cancerous which what what this previous publication was about you find this cold ocean d metadata and there's links and metadata etc. so using the the death model if you will give something I then you can search everything that has a GUI for keywords in the title for abstracts for office etc. and because this is not perfect because you see that the author years at the poles in research so it's a service that provides so the authors which office of get lost in this case but it's just I think the learning curve on
the the not the top this is able only is a so we are because naming suffers really hot so data set we the basic stuff
just using it turned day names and I think you if we continue we have to switch to other animals so that they are not so many of the the only like 30 years so on so is is a
library of that I wrote which is about conversion of metadata and actually we launched we over content ingurgitation yesterday's all you have a DUI but you want to do I in different formats
and . sample in RDF for on other things so what's going on here and there's a library that we use for that and you see your adults he his library is a G 1 and of course because the data site and we like that we have to its own dog food and we have to think about how can we move this forward so of course we could have used but maybe we can do something else and if you look here
you see a Jason particle called Jason so that's adjacent 5 that has all the metadata we need submitted you and that's the workflow we can automate and that's how we did this and he I and use of extra funding to it with this tool so generate its own the wife who will and coordinators a project the NSF-funded which basically ended
and is in the face of writing everything up and sorting everything out that was led by property of from our side and and the child's fun and they 1 and that's the usual story didn't put it here about this so many metadata standards for
software so let's just straight 1 you know this this is the same about doing that but of course they don't and and if you take the stand as 1 of the topics for discussion for the software sedimentation rope if we can agree on this as a community is adjacent file and we do this for other and things as well as better for software packages and maybe in the future we can automate the process generation and you can take this and have a very straightforward process minting GUI and have to think
about things like the discussion we had laughter previous discussion maybe you only do this if you take this as a major worship on every worsen whatever policy you have and because you can have physical repository you can run code like the continuous integration tools so that makes it much much easier than that already easy voxels in
old that this then knows what's on our
continent radiation we can create a citation by chance also carries about what you see here is that there is no mention that this is actually software the reason for this is that the citation styles and the so sigh Proc citation style system for generating citations they don't understand software it is on the list and it's sort of another to do I think there's a good reason in the citation to say there's a dataset is a software because if the refusal along references for now people might buy 1 get on with the the metadata about the CSL doesn't understand it yet but and so just for use also encourages
learner of this and this will happen soon
and this is the last slide about the brief discussion in condensed presentation about worsening on the top of a some of the paper model given the worst and iterate and his works well not until you have 100 words in the whatever and both software and dynamic data is another example you have a lot of words that's 1 and also you have something that doesn't have worsened and so if the simplest way to figure the software is the core repository so what I showed you with the all example that's actually well how do scientists the software and people do that of course when they say we use are they don't care what what specific working on so there's a use case was very I don't have worsened and if you have worsened because you need them for specificity you have to link them together to this 1 canonical 1 that's sort of the discussion but then was referring to I think this is very much work in progress and but what will happen with dataset metadata next origin and of the year the editors careful plan 1 will support this this activation that's already Dublin Core and the other big change to the other questions this reduces really focusing on software because part of the documentation how to use status and that of software because the same concepts as different terms the the and this has already been mentioned by then the starting now and we on happy for
input and this is sort of what exactly we will do and how we go about this that sort of the happening now and I will finish
tool conference this 1 is the forest
living conference will happen Belinda's in October so it's a great place to have the wearing meeting person and just go in general and the conference yeah so I kind of like you to go there from there that ideally something interesting proposals where we open the assessment of the month and if you really crazy about persist identifiers we have some old conference that last year recovery of an extremely general in that also includes of course thinking what persistent inference for software we have thought that maybe Reykjavik was to call and this most likely state thank you FIL
Objektverfolgung
Software
Physikalismus
Projektive Ebene
Dienst <Informatik>
Information
Computeranimation
Rechenschieber
Objektverfolgung
Software
Dienst <Informatik>
Quellcode
Information
Kombinatorische Gruppentheorie
Quick-Sort
Computeranimation
Sichtbarkeitsverfahren
Objektverfolgung
Software
Web Site
Subtraktion
Arithmetische Folge
Software
Gruppe <Mathematik>
Mereologie
Computeranimation
Lesen <Datenverarbeitung>
Richtung
Schreiben <Datenverarbeitung>
Information
Extrempunkt
Quick-Sort
Computeranimation
Energiedichte
Dienst <Informatik>
Software
Rechter Winkel
Software
Ein-Ausgabe
Codierung
Hill-Differentialgleichung
Ultraviolett-Photoelektronenspektroskopie
Softwaretest
Software
Wellenpaket
Kontrollstruktur
Klasse <Mathematik>
Versionsverwaltung
Formale Grammatik
Dienst <Informatik>
Information
Versionsverwaltung
Normalspannung
Code
Computeranimation
Open Source
Physikalisches System
Dienst <Informatik>
Information
Fokalpunkt
Quick-Sort
EINKAUF <Programm>
Computeranimation
Videokonferenz
Software
Software
Code
Regelkreis
Momentenproblem
Code
Ablöseblase
Quellcode
Computerunterstützte Übersetzung
Lie-Gruppe
Code
Magnetbandlaufwerk
Softwaretest
Server
Formale Sprache
Kontinuierliche Integration
Speicher <Informatik>
Ähnlichkeitsgeometrie
Quellcode
Information
Quick-Sort
Computeranimation
Software
Softwaretest
Software
Code
Wärmeübergang
Inverser Limes
Konfigurationsdatenbank
Autorisierung
Mapping <Computergraphik>
Metadaten
Software
Dokumentenserver
Reelle Zahl
Datentyp
Mereologie
Rechenschieber
Mailing-Liste
Dienst <Informatik>
Quellcode
Information
Quick-Sort
Computeranimation
Dokumentenserver
Relativitätstheorie
Datenmodell
Quellcode
Fokalpunkt
Packprogramm
Quick-Sort
Computeranimation
Graph
Metadaten
Software
Datensatz
Dienst <Informatik>
Informationsmodellierung
Perfekte Gruppe
Flächeninhalt
Software
Mereologie
MIDI <Musikelektronik>
Projektive Ebene
Binäre Relation
Information
Meta-Tag
Datenmodell
Quellcode
Dienst <Informatik>
Biprodukt
Information
Term
Lie-Gruppe
Computeranimation
Graph
Software
Software
Aggregatzustand
Meta-Tag
Open Source
Software
Code
Hochdruck
Programmierumgebung
Dienst <Informatik>
Quellcode
Information
Computeranimation
Homepage
Internetworking
Leistung <Physik>
Rechenzentrum
Autorisierung
Physikalisches System
OISC
Digital Object Identifier
Dokumentenserver
Perspektive
Code
Dokumentenserver
MIDI <Musikelektronik>
Quellcode
Physikalisches System
Term
Quick-Sort
Computeranimation
Software
Weg <Topologie>
Dokumentenserver
ATM
Zahlenbereich
Projektive Ebene
Dienst <Informatik>
Quellcode
Information
Computeranimation
Umwandlungsenthalpie
Software
Abstimmung <Frequenz>
Dokumentenserver
Software
Zahlenbereich
Programmbibliothek
Zenonische Paradoxien
Quick-Sort
Computeranimation
Spezialrechner
Software
Task
Punkt
Software
Datentyp
Speicherabzug
Ruhmasse
Projektive Ebene
Information
Ordnung <Mathematik>
Computeranimation
Mereologie
Dreiecksfreier Graph
Server
Gamecontroller
Elektronischer Datenaustausch
Hasard <Digitaltechnik>
Profil <Aerodynamik>
Inhalt <Mathematik>
Quellcode
Datenfluss
Quick-Sort
Computeranimation
Web Site
Klasse <Mathematik>
Selbstrepräsentation
Formale Sprache
Programm
Dienst <Informatik>
Quellcode
Information
Quick-Sort
Computeranimation
Data Mining
Software
Dienst <Informatik>
Software
Rechter Winkel
Randomisierung
Benutzerführung
Wort <Informatik>
Partikelsystem
Eigentliche Abbildung
Schlüsselverwaltung
Algebraisch abgeschlossener Körper
Software
Formale Sprache
Dienst <Informatik>
Information
Bitrate
Quick-Sort
Computeranimation
Umwandlungsenthalpie
Dualitätstheorie
Software
Verzeichnisdienst
Code
Formale Sprache
Kategorie <Mathematik>
Digitalfilter
Quick-Sort
Computeranimation
Lineares Funktional
Software
Dienst <Informatik>
Dokumentenserver
Selbst organisierendes System
MIDI <Musikelektronik>
Dienst <Informatik>
Information
Ordnung <Mathematik>
Packprogramm
Computeranimation
Integral
Selbst organisierendes System
Dienst <Informatik>
Kombinatorische Gruppentheorie
Information
Quick-Sort
Packprogramm
Code
Computeranimation
Internetworking
Software
Software
Projektive Ebene
Programmierumgebung
Gammafunktion
Autorisierung
TVD-Verfahren
Web Site
Abstraktionsebene
Dienst <Informatik>
Quellcode
Information
Binder <Informatik>
Computeranimation
Office-Paket
Polstelle
Metadaten
Software
Dienst <Informatik>
Informationsmodellierung
SLAM-Verfahren
Einheit <Mathematik>
Software
Vorlesung/Konferenz
Benutzerführung
Umsetzung <Informatik>
Metadaten
Umsetzung <Informatik>
Dateiformat
Computeranimation
Metadaten
Digital Object Identifier
Lesen <Datenverarbeitung>
Dateiformat
Vorlesung/Konferenz
Inhalt <Mathematik>
Programmbibliothek
Meta-Tag
Sichtbarkeitsverfahren
Web Site
Metadaten
Dienst <Informatik>
Extrempunkt
Umsetzung <Informatik>
Information
Computeranimation
Metadaten
Software
Digital Object Identifier
Stichprobenumfang
Programmbibliothek
Projektive Ebene
Partikelsystem
Programmbibliothek
Koordinaten
Offene Menge
Prozess <Physik>
Metadaten
Dokumentenserver
Telekommunikation
Elektronische Publikation
Computeranimation
Metadaten
Generator <Informatik>
Software
Standardabweichung
Software
Code
Benutzerführung
Ereignishorizont
Meta-Tag
Standardabweichung
Modul <Software>
Chipkarte
Dokumentenserver
Dokumentenserver
Kontinuierliche Integration
Dienst <Informatik>
Information
Code
Computeranimation
Software
Code
Ereignishorizont
Meta-Tag
Data Mining
Metadaten
Temperaturstrahlung
Software
Digital Object Identifier
Software
Mailing-Liste
Dienst <Informatik>
Physikalisches System
Information
Umsetzung <Informatik>
Programmbibliothek
Quick-Sort
Computeranimation
Webforum
Subtraktion
Dokumentenserver
Mathematisierung
Automatische Handlungsplanung
Kraft
Dienst <Informatik>
Information
Kombinatorische Gruppentheorie
Term
Computeranimation
Metadaten
Informationsmodellierung
Arithmetische Folge
Software
Implementierung
Umwandlungsenthalpie
Softwareentwickler
Motiv <Mathematik>
Dokumentenserver
Diskretes System
Kondensation <Mathematik>
Gasströmung
Gibbs-Verteilung
Ein-Ausgabe
Menge
Quick-Sort
Dublin Core
Rechenschieber
Texteditor
Software
Gruppenkeim
Mereologie
Elektronischer Fingerabdruck
Wort <Informatik>
Speicherabzug
Software
Wald <Graphentheorie>
Verbandstheorie
Inferenz <Künstliche Intelligenz>
Software
Wiederherstellung <Informatik>
Identifizierbarkeit
Dienst <Informatik>
Information
Computeranimation
Aggregatzustand
Software
Dienst <Informatik>
Information
Computeranimation

Metadaten

Formale Metadaten

Titel Workflows for assigning and tracking DOIs for scientific software
Serientitel 2nd Conference on Non-Textual Information: Software and Services for Science (S3), May 10-11, 2017 in Hannover
Teil 6
Anzahl der Teile 13
Autor Fenner, Martin
Lizenz CC-Namensnennung 3.0 Deutschland:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/31032
Herausgeber Technische Informationsbibliothek (TIB)
Erscheinungsjahr 2017
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik

Zugehöriges Material

Folgende Ressource ist Begleitmaterial zum Video

Ähnliche Filme

Loading...