We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

How do open infrastructures work together?

00:00

Formal Metadata

Title
How do open infrastructures work together?
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Providers of open infrastructures collaborate to enable seamless scholarly communication. But what are they actually doing to realize this? How do open infrastructures integrate metadata of other providers into their services? This open community session gives open infrastructures the opportunity to let us see scholarly communication through their eyes. Throughout the session you will be able to ask the panelists anything about their services. Speakers: 00:00 Introduction by Xiaoli Chen (DataCite) 03:20 Patrica Feeney (Crossref), 07:37 Helena Cousijn (DataCite), 12:07 Jens Klump (IGSN), 15:03 Paloma Marín-Arraiza (ORCID), 19:17 Shawn Ross (RAiD), 25:26 Amanda French (ROR), Slides - DOI: 10.5281/zenodo.7129812 35:42 Panel discussion
Open setOpen setLattice (order)CollaborationismInternet forumSelf-organizationMachine visionService (economics)Slide ruleData managementProduct (business)IdentifiabilityMetadataQuantum stateProjective planeTouchscreenTwitterBitWebsiteDisk read-and-write headLevel (video gaming)WhiteboardVideo gamePoint (geometry)Cycle (graph theory)MathematicsINTEGRALRow (database)Online chatMeeting/InterviewXMLUMLComputer animation
Link (knot theory)Element (mathematics)Self-organizationIdentifiabilitySelf-organizationContent (media)MetadataElement (mathematics)Row (database)TelecommunicationMultiplication signMachine visionMeeting/InterviewComputer animation
MetadataRouter (computing)DataflowInterface (computing)Virtual machineOpen setCollaborationismCommitment schemeDisintegrationUsabilityLevel (video gaming)outputStandard deviationGroup actionLocal GroupContent (media)Statistical hypothesis testingPrincipal ideal domainSelf-organizationMachine visionNumberLevel (video gaming)MetadataOpen setStaff (military)Vapor barrierIdentifiabilityAbstractionGroup actionRow (database)CollaborationismObject (grammar)MereologyElement (mathematics)Electronic mailing listStandard deviationWebsiteContext awarenessConnected spaceSet (mathematics)Game controllerSoftwareContent (media)Statement (computer science)Letterpress printingCommunications protocolFreewareComputer animation
File formatComputer fileView (database)Slide ruleRepository (publishing)CollaborationismTouchscreenLattice (order)WebsiteSemiconductor memorySet (mathematics)Digital object identifierFunction (mathematics)Self-organizationElement (mathematics)Repository (publishing)Internet service providerMeeting/InterviewComputer animation
Cycle (graph theory)MereologyBuildingFunction (mathematics)Type theoryDifferent (Kate Ryan album)LaptopSlide ruleHard disk driveMetadataVideo gameIdentifiabilityInformationElectric generatorComputer animation
Function (mathematics)MetadataRouter (computing)Self-organizationService (economics)Image registrationDisintegrationEvent horizonConnected spaceMachine visionWebsiteOpen setSlide ruleCollaborationismBitSystem callMultiplication signDigital object identifierInternet service providerCircleSet (mathematics)Function (mathematics)MetadataSelf-organizationSoftwareField (computer science)Meta elementIdentifiabilityImage registrationRow (database)Category of beingGraph (mathematics)Form (programming)Event horizonInformationRAIDTheory of relativityComputer animationDiagram
TrailSample (statistics)Physical systemPrincipal ideal domainPay televisionArchaeological field surveyBoundary value problemSystem programmingDisintegrationMetadataDigital object identifierSpectrum (functional analysis)Link (knot theory)User profileSelf-organizationImage registrationWeb pageInformationField (computer science)Sampling (statistics)Data managementPhysicalismAddress spaceArchaeological field surveyPhysical systemDigital object identifierSpectrum (functional analysis)Slide ruleWeb pageCASE <Informatik>WebsiteLanding pageSet (mathematics)Internet service providerBoundary value problemInformationPosition operatorSelf-organizationIdentifiabilityUniverse (mathematics)Field (computer science)Presentation of a groupElement (mathematics)TrailMetadataImage registrationPay televisionMeeting/InterviewComputer animation
GoogolView (database)Computer filePersistent identifierUniqueness quantificationConnected spaceType theoryMereologyGroup actionSoftwareInteractive televisionSelf-organizationProduct (business)Set (mathematics)Meeting/InterviewComputer animation
Row (database)Power (physics)Open setCommunications protocolRepository (publishing)MereologyMetadataContext awarenessComputer programmingIdentifiabilityElectronic mailing listDifferent (Kate Ryan album)Interface (computing)Computer animationDiagram
Principal ideal domainPersistent identifierWindows RegistryDigital signalDisintegrationUniform resource locatorSelf-organizationSource codeMetadataInclusion mapData modelLibrary catalogStandard deviationCollaborationismElement (mathematics)Local GroupGroup actionFile formatVirtual machineVariety (linguistics)Router (computing)Personal digital assistantNumerical taxonomyField (computer science)Formal languageMoistureInformationStudent's t-testComputer fileMereologyResolvent formalismRadio-frequency identificationCollaborationismComputing platformSerial portIdentifiabilityTouch typingDifferent (Kate Ryan album)Right angleElectronic mailing listCASE <Informatik>BitInformation privacyInclusion mapWebsiteSelf-organizationSource codeMetadataStandard deviationPublic domainNumerical taxonomyElement (mathematics)Library catalogGroup actionWritingCasting (performing arts)Correspondence (mathematics)Computer animation
Personal digital assistantRouter (computing)Numerical taxonomyPresentation of a groupGoogolComputer fileView (database)RAIDMathematical analysisStandard deviationImage registrationPrincipal ideal domainMetadataEnvelope (mathematics)Source codeLink (knot theory)InformationInclusion mapCollaborationismEnterprise architectureSelf-organizationLocal GroupExecution unitSample (statistics)outputSoftwareService (economics)MiniDiscRevision controlObject (grammar)Digital signalPairwise comparisonIdentifiabilityRAIDBitMetadataProjective planeQuicksortLink (knot theory)Endliche ModelltheorieHierarchyLevel (video gaming)Representation (politics)Enterprise architectureExpert systemRevision controlGroup actionPoint (geometry)Attribute grammarInformationSource codeElectronic mailing listEnvelope (mathematics)SynchronizationSelf-organizationSet (mathematics)Bit rateCircleInclusion mapShared memoryImage registrationMeta elementPrincipal ideal domainInformation technology consultingInternet service providerMultiplication signStandard deviationContext awarenessSystem callTerm (mathematics)CASE <Informatik>Online chatLimit (category theory)Service (economics)Streamlines, streaklines, and pathlinesAuthorizationPersonal identification numberInheritance (object-oriented programming)Phase transitionComputer animationMeeting/Interview
Slide ruleMetadataString (computer science)Router (computing)Time domainWindows RegistryDigital signalSelf-organizationLink (knot theory)DatabaseGoogolPresentation of a groupRoute of administrationComputer iconWebsiteSolitary confinementTwitterDigital libraryWhiteboardPrincipal ideal domainTouchscreenCollaborationismExtension (kinesiology)Term (mathematics)File formatCore dumpSelf-organizationQuicksortData managementGroup actionWordIdentifiabilityCASE <Informatik>Latent heatRow (database)User interfaceWebsitePropositional formulaUniverse (mathematics)Projective planePhysical systemDigitizingExistenceNumberWindows RegistryPerfect groupForm (programming)Object (grammar)BitLink (knot theory)Presentation of a groupDomain nameOpen setString (computer science)Analytic continuationSpacetimeTelecommunicationExpressionModal logicLevel (video gaming)Multiplication signDigital libraryMeeting/InterviewComputer animation
Digital libraryDigital signalRouter (computing)WhiteboardPrincipal ideal domainSelf-organizationAttribute grammarMetadataMeta elementOpen setRepository (publishing)Video gameWebsiteGraph (mathematics)outputUniqueness quantificationMeta elementFeedbackIdentifiabilityIntrusion detection systemInstance (computer science)Physical systemMathematicsCollaborationismElectronic mailing listDatabaseProjective planePresentation of a groupGoodness of fitGroup actionWindows RegistryWhiteboardEvent horizonRepository (publishing)QuicksortRepresentational state transferConfidence intervalData managementForcing (mathematics)Self-organizationRepresentation (politics)Open setCASE <Informatik>Image registrationAdditionPrice indexTelecommunicationCore dumpProduct (business)MappingFood energyRouter (computing)Computer animation
IdentifiabilityHierarchyOpen setPresentation of a groupAddress spaceRouter (computing)Right angleMeeting/Interview
Raw image formatHierarchyProjective planeQuicksortUniverse (mathematics)Level (video gaming)CASE <Informatik>NumberRAIDIdentifiabilityOntologyCodeOnline helpRight angleDemo (music)Router (computing)Meeting/Interview
Digital object identifierLevel (video gaming)Function (mathematics)Computing platformRAIDSelf-organizationWindows RegistryAlgorithmTraffic reportingTouchscreenMereologyOpen setMultiplication signDifferent (Kate Ryan album)Projective planeLanding pageIdentifiabilityInformation privacyData managementSelf-organizationType theoryCartesian coordinate systemDigital object identifierSingle-precision floating-point formatInternet service providerLocal ringConnected spaceProcess (computing)CASE <Informatik>Doubling the cubeAbsolute valueSoftware frameworkBitFunction (mathematics)Shared memoryMetadataSet (mathematics)Category of beingObject (grammar)RAIDSampling (statistics)Element (mathematics)Field (computer science)AbstractionService (economics)Principal ideal domainLevel (video gaming)SoftwarePhysical systemInformationWhiteboardRange (statistics)Video gameContext awarenessSource codeNichtlineares GleichungssystemRow (database)Windows RegistryComputer fileBit rateUniverse (mathematics)MultiplicationCycle (graph theory)NumberServer (computing)DigitizingQuicksortTraffic reportingWebsiteWordMeeting/InterviewComputer animation
Presentation of a groupDigital object identifierComputing platformFunction (mathematics)Level (video gaming)RAIDSelf-organizationWindows RegistryAlgorithmPrincipal ideal domainHypothesisBookmark (World Wide Web)WebsiteSineInformationFAQWeb serviceService (economics)Open setRouter (computing)Type theoryTraffic reportingLevel (video gaming)BitFunction (mathematics)Encapsulation (object-oriented programming)Open setWritingInterface (computing)EmbargoLanding pageFrequencyData managementQuicksortSoftware repositorySelf-organizationMereologyMultiplication signInternet service providerResultantCartesian coordinate systemSet (mathematics)Order (biology)Data conversionCASE <Informatik>Universe (mathematics)Operator (mathematics)Projective planeArithmetic meanClosed setPlanningFile archiverMetadataRight angleInformationLink (knot theory)Vulnerability (computing)Endliche ModelltheorieConfidence intervalRevision controlEnterprise architectureObject (grammar)RAIDAdditionSoftwareGraph (mathematics)Speech synthesisNormal (geometry)Bit rateMaxima and minimaArchaeological field surveyCodeRepetitionPhase transitionTrailWebsiteSinc functionComputer animation
InformationFAQService (economics)Web serviceOpen setWindows RegistryRAIDRouter (computing)System identificationIdentifiabilityGraph (mathematics)Message passingSystem callMetadataRight anglePersonal digital assistantAnalytic continuationCASE <Informatik>Self-organizationPhysical systemComputer animation
Reflection (mathematics)Meeting/InterviewComputer animation
Transcript: English(auto-generated)
It started. Okay. So hello everyone. Welcome to the open infrastructure roundtable session of the data side member meeting day.
I'm very happy for you to join us today. My name is Sally Jen, and I'm the project lead of the implementing fair workflow project at data side, and I will be your moderator today. Well, as a short introduction is a well know that is important for infrastructures to
be robust and integrated. And this requires a shared vision and continued collaboration among stakeholders. And today we're very happy to have speakers from Crossref, ORCID, ROR, ARDC, and IGSN alongside our Helena from data side to join us and talk
about how infrastructure organizations work together to enable metadata interoperability and seamless integration of services. So we'll kick off this session with short introduction talks from each of the speakers to tell us their approach into interoperability and collaboration on three levels, the strategic level, the engagement level and technological level.
And at this point, if we have questions that are generally applicable from the audience, we will address them. And after that, we will change our perspective and look at how the infrastructure and the services to provide support each stage of the research lifecycle. That's what we'll base our curated discussion session on.
So first, a little bit of a housekeeping. Please tweet about the open session using the hashtag data side 22 and feel free to introduce yourself in chat.
The session will be recorded. It is recorded. I hope you have noticed that in the beginning of the session, this will be recorded and shared publicly together with all the slides that are presented today. And please use the Q&A tool for your questions. So today's panelists, we have Patricia Finney, the head of metadata at Crossref. Helena Kizin, the
community engagement director here at data site. Jens Klump, the vice president of IGSN executive board. And Paloma Marine Araraza, sorry for putting your name, apologize, the engagement manager
at ORCiD. Sean Ross, product manager at ARDC, representing RAID, the research activity identifier. And Amanda French, the technical community manager at ROAR.
So without further ado, I will give the floor to Patricia to start the first talk. She can share her screen. Thank you.
Just briefly, just for the brief, I could talk about this stuff for a long time but I'm supposed to keep it brief so I'll do my best. For those of you, I think most of you are familiar with Crossref. But for those of you who aren't, or if you're not as familiar, we're a membership organization consisting of publishers, funders, and other organizations that create and impact scholarly communication.
We help our members register metadata records and persistent identifiers, DOIs, for the content grants and other resources that they create.
The metadata registered with us is increasingly vast and we're able to picture a new kind of vision I think that's shared across a lot of organizations for how we want to grow, what we collect, and what we connect to. We want to create a rich and reusable open network of relationships connecting research organizations, people's things and actions.
We see it as a scholarly record that the global community can build on forever for the benefit of society so it's a very grand vision but I think it's very important. So this idea of what we're calling the research nexus goes beyond the basic idea of just having persistent identifiers for content.
Objected entities such as journal articles, book chapters, grants, preprints, data, of course, software, statements, dissertations, protocols, affiliations, contributors, and it's an endless list. All of those pieces of metadata all need to be identified. And that is a very important part of the
picture. But what is most important, increasingly, is how they relate to each other, how we can place them in context. And, and how they combine to create this whole research ecosystem. So we're working with our membership and other organizations to provide richer metadata for a widening set of objects, and to make these really granular connections between them.
So very briefly, we began building this nexus already mostly with the resources under our control, but to make this vision a reality, we really need to work with other open infrastructure organizations to connect and collaborate, you know, we already collect
ORCA identifiers and ROAR. We've started collecting ROAR identifiers and we expect that will grow a lot in the future. So how do we plan to do this? We plan to do this through open metadata, persistent identifiers that span organizations, not just Crossref identifiers.
We will, we are going to work with members of our community like DataCite, ORCA, ROAR, and other organizations like the Initiative for Open Abstracts, Initiative for Open Citations, and we also are embracing the principles of open
We have open APIs, we adopt the standards and best practices put forth by our community. At Crossref staff, we participate in all kinds of working groups and committees on all kinds of levels that shape metadata and even beyond metadata, just, we have a lot of discussions about how our APIs and infrastructure can communicate with other organizations.
We work with invested organizations like PKP and DOAJ to lessen barriers for registering metadata records with us so that we can provide the rich metadata records that other organizations require. And we also internally have a number of working groups and committees that welcome participation from invested organizations. So
really it's all about collaboration, how we collaborate technically and, you know, on a more of a social level. And that's it for me. Thank you.
Well, thank you, Patricia. That's a very good overview of all the collaborative efforts. Next to Helena. Go ahead. Yeah, thanks Shelly. I thought, with an eye on the clock, I'll just share my screen straight away. So hi everyone. Thanks so much for attending this session of our member meeting and also thanks to
all the other speakers. We're really happy to have you participate and contribute to the member meeting today. I will provide a very brief introduction to DataSite. I think you will already know some of this.
DataSite is like Crossref, a global nonprofit membership organization. We work with over 2,500 repositories around the world to provide DOIs for data sets and also other research outputs. And how that works, I just wanted to demonstrate that with a simple infographic. The building that you see on my slide is a research institution
and at research institutions, many different types of research outputs are being generated. But if these just sit on someone's laptop or on an external hard drive, then they don't become part of the research ecosystem. And that's why it's important to think about persistent identifiers and metadata.
Because if a persistent identifier is assigned, then the entity becomes part of the research lifecycle. And through the metadata, it's possible to connect the outputs to all the other entities within the research lifecycle. And the metadata makes the output more discoverable so other researchers can find and reuse the output as well.
And it makes it easier to track what's happening with the output. And that information feeds back into the research institution again. Now, our vision at DataSite is connecting research to identify knowledge. And I think that's really relevant to this session today, which is all about these connections between open infrastructures.
And so metadata is really key for that. And I think Patricia also referred to that and I wanted to show that on this slide. So if you imagine in the middle, let's say for the purposes of this talk, it's a data set in the blue circle and you assign a DataSite DOI to your data set.
Then through the related identifier property in the metadata, you can connect your data set to related research outputs. This can be the software used to analyze the data, but also an article you published about the data. You can connect it to yourself as a researcher by adding an ORCID ID in the name identifier field.
And you can connect it to your organization by adding a role in the affiliation identifier field. And then it's also important to consider the funder of the research. So for that, we have the funding reference property. Now, all these things together, all these connections we're making, those form a
graph and this is also related to the research nexus that was mentioned. So through the connections in the metadata, we can get this more complete picture of how everything in the research ecosystem is connected and how data sets connect to publications and to software and to researchers and organizations and funders.
And very briefly, something a bit more practical about current collaborations, because I don't know how much time we'll have later in this session. So I already wanted to mention now some of the things we're working on with the other partners that are on the call today. So with Crossref, we have a service we call Event Data, and both Crossref
and DataSite contribute information we have about the relationships between Crossref DOIs and DataSite DOIs. ROARS, we've integrated into our metadata and also into DataSite Commons. With iGSN, and you may have seen the announcement earlier this week, iGSN registration is now available through DataSite Services.
With ORCID, we collaborate on auto update, where you can automatically enable your ORCID records to be updated with outputs with DataSite DOIs. And with RAID, which is a bit newer, we're currently working on how we integrate that into the metadata.
So I will stop there. Thank you, Shelley, and I'll hand over to the next speaker. Thank you, Helena. Next up is Jens.
Yeah, so Helena gave me a good segue into the next presentation on iGSN. And I'll start with the elevator pitch, where iGSN is about acknowledging that it's difficult to track samples across institutional and system boundaries and unambiguously link them with data and literature.
And to solve this problem, iGSN provides a persistent identifier system for physical samples through a globally unique resolvable identifier that is compatible with other persistent identifier systems.
iGSN persistent identifiers are already used by major research centers, universities, and government geological surveys, and they are endorsed by scientific publishers. And as Helena just mentioned, iGSNs are now available through DataSite to all DataSite members, so it's one service, one subscription.
And what Helena also talked about is this trust linking through the related identifier element. This is one of the important features of how iGSN is set up and what it does. On this slide, there's an example of a specimen of, in this case, kaolinite, a mineral that is identified by an iGSN.
And then an optical spectrum was measured on this specimen. And this spectrum is then published through, in this case, the CSIRO data access portal and identified by DOI.
And this data set of the spectrum is discussed in the publication, which itself is identified by a digital object identifier. So you can see how you can now cross reference all these materials and move between those entities to learn more about them.
And what Helena also already mentioned is that DataSite and iGSN work together in a partnership to provide registration for physical samples, to also give guidance on best practices and provide technical solutions.
The iGSN organization develops publisher's guidelines on information that is displayed on the landing pages and how to populate the DataSite metadata fields so that they are semantically coherent. And the DataSite samples community manager works with the DataSite members who want to assign DOI to physical samples.
And for more information, you can go to the DataSite pages or to iGSN.org. Thank you. Thank you Jan.
Next up is Paloma. Yes, so it's loading. There we are. So, most of you probably already know ORCiD, but as crossroads and DataSite, ORCiD is also a non for profit organization, and our main mission is to enable all these trustworthy connections between researchers
that are going to be always in the middle, and their contributions and other activities in the scientific and research ecosystem such as their affiliations or any other type of action related to innovation.
So the main part is that we have people in the middle, researchers, contributors, and then we have all their production and their interactions in this scholarly ecosystem. So, we have, for example, their publications, their data sets, software, ideas they might have, or funding they receive.
And how can we connect, actually, all those basically through infrastructures that communicate with others. So, this context of open infrastructures and taking the power of API's of application programming interfaces.
So, here for example we see how the ORCiD record might communicate with the institutional repository, with a publisher, or with the funding entity, and also how these funding entities, publishers, and institutions, also communicate with each other.
So that using metadata and protocols, they can actually exchange all these data and reuse them and actually contribute to that metadata reusing part. One example more closely, at ORCiD, so we work with different persistent identifiers for works, and we continue adding new identifiers to that list.
One of them is DOIs, both from DataSite and CrossRef, and also the corresponding outer update with them. And also we work with organizations identifiers such as ROR, and Ringle, and others.
But of course, it is the prioritization of the inclusion of resolvable piece as part of the metadata associated with every item that we have, and that we provide those metadata as public domain on their CCC license as part of our public data file, so that others can reuse that,
and also integrate that as part of the platforms as well. And when it comes to community and collaboration, we try to keep the community standards and adopt them, and also adopt them through this community collaboration.
Some examples are the credit taxonomy, or the cast write catalog of elements, and also be in touch with the community through groups such as the researcher advisory council, or the funders interest group.
And two examples in Praxis or use cases are, for example, affiliations using ROR, or works using the contributor roles. And if we pay a bit more attention, then we see that ORCiD is connected as well with the corresponding DOIs, ROR, or the contributor roles.
And also in the case of affiliations with other identifiers such as ISNI or Fundbreath. And also important is that the source appears there so that we can track as well metadata provenance.
And that will be it. Thank you Paloma. We have Sean talking about Rave coming up next. There, that should hopefully have shared.
So, I'm, I'm a give provide just a bit more background because I think Rade is the newest of these identifiers and might be less well known and I may even indulge in making a few comparisons to the other identifiers that have been around for for longer.
So first, the Australian Research Data Commons, we're not for profit company, providing national infrastructure in Australia and our relationship to the raid is essentially that we have, we're in the late stages of getting an ISO standard for raids, where the ARDC will be the registration authority establishing policies and guidelines globally for raids and will also be the Australasian
registration agency that operates a service for minting raids, and we envision other registration agencies around the world as well. And essentially the, the idea behind a raid is that most research takes place in the context of something that, for lack of
a better term will call a project that there is an individual or collaborative time limited entity that most research happens in and what
we're doing is providing a PID for research projects and the activities that they undertake. And that consists of a handle service that provides a unique and persistent identifier and a metadata envelope that contains a collection of other persistent identifiers and project information that is found nowhere else. And it can also include
named relationships to other raids. So we're expecting, for example, hierarchical use of raids and parent child, you know, sub projects or activities undertaken by a project. And we're in early stages.
Now, I've been doing a lot of consultations recently and, you know, and we are close to having an early version of the metadata envelope ready and some other definitions and scope documents that I'd be happy to share if anybody's interested, but our design principles behind this are
that we are a source of truth for research projects that allow, for example, organizations to share information about projects without having to rekey it or having things get out of sync from one participating organization to the next. But we absolutely we don't have any interest in duplicating information that's held elsewhere the metadata envelope is fairly streamlined.
And, but we do include project information not found anywhere else. Again, it's a relatively short list of metadata attributes there. We link the project the raid links to particular other entities but we don't, we don't want to trot on anyone else's toe,
we're not trying to link the other entities to each other so for example this would tie a project to funding institution data, data sets that they produce or consume, etc. etc. But we don't try to, for example, tie contributors to publications or tie any of the, any
of the things in the, the identifiers or entities in the outside circle here to one to one another. So, we always get asked what a project is provided kind of a, I don't want to go into too much depth with this.
But essentially we're, we're, we're attempting here to not duplicate anything else that others are doing. And so you can see a list what the projects not and essentially this is the vehicle, a project we're defining as the vehicle or enterprise that undertakes undertakes research, and we didn't allow through hierarchical deployment of raids for you
doing to include sub projects activities etc and we've gotten to the point of modeling this with from sort of typical small academic project up through quite large projects that may have three or four levels of hierarchy and 1010 to 50 entities that each
at each level. And we're really in the outreach and engagement phase right now we've got an advisory group consisting of global experts including representatives of most of the other pin services partnerships with early adopters in Australasia who are current are using an early version
of the rage service that we're currently rebuilding. And we've got other partnerships with EOS, served in the Netherlands just in the UK, etc and many of the organizations who are at this at this webinar now. So, we are also under undertaking other consultation so if anyone would like to contact me
I would be happy to talk to you about use cases and requirements. And that's that trade. Okay. Thank you very much, Sean, I think it was generated a lot of interest in chat and questions for read as well but we're just addressing our last talk by Amanda.
Hello, are you seeing the correct screen. Now we're seeing or
from what's it called the speaker notes. Yeah. Always you would think I would be able to manage this. Okay. I have too many monitors, really.
How's that work. Perfect. Great. Thanks all war has been mentioned in a number of the presentations already. Beginning with cross ref, which has added support for war.
Within the past year data site itself supports war orchid supports for so happy to go last and tell you a little bit more about what it is, we didn't already know. So war is the research organization registry. It is a persistent identifier global community led project for identifying research organizations,
or links research outputs to organizations that employ fund and publish scholarly researchers. It's been in existence, officially since 2019, and has over 102,000 research organizations currently identified there was
a question that came up earlier about is me, which is of course also a somewhat open persistent identifier. And is me as I mentioned in the q amp a has 13 million objects, people, organizations
identified whereas we're only has 102,000, and oddly enough that precision we think is one of the major value propositions of war, the primary use case of war is to identify researcher affiliations with universities
that employ them with funders that have funded the research and with publishers that are publishing their research with is me you you know will get identifiers for 18th century painters for barbershops and all kinds of things so the really specific use case for war is to identify top level organizations that are connected with research.
And as you can see here in just a screenshot of the web interface to a war record. Each war record does include external identify external identifiers including where available identifiers for is me.
This is the actual war ID. It's a unique string that is randomly generated with nine characters following the war domain. The war registry began with seed data from digital sciences grid, which is an identifier system that many systems did
use. And in fact, digital science was really very much engaged in handing over, you know, their purpose to roar, and they were very, very much formative in the early days and so war was originally based on seed data from grid. We've since diverged and war is now the official successor to grid.
So in terms of strategic collaborations and engagement I actually did want to talk about the extent to which war itself really is a strategic collaboration. War is not an organization. It is really primarily supported by three organizations data site cross ref, and the California digital library. Maria
Gould is here from the California digital library she is the war project lead and has been for several years, and she is a core member of the team really the lead of the war team. I am the technical community manager for war, but I am actually employed full time by cross ref. So I
am cross refs contribution to the war initiative cross ref is very much supportive of war, and not just in sort of a technical collaboration but really really willing to support war and all kinds of strategic and outreach ways as well.
Our technical lead is actually full time employed at data site. So that's the extent to which war itself really is kind of a poster child in a way for strategic collaborations. So I actually also think if you're interested in reading about the history of
war, I think that's a kind of a fabulous object lesson in collaboration as well. Because these discussions really began in 2016. So sort of three years of continual collaboration between major players in the scholarly communication space. In which, you know, everyone realized that there needed to be an organization identifier and that the existing solutions
were not really sufficient either because they weren't open enough or didn't really properly express the necessary use case. So if you're interested in looking at how war came to be, I'd encourage you to go read about that history.
Orchid was involved on the steering group that developed the pilot, as I mentioned digital science, helped form war, and just plenty and plenty of collaborative workshops specking out what was needed for an organizational identifier, which is what war has become. And then currently, too, as well, I think Patricia, in her presentation, gave a great
overview of the same kinds of outreach and engagement activities that we are involved in. We are continually working with groups like Force 11 and various open access groups to help promote war. But really, to me, it's really been fascinating, having joined ROR relatively recently,
to see how baked into the day-to-day management of ROR collaboration is. So, for instance, when we make changes to the registry, that is overseen by an advisory board that is
made up of representatives from the Department of Energy and from, you know, other organizations all around the world. We have many sustaining supporters that have voted confidence in ROR by helping sustain us financially.
And we have a community advisory board that meets, you know, usually once a month, once every couple of months, to really help us see the way forward. So really just a sort of day-to-day management of ROR is continually a collaborative endeavor.
We're also, one of the things that we're doing, we have lots and lots of events coming up, things like that, but DataSite and ROR are putting together, for instance, a best practices workshop for integrating ROR into repository systems. So we're currently co-organizing that, and we'll be announcing that as soon as we can.
Technical collaborations. So with ROR having a really entirely open API that you don't need to register for, you don't need to pay for, which is not true for a lot of other identifier APIs or identifier databases,
we certainly find sometimes that, you know, in a way collaboration isn't so much necessary because we are just sort of, you know, as it were, leaving goods out on the sidewalk for people to just pick up and use as they will. But that being said, we do work quite closely with, for instance, DataSite and Crossref, who of course are, you know,
strategic collaborators, but we have worked quite closely with them to make sure that ROR can be supported in their APIs. This screenshot is from the DataSite API, where you can see ROR being used as an affiliation identifier. DataSite is currently calling for feedback on its new schema and asking for
input on, for instance, using ROR IDs as a unique identifier for publishers. So if you haven't already taken a look at the DataSite schema proposal for changes, do take a look at that. ROR has been widely adopted by what I think of as meta infrastructure projects.
Big list here. So anyone who's kind of really interested in tracking something across kind of the global system of scholarly communication has found ROR really useful. OpenALEX, which is the successor to the Microsoft Academic Graph, has done
a really tremendous lot of work mapping affiliations using ROR in its product. And so I encourage you to take a look at that. But we're also seeing a lot of publishers and repositories, individual publishers and repositories, beginning to sort of bake ROR into their systems and then in many cases send those identifiers to Crossref and DataSite.
And that's really what we hope we'll see more and more of in the next couple of years in particular. So ROR is most often used to indicate researcher affiliation as it is here.
But we're also seeing kind of additional use cases for the ROR identifier, including to indicate who funded research. There is already the Crossref funder ID and discussions have just begun about how to potentially merge the Crossref funder ID with the ROR ID.
So we're just beginning to look at how that's working. And then just I guess I'll say finally that, you know, ROR is a completely open project, as I mentioned before. So we provide an open data dump. We have an entirely open REST API with no fees and no registration.
So we think that that's one of the reasons why people are finding it increasingly easy to adopt ROR, whereas with some other identifiers, that might not be the case. And I'll stop there. Thank you, Amanda. A really thorough overview.
And during this presentation, I see that a lot of Q&A happening as this is going on. And we have a couple of open questions. I think we'll address some of them. And now we have a question that has four upvotes from Allison.
Do ROR identifier reflect organizational hierarchies? For example, do we want to address that? Amanda? I'm sorry. I was reading the chat questions. The Q&A questions.
Right. The question is does ROR address hierarchies? Just typing an answer to that. And the answer is no. And that is a question that we get a lot. You know, we heard that if RAID gets the question a lot, what is a project? ROR gets the question a lot. Are you going to represent departments?
And the answer is essentially no. So in a way, I think of it as sort of, you know, number one, the most urgent use case that people have is for that institutional level identifier. Which doesn't really exist. So I think of it at least as we need to, you know, help get a really robust solution in place for that problem first before doing departments.
But the truth is that we may never do departments because when you think of it, ROR is a global identifier, right? I mean, we're even just sort of identifying all of the companies and labs and facilities and research institutes and universities that are touching research as a big project in and of itself.
When you think about all the churn that goes on at the department level at universities worldwide, you know, I mean, departments get shut down, they get renamed, they get moved, you know, that would be a lot to handle. So it's really kind of out of scope for ROR right now to handle the department level and just sort of really extremely difficult.
So that being said, because ROR is entirely open and the code is entirely open and the data isn't entirely open, people can build their own departmental ontologies on top of ROR to interoperate with it. And in fact, there is a project of sort of a demo pilot project that has been done to
do that with the Vivo ontology that comes out of the University of San Diego, I think you see. Okay, that's, I hope that answered your question. And thanks, Amanda. I don't want to move us on to our discussion part. There are still some open questions in the Q&A and we can still look at them.
So, let me actually share my screen here. So I say, as I mentioned, we want to conduct this discussion, having a framework of the research lifecycle and this is
corresponding to some ongoing effort at data side that we have this implementing fair workflows project we will address all of the identifier and the metadata use cases throughout the research lifecycle. So we want to base this on the PID optimized research lifecycle
that is summarized by the MoreBrains project and roughly divided the process into four stages, pre and during grant application stage, research stage, publication stage and reporting stage. And we have our guest speakers to commenting
on the particular stage that they have, you know, targeted services based on their infrastructure. So starting with the pre and during grant application and approval stage, I would particularly like to hear from Patricia from Crossref about the fund and grant ID they have and and raid, how they coordinate the
raid ID. So, Patricia would you like to start? Sure, sure. I'm in the middle of a really big thunderstorm so if you hear booming, that's, and
if I like disappear from the screen that's what's happening it's, it's, it's, it's very intense. Yeah, so at Crossref we maintain a funder registry where we assign identifiers, DOIs to funders, the registry is updated regularly
we try to update it monthly but it really depends on currently Elsevier's actually donated the initial data for that registry and is curating it and then we, we do some stuff to, to make it work and if the funder registry is completely open it's that you can download an RDF file and look at what funders are available.
In the, all of the records registered with Crossref you can provide funder information that includes the funder identifier. But we also, as of a few years ago started working with funders to register identifiers for grants,
and those identifiers actually identify a lot of information surrounding the grant, like well who's the funding organization. What type of funding is it, how much money who's involved and there's a lot of project metadata involved in that as well. And those are really starting to pick up a bit. So we, we are hoping you know as we grow
our funder membership and grow the number of grants registered with us they'll become a really important part of this cycle. Thank you Patricia. Sean. Yeah, one of the most common use cases that we get, and this is why I lobbied Charlie to put rate in the early part
of the life cycle as well, is that say, you know, multiple universities are preparing a grant application to an external funder they need to
Many local research and research information systems, Chris's have the concept of a project and, you know, and quite a bit of according to the
That there's likely to be a lot of double entry of data across organizations right now and we're trying to combat that provide a single source of truth. So you're going for an, you're putting in a grant application. You know, the lead organization might push the information from their local Chris into raid and we are an API for will
be providing a landing page for each project that sort of that we are provisionally trying to model on an orchid landing page. But it's an API first service that they might push that information up to raid and then the
other organizations that are on the grant application can then pull it down locally into their into their systems. And yeah, in my own, you know, experience as a researcher, the project, you know, often, you know, precedes the grant and may exist for some time trying to get some runs on the board before grant is is even applied for
So I think that's another one of the questions that come up all the time. What is a project and another one. What's the difference between a project and a grant? And I guess I have an easier time with that coming from a has discipline, we do lots of research without grants. But then also on the, you know, I'm an archaeologist and the big archaeology projects I've been part of have gone on
for years and have had multiple grants so it's a sort of many to many relationships between grants and projects and I think And so it's a great way to sort that out, coordinate across, even in a lot in the in some crisis like the one that will remain nameless that we use it at my university.
It's not that there is a bit too much of an of an equation between grants and projects and it makes some aspects of managing things that there's a lot of rekey of information as you go from one grant to the next so sort of longitudinal longitudinally on a on a project as well.
So, yeah, I think those are a couple of the use cases in the context of grant applications. Thank you very much, John. Now we move on to the next stage which is the research, research duration, and here I think that data side, and I have a lot to say.
Would you would you like me to start. Yeah, I think an important thing to say here and I didn't have time in my short introduction is that data side DOIs can be registered for a wide range of research outputs.
And some of these sit really early in the research process and in the research lifecycle. So for example, even before the research project start a researcher may want to register a DOI for a data management plan a DMP ID. And also, for example, pre register the study and assign a DOI for that. And then during the research process before the research is completely
finalized there may already be intermediate outputs those can be data sets or software workflows that can already be shared and and get a DOI. And then one other thing I think is important to consider when these outputs are being made available along the way is
that you can continue to update the metadata and you can continue to establish those connections to all these other entities and these other persistent identifiers. So this is really a process not something that should be done once but something that should continue over time to ensure this richness of richness of connections.
So yeah, maybe I'll pause there. I know there are other people that probably want to comment. Yes, I also want to kill Jens here in this stage. And I think that's the primary use case for IGSN. Yeah, absolutely. I think that IGSN is a key element in this stage of the lifecycle, where it allows us
to identify the objects that we work with or that we produce as physical objects in the course of our research. And I call them the anchor into reality, where data and publications are kind of abstract things, they don't have a physical reality unless you print them out, but who does that.
But the physical samples really have a physical presence and in this way, some properties they don't share with the other digital objects. And, but to be able to unambiguously identify what you work with is something
that is crucial for reproducibility and transparency of science and other fields of research. So I just place an important role in this phase. Oh, thank you Jens. With a nine o'clock, I would merge the last two stages, publication and reporting
stage, and I think these are both stages where ORCID and ROAR have very strong roles. So, Amanda and Paloma, maybe you have some comments and Paloma, would you like to go first?
Yeah, I can go first. So, actually, ORCID has a transversal role, I think, in all stages. For example, Helena was mentioning about the data management plans.
In those plans, you also need to guarantee who is going to be responsible for that data in order to identify the people responsible, then ORCID might be there playing a role, but of course, then when applying for grants, people can identify themselves with their ORCID ID.
And, of course, in the part of publication or showing the results and the outputs and guarantee that those outputs are connected with the corresponding contributors. And this can be done at the level of authorship, but also at the level of other types of contributorship.
So, for example, if someone has curated the data, or if someone has coded a particular piece of software, or if someone has reviewed the final writing. Thank you Paloma.
Amanda? Sure, yeah. One interesting thing is that I think a lot of the publishers who have been adopting ORCID are beginning to use that for internal reporting and including compliance with OA policies.
We haven't seen a lot of tools yet built for, you know, sort of, that are built on ROAR allowing an institution to track its own research output.
But that is absolutely a thing that ROAR is designed to do, and I kind of have every confidence that those tools will continue to be built. I actually think that Data Site Commons is a really good example of the kinds of things you can do with the ROAR ID and the kinds of institutional browse you can get.
Because if you go to Data Site Commons and begin to look at sort of works by institution that Data Site Commons enables, you'll see the kind of thing that ROAR can help do. It's just that not a lot of that has been adopted in kind of public interfaces yet, public tools yet.
But it's certainly one of the things, CrossRift did a survey, I think in a couple of years ago, in which they asked their own members, which are mostly publishers, what one thing they wanted. And that was one of the key things was the ability to sort of find their own work by institution.
Right. Thank you, Amanda. So, I noticed that we have a good question in the Q&A that is still open, that is specifically addressed to Sean about RAID. He said RAID could be used between project partners to share information as well as grant application. How open would the RAID landing page be?
Surely people won't want information relating to grant application be open before grant has been allowed? Yeah, there has been heated discussion about this internally at the ARDC and about, no, it's been quite productive.
We are looking at the ORCID model of open trusted entities, private. It's a bit more complicated because a project will often not always have multiple participants or contributors.
At minimum, in the version, the next version of the service that's out, we're going to look at doing embargo periods and open-closed as a first step and then do a bit more requirements gathering on what level of, what additional level of nuance in open and closed.
Is there ever a use case where some metadata might be open and other metadata might be closed, or is it really just an all or nothing thing?
We recognize these use cases, like the one here, we want to keep the project under wraps until the grant's won. We also recognize a use case that was brought to us in some of our conversations about industry funded research that sometimes a university has to sign an NDA,
a nondisclosure agreement to make, you know, they can't tell people about the research usually for a period of time. And so we're looking at an embargo period as the sort of first stop on this. And if we need more nuance than that, bring me use cases and I'll try to see what the simplest solution is to them.
Thank you, Ron. Thank you, Sean. And anonymous question asker. I hope that answers your question. And the last, yes, you have your hands up. Yeah, I just wanted to add to what Sean just said
that in the report a couple of years ago, the Royal Society published this report, Science is an Open Enterprise. And they, I think, put it quite nicely, this principle of intelligent openness to be as open as possible and as closed as necessary. And Sean touched on a couple of use cases, but there are others as well.
There's not just confidentiality, but there's also cultural norms, vulnerable subjects or vulnerable objects that need to be protected that where you cannot put that better information online. So there's a lot of legitimate reasons where we have to consider intelligent openness
and provide means of controlling what is released to the public and what isn't. And also when you cross link so that you can deduce hidden information from implicitly deducing it from the knowledge graph.
Yeah, sorry, I'll just very briefly, I thought that we were hoping, I mean, we do want this information to be as open as possible, but we also recognize the as closed as necessary. We're optimistic, I'd say at the beginning, in the sense that in speaking to one of the operators of one of the largest, if not the largest data repositories in Australia, the Australian Data Archive that does social science data,
most of their data sets are sensitive, but in almost all cases they can make the metadata about the data set. And since we're only dealing in metadata, we're sort of cautiously optimistic that most of the time it will be able to make this metadata available.
And because we do track sort of the history of the project, like everything, we have sort of start and end dates to all the contributors and organizations and stuff that it does make a nice encapsulation of the history of a project that could be important metadata for outputs of metadata,
so with that, we're going to conclude the session, and thank you for all of the panelists. And thank you for all of the attendees for
joining us today and I just want to say that you are seeing that data side is working closely, together with all of our system organizations who address all of these use cases, and you can only grow more strong and more robust with your members continuous support and pouring your metadata into this graph of identified infrastructures.
So please remember to share accurate and rich metadata. And if that's like a takeaway message
for you. And that's where I'm going to wrap up, and let's make a pitter place together. That's a call out to that. That's your session if you were there. Thank you.
Thanks everyone. Thanks. I'm going to close this now so we can start the next session so. Thanks. That's great. Thank you.