We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

ROR and ORCID

00:00

Formal Metadata

Title
ROR and ORCID
Title of Series
Number of Parts
36
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
36
Streaming mediaTape driveSelf-organizationOpen setWindows RegistrySlide ruleGroup actionRotationComputer fileCASE <Informatik>Local ringFeedbackPresentation of a groupRevision controlSpeech synthesisMeeting/InterviewComputer animation
Self-organizationOpen setWindows RegistryComa BerenicesRouter (computing)Digital libraryProjective planeIdentifiabilityService (economics)Group actionWindows RegistrySelf-organizationDigital library2 (number)Multiplication signInternetworkingBitBounded variationTunisMereologyWikiOpen setComputing platformComputer animation
Router (computing)Context awarenessFocus (optics)Connectivity (graph theory)Function (mathematics)TrailIdentifiabilityMultiplication signMereologyComputer animation
Router (computing)GoogolIntrusion detection systemMultiplicationMereologyIdentifiabilityDigital object identifierFreewareFunction (mathematics)DatabaseConnected spaceStandard deviationRevision controlArithmetic meanComputer animation
Intrusion detection systemRouter (computing)Inclusion mapSelf-organizationOpen setSoftware developerIdentifiability19 (number)Open setSelf-organizationWindows RegistryGreatest elementDigital object identifierType theoryIntrusion detection systemLibrary (computing)Right angleMetadataConnected spaceFunction (mathematics)HierarchyWebsiteProjective planeCharacteristic polynomialSimilarity (geometry)Physical systemLine (geometry)Different (Kate Ryan album)Instance (computer science)Level (video gaming)Mapping
Digital libraryPresentation of a groupPressureDampingResultantWindows RegistryPhase transitionMereologySelf-organizationDigitizingProjective planeWebsiteGradientoutputDatabaseDigital libraryWeb 2.0XML
Digital libraryRotationPresentation of a groupTotal S.A.Router (computing)AdditionSelf-organizationString (computer science)MetadataFormal languageRight angleInheritance (object-oriented programming)MultiplicationUniform resource locatorRevision controlWindows RegistryAliasingSet (mathematics)Row (database)Computer animation
Digital libraryWindows RegistryRow (database)Different (Kate Ryan album)IdentifiabilitySet (mathematics)Type theorySelf-organizationSpecial unitary groupInstance (computer science)CASE <Informatik>Crash (computing)Intrusion detection systemMappingComputer animation
Router (computing)Library (computing)Type theorySelf-organizationMatching (graph theory)Information privacyOpen setWindows RegistryFunctional (mathematics)Associative propertyMultiplication signCore dumpTouchscreenCorrespondence (mathematics)Electronic mailing listFreewareIntrusion detection systemString (computer science)Term (mathematics)Installation artMortality rateMatching (graph theory)Computer animation
QuarkInformationMathematical analysisINTEGRALWindows RegistryPoint (geometry)Shared memoryIntrusion detection systemMetadataPhysical systemSet (mathematics)MappingGame controllerComputer configurationSystem callElectronic mailing listComputer animation
Router (computing)TheoryPresentation of a groupSoftware engineeringAngleSet (mathematics)DatabaseMetadataIntrusion detection systemWebsiteInterface (computing)Latent heatType theoryMappingIdentifiabilityDifferent (Kate Ryan album)NumberINTEGRALVisualization (computer graphics)Sinc functionImplementationElectronic mailing listComputer animation
Metric systemLibrary (computing)SPARCRouter (computing)Intrusion detection systemMetadataImplementationSinc functionElectronic mailing listTheory of relativityInformationMetreComputer animation
Router (computing)InformationGroup actionExtension (kinesiology)FeedbackoutputMultiplication signRow (database)Link (knot theory)Shared memoryProcess (computing)CASE <Informatik>Computer animation
InformationRouter (computing)WebsiteBlogYouTubeShared memoryGroup actionLink (knot theory)WikiComputer animation
Formal grammarConvex hullRouter (computing)Presentation of a groupAreaBitTerm (mathematics)Different (Kate Ryan album)WikiAuthorizationAssociative propertyForm (programming)Meeting/InterviewComputer animation
Digital photographyRouter (computing)Self-organizationMachine visionUniqueness quantificationControl flowAuthenticationInformationLanding pagePower (physics)Software development kitArithmetic progressionSelf-organizationMachine visionKey (cryptography)MetreWordDifferent (Kate Ryan album)Degree (graph theory)Task (computing)TunisSimilarity (geometry)Service (economics)Single-precision floating-point formatMetadataIndependence (probability theory)CollaborationismRow (database)Computer animation
Control flowAuthenticationInformationLanding pageNumberBinomial heapZeno of EleaNatural numberStandard deviationHybrid computerLink (knot theory)WikiPrincipal ideal domainData modelMetadataMoment (mathematics)Presentation of a groupDifferent (Kate Ryan album)Range (statistics)Set (mathematics)Digital object identifierIntrusion detection systemHybrid computerProjective planeEndliche ModelltheorieRow (database)Link (knot theory)Reading (process)NumberBitNatural numberInformationWindows RegistryTask (computing)Peer-to-peerFunction (mathematics)EmailSelf-organizationAxiom of choicePoint (geometry)Right angleObservational studyAddress spaceProduct (business)Arithmetic meanRoboticsWordComputer animation
Digital photographySelf-organizationRing (mathematics)Router (computing)Source codePhysicsCondition numberRow (database)Self-organizationPoint (geometry)CollisionINTEGRALSingle-precision floating-point formatDecision theoryRight angleAngleForm (programming)Universe (mathematics)AuthorizationDatabaseAssociative propertyArmEstimationComputer cluster2 (number)Raw image formatCondensationIdentifiabilityMoment (mathematics)Observational studyIntrusion detection systemMathematicsComputer animation
Digital photographyDecision theorySource codeSelf-organizationGroup actionElectronic visual displayType theoryLink (knot theory)Ring (mathematics)Personal digital assistantCore dumpDecision theoryRule of inferenceAxiom of choiceInstance (computer science)DatabasePhysical systemVirtual machineDifferent (Kate Ryan album)Sheaf (mathematics)MetadataDrop (liquid)Open setComputer fileIntrusion detection systemSubsetCore dumpGame controllerUniverse (mathematics)Representation (politics)View (database)AlgorithmIdentifiabilitySelf-organizationInformationLibrary (computing)Point (geometry)Normal (geometry)DemosceneWordOrder (biology)Arithmetic meanMoving averageArmComplex (psychology)Revision controlUsabilityQuicksortINTEGRALForcing (mathematics)SpeciesRight angleInternetworkingRing (mathematics)Source codeAlgebraic closureGradientRow (database)Computer configurationSystem callRange (statistics)Computer animation
Digital photographyReduction of orderKolmogorov complexityDisintegrationOpen setConsistencyUtility softwareWaveRouter (computing)Self-organizationQuery languageEquivalence relationIdentifiabilityConnected spaceDefault (computer science)MetadataRaw image formatDependent and independent variablesMoment (mathematics)Covering spaceWindows RegistryComplex (psychology)Utility softwareOpen setGroup actionReduction of orderIntrusion detection systemNumberArithmetic meanRule of inferenceCausalitySound effectWordComputer animation
Router (computing)Instance (computer science)Self-organizationGroup actionJunction (traffic)State of matterFamilyOpen setAssociative propertyOrder (biology)DialectMachine visionFreewarePlanningData conversionProjective planeWindows RegistryString (computer science)Lattice (order)IdentifiabilityMetropolitan area networkInternetworkingPlastikkarteCASE <Informatik>Different (Kate Ryan album)Type theoryCategory of beingQuicksortIntrusion detection systemMusical ensembleNumberFunction (mathematics)Row (database)Moment (mathematics)Right angleSoftware developerProduct (business)Mechanism designVector potentialScalabilityBitFunctional (mathematics)Touch typingRange (statistics)Electronic mailing listFocus (optics)Expert systemJSONXMLUMLComputer animationMeeting/Interview
BuildingFeedbackMathematicsMappingGroup actionScaling (geometry)IdentifiabilityVirtual machineWindows RegistryRow (database)Matching (graph theory)Form (programming)AuthorizationProcess (computing)MereologyStapeldateiNumberMultiplication signMoment (mathematics)Right angleData conversionText editorSoftware developerError messagePoint (geometry)BitQuicksortDataflowInformationSound effectOrder (biology)Instance (computer science)Basis <Mathematik>Independence (probability theory)Intrusion detection systemWebsiteBit ratePower (physics)MaizeSource codeConditional-access moduleUsabilityRing (mathematics)Goodness of fitReal numberGreen's functionMeeting/Interview
Data conversionDirection (geometry)Physical lawQuantum stateMultiplication signInstance (computer science)BitMeeting/Interview
Transcript: English(auto-generated)
Hi everyone, welcome back. So we're joined by Tom and Maria now who are going to talk to us about Roar and Orchid. I hope you're both well, it's great to have you here with us. Great, shall we begin? Yeah, let me just add your slides and then we can begin.
Okay, so I'll remove Jason and myself from the presentation and the floor is yours. Welcome back. So we're joined by Tom and Maria now who are going to talk to us about Roar and Orchid. Hope you're both well, it's great to have you here with us. Sorry, we're getting some feedback there. Great, shall we begin? Yeah, let me just add your slides and then, oh okay, so I'll move
Jason and myself from the presentation and then the floor is yours. All right, hello, can you
hear me all right? Yep, I can hear you, Maria. Okay, excellent. Sorry, apologies for the
technical glitch just now. I think I had too many versions of this going in various tabs. So let's get started. Thanks everyone for tuning in. In this session, I'm going to be talking about the Research Organization Registry, better known as ROAR, which provides open
identifiers for institutional affiliations. A quick introduction to myself, I am the ROAR project lead and I work on persistent identifier services at California Digital Library, which is
one of the steering organizations behind ROAR. So first of all, just wanted to quickly thank Simon for inviting me to do this talk. Very happy to be here. Also a little intimidated, I feel like a bit of an outsider or interloper and consider myself very much a wiki newbie, but I'm excited to be here to talk about ROAR and identifiers and also to have Tom as part
of this discussion as well. I'll get to that in a second. So four goals for this session today. First, my goal primarily is to make sure that everyone tuning in knows about ROAR if you
get excited about ROAR and what we're trying to do and similarly excited about how ROAR might be used in conjunction with Wikidata and other platforms and data to link up research. And lastly, as mentioned, being a bit of a newbie, I myself am hoping to learn from this session as well and I'm really eager to hear your thoughts about ROAR and Wikidata. So it'll be great
to share this session with Tom DeMaranville from ORCID who will be talking after me about what ORCID is working on with regards to institutional affiliation data. So I'm going to do my bit, then Tom will do his bit, and then we'll have some discussion in Q&A after both of
us have spoken. So hope that sounds good to everyone. So let's start with a quick overview of ROAR. ROAR emerged in the context of this ever-increasing challenge of how to connect different components of the scholarly research landscape together. So how do you identify and
track researchers and research outputs and research institutions and how do you connect them to each other? We've already heard earlier today about some of the challenges with identifying institutional affiliation data. That's what I'll be focused on today, but this is obviously a much larger challenge that goes beyond affiliations.
And persistent identifiers are really key to making these connections. And in this context, there had already been a lot of effort to solve part of this puzzle with developing ORCID IDs for researchers and DOIs for research outputs. But for a long time, there has been this missing piece in the puzzle as there was no open and community-driven standard for identifying
research institutions. So what we've ended up with is affiliation data that's all over the place, multiple versions of names and multiple free text variants and or perhaps no affiliation data collected at all or perhaps some proprietary affiliation data locked in a commercial database.
So we launched the ROAR registry in 2019 to develop open community-driven identifiers for organizations that could be plugged into existing research infrastructure that already relies on ORCIDs and DOIs. So the bigger picture goal with ROAR is to be able to use ROAR IDs in
scholarly systems to identify institutional affiliations, to then get those ROAR IDs embedded in research infrastructure and metadata, such as sending ROAR IDs to cross-ref and data site when DOI metadata is collected, and then to be able to really complete the puzzle that
shows the full picture of who is doing the research and where the research is coming from. Bottom line, we think that this institutional metadata should be freely and openly available to all. We believe that institutions and libraries and researchers should not have to pay to access data on their research activities. So in a nutshell, just to summarize what
ROAR is trying to do, it's an open and non-commercial registry of organization IDs and metadata. It is specifically focused on capturing top-level affiliations for research organizations. So it's explicitly not focused on mapping department-level hierarchies or identifying
legal entities. And ROAR is especially designed to connect research outputs to research organizations and also being completely open, specifically being developed as a community-led project and driven by leaders in open research infrastructure. So these characteristics
are what set ROAR apart from similar types of identifiers for institutions that are also out there. So I'm going to take you all on a very quick tour of what's in the registry right now. We already have a bunch of data that anyone can work with. There are at present about 99,000 organizations in the registry, and there are a few different
ways that you can access and use ROAR data. One way is through a simple search on the ROAR website. In this example, I'm showing the results of a search for California Digital Library where I am affiliated. Some of you who might be wondering how this data gets into
ROAR in the first place and how we're maintaining it. We actually launched ROAR with seed data from the grid database, which was created by digital science, and digital science was part of early startup phases of the ROAR project to help launch the registry. By starting with grid, we meant that we didn't have to build ROAR from scratch. And what we're
working on now is setting up ROAR to curate its own data independently and with community input. In the meantime, ROAR is essentially mirroring grid, and then the work that we're doing right now is building out ROAR on top of that. Every organization in ROAR has a unique ROAR ID. This is a URL that will always resolve to the organization's record.
The ID itself is an opaque string. It starts with a leading zero followed by six characters and a checksum. In addition to the ROAR ID, we have metadata about each organization in the registry, including multiple versions of the organization name, including other languages.
We can support multiple character sets. We have acronyms and aliases for the organization, as well as the URL and location. There is some metadata that is currently available in grid that is not yet available in ROAR, but that ROAR will be incorporating soon, such as some very basic parent-child relationships and some more granular location data. The ROAR records also include
other identifiers for the organization when these exist. That interoperability is really important to help support wider adoption of ROAR and to have these crosswalks between different types of identifiers. So, first and foremost, I want to point out that ROAR includes Wikidata IDs
in the data set, which is really exciting. ROAR also maps to grid and to ISNI, cross-ref funder registry, and a few other identifier types. In this example that I'm showing here, you don't see the funder registry, for instance, because CDL is not a funder, but that would be present if that were not the case. Another way you can search and filter organizations
in the registry is through ROAR's open API. We also have a full public data dump that can be downloaded. We do a public data dump each time there is a new release. We have a few tools as well to help facilitate working with ROAR data in the registry and cleaning up
affiliations. So, we have an affiliation matching function in the ROAR API that allows you to feed it a free text affiliation string, and then I have it matched to the corresponding ROAR ID. So, that's what I'm showing on the screen here. And we also have a reconciler that works with OpenRefine, so you can load a list of affiliation strings into OpenRefine and map
those to ROAR IDs. So, that was a quick whistle-stop tour of what's in the registry right now. I'm just going to share a couple of quick examples of how ROAR IDs are starting to be implemented in research infrastructure. So, the point that I really want to emphasize
is that the registry, it really isn't useful on its own. The aim is for ROAR IDs to be implemented in various systems and captured in this metadata that is being embedded. So, we have one example that I wanted to show today, which is this integration with Dryad for data
publishing. So, Dryad maps their institutional affiliation lookup to the ROAR API. So, when a researcher is submitting a data set to Dryad, the system is making a call to the API, and the researcher is choosing their affiliation from the controlled list of options
coming from ROAR. So, to the researcher, it's pretty invisible and seamless. They don't even have to know that ROAR is operating in the background. Dryad can then store the ROAR IDs for its data sets in its database and then deposit those ROAR IDs when it sends metadata to the data site. And then you can query data site. I'm showing the public search interface here,
but this would also be available in the data site API to be able to search research by specific affiliations coming from those ROAR IDs. Data site is also starting to do some interesting work with research discovery using identifiers. So, I'm showing an example here
of the data site commons discovery tool, which basically maps ROAR into DOI metadata and ORCIDS and other types of identifiers as well to be able to visualize different research activities. So, there are a number of other integrations that are already completed or
underway since ROAR launched last year. This is a list of various implementation examples that I am aware of. There may be others out there. We're really going to be encouraging wider adoption this year and in the coming years, especially focused on publisher metadata and encouraging the implementation of ROAR IDs for affiliations so that those affiliations can
be sent to Crossref, which will be supporting ROAR in its metadata schema. So, just to close with a couple of thoughts and information about where ROAR is going next, we're really focused
right now on, as I mentioned earlier, developing curation in ROAR and incorporating extensive community input into how we will be updating and maintaining the records going forward and how we will be the kinds of processes and infrastructure that we need to add records
and maintain them over time. Also, working on driving wider adoption and then, lastly, supporting those who are doing things with ROAR data or want to do things with ROAR data and getting feedback from those users and understanding more about downstream use cases. So, if you are excited and interested in learning more about ROAR and getting involved,
I just wanted to share some links here about how you can do so. We also have a community advisory group that meets periodically to share updates and give feedback, so that's another way that people can get involved. So, this is going to conclude the ROAR portion of the session.
Thank you very much for listening. I'm really excited now to turn this over to Tom, who will talk about ORCID and ROAR and Wiki data and how all of these things can be connected.
So, I'm going to assume everyone can see this. I only have one monitor, so I can't tell, but I'm going to assume and plow on regardless. So, I'm going to follow on from Maria's presentation, talk a little bit about ORCID, ROAR, and Wiki site. I'll be talking
about how we got where we did, how we got there, and where we're really going to go next in terms of affiliations, but I'll also be talking a little bit about, you know, what makes ORCID different from other different approaches we've seen to author disambiguation. So, first off about ORCID. We're, for those that are unaware, we're a not-for-profit
community-governed organisation. We were formed around 10 years ago as an initiative to really try and solve this disambiguation problem, and it was a collaboration of institutions, funders, publishers, all sorts, in fact, came together and realised they wanted a single independent
community solution to this problem rather than anyone, a commercial provider, going alone. Our vision really is a world where everyone is connected to the things they've created and they're uniquely identified, and we do this by providing people with an ID,
and they use it during their normal workflows, and this helps disambiguate them from other similarly named people. What's key here, and the key difference I think with ORCID, with other ways of doing it, is that the ID and the metadata attached to it are all owned and controlled by the researcher, so researchers essentially curate their own record. They add
things, they can update things, but they can also delegate permission when they authorise their ID on another service, so they sign in. They can grant that service permission to kind of take over some of the tasks on their behalf, and they get to choose what they make public and what
they make private. So we see, for example, the vast majority of academic outputs are made public and the vast majority of email addresses, for example, are made private. So we really help to connect people, places, and things, so we connect to works, affiliations, which we're talking about now, but also things like funding and peer review, and when we do that we try and use other persistent
identifiers so that there's always a link to more information, more detailed metadata about those items. So for affiliations, I'll get to in a minute actually, but we support Ringgold, Fundref, Grid, and soon RAW, but for works we support a vast range of different identifiers, but
mainly we see DOIs. We provide a set of APIs to the public and to our members, and this enables the authentication, but it also enables things like reading and writing metadata that are attached to IDs. By the numbers, we recently stopped talking really about how many IDs we've created, and we started talking a bit more about active users, and there are around 4.8
million of those, depending on how you count them, but this number is decided by the number of people that have actually used their record in the last 12 months. We're a community organization, we're sustained by just over 1,100 member organizations, and that's spread all across the globe, and between them there are two and a half thousand integrations, and 353 of them
are actually updating records. They add over 1.1 million works to the registry every month, and that's been consistent for over a year now, and as I've seen in numerous presentations today,
there are 1.6 million ORCID IDs in Wikidata at the moment. I'm just going to take a little aside away from affiliations, just talk about one particular project, just to get to highlight how ORCID and Wikidata are being used in the wild together to solve problems. So Bionomi, I know
in this project is on the call, so he'll know far more about this than I do, but I've watched them grow, and they're using not only ORCID and Wikidata, but they're also using Zenodo, which means DOIs, to link together the people that collect his natural history specimens
with the specimens themselves using GBIF IDs, and they're doing this by giving ORCID IDs to living researchers, but Q numbers, Wikidata Q numbers, to the deceased ones, and then they're linking all together with DOIs as well, and that's automatically updating ORCID records, and it's
just a brilliant example of this kind of hybrid model where you have researchers on the one hand curating their own pieces of information, but also you know professional curators, or just volunteers helping out as well, and between them they've created 13 million links, which I think is amazing, and there's more information to be found in their footnote there. So getting back
to the topic, ORCID and affiliations, we added support back in 2013 for the employment and education affiliations, which I think employment is the one that's most important to the
discussions we're having today, but we have expanded it to other things, right, so you can for example note that a person is a member of a scholarly society, or that they are on a conference review panel for example. They're usually associated with one org ID at least, but not always. There are three and a half million IDs within ORCID have 11 million affiliations
between them, and I think out of those 3.5 million, 2.9 million of those have at least one organisation identifier attached to their record, and these end up in the record either by the individuals or by ORCID member integrations, so we're increasingly seeing
universities updating records for the researchers with the researcher's permission, and that accounts for about 10% of affiliations at the moment. So as I said at launch we supported Ringgold, which for those who don't know is a proprietary organisation ID solution, quite often used in publishing. Later on we added support for
cross-ref funder IDs and grid identifiers, and as Maria mentioned GRID is run by another commercial entity called Digital Science, but they, distinct from Ringgold, the GRID ID database is in fact CC0. We're developing raw support right now, and this is kind of what it looks like. So this is an education affiliation that's been added by Oxford University themselves to one
of their scholars. So this scholar studies condensed math physics, and what we can see here is Ringgold, and we can see that we also know that that Ringgold is associated with that isn't he, and that OFR ID because that's what's found in the Ringgold organisation ID database. I'm just
going to talk about some of the decisions we've made in the past and the problems that are causing us, and we've come an awful long way, but we have definitely made mistakes in the past that we're trying to rectify now. So problem one, I think Simon showed this exact
screenshot earlier, and in fact he brought this particular instance to my attention during the week, well last week, but when we first launched we thought it'd be a great idea for people to be able to rename the metadata that comes from organisation identifiers in their own record, because essentially they're in control and they should be in control of how this information is displayed. So we see people, for example, renaming University of Oxford to Oxford University
or Trinity College, Oxford University, but it turns out it was a terrible idea. You can see from this example the person who has entered this has renamed one university to a completely different university, and what this means is a human being looking at this would see the University of
Sao Paulo, but a machine will look and see the University of Santo Amaro, which are obviously completely different, so it does definitely cause us problems, and given we have 4.8 million active users it's pretty hard for us to curate this actively due to scaling issues. So we have
to find other methods to help people help themselves. The second problem was as we introduced new org IDs beyond our initial one, we realised that while these databases or systems all contain, well they contain an overlapping section of org IDs. So this is University of
Sheffield. This is one of the the worst examples I could find in that they're both, there are two entries in this drop down for a user to pick from, and they're both exactly the same from a normal user's point of view. So we wanted to remain neutral, which I think was really really important, but in other ways it's really bad. This is not a great user experience,
and it causes other problems, right, so it causes problems in the metadata when you can't tell that one ID is the same as another. So here we can see the two representations of the British Library from two different identifier systems, and we actually have no way of saying
that they're the same. Quite often grids will have ISNIs, but this one does not for some reason, but this means that, you know, we can't develop an algorithm to match these reliably. However, we can do this for around around about 40,000. We can actually match between
raw and Ringog, which is really going to help us, and I'll show you why in a minute. So the problem four was around governance and openness, right, so Orchid is open infrastructure, community governed, all of the metadata in our APIs and data files are cc0, and openness is one of our core values, right, but of the organization IDs I've talked about, two of them
don't have community governance, and one doesn't even have open data, and we have to have a special arrangement where we can essentially use subsets of their data and put it in our own open data and reapply cc0 license to it, which is which is problematic. So where do we go next?
Well, getting back to raw, we really want to use raw, and we're going to help use raw to solve those problems in various ways, but we want to reduce user confusion, reduce complexity for our integrators, and reduce the reliance on non-open IDs. We want to increase use of our
open infrastructure, add more connections to the registry, and hopefully increase the utility and value we can offer to the community. So we've added raw to the roadmap, and we're going to use it to join other identifiers together where possible. I think we can be able to create about 40,000 groups where we group things like Ringgold and Wikidata numbers
and raw together, and that will cover around half of the 11 million affiliations in the registry at the moment. What this means in practice is once this goes live, we should see overnight half the affiliations within Orchid have a raw ID and a Wikidata ID. That's the hope.
We're going to update the user experience to get rid of the problems I've shown you earlier, so hopefully we can use raw as a baseline given it is another piece of open infrastructure. I think we should be able to make it that the kind of the default go-to identifier moving
forward, and we'll do that for our API response and our metadata responses as well, so you'll be able to tell and just pull out raws. When we know them, we'll let you know, and the same for search. And the other thing is we're just going to, and this is kind of a side effect, but we are going to prevent manually created affiliations from being renamed.
So this means if it's got an org ID, it'll have the name of the org ID, and if a person wants to rename it, they'll no longer be associated with the ID. So that was kind of a whirlwind tour of orchid and affiliations, and yeah, I'll hand back to Maria and the group, and hopefully we can have some questions. Thanks so much, Tom.
Yes, you both very much. That was really interesting, really interesting. So yeah, I'm going to make a quick comment that leads into one of the questions that we've got on
the etherpad. It's a question for Maria. So the first thing I did when I just discovered raw as an ID, I went into Wikidata and found that it's been added to the National Library of Wales. So thumbs up for that. And so the question is, what type of organizations qualify for a raw ID
and how flexible is that? For example, someone's asking, could you add IDs for Wikimedia user groups, for example? Right. Yes, Jason, that's a great question.
Thanks for asking. So as I mentioned in the talk, the focus of ROAR is on capturing affiliations for research organizations. So there's quite a lot of flexibility within that definition, and I'll get into some of the nuances in a moment, but first and foremost, it really means that ROAR is meant
to identify organizations that are in some way involved in research, primarily producing research if you think about an affiliation that would appear next to an author's name on a research article or on some other research output, but it also encompasses
other types of research activities like employing researchers, disseminating research, in some cases, publishing, funding, facilitating research. For instance, there's a facility category in the registry, and so there are some organizations classified as
facilities like a large telescope or some kind of laboratory, for instance. So I would say the primary criterion is that it must be an organization, so an informal group or a time-limited kind of project or some sort of volunteer effort
wouldn't necessarily qualify. For instance, ROAR is not an organization, so it does not have its own ROAR, if that makes any sense to you. So it's really hinging on that notion that it must be an actual organization that exists and is involved in research or touches research in
some way, but within that there's, I would say, quite a lot of flexibility. You can scan through the the list of entries right now and get a sense for that range. Thank you, fantastic. There's another question on the etherpad. It's a general question, I'm
not sure who wants to, whether you both want to try and answer it. Is there any plan to try and reconcile existing affiliation data with organizational IDs, maybe to be confirmed by the ORCID users? I think I understand where this is coming from. So we have we have IDs
with affiliations, the affiliations don't necessarily have ORCID them. So I know we're doing a lot of user experience work next year to improve this form, like I was just showing you little bits of it, but that's actually a really interesting idea.
It's not something we could do in bulk, right? It has to be some kind of scalable solution and it has to be researcher driven. We can't just go in there and edit people's records, even if we think we're doing them a favor and helping them out. That's not the way it works, but we are putting in mechanisms to interact with researchers and prompt them to take certain actions and that seems like a completely reasonable action to try and prompt them to do. So I will take
that away and back to the rest of the product team and see if we can do something there. Brilliant. I can touch on that too, just from the sort of outside of the ORCID context, because if I'm interpreting the question appropriately, I can just say that from the ROAR standpoint,
this is a conversation that we've been having with a number of different users or stakeholders. For instance, a publisher might be very interested in adopting ROAR, but they're already using, you know, they've either never, they might have a bunch of affiliation strings without having any IDs or they might have a bunch of other identifiers
like Ringgold, for instance, or something else and they want to map those to ROAR. So that's one of the, you know, a couple of the use cases that we're trying to support with the tools like the affiliation matching function and the open refine reconciler and understanding
a bit more about where, you know, what that adoption journey looks like and what kind of tools or support people might need to get to ROAR. And I think there's definitely a lot of interesting potential with how ORCID fits into this as well. Excellent, thank you. Simon, do you have any
comments or questions you'd like to ask? Yeah, thank you both for speaking. It's really interesting to hear about the development of ROAR and particularly how it's going to be implemented in ORCID to have them both side by side is really useful. One thing that I noted both of you were talking about Wikidata in ROAR and in ORCID, and obviously that's music to my ears
to hear that. What I'm really wondering is how can the Wikidata community support what support your work? Is there any way we can be involved? So I'm going to confess my ignorance
because I'm not an expert on Wikidata, but I did wonder where Grid found out all the ISNIs that match Ringgold's. And I'm assuming someone in the Wikidata community worked out for them, but maybe it was Ringgold and Grid themselves, but that kind of action is fantastic. That's the mapping of one identifier to another that is reusable. I mean, it's obviously
something that we're quite good at because we've got a community of human editors, so we're not just relying on machines to do that. So it's good to hear that there is some use coming out of all of this. Yeah, that's going to be fantastic useful. And like I say,
it's I'm slightly ignorant to how those mappings were made. But I noticed that the Q numbers are in ROAR for that. So that's just going to be super useful for us. Yeah, and I can say a few things to your question, Simon. First of all, thank you because
you were instrumental in helping to get ROAR IDs into Wikidata. So that's been really exciting and a great thing to talk about. I think one of the things that will be helpful, you know, you're asking about going forward, what might be helpful. ROAR is at this point now where we are
developing our independent infrastructure and workflows for curating the registry. We want to be putting out updates on a predictable and regular basis. Right now, the registry is being updated, you know, approximately on a quarterly basis. But going forward, I'd like to see that
be more frequent, like monthly. And so figuring out how to coordinate those flows back and forth between ROAR and Wikidata. So if there's an entry in ROAR right now that doesn't have a Wikidata ID at the moment, how do we know, you know, what's the best way for ROAR to know about that and get that into the registry? And then vice versa, when we're adding records to ROAR or
updating records in ROAR, what is the best way to feed that to Wikidata so that can be populated there as well? So that's just one thing that's been on my mind as we're going forward with the ROAR curation development work. Yeah, I think at the moment we don't have the
most efficient way of keeping things up to date is generally offer updates grid, sends me an email, and then I update ROAR. We perhaps need to work on something a bit more sustainable than that. Yeah, we can do that. I suppose just a similar question. Is there any way
the Wikidata community can provide you with information if we spot problems in either in ROAR or Orchid? Do you want us to let you know somehow? And what's the sort of best way of doing that? So that's the flow of information the other way.
I would say for ROAR, yes, definitely. Right now the somewhat temporary process that we're working with is we have a form on the ROAR website where you can submit feedback. If there's questions around a large number of records, like a batch kind of update,
that's a little trickier right now. So better to just email ROAR to talk about that at the moment. But this kind of feedback is super helpful, especially as we're building out the infrastructure further. So yes, ROAR would like to know if there are changes or
suggestions to make. And for Orchid, it's always welcome to have these things pointed out to us, but there is the question of scale for us. So we do have a defined process. So if someone says, for example, I think this ID is the same person's other ID. And we've got this defined
process, but we have to actually talk to the researchers themselves in order to make any kind of effect to their record because they're in charge of it, not us. We don't actually own that data. So we can do these things. And we do do these things a lot. We essentially turn off one and have it point to the other one in that particular instance. But it takes time. And
scaling it to millions of people is quite tough. And we haven't really worked out how to scale that yet, I suppose is what I'd say. So yes, it's always welcome. And the support desk always do follow these things up. So the normal process is go via the support desk and go,
I think it needs to be the same or I think there's something wrong with the work on this record. But all we can do is essentially point out to the researcher that there might be an error in their record and they should probably do something to correct it. I think we'll be able to point out quite a few duplicate Orchids. It's the sort of thing that human editors do come across that might just slip under the radar for quite a while otherwise.
Yeah, they're not things we've noticed. I think they're very difficult to detect with any reliability anyway, put it that way. But with human eyeballs, much, much easier. Yeah, that's great. I could ask you questions for hours, but I won't. I'll leave it at that.
I think the most important thing is that we've brought you in contact and we've started a conversation. I hope you can continue to be part of it as we continue to work to improve our author items in Wikidata. So thank you very much for both of you coming along today and taking the time to speak to us.
No, no, thanks so much. I've actually learned a huge amount today, so it's been really useful for me. So thanks for inviting me. Yeah, likewise. I'm really great. Thanks, guys. Okay, thank you. So we're at the end of the session. I've really enjoyed this afternoon.
It's been some very interesting talks and I've learned a lot. And there's still so many questions. So I'm going to try and organize a workshop like I mentioned earlier. I saw a bit of enthusiasm for it. That's all I need. I'll work with that. And hopefully sometime next year, we can
get together online and have a proper conversation about this. Don't know if Jason has anything else to say? I'd like to say thank you very much to all the speakers this afternoon and everyone who's tuned in. And to you as well, Simon, because everyone should know that although we are co-hosts,
you did all the organizing and I literally just turned up on the day to help out. So thanks very much for putting it all together. It's one of those subjects that we definitely need to talk about. And hopefully this will be the start of some really interesting conversations.
Yeah, I hope so. And thank you too, Jason, for coming along today. It's been a pleasure to spend the afternoon with you. And thank you to all our speakers again. Bye everyone.