PiDs Short bites #1 - DOIs to support citation of grey literature - 24 May 2017
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 9 | |
Author | 0000-0003-0635-1998 (ORCID) | |
License | CC Attribution 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/19193 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Sample (statistics)System identificationLink (knot theory)InformationEvent horizonSeries (mathematics)Open setService (economics)Library (computing)Repository (publishing)Uniform resource locatorSanitary sewerPersistent identifierDigital signalSource codeTime evolutionVirtual machineSoftwareDigital object identifierObject (grammar)IdentifiabilityService (economics)Digital object identifierLink (knot theory)Materialization (paranormal)Address spaceNumberWeb 2.0Error messageNumbering schemeIntrusion detection systemWeb pageAreaInformationRadio-frequency identificationLatent heatUniform resource locatorSoftware frameworkDifferent (Kate Ryan album)HyperlinkVirtual machineType theorySeries (mathematics)Presentation of a groupMobile appMetadataContext awarenessData structureGeometryPhysicalismTouchscreenElectronic program guideExtension (kinesiology)Sampling (statistics)Descriptive statisticsRepository (publishing)Theory of relativityWebsiteLibrary (computing)Schmelze <Betrieb>Principal ideal domainSlide ruleJSONUMLComputer animation
04:45
Digital object identifierSeries (mathematics)Service (economics)Presentation of a groupVideoconferencingLibrary (computing)Condition numberIntegrated development environmentDisintegrationPersistent identifierDigital signalTerm (mathematics)Uniqueness quantificationExecution unitRepository (publishing)Type theoryComputer configurationStaff (military)Computer configurationDigital object identifierHypothesisType theoryElement (mathematics)MetadataImage registrationExterior algebraService (economics)Library (computing)Descriptive statisticsRight angleRepository (publishing)Projective planeCondition numberFaculty (division)Self-organizationNumberResolvent formalismStaff (military)Universe (mathematics)Link (knot theory)Traffic reportingCASE <Informatik>MomentumSystem callObject (grammar)Slide ruleWeb 2.0Web browserFlow separationObservational studyWordTouchscreenCovering spacePower (physics)Function (mathematics)IdentifiabilityPrincipal ideal domainTelecommunicationPresentation of a groupTerm (mathematics)INTEGRALScaling (geometry)Characteristic polynomialSet (mathematics)DigitizingBusiness modelLanding pageCategory of beingUniform resource locatorSeries (mathematics)InformationSoftwareVideoconferencingGroup actionVirtual machineInstance (computer science)Core dumpSystem administratorQuantum stateWebsiteMiniDiscComputer animation
10:14
Service (economics)Digital object identifierComputer configurationLibrary (computing)Staff (military)Repository (publishing)Condition numberPermanentEmbargoAuthorizationSummierbarkeitPlastikkarteMUDData typeWeb pageElectronic visual displayLink (knot theory)ExplosionPhysical systemStudent's t-testDisintegrationUser profileInformationDigital object identifierStaff (military)Repository (publishing)System administratorDigitizingArithmetic meanType theoryIntrusion detection systemAuthorizationFaculty (division)Function (mathematics)Set (mathematics)Web pageMetadataRow (database)State of matterProfil (magazine)Physical systemMathematicsService (economics)MereologyUser interfaceTouchscreenServer (computing)Condition numberProcess (computing)Self-organizationWave packetProcedural programmingDisk read-and-write headLanding pageConnected spaceIdentifiabilityGoodness of fitLibrary (computing)EmbargoWebsiteCycle (graph theory)Point (geometry)PlanningInclusion mapDirection (geometry)Program flowchart
15:42
Type theoryLibrary (computing)Service (economics)Digital object identifierObservational studyInformation managementLoginVideoconferencingRevision controlService (economics)Computer animation
16:01
Service (economics)Digital object identifierLibrary (computing)Staff (military)Observational studyMIDIInformationFaculty (division)Link (knot theory)Frame problemHypothesisPermanentEmbargoDemosceneGroup actionLoginDigital object identifierTouchscreenRepository (publishing)CuboidCondition numberInformationMetadataElectronic mailing listMessage passingComputer animation
16:52
Digital object identifierService (economics)Student's t-testMassObservational studyFrame problemLibrary (computing)Information managementInclusion mapFaculty (division)HypothesisElectronic visual displayExecution unitWeb pageHypermediaInternetworkingUsabilityFormal languageView (database)Physical systemSystem administratorService (economics)MetadataMessage passingMereologyDigital object identifierInformationProcess (computing)Computer configuration
17:41
Student's t-testLibrary (computing)Service (economics)Digital object identifierObservational studyRepository (publishing)Computer wormFunctional (mathematics)Volume (thermodynamics)Series (mathematics)Digital object identifierComputer configuration
17:58
Library (computing)Service (economics)Digital object identifierInformation managementStaff (military)Observational studyCloud computingRepository (publishing)Maxima and minimaACIDLink (knot theory)Lemma (mathematics)PermanentEmbargoDependent and independent variablesRow (database)Process (computing)Procedural programmingMetadataXML
18:15
Digital object identifierElectronic visual displayCondition numberVideo trackingPersonal digital assistantUsabilityDisintegrationComputing platformPresentation of a groupService (economics)VideoconferencingCodeSlide ruleCASE <Informatik>Process (computing)MetadataIdentifiabilityLink (knot theory)Point (geometry)Software testingService (economics)VideoconferencingDigital object identifierCondition numberPhysical systemMereologyRepository (publishing)SoftwareTerm (mathematics)HypothesisEvent horizonType theoryBitNumberDifferent (Kate Ryan album)TrailPresentation of a groupUsabilityFunction (mathematics)Computing platformRow (database)Instance (computer science)CodeScalabilityComputer animation
22:21
OvalPresentation of a groupXMLUML
Transcript: English(auto-generated)
00:00
Hello everyone and welcome to today's webinar. So today's webinar is the first on a series looking at persistent identifiers that we've called PIDs for short, PIDs short bites webinar series. The first one today is on DOIs to support citation of grey literature. The second one is on identifying and linking samples, physical samples with data using the International
00:24
GeoSample number and the third one is on linking data and publications and that's on the Scolix International Initiative. So today's webinar is on DOIs to support citation of grey literature. I'm Natasha Simons from ANZ and I'm going to start with a brief introduction on persistent identifiers
00:42
and I'm then going to hand over to my excellent colleague Dr. Daniel Bangert who's the Senior Data Librarian in Library Repository Services at the University of New South Wales. Okay first of all what's the problem that persistent identifiers are trying to address? Well I'm sure everyone will be familiar with this particular problem when you click on a
01:02
web link that takes you either to a 404 page not found error like this one or it takes you to content that's not actually related to the link that you clicked and both of these things usually happen because the web resource has been moved to another location and you have the old link. The
01:20
page not found error is frustrating and in the context of research it's disastrous because it means that a scholarly resource which may have been cited cannot be found, verified, potentially cited again and so forth and this is the problem which persistent identifiers are trying to address. So persistent identifier is simply a long-lasting reference to a digital
01:42
resource. Even if the resource moves location on the web the persistent identifier is there to make sure the link always resolves. So if a PID is used as a citation link in scholarly literature it will always resolve to information about the resource either a descriptive metadata page, the resource
02:00
itself or information about the removal of the resource from the web. PIDs are key to facilitating the discovery of scholarly resources and play a key role in linking scholarly resources for example publications and data as well as tracking the impact of these resources. But it's important to note that PIDs do not guarantee a link will never be broken but what they do
02:23
is create a framework which helps to guarantee it. So PIDs have evolved quite a lot over the last 20 or so years. This slide is taken from Jonathan Clark's presentation at the Thaw webinar last week and he notes that now we have identifiers for people as well. We want to know what persistence means
02:42
and how long a PID will last. Metadata has grown so there is a lot more value in retrieving the metadata as much as retrieving the object itself and that object may no longer be digital because you can refer to digital information on a physical object which is a big growth area and we're looking at that in the IGSN webinar coming up. And last but not least we want our
03:05
machines to be able to interpret PIDs. So in this webinar series we hope to explore more of these topics in more detail. What PIDs apply to research data is a very good question. There are many different types of persistent identifiers that apply to research data. I've put on the screen some common
03:24
examples that ANZ actively promotes or provides a service for such as handles for identifying data, DOIs for citing data and related materials or orchids for people identifiers and really so many more. And all of these persistent identifier schemes differ in some way. For example they might have a
03:43
purpose, some apply generally to all scholarly resource types, some are discipline specific. The underlying technology differs between persistent identifiers as does the governance structure. For example some are non-profit, some are company driven and metadata is collected. Some require
04:03
more metadata than others and also in the extent of use. So PIDs vary in their uptake. If you'd like to know more about persistent identifiers there is a PID guide on the ANZ website and there's a lot of information about our DOI and handle service as well. There's also this short bites webinar
04:22
series and I highly recommend the THOR webinar series on PIDs. The first one happened last week and it was a general introductory one and that's been recorded and they are making the recording available by the end of this week and then there are another two coming up on that series so if you'd like to register you can click on those links. So I'd like to finish now
04:43
and hand over to my colleague Dr. Daniel Bangert. Okay thank you Natasha. Thank you for joining this webinar. Thanks Natasha for the invite. Today I'll be talking through a service that we implemented at UNSW library to support the citation of grey literature held in our repositories.
05:02
Today this is based on a presentation that I gave late last year for the call Research Repositories Community Days and that presentation, the slides and video, can be found at the link on your screen. I'm going to briefly cover digital object identifiers and say a few words about what they are,
05:21
then take you through the environmental scan that we did to design our service and then some of the details of the UNSW DOI service including the conditions around DOI assignment, the workflows that we're following and integration with ORCID identifiers and a few words in conclusion. A DOI, a
05:42
digital object identifier, is a type of PID that is optimised for scholarly resources. Importantly it's the identifier that is digital and the object can be digital or physical. DOIs are assigned to an object by the publisher or a long-term custodian and the persistence of that identifier
06:01
and the resource is managed by the organisation and its policies. There are a few facets to a DOI. We can start with the DOI name itself which is an alphanumeric string and that can be converted to a URL by adding a DOI resolver like doi.org. When that URL is entered into a browser it takes you
06:26
to a landing page with human readable metadata about the resource, about the object. So basic information about the resource is required to mint a DOI and that metadata is both human and machine readable. So why are DOI important?
06:41
They've emerged as a relatively simple but powerful piece of technical infrastructure in improving scholarly communication. They make it easier for outputs to be discovered and used by others and to be cited and measured for impact. A useful way to think about DOIs is as a trusted identifier which is a term introduced a few years ago by a project called ODIN,
07:04
the Orchid and Data Site Interoperability Network. That's the predecessor project to Thor that Natasha mentioned at the beginning. This term captures a set of characteristics that trusted identifiers are unique, so they're unique on a global scale. They resolve as HTTP URIs persistently.
07:23
They're descriptive so they come with metadata that describe their most relevant properties. For instance, there's a mandatory set of metadata elements like creators, title, publisher, publication year, resource type and then you can add recommended or optional elements like alternate identifiers,
07:43
subjects, dates, rights information, description and so on. And lastly, trusted identifiers are governed. So they're issued and managed by an organization that has a sustainable business model and it's managed by that body which is usually a publisher or custodian. You can read more about trusted identifiers at the
08:04
link below. When we were looking at designing a service, the impetus for this came from requests from academics. Most commonly they had grey literature like a series of reports that they wanted to assign DOIs to and we were also able to implement something based on the ANDS SiteWide data service. In April
08:27
last year, 2016, that was extended to account for grey literature. So we were looking at the possibilities of implementing something here at UNSW. In preparation for an options paper, we looked at grey literature and DOI
08:44
assignment in several repositories, whether institutional, disciplinary or national. We also looked at options for registration agencies and the resource types that we would cover. A few things that we found that might be useful
09:01
were a project conducted in the UK called Unlocking Thesis Data. It's a GIS funded project and led by the universities of East London and Southampton as well as Ethos, the national thesis service at the British Library. They have a number of reports and case studies where they outline options for the workflow to assign DOIs to theses. Another idea
09:24
that we eventually incorporated into our own service was from the University of Southampton. They have a role called a trusted partner. That allows certain staff, academics or faculty administrators, to authorise
09:41
their own DOIs or the DOIs of a research group. I will come back to that idea later in the presentation. In the latter half of last year, we presented an options paper to the library and went ahead with a pilot which involved a manual workflow for a certain resource type reports to start
10:03
with and we had workflows for both library staff and trusted partners to mint DOIs. We then moved on to implement a web tool which I'll show you later and at the link on your screen you can look at the DOIs minted by that service. I think now we have about 330 DOIs minted for
10:27
grey literature. It was important at the outset to think about the conditions around DOI assignment and the first one is that the resource is deposited in a UNSW library repository. Our institutional repository called UNSWorks
10:43
holds a large amount of the grey literature created by UNSW staff and resources in the repository are managed in accordance with the UNSWorks digital preservation policy. So for that repository we have governance, we have preservation procedures in place and we're then able to sign the DOI and
11:05
then potentially if the resources move or the repository moves then we can make sure those DOIs continue to resolve. The second one is that it's an eligible resource type and it needs to be within a certain set of grey literature that we've defined. There should be no existing DOI for the
11:23
service as that defeats the purpose of a unique identifier. There should be no existing DOI request and it needs to meet the mandatory metadata requirements set by the ANDS service which links to data sites. So in the user interface as
11:40
you'll see later the requester is given these set of conditions which they need to agree to before they submit. So they need to agree that they're an author or creator of the resource or have authorization from an author to request a DOI. The resource doesn't already have a DOI. They don't plan to mint a DOI using a different service. The resource is unpublished or published
12:04
by UNSW. A repository is a library repository is the primary publication point for this resource meaning that when people resolve the DOI they'll be taken to the repository page, the landing page. The resource is not
12:20
subject to a permanent embargo and the resource is not likely to change significantly. That's just flagging that major changes like anything that would be part of a citation shouldn't be changed and that would require a new DOI. This is the workflows that we're following. So for all users we allow
12:42
them to request a DOI. They go into the tool, they select their repository where the resource is, most commonly UNSWORX. They search for their record. They select it. The system checks if there's a DOI existing already or if there's an existing request. It also checks if the mandatory metadata is
13:04
already held by the system. If not, they need to enter or confirm the metadata and then submit a request. The second part of this is for a DOI service administrator which is currently library staff. They go back in and review any request that has been submitted. They check that it
13:25
meets the conditions that we've already outlined. If it does, that's approved. It goes to the ANDS service, meets the DOI and comes back and emails the requester with the DOI. The administrator then updates the metadata
13:40
and then that information is sent back into the repository and is displayed on the repository page. For the trusted partners, so these are faculty administrators or researchers that we allow to mint DLYs directly and that needs to be approved by a relevant authority like the head of school or
14:00
associate dean and we give those people training and access to the tools they need. So they follow a very similar process except that instead of requesting they're able to mint the DOI directly. So they select their record, they mint, do the administration and the cycle is complete. Back to the UNSWORX
14:22
page, the institutional repository. If you then resolve the DOI, it takes you to that landing page and the DOI itself is displayed in the record details as part of the metadata about the publication. We are also aiming where possible to include ORCID IDs, so identifiers for researchers and
14:43
contributors to research outputs to be included in the DOI metadata. The way we do that at UNSW is through our research output system. Users can link their ORCID profile within that system. That ORCID ID is then pushed to the
15:01
repository if they deposit full text. Then they can go back into the DOI service, select that record, submit a request and that DOI gets put back into the research output system. There is then an update. So both the DOI and the ORCID go into the repository and both of those, the ORCID ID and DOI can
15:24
be exposed via external harvesters like Trove and aggregators as well as through the ORCID profile. Because of the connection between ORCID and data site, that can be easily claimed through data site and added to the
15:42
user's ORCID profile. So I have a short video here which I will take you through. This is showing an early release. So this is the version that was available at the end of last year. There have been some minor modifications since then but it gives you a sense of what the service looks like and how those workflows actually look in practice. So there's a login
16:03
screen that uses the usual credentials. So we'll start with the request a DOI workflow. Here the repository can be selected and there's a search box to pull in the information from the repository. So the user selects the
16:21
record and they're given a preview of the metadata. So what we show here is the mandatory metadata for assigning a DOI and if any of that is missing, they're given the opportunity to add it. They see those conditions for requesting and they submit the request. There's a confirmation message
16:43
and they also have a list of their requests and they can see the status whether that's pending or whether the DOI has been minted or declined. The next step is for the DOI service administrator to log into the system. They have access to a tab called review where they can see the pending requests.
17:04
They can then review each request and the metadata. Based on that information, they can either decline or mint. If they choose to decline, there's an option to send a personalized message and to follow up with the requester. If they mint, then that request goes to the AND
17:21
service and comes back with the DOI. And then this is emailed back to the requester. So they're given the DOI immediately. The administrator then updates the metadata in the system. You can see that the DOI is active immediately and that's the end of that process. And the last part to show you
17:42
is the mint function for the trusted partners. And this just means that for people that have high volumes of publications, grey literature that they need to assign DOI's or an ongoing series, that they're given the option of actually doing that directly and having responsibility for the whole
18:03
process. The procedure is much the same. They can search for the record. They select the record, review the metadata, and then they complete the minting process immediately. Okay, that's a repetition of before, so I'll just skip through the rest of this. Okay, so in conclusion,
18:22
the UNSW DOI service was designed to meet existing and future use cases. So it's flexible and scalable with future cases in mind. A priority for us was ease of use. So we're reusing metadata where possible. So anything that
18:41
we hold in the repository that we need for the DOI metadata, we use that, reuse that metadata, which is reviewed. It integrates with existing workflows for instance, with the research output system, with the repository itself. And it connects with other PIDS and platforms like ORCIDS. There are
19:01
conditions set around it, so we ensure that the identifier is governed correctly, that the resources remain persistent, and that the link can continue to be resolved and be a citeable enduring part of the scholarly record. So that's handled by preservation policies, by the reviewing process, and the ability to track our DOI requests.
19:25
Okay, thank you very much. There are a couple of links at the end of the slide there to both the slides and video. If you're interested in the software itself, we've made the code available, and both of those, of course, have DOIs to access. Thank you very much, Daniel. That was a brilliant presentation.
19:44
So I suppose what your presentation shows is that there's quite a bit of thinking involved in assigning DOIs to grey literature in terms of what DOIs, you know, what they should be assigned to and how it should be done. Can you just tell us how you got that thinking process started at UNSW?
20:02
Sure. So primarily it was based around the infrastructure we already have in place and the policies that already govern our repository material. So we knew there were conditions about what was in the repository and what we could govern. So that was a starting point. Then we wanted to cater for the
20:23
greatest use cases and make the most impact. So we wanted to start with resource types that were requested by the community. Then in negotiating the actual conditions, that was partly based on the ANDS guidelines. So making sure they actually fit in with what ANDS requires,
20:42
the agreement that we have with ANDS, and also what DataSite considers best practice. And from there it was basically a process of just testing those and whether that could be worked into the existing workflows and implemented efficiently. Okay, thanks Daniel. There's a couple of questions here.
21:01
A question from Gillian Elliott is what happens to the handle associated with the thesis used in this example? So the handle will still resolve and can be used as it usually would. The way we've implemented the DOIs is by resolving to the handle. So there are a number of different ways to do that, but for us it's best if the
21:24
resolving URL for the DOI is the handle itself. That ensures that if we ever migrate, the handles would be migrated as well, and therefore the DOIs. Okay, thank you. A question from Julie Gardner is are you able to determine how often these DOIs have been cited?
21:43
I believe there is some event tracking or data around events being implemented by agencies like CrossRef and DataSite. One way to do it is we also have Altmetric implemented at the institution, so mentions that include the
22:02
DOI can be tracked, so that's one advantage. And then through DataSite itself, I think that would be the best way to have a look at what kinds of events are happening with particular DOIs. Okay, thanks very much. Well those are the end of our question. I'd like to
22:21
thank Daniel very much for his presentation today. Thank you all. Bye.
Recommendations
Series of 21 media