We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Research Grant Data in the Griffith University Research Hub

00:00

Formal Metadata

Title
Research Grant Data in the Griffith University Research Hub
Subtitle
A case study on use of the ANDS Research Grant API
Alternative Title
Research Grant Data at Griffith University
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Jan Hettenhausen presents Case study: Griffith University’s use of the API to pull in grant records from RDA for Griffith researchers to cover the period where the researchers were not at Griffith. Explore the new Research Data Australia Grants and Projects portal Research Data Australia aggregates: 1. research grant information supplied by multiple funders - currently ARC and NHMRC 2. research project information supplied by some of our data contributors.
Observational studyUniverse (mathematics)Profil (magazine)Physical systemService (economics)Computer animation
User profileSystem programmingInformationData modelPhysical systemTerm (mathematics)Universe (mathematics)Function (mathematics)Profil (magazine)NumberPhysical systemOntologyGroup actionSemantic WebSelf-organizationInformationBitEndliche ModelltheorieCartesian coordinate systemCovering spaceQuicksortComputer animation
InformationSystem programmingEnterprise architectureDegree (graph theory)Source codeInformationGroup actionPoint (geometry)Data managementPhysical systemStatement (computer science)Degree (graph theory)Enterprise architectureCASE <Informatik>Term (mathematics)Descriptive statisticsTraffic reportingMoment (mathematics)Row (database)Digital photographySelf-organizationProfil (magazine)Computer animation
InformationTotal S.A.InformationShared memoryCASE <Informatik>MereologyDescriptive statisticsComputer animation
Arc (geometry)Source codeRevision controlInformationInformationSource codeRevision controlForm (programming)CASE <Informatik>Search engine (computing)Enterprise architectureComputer animation
Field (computer science)Revision controlInformationBitAreaTheory of relativityOrbitRow (database)Term (mathematics)Database2 (number)Process (computing)Field (computer science)Revision controlCASE <Informatik>Position operatorUltraviolet photoelectron spectroscopyComputer animation
Uniqueness quantificationSimilarity (geometry)CodeKeilförmige AnordnungLevel (video gaming)Function (mathematics)Link (knot theory)InformationPhysical systemCASE <Informatik>NumberMoment (mathematics)Position operatorField (computer science)CodeCodeAdditionFunctional (mathematics)Similarity (geometry)IdentifiabilityComputer animation
Link (knot theory)Projective planeWeb 2.0Sinc functionPhysical systemAdditionProfil (magazine)SpacetimeComputer animation
HypermediaException handlingXML
Transcript: English(auto-generated)
So, yeah, I work at Griffith University and eResearch Services and a while ago we used the ANDS Research Grant API to improve the data that we present in the Griffith University Research Hub. So the Research Hub is our publicly facing researcher profile system and we build that for two main purposes. One to make Griffith Research more discoverable to show what we are doing and the other
one to give researchers a profile that they can use for their own purposes that they can share that shows their work individually. And to give a bit of background, the Research Hub is built using Vivo, which is a semantic web application and it's becoming quite popular.
There's a large number of universities worldwide that build their researcher profile systems based on this. It came from Cornell University originally, so there's a huge uptake in the US in particular and as a semantic web application it has a couple of very nice benefits for this sort
of purpose and one is that it provides a very rich ontology to model information about researchers, research related activities, organizations such as institutes, schools, groups and in terms of activities we can model publications, grants and other research output. And of course it's also easy to add third party or your own ontologies to add even more data to this.
Now when we developed the Research Hub, one of the main aspects that we wanted to cover was that people would not have to maintain their profiles themselves and so in that spirit we tried to get as much data as possible from various enterprise systems that Griffith
and external systems if available. So at the end, at the moment, researchers really only have to add their photo if they want one, a short bio statement and maybe a research statement and everything else including academic degrees, employment history, publications, grants, supervision and so on gets drawn
from enterprise systems and we get the same information about institutes, groups and schools. However, one problem that we came across was that enterprise systems were at some point built for a specific purpose and that was usually not that the data would be displayed publicly and for a lot of the data that's not a huge issue, publication records are fairly standardized so we didn't have any problems there but grant information in
particular was not very well covered in our systems. Sometimes just because we were in the managing organization so if things changed later on in terms of titles and amounts and whatnot, that wasn't necessarily reflected in our systems and the other reason is that we didn't necessarily need descriptions and whatnot for the reporting
purposes the systems were built for. So for the research hub, we identified two business cases where we could use external grant data and really add some value to the research hub and one was to improve data on existing grants, get better descriptions, get full funding amounts like the total
grant amount and not just the share that Griffith University got from it and the other business case was that while we knew about grants that had some affiliation with Griffith, we didn't know anything about grants that researchers had while they were not at Griffith University and so adding that information became quite important because while it doesn't
showcase any Griffith research, it is an important part in the biography of our researchers and it gives a much more complete picture especially because we do have historic information about publications and whatnot so not having the grants left a gap that many people were
sort of eager to close and again we didn't want people to enter this information manually so getting as much of that done automatically as possible was the end goal. And this is where the ANDS research grant API came in and yeah as I said in the previous talks it draws from the same data sources as the Research Data Australia portal and
so it has very comprehensive information especially about ASC and NHMRC grants and it also provides us with a very nicely cleaned up version of this grant information so information that is maybe not well captured in a standardized vocabulary in the source data was actually cleaned up and is now provided in a very nice form.
And the API is based on Solr which is a very simple to use, very nice and very well documented enterprise search engine and so using this data was actually quite easy for us. So for the first business case we didn't actually have to do very much. We could basically look up grants based on their grant ID and the funding body.
Grant ID is not necessarily unique across funding bodies but doing this look up was quite easy and so we would get back the record as a JSON formatted record and all we really had to do was map those fields to our RDF vocabulary and do a few related look ups for people
in our database and what not to link it up properly but all in all it was a very easy process and well we did this work quite a while ago so about a year and a half I think most of it, a bit longer and initially a lot of the text fields still contained a lot of the actual
information in terms of funding amounts and what not and we did a fair bit of text processing to extract it as well. Nowadays ANDS has done a lot of work on improving this and so we're now getting a much cleaner version of the data so whoever wants to get into this area now and use this information is in a really good position to get very nice and clean data from this.
The second business case was a lot more difficult so we just heard about research identifiers, it's still very difficult to get that information for our researchers at the moment and ORCID is not very common yet and we don't get ORCID identifiers from the API or from the funding body so what we had to do to get
historic grants for researchers that had nothing to do with Griffith was we had to come up with a way of matching researchers by name and for that we built a two-stage scoring function. One simply looked at name similarity and gave us some idea whether two names could be referring to the same
person and we put a lot of empirical work into that because sometimes people go by a preferred name, sometimes by the actual first name, some people always include their middle name, some people don't so there's a lot of work to do about that and then we still have the problem or have the problem that names are not unique and so we added a second score that was based on the fields of
research people published in and we have very good information about that in the research hub so we could build a portfolio of four codes that people had published in previously and we just went by the assumption that if they had a grant in the past that had a certain four code
that they would have at least one publication that had that four code as well. Yeah then we had to implement some additional handling for edge cases where grants were actually managed by Griffith and we had information about them but people were different institutions and still attached to them and linking all that up but that was all relatively easy once we had the linking up and running. Well I can't actually give any numbers about how
well we're doing. Empirically it worked quite well and in practice over the last one and a half years I think we had about two or three false positives where people informed us that the data was incorrect and we built in functionality to manually add and remove grants but still
automatically ingest the data and yeah so both of these cases were very successful and that was largely thanks to how easy the ANTS API was for us to access and to use and yeah I thought to wrap it up I quickly put up some links to the systems involved. The first one
is our research hub, the second one for those who are interested and who may not know about it already that's the Vivo project which is definitely worth a look for everyone who's interested in getting into the space of researcher profile systems and the last one is the documentation to the ANTS API and since it's based on Solr there's a lot of additional
resources everywhere on the web and yeah that's all from me.