Snippet #4 - Data Publishing at UQ LibraryData Publishing at UQ Library

Video in TIB AV-Portal: Snippet #4 - Data Publishing at UQ LibraryData Publishing at UQ Library

Formal Metadata

Snippet #4 - Data Publishing at UQ LibraryData Publishing at UQ Library
Title of Series
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Helen Morgan discusses the data publishing process and systems at UQ and how it links to the management of research data throughout its lifecycle. Data publishing is the process of preparing and publishing research data so that researchers and the broader community can find, access and (re)use the data. Data publishing may include sharing information about research datasets with Research Data Australia or discipline-specific portals or repositories, as well as making collections available to other end users of data, either on an individual basis or through a formal publication process. It can also include activities like minting a persistent identifier for the data and assigning a licence to the data.Most research institutions offer support for data publishing however there are variances in approaches and a common challenge is in working out how to link the process of data publishing with other systems that handle the data during its lifecycle. The Series is jointly sponsored by ANDS and CAUL.
Addition Group action Service (economics) State of matter Projective plane Mathematical analysis Bit Mereology Flow separation Connected space Goodness of fit Repository (publishing) Telecommunication Green's function Library (computing) Identity management Spacetime Identity management
Mechanism design Function (mathematics) Projective plane Repository (publishing) File archiver Video game Integrated development environment Digital signal Quicksort Cycle (graph theory) Function (mathematics) Formal grammar
Point (geometry) Validity (statistics) Shared memory Set (mathematics) Mathematical analysis Digital signal Function (mathematics) Peer-to-peer Mechanism design Mechanism design Word Peer-to-peer Process (computing) Different (Kate Ryan album) Term (mathematics) Repository (publishing) Function (mathematics) Formal verification File archiver Formal verification Process (computing) Resultant Formal grammar
Area Service (economics) Dependent and independent variables Multiplication sign Data storage device Function (mathematics) Client (computing) Data management Performance appraisal Computer configuration System programming Information security Library (computing)
Type theory Uniform resource locator System call Different (Kate Ryan album) Model theory Data storage device Hard disk drive Medizinische Informatik Data storage device XML Mereology
System call INTEGRAL Multiplication sign File format 1 (number) Price index Database Water vapor Function (mathematics) Data dictionary Mereology Information privacy Mathematics Repository (publishing) Data conversion Stability theory Physical system Collaborationism Moment (mathematics) Electronic mailing list Data storage device Shared memory Metadata Bit Statistics Repository (publishing) Order (biology) System programming Website Normal (geometry) Trail Statistics Real number Data storage device Smith chart Number Product (business) Goodness of fit Term (mathematics) Natural number Form (programming) Netbook Multiplication Weight Total S.A. System call Personal digital assistant Cube Iteration Pressure Library (computing)
Context awareness Set (mathematics) Database Online help Metadata Frequency Latent heat Whiteboard Profil (magazine) Repository (publishing) Information Data conversion Address space Email Link (knot theory) Planning Generic programming Login Bit Term (mathematics) Digital object identifier Windows Registry Mathematics Subject indexing Repository (publishing) Hard disk drive Object (grammar) Spacetime Row (database) Embargo Library (computing) Address space
Home page User interface System administrator Tap (transformer) Electronic mailing list Menu (computing) Function (mathematics) Message passing Computer configuration Order (biology) Repository (publishing) Spacetime Physical system
Web page Rhytidectomy Point (geometry) Moment (mathematics) Set (mathematics) Bit Field (computer science) Metadata Different (Kate Ryan album) Codec Personal area network Iteration Form (programming)
User interface Link (knot theory) Model theory Content (media) Maxima and minima Set (mathematics) Open set Statistics Uniform resource locator Computer configuration Term (mathematics) Repository (publishing) Personal digital assistant Right angle Wireless LAN Condition number Row (database) Spacetime
Metropolitan area network Trail Standard deviation Group action Theory of relativity Link (knot theory) Computer file Information INTEGRAL Ferry Corsten Multiplication sign Weight Set (mathematics) Nominal number Database Function (mathematics) Metadata Frequency Repository (publishing) Function (mathematics) Internet service provider Row (database) Library (computing)
Standard deviation Service (economics) Information Projective plane Moment (mathematics) Data storage device Planning Data storage device Metadata Attribute grammar Type theory Data management Angle of attack Different (Kate Ryan album) Universe (mathematics) Identity management Eccentricity (mathematics) Row (database) Identity management Physical system Library (computing) Spacetime
Point (geometry) Link (knot theory) System administrator Multiplication sign Set (mathematics) Data storage device Metadata Planning Library (computing) Position operator Task (computing) Link (knot theory) Information Projective plane Data storage device Metadata Maxima and minima Motion capture Digital object identifier Process (computing) Repository (publishing) Identity management Physical system Library (computing) Row (database) Spacetime
Maxima and minima Data storage device Mereology Metadata Field (computer science) Number Power (physics) Bit rate Different (Kate Ryan album) Energy level Physical system Information Machine vision Projective plane Data storage device Metadata Bit Database Maxima and minima Cartesian coordinate system Vector potential Query language Video game Cycle (graph theory) Physical system Identity management Library (computing) Row (database)
Computer icon Floppy disk Slide rule Information Server (computing) Multiplication sign Computer-generated imagery Physicalism Software Repository (publishing) Universe (mathematics) Text editor YouTube Laptop
Boss Corporation Implementation Open source Repository (publishing) Software developer Multiplication sign Moment (mathematics) Videoportal Identity management
helen is the UQ project manager of scholarly communication and the repository service today hallouwe talk about data publishing at UQ Helen up to you I good afternoon everyone and I was really pleased with Natasha got me to talk about dates of publishing because here at UQ likely we've been trying to build a bit of solution around that and particularly for the long tail of research data here at UK so I have said it specifically about these publishing it you can I agree because I'm very aware there's some groups are you doing some populist work in this space but I will focus very much on what we're doing here in the repository I'm so thinking about data publishing really raise these questions of why we're going to do it how we're going to do it how a reason she's going to get credit for the data that they produce and particularly how they're going to green credit separate from and in addition to the analysis of those states or in publications so what we're really trying to look at is really how we can build those meaningful connections between publishing the data and publishing that the scholarly work it's actually on my favorite part of the
data life cycle days of publishing because it can be both the beginning in the end so it might be that you're tidying up your day to end of the project and looking to archive it but if you go one step beyond the idea of archiving your work and you really start looking at depositing the detour publishing it and you really are giving sort of the start of another of another project you're really really putting your data out to become the beginning as well as the end so there's been talk for
a while now about making data a first-class scientific output and here in this paper in 2012 they discussed achieving that through formalizing the methods for citation and publication and thereby sort of incentivizing data sharing I think that's really important when we go around talking to researchers here at you do is to really make sure that they are understanding that the incentives we hide and sharing data if we talk about the data with them about being a primary research our buttons that really starts to click with them and they start to understand and more often where we're coming from crucially
a point of difference which we talk about with them researchers is around an archiving data versus publishing data so if when you archive your research that could be obviously very beneficial in terms of them preserving the data but when you publish it it allows for things like validation and peer review of the data which it really enhances science as a whole so we're going to researchers and talking some about not only the academic credit that they'll get but also up that the results of their work will be verified I others that they'll be able to expose their data to decide your peer review which to some of them can be quite scary especially we're talking again like I said to that one tailored research data perhaps aren't as familiar with these ideas data sharing as on this but the M you know we're really trying to provide a mechanism to ensure the quality of data sets available so at UQ what do research is
one when we go out and talk to them what is it that we're saying that they want they were like I think and research data archiving somewhere to preserve their research data and our way of sharing it a way to publish their research data in a way that treats it as a primary research output and that's crucial i think as to why we've implemented the data publishing and infrastructure here in our institutional repository we very much wanted researchers to feel that they were going through that process of publication in a similar way they would with their or the scholarly work and we do talk to about peer review and verifiable results making sure the results are validated and reproducible and the idea of getting academic credit but I'm just putting all these words in their mouths I'm not sure do research is really now that they want that so when we go and talk to people we've we've
done a lot of work in this area and we're very lucky here in the library time a team of librarians who work in research output services and as well as I client service liaison my brains so we're able to go out and talk to researchers about what they actually do want so we did a couple of things we've continually evaluated at data management service since 2014 as well as collected user stories from people so they tell us
that the largest ever datasets they work with perhaps on that big you know we are really trying to hide this facility here but people who don't have other options who are working in these big big areas which perhaps provide leads nice fancy and work place for them so they're telling us they're not working with huge datasets
but that they have many different types of data that their storage locations for
archived data are a little concerning that they will store it on there and the external hard drive on my computer so we know they want to preserve their data and they want to save it into the future but that perhaps or not they're not sure that how to do that we know that fifty
three percent of them wanted to keep their data permanently so the idea of data archiving isn't something that they are adverse to that they're happy to keep their data permanently what is taking that next step and actually publishing the data sharing a due to the part trying to facilitate so these are
some of the real user stories real researchers of this news and things I've made up and that they want to store their research data in such a way that williston cited that seems to be really important that they get credit for their work that they need access to institutional repository storage solutions from the data as required by the journals intend to publish it so we did a bit to plan an environmental scan recently where we looked at for the past five years everything that you cube has published we analyzed that those publications by and journal at bike bundler and then one of our data librarians went and dug out up to the top 25 journals and by productivity so by sheer number of publications at UQ and and also by and overall Times cited so you can say by an overall and total number of sites for papers in certain journals so we got to listen to top 20 by journals one for productivity and multiple times cited what we found with those policies for those journals and only seven out of 25 in terms of sheer weight of numbers required data showing still a lot 7 out 25 but in the highly cited or tonight and journal list of the 25 18 out of the 2500 data sharing policy in place so we know that uq researchers are publishing in journals a huge number of which 18 out of 25 of which I am required them to share their data so this researcher here is an unusually the most frequent phone call we're getting at the moment in the team's people who are trying to publish their research data in a journal that's requiring them to deposit their data somewhere and they're looking for a solution to that problem they also would like stats on who died that's a little bit more difficult to to work through for them but they are interested in who's looking at their data this researcher said they needed to be able to securely store their sensitive data but also share it with other researchers and collaborators so we knew that we had to build infrastructure that made and sense to people who had data that perhaps needed to be mediated access this person we needed to be able to permanently store their research nature in a way that was open and accessible in order to meet the requirements of a funding agency so as well as then I lighting cuties research output by journal requirements we did the same for funding agency requirements we looked at all the funding agencies named on uq research outputs the last five years and we found that there are multiple and funding agencies are going this pressure on to researchers to make sure that their data is open and accessible that's by them Australian ones as well as international ones named on uq research publications we knew that they wanted to store and accommodate all the research data along with everything that goes with it so then he presented to upload data dictionaries metadata love netbooks so that it can use by researchers in the future so these are all really great and user stories to come from our researchers these are really good use cases that we were able to accommodate using our institutional repository and I do think that over the over the years that we've been here and talking to research is very much the conversation starting to change and we really are changing that terminology now
so people are beginning to start to talk to us about data publication instead of data sharing and the conversation it's really i think the start of a culture change here at uq which is I think very good to see and the idea that researchers should share data to advance knowledge and promote the common good it's quite an old idea but in recent years you have my really single our enthusiasm column I think because people are starting to look at how and they can get academic credit and how it can lead to very much a conversation around research integrity and an audit trail from water published data but then also come to publish data to the publication and I think that's where you get very strong for stability and this is what we're working towards is and really the idea that data is deposited along side and at the same time as publication of any scholarly output so at a time that yuki researchers publishing a paper that we give them an easy workflow and trusted system for them to deposit the data that goes along with that publication I linked the two things together I really by integrating the data publishing with the other publishing we're giving them an real credibility adjustability so it says here in this paper data stewardship is best accomplished in systems or repositories whether custodian has trusted status within the relevant communities and again I think that's why it fits really well in the repository and really well with the library but it also requires your post infrastructure that's quick and simple to use and we first implemented the form which I will show you very shortly in in our repository a couple of years ago and it has gone through a number of iterations where we've tried to make it very user centric and very and straightforward for researchers to use we do want them to do it we want them to deposit the data we want now to describe it and so we're trying to make it so that they can use it and be confident that it that it's a straightforward workflow so if it's going to become part of normal scientific practice it really just have to be easy to achieve so when
researchers come and talk to us about and publishing their research data we will quite often talk to them about it as a discipline specific repository because I do feel about a very very relevant certain researchers and we talk to them about instead of archiving their data on their external hard drive perhaps you know going use a specific repository like that or mattel are also about uqe space so the fact that they
can actually describe their research data in uqe space we talk to them about the idea that data the underpinner journal article should be makin currently available and we talk to them about the fact that we can link that data metadata record with their publication metadata record there can be shown to be related objects and I think that really just they start to really understand Leverett value behind what we're trying to achieve a plan and we make it discoverable so we we obviously spend all our research data today to look through to research day to Australia and then we also send that through to you the data citation index so we're able to track citations of it they don't except through that which has been really a key thing I think of people to really comprehend and the impact that this can have which is really good so I'll show you a little bit more about how but we have this need some extra help email data at library so we have a generic email address there which comes through to the team and here in the library we're very lucky to have some very skilled and specialist a for librarians working here we have I suppose it's a relatively small team what a very and very dedicated team at work work very hard to process these records as they come through and to really have these conversations with researchers articulating clearly the relevant under in general requirements and that they can use the institution repository that it's known that it's twisted that it can integrate that was a publication were closed and linked to that other related and publications or data sets we try and really keep it very such a century can build them a profile their data sets and we can give them do eyes for the data sets for sure them have the license the dataset we show them how to cite it and how they should be showing other people have to cite their data correctly we still find a lot of people just either acknowledge the data set or mention its and wearing the papers that we're really trying to push the notes a proper citation and then you know we can do things like if their data is actually stored in a trusted subject-specific repository we can link out that or they can upload their data is a very small directors that they can choose mediated access to their data or they can choose open access they can actually lead to reload it or they can just have a contact person's of it to people aware that it exists but that they would like to mediate you access to that and we can also add an embargo period if required so if somebody comes to us and says the six months 12 months and by their period and edits that we can facilitate that as well so this is
what eq space looks like a homepage and when you login in you going to my uq e
space in order to say you can see em iu piu options that something admin can sue so researchers starts with the taps my research possibly my research I've missing publication then they have two more options my research data and add with some research data and I really think by having the data sets up there in in in that prominent along with the publications gives the right message it gives the status of research data as a primary research output so they know they're getting the list of my research publications they know they can claim publications that might possibly be theirs and the system will present nice to them that they can add publications if they think we're missing them but that they can also get a list of their research data which comes like
it looks like this the datasets below are greatly tributed to you and people really like this this page well then they can also go to addressing research
data set and this is what they get they get em it's fairly simple form and I said we've gone through a couple of different iterations and we are actually looking to and we design all the forms in its pace at which point it will get a bit of a facelift but I think we're pretty happy with it the fields that we've got in there at the moment so the person goes in at a simple amount of metadata not too much all the mandatory fields are up top so they can fill it in and get a lot of that done very quickly they go through and they can add access
conditions so this is where they'll tell us if there by three open access unmediated access and this is where he'll pick and a license in terms of access for the data set which we talked some about in great detail because obviously if you're making your data available online you need to make sure that you're releasing it under conditions that you feel comfortable with and also allow for reuse so we talked to them about what the different restrictions on the different licenses me and we talked to them about copyright and whether or not copyright exists in their data if copyright doesn't exist in their data which quite opens the case in Australia we talked some about you two terms and conditions which is a very simple thing that says you could very welcome to use my day to do anything you want with it I'd like you to attribute me so we talking about various options around of Licensing in terms of access just to make sure that they're feeling comfortable I think for some people it's quite a new idea that going to put that data out there online or publish that data my mind then we go through various things they can upload their work they can add links to the location of data if it's in Pangaea or dragon with a repository for example and then they take a little deposit
agreement that says they're the creator of the co-creator that they're authorized to deposit it we've got permission to include any third-party content that its original doesn't infringe any legal rights and thereby depositing it they're granting uqe space a license to reproduce and they care available and that the desert creatures model of lights for me to be associated will be respected by you cute new space and there before the records published
it checked by one of our specialists research output librarians so every record that comes through and every time a researcher says add missing data collection net offering metadata and it will doesn't go automatically publish online it comes through to our team and we check very carefully through the record and require a whore contact the researcher and speaks them about the metadata that they provided and make sure that it's a rich resource because I do feel like you're publishing data the metadata provide around it is very important and to make sure that that data and metadata is considered a consistently high standard would be certainly Aang that we have here at uq library so then you end up with a final
record this is a record from the e fish to nominate database repository they're a great group period UQ they analyze all these amazing fish and sharks and and they get all the genomic information of which they say they use roughly about three percent of the information that they collect and then they're very happy to make a full pull a lot of information available online and you can see here we've got the file actually touched there so people just download it and we also have a link through to the full-text publication so we're making sure that you've got that trail from the data set to the publication and also to any other related publications or data sets I do think that's the main thing here by popping all this information directly into their institutional repository it's really giving us that advantage in that integration with other exit public publishing which is where you're going to get the credibility I think with with researchers this is the
second half of the record you can see they pick their creative commons attribution but non-commercial license tell us about type of data although standard metadata but enough for you to go off if you're going to try and discover the data set so in the future
here at uq libraries and their plans that we have really center around creating more of this research eccentric data management and infrastructure so we have a couple of different projects on the go at the moment and funded by the enhancing systems and services and sweet of projects i guess you would call them and here at uq they are trying very much to provide this umbrella and university wide infrastructure that's really going to help researchers sort out their workplace and that includes and management and use of the data on the DMP all the way through to storage preservation and reuse so we expect this wartime very closely with the existing information that we have any space we
know from our user research that research is required that easy to use infrastructure that's available to them at no cost and that allows for best practice work with but with minimal administrative intervention so we're not trying to give them an administrative tasks to do what we're trying to also dom said to be collecting that metadata earlier in the process so that why they make time equals the end of publishing they're not having to remember everything that they've already got quite a well-established and set of metadata by that point so currently at
the UQ library we do have a DMP online tool there's no floor no metadata from that into the repository and there's no links to storage provisioning there's no links to published record metadata however we are well-positioned to capture that information any space because we know we've got the the infrastructure of showing you now we've got that complementary and projects around a to show we know what you do the licensing do I we can send it Ritter I do send it to its GCI so we know we're in a good position to do this we've done
an awful lot of brainstorming and I like their little bit there on the wall that says can do we know we can do it there's another one says it's my data I'm not publishing it but I don't believe that one and so we're really are going to work towards thinking about and having an idea of
project level of Minimum Viable metadata which can be fleshed out into a DMP which can have them other information added to it we're really right you keep trying to look across a huge number of different disciplines and they all require something slightly different and a feeling like different ideas as to what date the publishing even is so by keeping this idea in the Bible metadata at the project level keeping it very simple and that allows I think as wide as we can possibly get you can coverage though we're not trying to go for everyone and I said at the beginning there are people that you keep doing this really well without us so we will not turn it and I'm marching trying to get out to counter all of those people but for the people that don't have working systems and the new system will out research budget level metadata captured in a DMP to cascade through the data life cycle automatically provision data storage and then we can use that information to to publish one or more data there doesn't metadata records linking back to the original more data and also linking forwards or set of publications that came from that project level and data collection so I think that's a really good situation to be getting into and certainly that's the vision although I think it would be and not coming probably i'm going to say 12 months give me milli up with it well it's certainly the direction that we're heading in and i've got a quarter from vincent Smith he says the power of publish data is amplified by ingenuity through applications and uses unimaginative unimaginably rigid and distant from the original field without connecting these disparate data sets the true potential of data reuse and repurposing is lost and that's risk paper on data publication towards a database of everything in which he has the idea that perhaps we can and i want to say coagulate everything into one large huge data basically query can solve all kinds of interesting so I really do you think that publishing data is something worth investing our infrastructure no money about salt a lot of interest rate community and yet something we're very excited to be part of here at the library today thanks
Helen we look for any more from be in one year's time but for now let's move
to a question time any questions yes there is one there is it possible that you can share those two journals that require data publishing I'm about to start working this out for the journals that I see you publish in and it would be great to have a central repository for this information so yes I'm very happy to share that information yep we did look gross physical YouTube publications of me and then slice editor just open here everywhere we've been publishing but I'd imagine a person one across the university so I'm very happy to share that information it sounds fantastic and so what software are you
using is a question and we use it's an in-house it's just the boss used to build our institutional repository so I do believe it's all open source and online videos all the in-house development story apart from the DMV online which an implementation of the DC seeds and dimpy online from the UK obviously you will be with your real zr1 that's all the questions we have for the moment of it thanks everyone 64 time thank you


  546 ms - page object


AV-Portal 3.21.3 (19e43a18c8aa08bcbdf3e35b975c18acb737c630)