Assessing Data Management Needs and Practices to Enable Research Data Support Services
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 14 | |
Author | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/37264 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
4
00:00
Lecture/Conference
00:48
Lecture/Conference
05:27
Lecture/Conference
05:42
Lecture/Conference
07:35
Lecture/Conference
08:24
Lecture/Conference
09:56
Lecture/Conference
10:31
Lecture/Conference
Transcript: English(auto-generated)
00:00
Good morning. My name is Plato Smith. I'm the Data Management Librarian at the University of Florida. And my colleague, Jean Bossert, she's not here, but she contributed a presentation as far as doing a visualization of some of the graphics. Also, we're working together for a plus one article as a result of the data survey.
00:21
Launched a data survey between January 3rd and April 31st of 2017. And basically it's an IRB approved survey of the researchers at University of Florida to analyze or assess their data management needs and practices. And so this presentation is just going to briefly touch on some of the questions from the survey, but also some additional support.
00:48
This is background, context, research purpose, and table of contents. Okay, pretty much of this, I'm a little bit nervous. Sorry about that.
01:06
Background and context, basically done multiple data management workshops. The first two at the Marston Science Library and also at the Informatics Institute were pretty much just general data management workshops. Introducing faculties to data management plans and actually the key components of a
01:24
data management plan according to the Digital Recreation Center checklist for data management plans. Also the key life cycle process using the USGS science data life cycle. As a result of these workshops, we were invited for guest speakers for University of Florida Division of Student Affairs.
01:45
And also biomedical science and the Graduate Linguistic Society Seminar. For the first presentation, the training workshop at Informatics Institute, I'm a little nervous. Sorry, Dean.
02:01
This is the first presentation where I had the Dean present, so I really apologize. But the first workshop at the Marston Science Library where a director of Nature Coast Biological Station was present and his whole research team. And he actually, we helped develop a data management plan for him and his staff which resulted in $1.1 million funding grant proposal.
02:28
And then also one of the attendings of the first workshop, we helped with the data management plan which resulted in $486,000 in grant award.
02:42
Number three, data management use case, a genomics data set. I used that in quite a few of the trainings and basically this was a use case working with a faculty member in molecular genetics and microbiology. And she contacted the library. She had these large data sets, supplemental data sets that could not be attached to the peer review article because they exceed the file size limit.
03:04
So they range anywhere from 34 megabytes to 43 gigabytes. At the time, Zenodo had a limit per record of two gigabytes. Towards the end of a funded project, the eternal grant, Zenodo features have increased to 50 gigabytes per record.
03:23
So at the end of this eternal strategic opportunity program grant for number three, we were able to load all of her data sets to Zenodo. She wanted a place, a data repository for data sets. She wanted a digital object identifier to link her data sets to a peer review article.
03:41
And she wanted it accessible for at least seven years. And she wanted to share this with her faculties, other researchers in her field so they can duplicate and use these supplemental files. Typically some researchers do not need the supplemental files beyond seven years but even though Zenodo is guaranteed up to 20 years.
04:01
Recently I was awarded an eternal grant to hire a graduate intern and he's from the College of Education, educational technology graduate student Ryan Rushing. He has a background in IT and he's helping me develop data management learning experience modules. So one thing to learn from all of the training workshops is that
04:20
the faculty want data management plans or data management training specific to their discipline. So he's helping me develop different scenarios based on social science, humanities, physics and getting use cases from the scientists and researchers themselves and then mapping that to the key components of a data management plan. So it will be an online module using Articulate 360.
04:45
And also I wanted to introduce basic concepts as far as what is open archive information system, the definition of digital object identifier for UF, number one, UF biomedical science. I did a presentation for Dr. Sylvan Dorr, he's an anesthesiologist and he had his lab, his staff lab.
05:12
So I asked the question, do you know where the DOI is? And none of the staff, people who work with data knew what that was or ORCID. So there's the opportunity to interact with the library liaisons and doing outreach and training to some of these specific labs.
05:30
Here's a research process to investigate researchers' current data management practices across campus. This is very important. Who are the data owners and who are the data managers and what are some data management support and training needs?
05:44
So here's the research design methodology. Basically, I'm averaging about six workshops. And the director of the Informatics Institute met with him and he leased once, twice, two a semester in his venue, Informatics Institute.
06:03
So they provide the venue and also the food for their participants, which is very helpful. The Qualtrics survey was 26 questions. It was adapted from a previous data management survey prior to joining UF and also adapted from the data access framework, which was developed by Digital Curation Center, GIS, and University of Glasgow.
06:22
And so apparently on phase three, so I want to analyze the quantitative. There was a question 26 which asked for who will be willing to contribute to building, contributing to data management use cases and more in-depth interviews. And we had 36 responses to that.
06:41
So submitted an NSF CREW proposal, National Science Foundation, Computer Information Science and Engineering Research Initiative, initiation of August the 8th, 474,000. I won't find out if it's accepted for about five months. And basically that is to hire a doctoral or graduate student from the top three
07:04
research disciplines at UF as far as according to the Office of Research annual report. And that would be College of Medicine, Institute of Food, Agricultural Science and Engineering. And so basically we want to hire a graduate student from each one of those disciplines and identify and
07:22
get data from those disciplines and help to create a data management module specific to those community of practice. But also integrating a data carpentry workshop teaches basic research computing skills. And here are just a brief response of the data owners, pretty much 96 PIs.
07:46
For the survey, we had 156 stars, 153 completes, 159 stars, 153, 33 completes for 83 percent completion rate. So it was pretty good.
08:02
So the top data owners are PI institutions, research collaborators and graduate students. As far as data management key are the graduate students. And so there's need to address data management training specific to the graduate students who are managing data and then also the PIs which range from associate assistant and directors of research facilities.
08:27
What resources and support and training outside of specific departments. And the top was short-term and long-term data storage and capacities. Throughout a lot of workshops, a lot of the training workshops turns into like town hall
08:44
meetings where scientists are coming together and comparing notes and also providing suggestions of their needs. One of the big needs are data storage short-term and long-term and then also the management of sensitive data. I found out from one individual who attended from a research IT, you have some
09:03
scientists who are not using Research Vault which is a secure environment for managing sensitive data. And we went to a better job in information literacy reaching out to those research faculties to let them know that we have facilities and resources at UF for them to properly manage and back up their data.
09:24
As opposed to using external hard drives or USBs and not having a standard practice of sensitive and open data as well. Number two is actually training and data management planning and sharing. And so basically I do general and then also specific.
09:45
And so moving forward, I've been at UF now for 21 months, so moving forward the next year is to develop discipline specific use cases across campus.
10:00
And so for question 26 I asked, would you please provide your name and email address if you're interested in building use cases. We had actually 35 responses and I had my graduate internship, my graduate intern identify the roles and departments of the faculty and staff who responded to this question.
10:26
And a large majority come from the top three disciplines, medicine, engineering and IFAS. Future plans, development data management, I already have that instructional intern for two semesters.
10:41
As I mentioned, develop relevant and discipline specific data management use cases. And number three is very interesting, it actually exceeded expectations. This will be the first annual data symposium at UF. It was inspired by Caltech's data symposium early this year and also the University of Alberta Research Data Management Week.
11:04
So we have representatives from IFAS Engineering, the keynote is the Associate Dean of Engineering. Also we have a center for open science who has developed an open science framework participating, ORCID, and then also GitHub.
11:22
It's interesting that GitHub, have my phone over? Private and public GitHub repositories that are being used by UF faculty and students.
11:42
And speaking with GitHub, he said it would be wise to go for a university wide license for GitHub. The return on investment currently, it would cost about $30,000 a year and currently he wouldn't provide all the details to say that UF is spending more than that with limited GitHub repositories.
12:05
Thank you. And so the goal of that is University of Minnesota has a university wide GitHub license. Quite a few scientists use GitHub to store their software code and software. But also there is no university wide or central code or software repository.
12:24
And having spoken with people from UF IT and research computing, they are in support of the idea. And also as part of the program is DSI. DSI stands for Data Science Informatics. It's a student run, student organized program where students learn R, Python, and the latest are research computing skills.
12:52
This was very good. The president extended the invitation and we have them on the program as well. So the program extends from assistant, associate, deans, faculty members, students, external speakers include Dr. Carl Benedict.
13:10
He's the director of research data service at the University of New Mexico. He has over 30 years experience. Then also Viv Hutchison from USGS who's in charge of, she's the branch chief of data and then also lead in the community for data integration.
13:26
So the purpose of this symposium is to bring different faculty and students from across campus together and to work and come together and share their best practices of how they manage data and share data. But also having the students involved as well and then also the nonprofit.
13:43
So hopefully if it's successful we will continue to have this in the future. Thank you.