23 (Research data) things Catchup - 06.09.2016

Video thumbnail (Frame 0) Video thumbnail (Frame 1874) Video thumbnail (Frame 3723) Video thumbnail (Frame 5381) Video thumbnail (Frame 10761) Video thumbnail (Frame 11667) Video thumbnail (Frame 14011) Video thumbnail (Frame 15053) Video thumbnail (Frame 16876) Video thumbnail (Frame 17832) Video thumbnail (Frame 19540) Video thumbnail (Frame 22192) Video thumbnail (Frame 23326) Video thumbnail (Frame 25045) Video thumbnail (Frame 27082) Video thumbnail (Frame 28282) Video thumbnail (Frame 29548) Video thumbnail (Frame 31887) Video thumbnail (Frame 38999) Video thumbnail (Frame 39880) Video thumbnail (Frame 44312) Video thumbnail (Frame 45256) Video thumbnail (Frame 46160) Video thumbnail (Frame 48791) Video thumbnail (Frame 49927) Video thumbnail (Frame 51621) Video thumbnail (Frame 52959) Video thumbnail (Frame 55959) Video thumbnail (Frame 58822) Video thumbnail (Frame 60137) Video thumbnail (Frame 61212) Video thumbnail (Frame 64614)
Video in TIB AV-Portal: 23 (Research data) things Catchup - 06.09.2016

Formal Metadata

23 (Research data) things Catchup - 06.09.2016
Title of Series
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Ivan Hanigan takes us through using reproducible research pipelines to help disentangle health effects of environmental changes from social factors. Dave Connell covers the project cycle used at the Australian Antarctic Data Centre.
Subject indexing Beta function Presentation of a group Multiplication sign Website Bit Quicksort Computer programming Metadata Row (database)
Theory of relativity Touchscreen Multiplication sign Moment (mathematics) Set (mathematics) Coordinate system Counting Bit Division (mathematics) Open set Mereology Number 2 (number) Pauli exclusion principle Uniform resource locator Repository (publishing) Convex hull Descriptive statistics Row (database)
Presentation of a group Distribution (mathematics) Code Source code Set (mathematics) Function (mathematics) Analytic set Information privacy Mereology Perspective (visual) Dimensional analysis Predictability Bit rate Different (Kate Ryan album) Software framework Process (computing) Error message Presentation of a group Enterprise architecture Feedback Bit Hypothesis Replication (computing) Process (computing) Telecommunication Chain Convex hull Resultant Row (database) Point (geometry) Rounding Table (information) Observational study Transport Layer Security Connectivity (graph theory) Lace Modulare Programmierung Student's t-test Code Number Computational physics Authorization Integrated development environment Software testing Traffic reporting Task (computing) Execution unit Distribution (mathematics) Graph (mathematics) Information Online help Patch (Unix) Projective plane Numerical analysis Code Analytic set Inclusion map Subject indexing Integrated development environment Faktorenanalyse Table (information) Fingerprint
Greatest element Table (information) Link (knot theory) Distribution (mathematics) Real number Digital object identifier Analytic set Predictability Measurement Revision control Computational physics Authorization Flowchart Process (computing) Curvature Error message Descriptive statistics Scripting language Presentation of a group Electronic data processing Distribution (mathematics) Multiplication Touchscreen Online help Numerical analysis Shared memory Code Planning Replication (computing) Hypothesis Type theory Data management Explosion Process (computing) Integrated development environment Software Arithmetic progression Fingerprint
Subject indexing Presentation of a group Goodness of fit Analogy Moment (mathematics) Coordinate system Division (mathematics) Data analysis Row (database)
Covering space State observer Collaborationism Dependent and independent variables Projective plane Physicalism Bit Division (mathematics) Food energy Mereology Fault-tolerant system Metadata Computer programming Inclusion map Data management Data management Goodness of fit Process (computing) Data center Quicksort Cycle (graph theory) Drum memory Resultant
Trail Link (knot theory) Maxima and minima Set (mathematics) Library catalog Perspective (visual) Metadata Data management Medical imaging Estimator Process (computing) Touchscreen Cycle (graph theory) Projective plane Electronic mailing list Data storage device Planning Bit Library catalog Cartesian coordinate system Data management Process (computing) File archiver Data center Cycle (graph theory) Row (database)
Metre Point (geometry) Web page Standard deviation Confidence interval File format 1 (number) Sheaf (mathematics) Information privacy Mereology Metadata Number Revision control Web 2.0 Mathematics Differenz <Mathematik> Plane (geometry) Profil (magazine) Ontology Digitale Videotechnik Square number Thermoelectric effect Standard deviation File format Electronic mailing list Metadata Directory service Repository (publishing) Data center Self-organization Quicksort Row (database)
Point (geometry) Multiplication sign WebDAV File format Set (mathematics) Metadata Web 2.0 Synchronization Diagram Physical system Collaborationism Execution unit Metadata Computer network Bit Directory service Process (computing) Software Repository (publishing) Synchronization Website Self-organization Quicksort Capability Maturity Model Row (database) Spacetime
Suite (music) Point (geometry) Projective plane File format Metadata Sound effect Library catalog Metadata Revision control Uniform resource locator Uniform resource locator Repository (publishing) Analog-to-digital converter Computer network Data center Self-organization Quicksort Cycle (graph theory) Row (database) Physical system
Point (geometry) Web page Complex (psychology) Slide rule Random number generation Identifiability Code Multiplication sign Real number View (database) Source code MIDI Device driver Student's t-test Data analysis Surgery Computer programming Metadata Number Different (Kate Ryan album) Analogy Beta function Multiplication Standard deviation Weight Uniqueness quantification Projective plane Moment (mathematics) Electronic mailing list Planning Maxima and minima Virtualization Term (mathematics) Data management Personal digital assistant Repository (publishing) Video game Quicksort Row (database)
Web page Sensitivity analysis Wechselseitige Information Graph (mathematics) Multiplication sign Adaptive behavior Sheaf (mathematics) Materialization (paranormal) XML Mereology Discrete element method Computer programming Number Crash (computing) Different (Kate Ryan album) Computer configuration Single-precision floating-point format Drum memory Chi-squared distribution Domain name Area Execution unit Touchscreen File format Shared memory Bit Word Website Self-organization Quicksort
Point (geometry) Execution unit File format Materialization (paranormal) Quicksort 1 (number) Computer programming Template (C++)
Sensitivity analysis Execution unit Message passing Fibonacci number Network topology Network topology Decision theory Materialization (paranormal) Bit Voltmeter
Home page Execution unit Information Link (knot theory) Multiplication sign MIDI Materialization (paranormal) Similarity (geometry) Image registration Density of states XML Event horizon Number Inflection point Mathematics Crash (computing) Uniform resource locator Well-formed formula Personal digital assistant Quicksort
Information MIDI Motion capture Planning Event horizon Form (programming) Computer programming Form (programming)
Wechselseitige Information Focus (optics) Group action Student's t-test Demoscene Usability Type theory Software Different (Kate Ryan album) Ring (mathematics) Quicksort output Library (computing) Chi-squared distribution
Point (geometry) Mobile app Mobile app Functional (mathematics) Service (economics) Mapping Multiplication sign Feedback Maxima and minima Control flow Bit Streaming media Computer programming Numeral (linguistics) Formal grammar Data conversion Series (mathematics) Freeware
Type theory Focus (optics) Data management Service (economics) Term (mathematics) Cube Square number Bit Open set Mereology Physical system
Execution unit Group action Link (knot theory) File format Multiplication sign Archaeological field survey Materialization (paranormal) Division (mathematics) Mereology Public key certificate Computer programming Data management Quilt Website output
Covering space Group action Artificial neural network Multiplication sign Moment (mathematics) Feedback Materialization (paranormal) Planning Bit Complete metric space Computer programming Connected space Number Uniform resource locator Cuboid Quicksort Form (programming)
since our last catch up we've worked through things 17 to 20 and we'll come back and reflect on those a little bit later in the webinar will also spend some time having a quick look through some new we've recently put up on the 23 things website we'll have a look at some of the sprint to the finish workshops that are coming to a town near you and we'll also have a quick look forward at last few things we are getting really close to the end of 23 things now and so it's sort of getting to an exciting time and hopefully your motivation to keep going is staying with you but let's start today with guest presenters it's my pleasure today to introduce Ivan Hannigan and Dave Connell who are the people behind some of the most talked about metadata records in the 23 things program and today Ivan and Dave very kindly agreed to tell us their data stories I'm Ivan Hannigan is a data scientist working with the Center for Research and Action in public health at the University of Canberra and today Ivan will share with us the journey he undertook to publish the hutchinson drought index record which I think we
have here I'll hopefully you can see on the screen now now we looked at this record as part of our activities in things seven and I believe Ivan's got quite a story to tell about this particular record so we'll hear that shortly interesting to note that this record has been viewed over fourteen hundred times and this time last year when we look back at our records we can see it was viewed around two hundred and nine times so it may be it's all of us looking at the records but at least it generated some interest for that record we love this record for a number of reasons but it's a it's a really good description of a data set and one of the things I guess we asked you to look at so we'll connect it up with relations relationships indicated between related public patient's open access repositories github and a whole bunch of other resources that really add value to the data description so Ivan will be talking about that record in in a moment and our
second speaker today is Dave Connell who we won't be seeing on screen but is with us his scientific data coordinator at the Australian Antarctic division down in Tasmania and here's the person behind the Weddell seals record that we looked at in things four and seven and there's that record this records now been viewed 2385 times and this time last year's been viewed around 300 times we looked at this record for one one of the reasons we looked at this record was because it shows us a citation count that was part of the activity for seeing seven but it's also a nice example of a really descriptive record and a record that you can find in various locations and Dave will be pep talk us a little bit about that today so we'll start now I'll hand over to Ivan who's going to speak to us first Ivan if you wish to commence and so like this Ivan Hannigan
our friend behind the hatchets and drought index record that we looked at in thing seven and Ivan I'll hand over to you now to tell us the story behind this record thanks Jerry so the reproducibility crisis has become known I think Roger pen coined the phrase in 2006 and what it basically boils down to is it's incredibly difficult to achieve the seemingly trivial task of loading up some data that has been talked about in a paper and recalculating the exact results that from that paper it seems trivial but it's actually very very difficult and there's a lot of points in the road from research to data publication that stand in the way of this it's nice to hear the feedback on the Hutchinson record being a linked record between papers software code and data and this is absolutely critical to the solution to the reproducibility crisis and so what I wanted to do today was to talk about the entire pipeline from my perspective and what hang and others have turned a reproducible research pipeline which leads through the chain of linkages from the author to the reader and the distribution of data code and papers now the interesting part of being a science student and designing your experiments and measuring data in a lab and a fairly basic educational setting turns into a bit of a bigger enterprise when you start a large research interdisciplinary research project and measured data becomes a very difficult to define beast but you can imagine that you've got a fairly good idea about what data you want to find you go out and get it or you create it and then you've got some electronic information that's where coding can come in handy to turn that often messy and dirty full of errors kinds of data and often distinct across different data sets such as environmental health demographic and you can combine all that data into something that you can analyze if you want to correlate drought with a suicide incidence rate ratio then you've got to at least join some climate data some death star turns and population and demographic data often with spatial and temporal dimensions once you've done all that processing you've got a computational process that also can create a new data and you want to pay attention to analyzing that in a in a systematic and rigorous way and code comes in handy here and one of the things that the Hutchinson data set was connected to is a published software package full of analytic code that takes the analytic data pumps out some computational results at some presentation where the output such as big as tables numbers and then puts it all together with some text that we wrote into the report that you can download from the journal all of these components of the pipeline can go down to the reader through the distribution channel which is what data publication and all the associated activities of the reproducible research pipeline are designed to do now this solves a big problem called the reproducibility crisis but there's an even bigger problem where you might have heard that
some got quite a lot of papers cannot be replicated in new study environments and the adherence to the reproducibility pipeline framework goes some way to solving this more serious scientific problem by allowing readers who may want to replicate a study to develop an understanding and analytical methods test new things to do with their code and different ideas whilst benchmarking against published validated computational results that leaves the
pipeline open to be extended with new measured data in new populations with new errors and potentially new methods and insights so that the findings of the original papers can be replicated and in this way we can we know out the wheat from the chaff in our hypothesizing and design new experiments that more quickly give us scientific progress so that's the overview of the pipeline and it is how I've tried to operate in my more recent papers where the process of finding and getting the data is all systematically organized with the data management plan I can get the data through the licensing and the understand all or the authorizations ethics I have to put the data somewhere and that's quite challenging it gets quite big you have to back it up then you start doing stuff with data no multiple versions to control and different types the measured data is not often something that the downstream reader will be able to use or probably wants to but the analytical data is the reader will want to find out more about what has been done with the data to arrive at the analytic data and so scripts can do a lead and then you've got this sharing of data process with a the distribution channel the digital object identifier and other links and licensing all being held down there in the bottom of the pipeline so that kind of brings me to the end of the description of the pipeline but obviously in a real world environment is never as simple as that
this I hope khanome you can see that i use flowchart diagramming software to track the multiple steps in protracted pipelines where data is fed in from the top and keeps going down to the bottom actually it goes off the screen this is about twice as big but I could only show you the top half with any legibility and the final comment that I'd like to make it comes from a famous quote the data and analyses of like sausages it's really better not to see than being made they've extended that to say that even so you do want to know what goes into them is high quality ingredients Thank you Thank You Ivan for that interesting
insight to the creation of that Hutchinson drought index record it's really interesting I think for people involved in the 23 things programmed to hear from somebody who actually works with data and who has to actually unravel things like license requirements do eyes reproducibility and ensuring that the data is of good quality and appropriate for reuse now as we got any questions at this stage Susanna have we got any questions at the moment wonderful comments so basically there's love that ok love the quote about data analysis I like sausages and what about analogy and then thank you have a great presentation okay well thank you suzanne and thank you ivan we may get more questions as we listen through to what Dave Connell has to say so I'll over now two days as I mentioned before Dave Connell is scientific data coordinator at the
Antarctic Australian Antarctic division oh there's a nice picture so it's my
pleasure to hand over now to Dave to tell us a little bit about the inner workings of the Australian Antarctic data center from somebody who actually works there and whose day job is data management thanks Dave thanks Jerry yeah so as Jerry explained I work for the Shonan Antarctic division I'm part of this rain Antarctic data center and our job is to sort of manage all the data that none they aren't Activision produces so Jerry asked me to talk a little bit about the project cycle that we use here as well as sort of what we do with that metadata regarding harvesting ok so the IADC we were
established about 20 years ago 1995 and as I just mentioned we've with responsibility for management of all the data that comes out of strain Antarctic Program so that's this rain Antarctic division as well as all our University affiliations and also some international collaborators there's an Antarctic Treaty which sort of governs what we do and that was signed in 1959 initially by 12 countries i think but there are many more signatories now and one of the crucial parts of that is that it says that scientific observations and results from antarctica shall be exchanged made freely available so we have this big international mandate that says that we have to push our data out there and make sure that people can access it and use it whenever they want to so we're fairly well not as big as we used to be but we are we still run a fairly big science program we run about 60 science projects each summer season all throughout them but the whole year it's very multidisciplinary so we have sciences coverings everything from geology biology to emissary physics to grace the ology oceanography you name it we did good chance we'll do something like it or fairly close to it so the project cycle I also should point out things
this is an Antarctic project I have put a few gratuitous Antarctic images through here to keep you and amusing entertained pretty late the AAP project cycle so the way that things work in our organized as this harness will come up with an idea for what they want to study they'll put this together as an application and then they could send off to a an independent committee who reviews all the applications and decides which some scientific projects are worth pursuing once that's happened the scientists this runs came from banana management perspective are then required to write a data management plan so that spells out all of the data that they think that are likely to be collecting during the course of the project as well as estimates as to how much that there will be and when they expect to hand it in and that's been a fairly invaluable tool for us in the data center because it lets us forecast how many how much data storage we're likely to need as well as well as knowing what data we can expect to come in because at the end of the project we then have to start to chasing people to to get all the data and this data management plans really helped us so that gives us a shopping list of things we can use things to look for so after the data management plans approved the sinus go off and do their research and then at the end of that process they start cataloging their data with them metadata metadata records they archive the data in our data center and then we'll all and that means we then publish the data and we will sign of the data set DIY so that the scientists can use that citations and also just to really get their name out there a little bit more obviously once they've got the dua then they can use that in their own papers so they can self cite them their own data and all of that is underpinned by this ap data policy there's a link there on screen what's the follows all
that is that the insiders then come back and say we'd like to do another project we review how they are there they've previously gone with their data management so we'll have a look over on their previous track record to see if there have been good at climbing and catalogue in a data or if it's something that they don't particularly care about what happens then is that we will then give them a score of their data management practices and then the external committee that reviews all their project applications will take that into account when they're armed when they're deciding whether it will approve or reject projects so sign as if they don't do the right thing and I properly manage that data it can have an impact on any future work they might want to do okay we're
regarding metadata mr. netaji data center we use the diff metadata standard which is a which stands for directory interchange format that's a standard that was developed and maintained by the global change master directory there are a part of NASA based on out of the Goddard Space Center in Maryland just north of Washington DC the reason why we use this standard is because it's the standard that's used by the entire international Antarctic community there was about 20 sorry about 15 16 years ago then the entire community got together and decided they needed to settle on a metadata standard rather than using disparate standards each country's we're looking at and diff was the one that was a was chosen and as part of that the the global change master directory they provide metadata support and to the entire community as well as providing an Antarctic sort of our metadata portal so currently we have about lynched over two and a half thousand meter letter records ninety percent of which point to UM data a lot of which is public although not all that because we do have some commercial and confidence data and obviously we don't make data public until the scientists and then you had a chance to publish themselves so as for
our datum we as I said we have that in a diff format which you can see in blue up the top but we also convert that to a large number of other formats so on this will page here or the blue squares and metadata formats and the red squares metadata repositories so we convert our diff standard into the ISO 1911 15 standard which is just list there is I so we also have a marine community profile version of that ansley profile version of that and we also have this other funky one this ends version so that's just an intermediate step so anne's then take our is ons metadata record and then they convert that into riff CS which then is made available via the and repository so as you can see I've listed the the ones that I can the organization's I can think of that harvest and metadata records so you can see there's quite a few debt most of them just take the plane I so I have just realized I've put bomb and thomson in the wrong section because they actually take our plain diff metadata they don't take the guy so they just take the regular diff so but anyway that's all by the by so but what happens sort of here one thing I should point out is that most of these all these people that are harvesting a metadata record they do so via something called web dev so we have all of our records
there are web accessible folders and we put those up there and then people just grab them and they harvest them when when they're ready I should point out that the strange metal data directory they require us user network for us to get our records into their system but we haven't quite got that running properly yet so that's more of a sort of waiting watch this space so the reason why we do with this harvesting it because it helps increase the exposure not only for us but also for our scientists so the more metadata repository is we can get our records out into the greater than the greater the spread and dissemination of all these metadata records there is and that means that they're given our site is much more chance of their data being noticed and therefore being picked up by other by the sinus and then obviously they get cited or it could lead to further collaborations for the sinus themselves okay there are a few issues
though with them with this at the top point there as I say no experience harvesting is it usually bit more pull than push so whereas people like bomb or and whatever they're grabbing them internet a record off of us when it's convenient for them some organizations are good at doing that regularly or romantically others they don't do it quite so often and that can lead to sync problems because we're updating our metadata records all the time and if somebody hasn't half center records for a few months then the set that they've got obviously it gets out of date so we sort of monitor that a little bit every now and then but that's just more of a I remember to go every now and then process rather than anything being automatic so we could probably improve that also there's a slight issue with ree harvesting which I'll just show you in this diagram so I'm greyed out sort
of most of it but what happens here is it ends you can see they get our metadata from us so it comes down through def and into and that ends also harvest metadata records from this an ocean data network and now you grab an ISO version of our data likewise suits the southern ocean observing system also harvest metadata from the ATN as well as getting it directly from us so the problem there is that we then have two copies of one minute I record
which can end up in the same system and sometimes these records aren't quite the same and that's because other organizations like this one ocean data network and an aunt say they will tailor the military records very slightly to suit their purposes which is fair enough because there's their repository they could there on it's their catalog so they should make them look sort of how it fits for them all that sort of has the effect of is that when the records all end up in an in this other via next repository is that even though they're the same records they won't look quite the same so we've sort of trying to address that by adding a uuid to all of our records so that that gets preserved all the way through so that users can then at least if they sue two records which look to be similar they can compare the UID and go okay that is the same record so I only worry about looking at this one and we also try and how to point of truth URL to all of our records to point back to the Asturian a target data center original so that all these users can then get back to the master record okay so that's basically it for harvesting and for our project cycle so thanks for listening and if
anybody's got any questions I'm happy to answer thanks thanks Dave it's almost like a 23 things bingo in the number of things you covered off on that we've discussed in 23 things like data policies funder policies where you mentioned you know how future projects assess partly on past performance with data management talking about metadata standards and the fact that there are many of them and that's why we love them so much cross walking and also those multiple discovery points that you get from having those records harvested and one of the things that we did look at in the 23 things program was looking at the impact in a way of harvesting and the differences between records so how they start from a source as you indicate it but as they perhaps get harvested two different repositories they may look a little different even though they've all come from the original source so that was actually a really nice summary of a number of things that we've looked at in some real real life examples there so thank you for that does anybody have any questions yet surgery there are a couple of questions well this is a comment it says like the view of data management plans and shopping lists for what is needed to support a particular project and go to borrow that analogy and then there's another comment which is I suppose is a question is a surprise that soo s does not use the marine standard like I moss and I odn do we know why that is that yeah they might sort of come up with this slide together yesterday quite quickly and I wrote off to him the sous data person just for clarification and you unfortunately traveling to the US the moments I didn't get quite get an answer back so I put my best guess in okay and then um the question is what is a uuid a uuid is a unique universal identifier so it's a I think it's a 16-digit number it's got a lot of numbers and letters and stuff so it's the simplicity unique code that you just decide to do anything you like there are uuid random generators on the net and I just use those and just every time I make a minute I record we make one and stick it into s to get into our record okay and then another question is how many scientists would be working on data at any one time any one time okay let's see where organized it's kind of hard to answer we have so we have 60 projects roughly each each summer season there are probably 50 to 55 chief investigators in each of those projects each project is headed is headed up by a single scientist and some of those scientists manage multiple projects and within each project there can be up to 15 people working on each project but I think it's probably more likely the BB up any sort of five or six so there are 60 so this may be say 300 scientists in any particular one time working program that's obviously very rough number and a lot of those are many days will be PhD students working for their supervisors like I think that the end of the questions that come through Gary back to you okay thanks Susanna I'd like to thank Ivan and Dave both to their time today I think both offered some really good insights into the real world of working with data some of the drivers for doing data management and also some of the solutions and ways of going about it so many thanks Ivan and Dave and sort of virtual applause managers from our audience and we appreciate your time today in your insights into your activities just data managers so thank you for that okay people just before they go there's a high complexity but there's also a another question Bible that's it Ivan do you think that the German visual experimentation helps reproducibility I'm sorry I don't know that channel and so I can't comment well that was the last one that came through okay there's a general comment if everyone wrote and data analysis code and made that available along with data and papers that would be the minimum requirement to solve the reproducibility crisis is that journal requires or assistance you to do that and then the it's a big yes thumbs up thumbs up for that one okay thank you thank you for your time and we'll move on now because we've got quite a lot to cover so we've
got a lot to cover today so all I thought we might do next is just have a look at some of the new things that have cropped up on the 23 things web page but some of you may have already noticed but I'd like to just point out a couple of things in particular one here on the
left hand side is our repurpose toolkit this is something we've been talking
about quite some time and it's now available for you to use and reuse and
as you wish so what we've done here is created one page where what we've
brought together are all the 23 things and their
activities into a single long word document that you can download and then edit to contextualize it and change the examples to suit your needs so if you wish to run all or part of the 23 things program in your organization the materials are there now that would allow you to adapt the materials that we prepared for this program for your own particular purposes we've also got some ideas about how you might want to use it and one of them that you can see here on the screen is the 10 medical and health things so a colleague Kate LeMay has extracted 10 out of the 23 things to create a short course of 10 medical and health things and in that she's reused some of the things but change the examples to be quite focused around the health and medical area so you can see there would be options there to contextualize it for different audiences different subject domains have to just run a short course on sensitive data and and health data there's a number of ways that you could reuse the materials and repackage them for different sorts of audiences another thing that we have here is the materials that we use to run our crash courses earlier this year and so that if you wanted to run something similar again in your organization the materials are all there in a format that you can edit and adapt to however you wish to use them and we'll be putting up the sprint to the finish course materials when they're available so that'll be up they'll be up in the next couple of weeks and we've sort of got some examples down here just examples of how the materials could be used just I guess to give you some ideas to spark a few thoughts about how you might be able to use some or all of the activities in the 23 things program for your own purposes what will also be encouraging you to do that if you do create a short course or some other adaptation of the 23 things program we've encouraged you to let us know if you're happy to share it and we'll put those examples up on this page so that others can again see what other people are doing it can become a bit of a resource sharing area so that if you've created something you're happy for others to use it we've got a place here that can be used to share those materials another section on our website is the resources section and again if
this sort of supports reuse of the materials but we've put up all the things that we've used during the
program and in a downloadable format so the posters that you would have all had or seen at some point that provide the overview summary of the program the reward stickers you can either ask us to
send you some or you can actually print your own but we've got the template up there the same with the bookmarks and
the other materials that we've used like
the sensitive data decision tree the data citation poster and pamphlet they're all here together easy to find
easy to either download or you can just email us and we'll send you printed copies if that's easier so I guess the message here is we really are very keen for you to reuse the materials as much as possible they're there for you to exploit really any way you can everything in the toolkit is licensed cc-by so please just have a bit of a
look through see whether there's some
things there that you might find useful and we'd love to hear hear about any any of ideas that you have for how you might really use these materials in the future so that's a reuse tool including the
health and medical things the 10 health and medical things and the
resources so please have a look through see how you can use those materials now
also on the 23 things home page yeah as
information about all our sprint to the finish workshop now we run the crash courses earlier this year they were quite a hit and so we've decided that we will run what we're calling sprint to the finish courses similar idea where you get together for half a day and in this case work through things thirteen to twenty three some of you might like to take that opportunity to sort of whip through the final three few things or to catch up or just to do things with other people and so all the details are here dates registration details if you click on the register link for the event that you're interested in it will give you all the details about the location and time etc etcetera and all the background information these as you can see is starting from next week so good idea to get in early and register for the workshop that you're interested in free to attend but you do need to register so that we have some idea of numbers also while we're talking about event eerie search Australasia coming up in October
and normally has quite a few people there and we'd love to see you if you are coming along to the e research conference in Melbourne and we will have an informal capture of 23 things participants during the workshop it will involve some form of sugar don't know yet what that form will be but in just intended to be an informal catch up of people who've participated and program so I stay tuned for more information if you're planning to be at Erie search this year so that's a quick whip through recent resources and upcoming event now let's just have a quick look back at some of the things that we've looked at
since our last ketchup so we've worked
through if you're keeping up with the focus week weeks we've worked through
things 17 which is about data literacy and outreach and this thing was an opportunity to explore data literacy resources for all sorts of people student citizen scientists teachers and librarians were just some of the I guess user groups that we were targeting was also an opportunity to I guess read an introduction to the four different types of carpentry courses that aim to build technical skills in wrangling data and some of you will have heard Belinda Weaver from qcf in Brisbane talk about software and library carpentry courses that she runs and these have certainly been generating a lot of interest in the library community and so we thought it was a good opportunity to introduce those carpenter ease in thing a scene that was about data interviews and we when you that that would be a popular
thing because a lot of people are involved in talking to researchers about data whether it's formally for data interviews or informally to talk to them about services that are on offer at their institution so in this thing we shared some really great resources for conducting both formal data interviews or just starting a conversation about data we had some great feedback about the Monash toolkit that had been shared and also the what's my pitch document that was just a series of conversation starters and people seem to find those really helpful and I expect some people will adapt these for their own purposes and that's terrific that's the whole point of sharing them and what this program is really all about then we moved on to thing 19 which was about
api's and apps and this is where the tech rubber really started to hit the road we all you will use apps but what do we really know about them and how they relate to research data so thing 19 gave us an opportunity to explore some apps and to get hands on with the trove API I'm sure many of you are familiar with trove and you're probably used at numerous times clue what this thing provided with you with is a bit more about how it works and perhaps new ways of using it so hopefully that was an interesting thing for people who perhaps hadn't had a chance to look under the hood previously for api's and apps in 20 last week find it with data all about spatial data so in this thing we search for data using a map search function we have a look at the wonderful Atlas of living Australia to see what fauna and flora had been spotted in our suburb or street and there was also an opportunity to get hands-on with some free online tools to manipulate spatial data so hopefully something in there for everyone and something perhaps if you are interested in learning more about tools and getting hands-on with data this is one to perhaps cherry pickle come back to when you have time if you didn't have a chance to have a look at the challenge my stream last week so that's a quick breath through what we've done in the last few weeks and so what's coming up next well we have a break week next week again another chance to catch up catch a breath cherry pick over some things that you may have skipped over previously and then we move on to thing 21 so thing 21 is another sort of
hands-on tech type activity where the focus is on dirty data and what we have here is an opportunity to find out why we should care about data dirty data and also roll up our sleeves and have a go at cleaning up some dirty data using some of the tools that have been or in particularly open refine tool that's also part of the data carpentry courses so this is an opportunity if doing the earlier thing around the carpentry 'he's sparked a bit of interest for you this one gives you an opportunity to actually get hands-on a little bit with what some of the data carpentry is all about with the open refine tool and get a little bit hands-on next we will then move on to sing 22 and this is where we learn a little bit more about australia's research data management ecosystem and unlock some of the mysteries of the acronym soup and being one of them many others that we've referred to during the course of twenty three things and this is an opportunity to find out a little bit more about in the australian research data management system but also to have a closer look at some of the research data related facilities and services that you may not be aware of and some really interesting things to look at in terms of things like the data cube and the Square Kilometre Array some of our Big Data initiatives and then finally woohoo we'll get to thing 23 and
this is really going to be an opportunity to think about where to from here what are some of the other
opportunities to keep on learning and to think about your data future so i encourage you to have a look at this one and as part of a group or individually to think about what you might do next as part of your data management learning journey also in thing 23 there is a short survey that we would really love you to complete we came to evaluate the program hear what you've got out of it so if you stick with us to think 23 or even if you do please come to think 23 and click through to the link for the short survey
and let us know how you went also it seemed 23 there's a certificate of participation you can download and of course you can claim every badge if you've been collecting your credit badges and then after that we will be having our final webinar catch up on October 16 where we will wrap up the program for this year so it's probably a good time to just remind you all that we won't be running the program in its current format again next year but all the materials that you can see on the on the end website now there's the 23 things the repurposed toolkit the
resources all those materials will stay up there into the future but at the end of October we will be winding down the current program so we won't be sort of running the regular webinars and catch up set cetera again next year so if you can to finish the program with us keep on doing your things or think about joining in a sprint to the finish course when it comes to a location near you but we have here at Ann's been thinking a little bit about next year the program has been well-received we've had some great feedback about it so we are thinking about what they might be able to do next year too I guess maintain a connection to the 23 things program and perhaps offer some workshops or sessions that I guess build on what we've been doing with the 23 things program this year just got a few ideas I guess that we're kicking around at the moment so keep an eye out in Anza or come along to our final webinar to find out what might be happening with the 23 things program next week a next year also we've got some great speakers lined up for our final webinar and it's going to be an opportunity for us to celebrate getting to the end of the program so hopefully you'll be able to join for that and if you are planning a celebration in your group for the completion of 23 things we'd love to see your plans reflected on Twitter or drop us an email so that we've got some idea of how people are winding up the program amongst their groups that was really all I wanted to cover today but please if anybody got any questions comments or suggestions please put them in the chat box in the question pod any anything there Susanna none of the moment Jerry ok well it's just so you can do what you've been saying enjoy exhibit thank you I've been rabbiting on for the last 20 minutes so it's probably time for me to wrap up another thank you to Dave and Ivan our speakers today giving up some
of their valuable time to talk to us give us some great insights and thank you all for coming along today and sticking with us through to this stage of the program it's been quite a long journey 23 is a very it's a large number of things we've worked out it's a long time to sustain motivation and to stay engaged but we're really delighted with the number of people that have stayed with the program in some form since we started back in March