Introducing the ANDS Guide + A Sensitive Data Success Story - 15 Oct 14

Video thumbnail (Frame 0) Video thumbnail (Frame 957) Video thumbnail (Frame 2088) Video thumbnail (Frame 4248) Video thumbnail (Frame 6744) Video thumbnail (Frame 7963) Video thumbnail (Frame 10009) Video thumbnail (Frame 10824) Video thumbnail (Frame 12358) Video thumbnail (Frame 13333) Video thumbnail (Frame 14715) Video thumbnail (Frame 16238) Video thumbnail (Frame 17278) Video thumbnail (Frame 19458) Video thumbnail (Frame 20989) Video thumbnail (Frame 22177) Video thumbnail (Frame 24465) Video thumbnail (Frame 25460) Video thumbnail (Frame 26773) Video thumbnail (Frame 28616) Video thumbnail (Frame 31590) Video thumbnail (Frame 32861) Video thumbnail (Frame 33892) Video thumbnail (Frame 35103) Video thumbnail (Frame 36087) Video thumbnail (Frame 38336) Video thumbnail (Frame 40895) Video thumbnail (Frame 41899) Video thumbnail (Frame 43568) Video thumbnail (Frame 44651) Video thumbnail (Frame 45928) Video thumbnail (Frame 48794) Video thumbnail (Frame 49871) Video thumbnail (Frame 52395) Video thumbnail (Frame 54360) Video thumbnail (Frame 56043)
Video in TIB AV-Portal: Introducing the ANDS Guide + A Sensitive Data Success Story - 15 Oct 14

Formal Metadata

Introducing the ANDS Guide + A Sensitive Data Success Story - 15 Oct 14
Alternative Title
A Sensitive Data Success Story
Title of Series
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Dr. Sarah Olesen, the author of the ANDS Guide to Publishing & Sharing Sensitive Data, will define and describe the sensitive data landscape and take you through the steps involved in publishing sensitive data. Sarah will cover topics from the Guide, including navigating legal and ethical requirements, publishing and sharing data with conditional access, and how to confidentialise sensitive data. In the second part of this webinar, researchers from the Australian Longitudinal Study of Australian Women will then describe how they have published and shared health and social data from this influential research project for almost 20 years.
Context awareness Sensitivity analysis Service (economics) Context awareness Presentation of a group Dependent and independent variables Multiplication sign Electronic program guide Point (geometry) Electronic program guide Content (media) Content (media) Web 2.0 Internet service provider
Point (geometry) Link (knot theory) Information Decision theory Decision theory Electronic program guide Moment (mathematics) Electronic program guide Shared memory Bit Mereology Sequence Data management Newsletter Inclusion map Process (computing) Term (mathematics) Energy level Information Process (computing) Quicksort Form (programming)
Point (geometry) Sensitivity analysis Slide rule Feedback Decision theory Electronic program guide Function (mathematics) Shape (magazine) Expert system Field (computer science) Information technology consulting Number Usability Expected value Term (mathematics) Profil (magazine) Repository (publishing) Information security Form (programming) God Collaborationism Focus (optics) Key (cryptography) Decision theory Video tracking Feedback Data storage device Expert system Lattice (order) Line (geometry) User profile Category of being Data management Repository (publishing) Data flow diagram Network topology Data logger Information security
Point (geometry) Sensitivity analysis Context awareness Joystick View (database) Decision theory Collaborationism Electronic program guide Mereology Field (computer science) Data management Inclusion map Term (mathematics) Software framework Form (programming) Collaborationism Focus (optics) Variety (linguistics) Decision theory Bit Sequence Category of being Data mining Data management Process (computing) Software framework Cycle (graph theory)
Module (mathematics) Sensitivity analysis Focus (optics) Information Feedback Feedback Electronic program guide Bit Line (geometry) Content (media) Category of being Latent heat Process (computing) Iteration Repository (publishing) Repository (publishing) output Process (computing) Modul <Datentyp> output
Sensitivity analysis Presentation of a group Multiplication sign Decision theory Electronic program guide Disk read-and-write head Metadata Term (mathematics) Repository (publishing) Information Descriptive statistics Condition number Information Key (cryptography) Decision theory Electronic program guide Point (geometry) Planning Bit Message passing Data management Process (computing) Personal digital assistant Software repository Condition number
Point (geometry) Sensitivity analysis Decision theory Simultaneous localization and mapping Electronic program guide Maxima and minima Event horizon Field (computer science) Product (business) Network topology Core dump Cuboid Exception handling Area Execution unit Decision theory Physical law Moment (mathematics) Sequence Vector potential Data mining Uniform resource locator Process (computing) System identification Species Intercept theorem Exception handling
Area Sensitivity analysis Information Projective plane Electronic mailing list Event horizon Field (computer science) Number Category of being Uniform resource locator Information retrieval Species Quicksort Species Object (grammar) Sinc function Row (database) Form (programming) Identity management Vulnerability (computing)
Point (geometry) Area Identifiability Table (information) Information View (database) Electronic program guide Electronic program guide Sheaf (mathematics) Expert system Lattice (order) Information privacy Field (computer science) Information privacy Root Sheaf (mathematics) System identification Information Row (database) Form (programming)
Point (geometry) Standard deviation Suite (music) Addition Link (knot theory) INTEGRAL View (database) Sheaf (mathematics) Electronic program guide Mereology Information privacy Data management Repository (publishing) Hydraulic jump Data integrity Addition Standard deviation Key (cryptography) Suite (music) Cartesian coordinate system Information privacy Data management Repository (publishing) Sheaf (mathematics) Freeware
Sensitivity analysis Group action Observational study Collaborationism Sheaf (mathematics) Electronic program guide 1 (number) Set (mathematics) Metadata Field (computer science) Computer configuration Term (mathematics) Information Information security Address space Descriptive statistics Form (programming) Condition number Collaborationism Arm Information Metadata Planning Bit Group action Cartesian coordinate system Vector potential Inclusion map Message passing Personal digital assistant Repository (publishing) Radio-frequency identification Internet service provider Sheaf (mathematics) System identification Condition number Information security Form (programming) Conditional probability
Observational study Arm Observational study Principal ideal Sheaf (mathematics) Shared memory Metadata Set (mathematics) Cartesian coordinate system Metadata Vector potential Integral domain File archiver Reading (process) Descriptive statistics Row (database) Condition number Conditional probability
Observational study Observational study Projective plane Collaborationism Shared memory Field (computer science) Quicksort Group action Fundamental theorem of algebra Condition number Perspective (visual)
Group action Service (economics) Observational study Information State of matter System administrator Archaeological field survey Range (statistics) Source code State of matter Sampling (statistics) Archaeological field survey Set (mathematics) Bit Incidence algebra 19 (number) Event horizon Windows Registry Event horizon Internet service provider Set (mathematics) Video game
Area Source code Service (economics) Observational study Service (economics) Observational study Information Range (statistics) Projective plane Source code Computer program Mathematical analysis Physicalism Set (mathematics) Function (mathematics) Numbering scheme Mereology Number Term (mathematics) Condition number Quicksort Condition number
Axiom of choice Slide rule Dependent and independent variables Observational study Observational study Information Dependent and independent variables Archaeological field survey Archaeological field survey Set (mathematics) Mereology Event horizon Process (computing) Personal digital assistant Information Statement (computer science) Identifiability Reading (process) Identity management Form (programming) Condition number Family
Aliasing Observational study Link (knot theory) Code Archaeological field survey Sheaf (mathematics) Set (mathematics) Code Number Term (mathematics) Repository (publishing) Uniqueness quantification Energy level Website Address space Descriptive statistics Area MIDI Dependent and independent variables Information Archaeological field survey Mathematical analysis Metadata Data analysis Staff (military) Data management Repository (publishing) Universe (mathematics) File archiver Website Quicksort Geometry Conditional probability
Trail Presentation of a group Regulärer Ausdruck <Textverarbeitung> Observational study Multiplication sign Workstation <Musikinstrument> Electronic program guide Sign (mathematics) Term (mathematics) Feasibility study Statement (computer science) Information Traffic reporting Form (programming) Observational study Information Expert system Mathematical analysis Shared memory Planning Cartesian coordinate system Variable (mathematics) Vector potential Category of being Process (computing) Computer configuration Repository (publishing) Statement (computer science) Arithmetic progression Form (programming)
Sensitivity analysis Information Feedback Multiplication sign Projective plane Feedback Electronic program guide Sheaf (mathematics) Metadata Set (mathematics) Content (media) Number Computer configuration Term (mathematics) Descriptive statistics Condition number
Web page Meta element Twin prime Strategy game Exception handling
I'm first up what I'll talk about today is to provide a context provide context
around that the sensitive data guide and what the goals of the guide were so I'll introduce its content and the key place that might pique your interest because obviously I won't have time to run through the detail of everything that's in the guide and also then talk about where to from here having put this guide on the web and having had some interesting response from it already what are we going to do from now on how we're going to add to that and then either said some time for discussion and any new ideas that you as the community have following the presentation today so
what is the guide to publishing and sharing sensitive data we released the guide on the twenty-third of September's so just a couple of weeks ago alongside one of them ends up newsletters if you want to look at that newsletter because it's got summary of the guide as well and some other bits in it is the the link there the guides written for well anyone who's managing sensitive data so includes data managers researchers members of the research offers dull librarians anybody really it is introductory level information however the guidance through the steps and the decisions made within the guide in terms of directing us to what to do and what sequence to do things in a relevant to everybody regardless of your level of understanding and expertise why do we
create the guide in the first place well as you know as most of you are probably here because you do use or deal with sensitive data is that it can be a little bit trickier than other forms of data it involves extra steps to publication and sharing that need consideration mainly things around the legal and ethical side of things the fact at the moment unfortunately there was a little publicly available in terms of guidance to help navigate the process and to guide decisions that are publishing sensitive data what we found that was out there was quite disparate in terms of the level that was aimed at the level of detail and its inclusiveness as well and by that I mean that often parts of the steps involved in publishing since your data might have been included or spoken about but not the full sort of whoa which is what we've tried to do on the guy consistently there was an expressive need amongst the dark community to navigation and some consensus around publishing sensitive data which isn't to say that there's ever going to be a one-size-fits-all for sensitive data but some consensus in the process of how this is done what steps everybody needs to consider to start from here's my data to use me and one published up and probably most importantly is that all of these these points above have prevented researchers and data managers from renting the potential benefits of publishing their sensitive data so those are benefits for
you as the research or the data manager in terms of the data becoming discoverable which can lead to expectations collaborations reputation and profile which can lead to future funding tracking the reach and output and raise up your data ability of publishing leading journals so a number of leading journals plus is always a good example of that now mandate that data regardless of what kind of data it is needs to be published in one side the paper data security in terms of storage this sensitive data as well as meeting funding and said paper publishing obligations so motor funders in in Australia at this point are not mandating the publication of data including sensitive data although there is encouragement along that way but what you may have find or might find is that collaborations if you have collaborations with institutions like sees particularly us in the UK major funders there like the Wellcome Trust or NIH do mandate publication of data including some problems of sense and of course there are benefits for wider size in terms of scientific rigor so any forms of data that can be published and of course we checked by Alan's values of open access I think particularly for sensitive data which might not apply to other forms of data is that the kind of data that we including this category of sensitive data are often those that are most expensive and time-consuming to collect and most taxing on participants if that data can be published and potentially reused then there's big points for the for efficiency of research along those lines so however God how did the guide
take shape what should we do well no so there's an absence of sensitive data records in repositories including our only research data Australia and these are for the reasons that I mentioned in the earlier slide so we had some discussion and community food feedback around this review of the literature that was out there and much consultation with and editing with experts in legal and ethical fields as well as experts in people that were experts in particular forms of sensitive data such psychological data what we opted for was to focus on the user friendliness of the guide so user-focused guide that included major decisions and the steps to publication in clear easy-to-follow way and this is based around a flow diagram or a decision tree which I'll get to very shortly the focus or the key
features of the sensitive data guide is to firstly to clearly outline the sequential steps involved in publishing and sharing sensitive data specifically although the steps to publishing and sharing any forms of of data irrelevant in this safe you're not dealing with sensitive daddy might still find the processes or the steps outlined in the guide quite useful to provide a decision framework for going through these steps so I've got this kind of data what do I do next what's the appropriate sequence through each of those steps you can take them off as you go along encompassing definitions and some methodology for each steps so it's quite hard to write then all income sir an inclusive definition of sensitive data because many kinds of data will finish that category not I'll go through some of those shortly so keeping that in mind in writing definitions or describing for example what sensitive data reads was to keep it relatively inclusive so that the focus is on encouraging the reader to think about what it is that might make this their data sensitive from a legal and ethical point of view and lastly legal and ethical expertise to provide advice throughout and before the release as did sensitive data managers and in some fields which I mentioned
so in terms of thinking about mine and ours is obviously involved in and we've given many webinars on various aspects of data management so just to provide a bit of context the guide is largely focused largely on the publication and sharing part of that cycle it is of course predicated by Goethe management so your dad has to be in good form before you consider can consider publishing it of course and then the point of doing it is obviously to read the benefits of that data citation collaboration potentially future of me the guides not intended to replace or
overwrite any institutional policies that that you've come across so for example you might have within your institution quite specific policies regarding data treatment or the confidential but I think I've done or how its taught what repositories to use an intellectual property policies can vary a bit between institutions as well it's intended to be a guide rather than a technical manual so the focus as I mentioned before is is really on looking at the overview or the process of publishing data rather than detailed instructions on each of the processes long way there is some information there but that's not to say that there will and there should be scenarios where you'll need greater take detail or quite specific methods required for some kinds of sensitive data and we aim to keep updating a bibliography of where's perhaps and more specific instructions can be found for each of those steps which will fit within the guide and of
course there's more to come so these sensitive data data guide is iterative in its content we'd like to keep updating it and adding more following feedback from the community or in feedback from you too we're looking at in 2015 to provide perhaps some more comprehensive many modules or add-ons or specific aspects of sensitive data and some would have come up already in feedback from releasing the guide or along the lines of perhaps ecological data specifically linked data and cultural data so I look forward to more discussions about these with yourselves and also that we've plenty of opportunities for input way topics covered in the guide now I'm
not going to go through all of these in detail because we simply won't have time but just to pique your interest in case you're thinking well what's in there should I actually go and read further these are the main topics that are in there defining what sensitive data is confidential izing sensitive data ethical considerations and legal matters and licensing the data and I'll talk a little bit more about those today making data discoverable by our data repository and what to publish and share or in what conditions to publish and share the data the guide includes some definitions where institutional policies may come in and might differ so they want you to go and look for those within your own institution extra information that might help you in making your decisions and includes guidance not only for new data but also if you're managing or you have existing data or data and whether the data is owned by you or owned by others key messages that have come out of the
guide probably but you're across the head with this a couple times more before the end of the presentation is that you can publish a description of your data that is the middle meta data without making the sensitive data itself openly accessible you might have heard this before in terms of the public-private contrasts are making metadata public but the data itself private or inaccessible under some conditions you can place conditions around access to the data publishing your data or just a description of your data means that others can discover it and cite it so that should be or probably will be the thing that you're most interested in getting it out there sensitive data that has been confidential eyes has been modified in a way that it is no longer sensitive may be shared in some sick in many circumstances and lastly be a scout plan ahead so there are things that you can do when you're before you collect your data or when you're collecting your data to make the process of publishing and sharing the sensitive data a lot easier normal so the guide is based around the
as I mentioned this idea of how do we go from having the data to publish in a data what are the steps involved and how do I make the decision about what those steps are and what sequence to do them so for example I work in an area of epidemiology so in production using other people's data so i would say yes i have sensitive data and then I would follow along to my collecting new data I hope you got the data is the data mine or is it somebody else's it's it collected by you so as you can see that the idea is to be able to quickly tick off boxes and work through each of those steps until you get to your design end plant so to begin with I thought we'd
start with well the first step in that box probably reason why many of you here is why is sorry is my data sensitive or are my data sensitive and this is the definition that we've got in the guide Michael camus that you've all you read that sensitive data a data that can be used to identify an individual species object or location that introduces a risk of heart risk of discrimination harm or unwanted attention under law and the research ethics governments of most interceptions sensitive data at this one cannot typically be shared in this form with few exceptions so it's actually very difficult to define sensitive data inclusively which many of you in the field would know even though this is the very starting point to our to our guide into the process and this is because what's out there at the moment can the definitions can be quite disparate and they typically discipline or sometimes institutionally specific but what we wanted to do was to define sensitive data in a way that went back to the core principles for that is so the first thing it includes data which identifies a person or thing or perhaps sometimes even an event or activity and that this identification may introduce a potential risk of discrimination or higher so in being inclusive and implementing being somewhat broad it encourages the reader of the guide to think about what and whether their data a sensitive sensitive
data as you would know across as many disciplines of research generally it's separated into two main categories the first being human data this is probably what most people think of when they think it's sensitive data so this includes and this is this is not an exhaustive list but human sorry health data so medical records from clinical trials epidemiological records from areas of social science so social sciences obviously research or data looking at the relationships between individuals and any aspect of society really so common fields political science sociology psychology and also some fields of humanity and a common example of sensitive sort of social science data which may be sensitive is from surveys such as the cysts and also cultural cultural data so for example research projects which collect information on sacred practices events locations and other information such as that the other main category is
ecological data so come an example of that might be data about the locational practices surrounding vulnerable animal and plant species geospatial data which is now collected alongside human and ecological data quite routinely it can lead to the data being sensitive because it can pinpoint the identity of who or where somebody is also sensitive data crosses includes data that's quantitative since retrieves numbers qualitative as well as of course geospatial data so pretty much any form of data if it includes identifying information and information which can intentionally put a person or an object at risk dept of discrimination fits into this category how can sensitive data be
published now as i said i won't go through the entire guide but i thought the two areas that you're probably most interested in and that tend to be most controversial it's well from a legal point of view and from ethical point of view so legally when we talk about
sensitive data the the legal acts that are triggered around sensitive data for mostly the Privacy Act so Privacy Act states the data that contains identify so identifying information people and personal information and you can have a look for more detail in total one of the guys so for example personal information might be around cultural practices and like the amount of criminal records and in fact that this kind of data triggers the Act so these data cannot do leadership in the original form if people are no longer identifiable so the identifying information and relevant personal information if that can be I can I don't lead to identification into root and technically the actors no longer treated but of course this must be this much must meet definitions of identifiability and confidentiality of that act and we go through that in quite a lot of detail in the guide because it is a point that that people are concerned about quite rightly and we're very pleased to have an expert in this field review that section he'll be speaking the next webinar and also the other relevant sections are chapter on confidential icing data which you might be interested to read to also from a
legal point of view to think about is licensing data any kind of data but including sensitive or confidential eyes data before it's published in Australia or data should have a license it explains how the data can be used and attributed and without a licensing it will be unclear to the reuses free user how the data can be reused and it's might discourage reuse as well some repositories do have their own licenses but also anybody is able to use the set the suite early endorse licenses at Alice colonists and links there if you'd like to go and have a look at those and I strongly encourage you to use it's very user-friendly how can sensitive
data be published from an ethical point of view again this is what is written in the garden think it's a really important part to start from so I'll just read out this the introduction to that section so in addition to meeting legal standards researchers have ethical obligation towards participants and research subjects these include preserving privacy and avoiding any possible harm arising from participation in research and its subsequent publication the ethical management management of data must be the primary concern of researchers to maintain the disciplines trust and research integrity so of course it was our primary concerns inviting the guide is how to look at publishing sense a jump from an ethical point of view and that of course includes in how a research or a data management operates with the the ethical applications and committees within their institution so the key
message to publishing sensitive to other in the manner is to plan ahead so include plans to publish confidential eyes sensitive data so this is a sensitive data which has had identifying information and information which would place an individual at risk of identification in potential harm has been removed and again please have a look at the more detailed information about confidential icing within the guy in your ethics applications so before the data is even collected if you can and also to including any information to participants and in consent forms from hearing participants as well we've got some great examples from us from other places that are being used around the world about how to include information about the publication of human data when asking permit fee in asking permission in consent forms of participants in the research study that's really handy the stories have caused a bit more complex for existing data where specific consent the publication wasn't asked of participants so we're largely talking about here and participants here but there is some it can still often be done in some situations so we've got some very clear steps as to when and how that can be done in the guide so check out section 4.2 for that so what are what
are your options in terms of publishing sensitive data how do you get it out there in a legal and an ethical way so that you can reap the benefits or perhaps meet the funding obligations involved in your research so you can place conditions around access to the confidential I sensitive data and this would be the recommended action for the vast majority of cases of sensitive data publication I keep saying sensitive data publication but as you've picked up from now picked up earlier what I'm talking about when I say that is sensitive data that has been confidential eyes already obviously data that's got people's names addresses or specifically identify other forms of identifying information it's not data that you can legally wrap it unless it's been modified and epically you shouldn't be glad unless it's been modified so we're talking about data that's been treated in some way so that it no longer places the participants at risk so if you place conditions around access to the data this is what we would call conditional access so this is where the metadata so a description of your project and of your data set is available to the public but access to the data itself only only occurs after predetermined conditions in there so common conditions this is already happening in climate fields and we'll go through a great example of that I'm shortly for the Australian longitudinal study of women's health but conditions that can be placed around access to sensitive data common ones are providing information about who and how the rare user wants to use store or manage the data they usually have to agree to conditions of data security and register or provide contact details and some cases also agree that they may be contacted by the original data owners the pits of collaboration or for other reasons as well and you can set the conditions around access majority of repositories will allow you to do that this is an arm you might be able to read
the detail but this is just a screenshot from a from the Australian dollar archive which deals largely in social science data and this is actually the record for the Australian longitudinal study women's health but it shows metadata description of the data set but to actually receive access to get the data itself you can just see where that the one the two will highlighted sections are there this directs the the potential read user or the reader of this metadata record as to how they would do that so this is the best do it click on that and it will tell you under what conditions and how that you can gain access to the data itself or what you would mean into to what conditions you need to me to do that how to do that I'm going to move over now associate
professor Lee to who were very lucky to have with us today Lee is the deputy director of the Estrella multitudinous study of women's health and she's also the chair of the publication's sub studies and analyses committee which is the committee that deals with applications to reuse this particular data set least going to talk about how they how their sensitive data is confidential eyes so it can be reused how they have public metadata or descriptions of the data set but with conditions around the access to that and what the benefits of publication of publishing and sharing this study arm okay well thank you very
much there and I'm going to put in another would like to thank Sarah for inviting me to give an overview arrested and exactly how we have go about sharing some of the very sensitive and personal data that our women have provided us so just to start with just this quote from our study director professor gated Mishra just illustrating that how fundamental data sharing is to study and that it very much public resource funded by the government and available to all people who want to use it providing they follow certain conditions so that's just a really nice sort of overview to start so just for those of you who are not
aware the longitudinal study is a collaborative project of the University of Newcastle and Queensland and it's been going since 1995 so it's one of the longest-running longitudinal studies in Australia and we were lucky to be able to recruit over 40,000 Australian women aged aged between 18 and 75 back in 1996 and we've recently added another seventeen thousand women young women to our study last year so just to give you
an overview of exactly who our women are we've got three age cohorts in the original sample who were recruited in nineteen ninety five and six so we have women born between 19 21 and 26 who when they joined the study were aged 70 to 75 then we have a cohort of women warren between nineteen for 36 and 51 who would then aged 45 to 50 and we had a group of women one between nineteen seventy three and seven th who have then aged 18 to 23 and you can see now that that that that these women have aged our oldest women are now entering their late 80s and early 90s our meds are now in their mid 60s and our and our very young women and now in their late 30s and early 40s so after the 2010 National Women's Health Policy and discussions we had with the government the government agreed to fund us to recruit a new group of young women because we argued we were no longer able to represent will provide data about young women in Australia today because our young cohort was aging so we were very fortunate enough to be able to recruit another seventeen thousand last year to become our new young cohort of women so that's who's in the study
basically what we have done with the original cohorts as we have surveyed them approximately three yearly since 1996 so we have a wealth of information about them and we are now up to our seventh survey of our young women which is currently underway as I speak and we've recently completed our seventh survey of our mid aged women with the new young cohort however because of the online technologies that are available today we hope we actually survey them annually now and we collect a whole range of data on all aspects of women's life including mental physical reproductive and social aspects of their health asking questions about life transitions about life events issues such as employment or caring looking at health services and so on and the other advantage that we have recently been able to implement is data linkage with national and state based administrative data sets and these include information about health service use through MBS to pharmaceutical use through the PBS data on incidence of cancer on hospitalizations and also in honor perinatal information as well and just to give you a a bit of a feel we have over 600 people who have used our Asia both nationally and internationally so it's been a very very big data source with a lot of people who have used it so
what impact has the study has basically our data have been reported in over 500 papers in across a whole range of journals and we have had a significant impact in informing national health policies in all sorts of areas including chronic health conditions physical activity violence nutrition caring and so on and possibly our most significant sort of output was towards the 2010 National Women's Health Policy where our data were cited extensively through our policy another way that we contribute is
by adding value to other data sources for example by the data linkage projects that i mentioned earlier an example of that is our paper published through from one of our chief investigators in 2011 in the medical journal in australia which was looking at women's use of mental health services and looked at the numbers of women across the country who were accessing mental health services by whether they actually reported having a mental health condition or not and these this information and these data were able to help the government in terms of its mental health policy provision another example is is by the way that we can support sub studies and also large data pooling research an example of that is that die nacht a project some of you may have heard of this this was called the dynamic analysis to optimize aging ject and we were part of a pooled data set from nine other longitudinal studies in Australia that had included some aspect of aging so turning back to the
other topic of today's seminar and this is how do we how do we actually manage the sensitive information that we collect because we are very personal information including questions about reproductive events about sexual identity about violence and these are questions that are incredibly personal to people and we can't just go out sharing that information and our study is considered to be a public resource funded by the government and open to anybody who really wants to use our data so how do we actually go about sharing these data legally and ethically so the next couple of slides I'll just talk about our processes so basically when
the women join the study and in every survey that they complete a I informed and asked to consent that their survey data will be lived with their previous survey responses so that we can follow women longitudinally and women have the choice to say no I do not want this to happen in which case we can only use their data in a cross-sectional way all women can withdraw from the study all together we also get women to sign a a consent form and to read information sheet where they agree and that openly says to them that we that your data will be used but you will not be able to be personally identifiable from your data we will d identify your data so we've set that condition upfront with anybody who's part of the study and then in
terms of how we manage the data practically all the surveys that we receive ID identified and confidential so when a woman first joined the study her personal identifying information was removed from the survey and in its place she was awarded what's called an ID alias which is just an alias number for example 2060 69 is the idea alias of one of the women in our mid age cohort so that you think unique identifies linking the ID areas to be personal identifying information of a particular woman is held securely at the University of Newcastle and only the data manager at the University of Newcastle has access to these so the rest of the staff may include at the University of Newcastle and the university Queensland have no idea who the women aren't who are in our study and so any data set that we send out for people to use for analysis on that data set the first column is the ID alias so that they can then link that I the alias to future surveys with future data but again they have no idea who that woman is and another level of protection that we offer is because we have women from all over Australia including rural and remote areas we have the geocoded data of those women's addresses and some researchers who are doing research on geo coding and looking at for example women's response to drought and may want to know a postcode that a woman lives in we have strict criteria that we do not release data that is smaller than a certain geographical area so that there is no way that anyone could work out that that I da udders that comes from that post code could be a particular woman so we have metadata available
about the else one in several national repositories including research data Australia the Australian data archives and tribes and this basically is just a description of the sort of data that we have but if you want to access our data you must come and ask us and get commissioned to do that and so we have out our own public website that is very content that's very complex and has an awful lot of information in it but we but we have a specific section about how to access the data so if you click on that link then you have to come and put an application in and so basically you
have to complete an expression of interest for them you have to provide information about about yourself about what you want to do with the data your research question the variables you want to ask your analysis plan and all the details about exactly what you want to do with the data if you want to use the linked data you have to provide a justification for why you need the linked Asia and you have to provide information about what your publications what your intended to do with the data what publications you're planning what conference presentations and so on so this application is reviewed by the publication sub studies and analyses committee which consists of all of the steering committee members plus a couple of other experts and this process takes a couple of months and that each application is reviewed on merit to make sure that it's an appropriate use of the women station and that it's feasible and then if that is approved and other data re user you need to sign a statement of data use and also confidentiality statements and anybody who uses our data must sign this must sign these statements and these basically cover aspects of what you were going to do with the data and that you're not going to send the data on to somebody else who hasn't sign these documents and that you're going to treat the d??j?? properly and also once you have signed these documents and receive the data and you can begin your analyses we ask you then to provide a six monthly progress report and update so that we can keep a rough idea that you're doing what you said you were going to do and not breaching any agreements that you may have had with us right thank you very much only following
on from Leah just in summary in discussing earlier what's included in the guide and then hearing a great a great story about how sensitive data which we often think about us something that's too difficult to share as something that we're there already and for a long time has been some great examples of data publication sharing a very large scale and in very influential and successful ways so since you've done a publication and sharing can be done what you might like to go away today and think about what can I do to start or what can I do now is to familiarize yourself with with local policies for your institutional around intellectual property licensing of data if you have any recommended policies around the confidential izing of data about repositories to use and if you're in the stage where you haven't already collected your data take advantage of that I've plan ahead I think in terms of ethics applications and consent forms include information about potential data sharing for your participants and voyeur ethics communities before the data is collected and that will save you a lot of grief and make this process of publication and sharing and they've been the potential benefits of that for you much much easier normal track you can start by
publishing your metadata if you've already got your data sensitive data set there and you're wondering well what can I do now how can I start taking advantage of this data set now you can publish a description of your data without making the data itself openly accessible and as we've said a number of times and this is the recommended way to go about publishing confidential I sensitive data because no one will know about your data if you don't publish the description of the data in least to begin with and it's very rare to be unable to let's put a description a good description of your data out there with further information about when or how it's already available access to the data can be gained if you put the description of your data out there it's nothing to stop you doing that and then from keep working on access around maybe you need to treat your data in some way in terms of confidential ization and you can include information about that obviously in the description on your project how you can keep working you can keep working on access conditions if you require some further expertise advice or time to do that but getting a description of your data out there is something that can be done almost immediately we're really greatly value
your feedback on this topic particularly as I mentioned it is an iterative process that we're going to keep adding more detail and more references to the sensitive data section of the anns website and if you've got even content to share in terms of your institution has has some guides or information around this topic please send them either directly to me or our main
contact page we'd love to hear from you