Introducing the ANDS Guide + A Sensitive Data Success Story - 15 Oct 14
Formal Metadata
Title |
Introducing the ANDS Guide + A Sensitive Data Success Story - 15 Oct 14
|
Alternative Title |
A Sensitive Data Success Story
|
Title of Series | |
Author |
|
License |
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. |
Identifiers |
|
Publisher |
|
Release Date |
2014
|
Language |
English
|
Content Metadata
Subject Area | |
Abstract |
Dr. Sarah Olesen, the author of the ANDS Guide to Publishing & Sharing Sensitive Data, will define and describe the sensitive data landscape and take you through the steps involved in publishing sensitive data. Sarah will cover topics from the Guide, including navigating legal and ethical requirements, publishing and sharing data with conditional access, and how to confidentialise sensitive data. In the second part of this webinar, researchers from the Australian Longitudinal Study of Australian Women will then describe how they have published and shared health and social data from this influential research project for almost 20 years.
|

00:00
Context awareness
Sensitivity analysis
Service (economics)
Context awareness
Presentation of a group
Dependent and independent variables
Multiplication sign
Electronic program guide
Point (geometry)
Electronic program guide
Content (media)
Content (media)
Web 2.0
Internet service provider
00:38
Point (geometry)
Link (knot theory)
Information
Decision theory
Decision theory
Electronic program guide
Moment (mathematics)
Electronic program guide
Shared memory
Bit
Mereology
Sequence
Data management
Newsletter
Inclusion map
Process (computing)
Term (mathematics)
Energy level
Information
Process (computing)
Quicksort
Form (programming)
02:50
Point (geometry)
Sensitivity analysis
Slide rule
Feedback
Decision theory
Electronic program guide
Function (mathematics)
Shape (magazine)
Expert system
Field (computer science)
Information technology consulting
Number
Usability
Expected value
Term (mathematics)
Profil (magazine)
Repository (publishing)
Information security
Form (programming)
God
Collaborationism
Focus (optics)
Key (cryptography)
Decision theory
Video tracking
Feedback
Data storage device
Expert system
Lattice (order)
Line (geometry)
User profile
Category of being
Data management
Repository (publishing)
Data flow diagram
Network topology
Data logger
Information security
05:19
Point (geometry)
Sensitivity analysis
Context awareness
Joystick
View (database)
Decision theory
Collaborationism
Electronic program guide
Mereology
Field (computer science)
Data management
Inclusion map
Term (mathematics)
Software framework
Form (programming)
Collaborationism
Focus (optics)
Variety (linguistics)
Decision theory
Bit
Sequence
Category of being
Data mining
Data management
Process (computing)
Software framework
Cycle (graph theory)
07:13
Module (mathematics)
Sensitivity analysis
Focus (optics)
Information
Feedback
Feedback
Electronic program guide
Bit
Line (geometry)
Content (media)
Category of being
Latent heat
Process (computing)
Iteration
Repository (publishing)
Repository (publishing)
output
Process (computing)
Modul <Datentyp>
output
08:53
Sensitivity analysis
Presentation of a group
Multiplication sign
Decision theory
Electronic program guide
Disk read-and-write head
Metadata
Term (mathematics)
Repository (publishing)
Information
Descriptive statistics
Condition number
Information
Key (cryptography)
Decision theory
Electronic program guide
Point (geometry)
Planning
Bit
Message passing
Data management
Process (computing)
Personal digital assistant
Software repository
Condition number
10:50
Point (geometry)
Sensitivity analysis
Decision theory
Simultaneous localization and mapping
Electronic program guide
Maxima and minima
Event horizon
Field (computer science)
Product (business)
Network topology
Core dump
Cuboid
Exception handling
Area
Execution unit
Decision theory
Physical law
Moment (mathematics)
Sequence
Vector potential
Data mining
Uniform resource locator
Process (computing)
System identification
Species
Intercept theorem
Exception handling
12:58
Area
Sensitivity analysis
Information
Projective plane
Electronic mailing list
Event horizon
Field (computer science)
Number
Category of being
Uniform resource locator
Information retrieval
Species
Quicksort
Species
Object (grammar)
Sinc function
Row (database)
Form (programming)
Identity management
Vulnerability (computing)
14:47
Point (geometry)
Area
Identifiability
Table (information)
Information
View (database)
Electronic program guide
Electronic program guide
Sheaf (mathematics)
Expert system
Lattice (order)
Information privacy
Field (computer science)
Information privacy
Root
Sheaf (mathematics)
System identification
Information
Row (database)
Form (programming)
16:19
Point (geometry)
Standard deviation
Suite (music)
Addition
Link (knot theory)
INTEGRAL
View (database)
Sheaf (mathematics)
Electronic program guide
Mereology
Information privacy
Data management
Repository (publishing)
Hydraulic jump
Data integrity
Addition
Standard deviation
Key (cryptography)
Suite (music)
Cartesian coordinate system
Information privacy
Data management
Repository (publishing)
Sheaf (mathematics)
Freeware
17:51
Sensitivity analysis
Group action
Observational study
Collaborationism
Sheaf (mathematics)
Electronic program guide
1 (number)
Set (mathematics)
Metadata
Field (computer science)
Computer configuration
Term (mathematics)
Information
Information security
Address space
Descriptive statistics
Form (programming)
Condition number
Collaborationism
Arm
Information
Metadata
Planning
Bit
Group action
Cartesian coordinate system
Vector potential
Inclusion map
Message passing
Personal digital assistant
Repository (publishing)
Radio-frequency identification
Internet service provider
Sheaf (mathematics)
System identification
Condition number
Information security
Form (programming)
Conditional probability
21:04
Observational study
Arm
Observational study
Principal ideal
Sheaf (mathematics)
Shared memory
Metadata
Set (mathematics)
Cartesian coordinate system
Metadata
Vector potential
Integral domain
File archiver
Reading (process)
Descriptive statistics
Row (database)
Condition number
Conditional probability
22:36
Observational study
Observational study
Projective plane
Collaborationism
Shared memory
Field (computer science)
Quicksort
Group action
Fundamental theorem of algebra
Condition number
Perspective (visual)
24:03
Group action
Service (economics)
Observational study
Information
State of matter
System administrator
Archaeological field survey
Range (statistics)
Source code
State of matter
Sampling (statistics)
Archaeological field survey
Set (mathematics)
Bit
Incidence algebra
19 (number)
Event horizon
Windows Registry
Event horizon
Internet service provider
Set (mathematics)
Video game
27:16
Area
Source code
Service (economics)
Observational study
Service (economics)
Observational study
Information
Range (statistics)
Projective plane
Source code
Computer program
Mathematical analysis
Physicalism
Set (mathematics)
Function (mathematics)
Numbering scheme
Mereology
Number
Term (mathematics)
Condition number
Quicksort
Condition number
29:03
Axiom of choice
Slide rule
Dependent and independent variables
Observational study
Observational study
Information
Dependent and independent variables
Archaeological field survey
Archaeological field survey
Set (mathematics)
Mereology
Event horizon
Process (computing)
Personal digital assistant
Information
Statement (computer science)
Identifiability
Reading (process)
Identity management
Form (programming)
Condition number
Family
30:37
Aliasing
Observational study
Link (knot theory)
Code
Archaeological field survey
Sheaf (mathematics)
Set (mathematics)
Code
Number
Term (mathematics)
Repository (publishing)
Uniqueness quantification
Energy level
Website
Address space
Descriptive statistics
Area
MIDI
Dependent and independent variables
Information
Archaeological field survey
Mathematical analysis
Metadata
Data analysis
Staff (military)
Data management
Repository (publishing)
Universe (mathematics)
File archiver
Website
Quicksort
Geometry
Conditional probability
33:15
Trail
Presentation of a group
RegulƤrer Ausdruck <Textverarbeitung>
Observational study
Multiplication sign
Workstation <Musikinstrument>
Electronic program guide
Sign (mathematics)
Term (mathematics)
Feasibility study
Statement (computer science)
Information
Traffic reporting
Form (programming)
Observational study
Information
Expert system
Mathematical analysis
Shared memory
Planning
Cartesian coordinate system
Variable (mathematics)
Vector potential
Category of being
Process (computing)
Computer configuration
Repository (publishing)
Statement (computer science)
Arithmetic progression
Form (programming)
36:14
Sensitivity analysis
Information
Feedback
Multiplication sign
Projective plane
Feedback
Electronic program guide
Sheaf (mathematics)
Metadata
Set (mathematics)
Content (media)
Number
Computer configuration
Term (mathematics)
Descriptive statistics
Condition number
37:43
Web page
Meta element
Twin prime
Strategy game
Exception handling
00:00
I'm first up what I'll talk about today is to provide a context provide context
00:07
around that the sensitive data guide and what the goals of the guide were so I'll introduce its content and the key place that might pique your interest because obviously I won't have time to run through the detail of everything that's in the guide and also then talk about where to from here having put this guide on the web and having had some interesting response from it already what are we going to do from now on how we're going to add to that and then either said some time for discussion and any new ideas that you as the community have following the presentation today so
00:40
what is the guide to publishing and sharing sensitive data we released the guide on the twenty-third of September's so just a couple of weeks ago alongside one of them ends up newsletters if you want to look at that newsletter because it's got summary of the guide as well and some other bits in it is the the link there the guides written for well anyone who's managing sensitive data so includes data managers researchers members of the research offers dull librarians anybody really it is introductory level information however the guidance through the steps and the decisions made within the guide in terms of directing us to what to do and what sequence to do things in a relevant to everybody regardless of your level of understanding and expertise why do we
01:24
create the guide in the first place well as you know as most of you are probably here because you do use or deal with sensitive data is that it can be a little bit trickier than other forms of data it involves extra steps to publication and sharing that need consideration mainly things around the legal and ethical side of things the fact at the moment unfortunately there was a little publicly available in terms of guidance to help navigate the process and to guide decisions that are publishing sensitive data what we found that was out there was quite disparate in terms of the level that was aimed at the level of detail and its inclusiveness as well and by that I mean that often parts of the steps involved in publishing since your data might have been included or spoken about but not the full sort of whoa which is what we've tried to do on the guy consistently there was an expressive need amongst the dark community to navigation and some consensus around publishing sensitive data which isn't to say that there's ever going to be a one-size-fits-all for sensitive data but some consensus in the process of how this is done what steps everybody needs to consider to start from here's my data to use me and one published up and probably most importantly is that all of these these points above have prevented researchers and data managers from renting the potential benefits of publishing their sensitive data so those are benefits for
02:53
you as the research or the data manager in terms of the data becoming discoverable which can lead to expectations collaborations reputation and profile which can lead to future funding tracking the reach and output and raise up your data ability of publishing leading journals so a number of leading journals plus is always a good example of that now mandate that data regardless of what kind of data it is needs to be published in one side the paper data security in terms of storage this sensitive data as well as meeting funding and said paper publishing obligations so motor funders in in Australia at this point are not mandating the publication of data including sensitive data although there is encouragement along that way but what you may have find or might find is that collaborations if you have collaborations with institutions like sees particularly us in the UK major funders there like the Wellcome Trust or NIH do mandate publication of data including some problems of sense and of course there are benefits for wider size in terms of scientific rigor so any forms of data that can be published and of course we checked by Alan's values of open access I think particularly for sensitive data which might not apply to other forms of data is that the kind of data that we including this category of sensitive data are often those that are most expensive and time-consuming to collect and most taxing on participants if that data can be published and potentially reused then there's big points for the for efficiency of research along those lines so however God how did the guide
04:31
take shape what should we do well no so there's an absence of sensitive data records in repositories including our only research data Australia and these are for the reasons that I mentioned in the earlier slide so we had some discussion and community food feedback around this review of the literature that was out there and much consultation with and editing with experts in legal and ethical fields as well as experts in people that were experts in particular forms of sensitive data such psychological data what we opted for was to focus on the user friendliness of the guide so user-focused guide that included major decisions and the steps to publication in clear easy-to-follow way and this is based around a flow diagram or a decision tree which I'll get to very shortly the focus or the key
05:21
features of the sensitive data guide is to firstly to clearly outline the sequential steps involved in publishing and sharing sensitive data specifically although the steps to publishing and sharing any forms of of data irrelevant in this safe you're not dealing with sensitive daddy might still find the processes or the steps outlined in the guide quite useful to provide a decision framework for going through these steps so I've got this kind of data what do I do next what's the appropriate sequence through each of those steps you can take them off as you go along encompassing definitions and some methodology for each steps so it's quite hard to write then all income sir an inclusive definition of sensitive data because many kinds of data will finish that category not I'll go through some of those shortly so keeping that in mind in writing definitions or describing for example what sensitive data reads was to keep it relatively inclusive so that the focus is on encouraging the reader to think about what it is that might make this their data sensitive from a legal and ethical point of view and lastly legal and ethical expertise to provide advice throughout and before the release as did sensitive data managers and in some fields which I mentioned
06:41
so in terms of thinking about mine and ours is obviously involved in and we've given many webinars on various aspects of data management so just to provide a bit of context the guide is largely focused largely on the publication and sharing part of that cycle it is of course predicated by Goethe management so your dad has to be in good form before you consider can consider publishing it of course and then the point of doing it is obviously to read the benefits of that data citation collaboration potentially future of me the guides not intended to replace or
07:16
overwrite any institutional policies that that you've come across so for example you might have within your institution quite specific policies regarding data treatment or the confidential but I think I've done or how its taught what repositories to use an intellectual property policies can vary a bit between institutions as well it's intended to be a guide rather than a technical manual so the focus as I mentioned before is is really on looking at the overview or the process of publishing data rather than detailed instructions on each of the processes long way there is some information there but that's not to say that there will and there should be scenarios where you'll need greater take detail or quite specific methods required for some kinds of sensitive data and we aim to keep updating a bibliography of where's perhaps and more specific instructions can be found for each of those steps which will fit within the guide and of
08:15
course there's more to come so these sensitive data data guide is iterative in its content we'd like to keep updating it and adding more following feedback from the community or in feedback from you too we're looking at in 2015 to provide perhaps some more comprehensive many modules or add-ons or specific aspects of sensitive data and some would have come up already in feedback from releasing the guide or along the lines of perhaps ecological data specifically linked data and cultural data so I look forward to more discussions about these with yourselves and also that we've plenty of opportunities for input way topics covered in the guide now I'm
08:57
not going to go through all of these in detail because we simply won't have time but just to pique your interest in case you're thinking well what's in there should I actually go and read further these are the main topics that are in there defining what sensitive data is confidential izing sensitive data ethical considerations and legal matters and licensing the data and I'll talk a little bit more about those today making data discoverable by our data repository and what to publish and share or in what conditions to publish and share the data the guide includes some definitions where institutional policies may come in and might differ so they want you to go and look for those within your own institution extra information that might help you in making your decisions and includes guidance not only for new data but also if you're managing or you have existing data or data and whether the data is owned by you or owned by others key messages that have come out of the
09:51
guide probably but you're across the head with this a couple times more before the end of the presentation is that you can publish a description of your data that is the middle meta data without making the sensitive data itself openly accessible you might have heard this before in terms of the public-private contrasts are making metadata public but the data itself private or inaccessible under some conditions you can place conditions around access to the data publishing your data or just a description of your data means that others can discover it and cite it so that should be or probably will be the thing that you're most interested in getting it out there sensitive data that has been confidential eyes has been modified in a way that it is no longer sensitive may be shared in some sick in many circumstances and lastly be a scout plan ahead so there are things that you can do when you're before you collect your data or when you're collecting your data to make the process of publishing and sharing the sensitive data a lot easier normal so the guide is based around the
10:52
as I mentioned this idea of how do we go from having the data to publish in a data what are the steps involved and how do I make the decision about what those steps are and what sequence to do them so for example I work in an area of epidemiology so in production using other people's data so i would say yes i have sensitive data and then I would follow along to my collecting new data I hope you got the data is the data mine or is it somebody else's it's it collected by you so as you can see that the idea is to be able to quickly tick off boxes and work through each of those steps until you get to your design end plant so to begin with I thought we'd
11:34
start with well the first step in that box probably reason why many of you here is why is sorry is my data sensitive or are my data sensitive and this is the definition that we've got in the guide Michael camus that you've all you read that sensitive data a data that can be used to identify an individual species object or location that introduces a risk of heart risk of discrimination harm or unwanted attention under law and the research ethics governments of most interceptions sensitive data at this one cannot typically be shared in this form with few exceptions so it's actually very difficult to define sensitive data inclusively which many of you in the field would know even though this is the very starting point to our to our guide into the process and this is because what's out there at the moment can the definitions can be quite disparate and they typically discipline or sometimes institutionally specific but what we wanted to do was to define sensitive data in a way that went back to the core principles for that is so the first thing it includes data which identifies a person or thing or perhaps sometimes even an event or activity and that this identification may introduce a potential risk of discrimination or higher so in being inclusive and implementing being somewhat broad it encourages the reader of the guide to think about what and whether their data a sensitive sensitive
12:59
data as you would know across as many disciplines of research generally it's separated into two main categories the first being human data this is probably what most people think of when they think it's sensitive data so this includes and this is this is not an exhaustive list but human sorry health data so medical records from clinical trials epidemiological records from areas of social science so social sciences obviously research or data looking at the relationships between individuals and any aspect of society really so common fields political science sociology psychology and also some fields of humanity and a common example of sensitive sort of social science data which may be sensitive is from surveys such as the cysts and also cultural cultural data so for example research projects which collect information on sacred practices events locations and other information such as that the other main category is
14:02
ecological data so come an example of that might be data about the locational practices surrounding vulnerable animal and plant species geospatial data which is now collected alongside human and ecological data quite routinely it can lead to the data being sensitive because it can pinpoint the identity of who or where somebody is also sensitive data crosses includes data that's quantitative since retrieves numbers qualitative as well as of course geospatial data so pretty much any form of data if it includes identifying information and information which can intentionally put a person or an object at risk dept of discrimination fits into this category how can sensitive data be
14:49
published now as i said i won't go through the entire guide but i thought the two areas that you're probably most interested in and that tend to be most controversial it's well from a legal point of view and from ethical point of view so legally when we talk about
15:03
sensitive data the the legal acts that are triggered around sensitive data for mostly the Privacy Act so Privacy Act states the data that contains identify so identifying information people and personal information and you can have a look for more detail in total one of the guys so for example personal information might be around cultural practices and like the amount of criminal records and in fact that this kind of data triggers the Act so these data cannot do leadership in the original form if people are no longer identifiable so the identifying information and relevant personal information if that can be I can I don't lead to identification into root and technically the actors no longer treated but of course this must be this much must meet definitions of identifiability and confidentiality of that act and we go through that in quite a lot of detail in the guide because it is a point that that people are concerned about quite rightly and we're very pleased to have an expert in this field review that section he'll be speaking the next webinar and also the other relevant sections are chapter on confidential icing data which you might be interested to read to also from a
16:20
legal point of view to think about is licensing data any kind of data but including sensitive or confidential eyes data before it's published in Australia or data should have a license it explains how the data can be used and attributed and without a licensing it will be unclear to the reuses free user how the data can be reused and it's might discourage reuse as well some repositories do have their own licenses but also anybody is able to use the set the suite early endorse licenses at Alice colonists and links there if you'd like to go and have a look at those and I strongly encourage you to use it's very user-friendly how can sensitive
17:00
data be published from an ethical point of view again this is what is written in the garden think it's a really important part to start from so I'll just read out this the introduction to that section so in addition to meeting legal standards researchers have ethical obligation towards participants and research subjects these include preserving privacy and avoiding any possible harm arising from participation in research and its subsequent publication the ethical management management of data must be the primary concern of researchers to maintain the disciplines trust and research integrity so of course it was our primary concerns inviting the guide is how to look at publishing sense a jump from an ethical point of view and that of course includes in how a research or a data management operates with the the ethical applications and committees within their institution so the key
17:52
message to publishing sensitive to other in the manner is to plan ahead so include plans to publish confidential eyes sensitive data so this is a sensitive data which has had identifying information and information which would place an individual at risk of identification in potential harm has been removed and again please have a look at the more detailed information about confidential icing within the guy in your ethics applications so before the data is even collected if you can and also to including any information to participants and in consent forms from hearing participants as well we've got some great examples from us from other places that are being used around the world about how to include information about the publication of human data when asking permit fee in asking permission in consent forms of participants in the research study that's really handy the stories have caused a bit more complex for existing data where specific consent the publication wasn't asked of participants so we're largely talking about here and participants here but there is some it can still often be done in some situations so we've got some very clear steps as to when and how that can be done in the guide so check out section 4.2 for that so what are what
19:06
are your options in terms of publishing sensitive data how do you get it out there in a legal and an ethical way so that you can reap the benefits or perhaps meet the funding obligations involved in your research so you can place conditions around access to the confidential I sensitive data and this would be the recommended action for the vast majority of cases of sensitive data publication I keep saying sensitive data publication but as you've picked up from now picked up earlier what I'm talking about when I say that is sensitive data that has been confidential eyes already obviously data that's got people's names addresses or specifically identify other forms of identifying information it's not data that you can legally wrap it unless it's been modified and epically you shouldn't be glad unless it's been modified so we're talking about data that's been treated in some way so that it no longer places the participants at risk so if you place conditions around access to the data this is what we would call conditional access so this is where the metadata so a description of your project and of your data set is available to the public but access to the data itself only only occurs after predetermined conditions in there so common conditions this is already happening in climate fields and we'll go through a great example of that I'm shortly for the Australian longitudinal study of women's health but conditions that can be placed around access to sensitive data common ones are providing information about who and how the rare user wants to use store or manage the data they usually have to agree to conditions of data security and register or provide contact details and some cases also agree that they may be contacted by the original data owners the pits of collaboration or for other reasons as well and you can set the conditions around access majority of repositories will allow you to do that this is an arm you might be able to read
21:06
the detail but this is just a screenshot from a from the Australian dollar archive which deals largely in social science data and this is actually the record for the Australian longitudinal study women's health but it shows metadata description of the data set but to actually receive access to get the data itself you can just see where that the one the two will highlighted sections are there this directs the the potential read user or the reader of this metadata record as to how they would do that so this is the best do it click on that and it will tell you under what conditions and how that you can gain access to the data itself or what you would mean into to what conditions you need to me to do that how to do that I'm going to move over now associate
21:56
professor Lee to who were very lucky to have with us today Lee is the deputy director of the Estrella multitudinous study of women's health and she's also the chair of the publication's sub studies and analyses committee which is the committee that deals with applications to reuse this particular data set least going to talk about how they how their sensitive data is confidential eyes so it can be reused how they have public metadata or descriptions of the data set but with conditions around the access to that and what the benefits of publication of publishing and sharing this study arm okay well thank you very
22:37
much there and I'm going to put in another would like to thank Sarah for inviting me to give an overview arrested and exactly how we have go about sharing some of the very sensitive and personal data that our women have provided us so just to start with just this quote from our study director professor gated Mishra just illustrating that how fundamental data sharing is to study and that it very much public resource funded by the government and available to all people who want to use it providing they follow certain conditions so that's just a really nice sort of overview to start so just for those of you who are not
23:26
aware the longitudinal study is a collaborative project of the University of Newcastle and Queensland and it's been going since 1995 so it's one of the longest-running longitudinal studies in Australia and we were lucky to be able to recruit over 40,000 Australian women aged aged between 18 and 75 back in 1996 and we've recently added another seventeen thousand women young women to our study last year so just to give you
24:05
an overview of exactly who our women are we've got three age cohorts in the original sample who were recruited in nineteen ninety five and six so we have women born between 19 21 and 26 who when they joined the study were aged 70 to 75 then we have a cohort of women warren between nineteen for 36 and 51 who would then aged 45 to 50 and we had a group of women one between nineteen seventy three and seven th who have then aged 18 to 23 and you can see now that that that that these women have aged our oldest women are now entering their late 80s and early 90s our meds are now in their mid 60s and our and our very young women and now in their late 30s and early 40s so after the 2010 National Women's Health Policy and discussions we had with the government the government agreed to fund us to recruit a new group of young women because we argued we were no longer able to represent will provide data about young women in Australia today because our young cohort was aging so we were very fortunate enough to be able to recruit another seventeen thousand last year to become our new young cohort of women so that's who's in the study
25:35
basically what we have done with the original cohorts as we have surveyed them approximately three yearly since 1996 so we have a wealth of information about them and we are now up to our seventh survey of our young women which is currently underway as I speak and we've recently completed our seventh survey of our mid aged women with the new young cohort however because of the online technologies that are available today we hope we actually survey them annually now and we collect a whole range of data on all aspects of women's life including mental physical reproductive and social aspects of their health asking questions about life transitions about life events issues such as employment or caring looking at health services and so on and the other advantage that we have recently been able to implement is data linkage with national and state based administrative data sets and these include information about health service use through MBS to pharmaceutical use through the PBS data on incidence of cancer on hospitalizations and also in honor perinatal information as well and just to give you a a bit of a feel we have over 600 people who have used our Asia both nationally and internationally so it's been a very very big data source with a lot of people who have used it so
27:16
what impact has the study has basically our data have been reported in over 500 papers in across a whole range of journals and we have had a significant impact in informing national health policies in all sorts of areas including chronic health conditions physical activity violence nutrition caring and so on and possibly our most significant sort of output was towards the 2010 National Women's Health Policy where our data were cited extensively through our policy another way that we contribute is
27:59
by adding value to other data sources for example by the data linkage projects that i mentioned earlier an example of that is our paper published through from one of our chief investigators in 2011 in the medical journal in australia which was looking at women's use of mental health services and looked at the numbers of women across the country who were accessing mental health services by whether they actually reported having a mental health condition or not and these this information and these data were able to help the government in terms of its mental health policy provision another example is is by the way that we can support sub studies and also large data pooling research an example of that is that die nacht a project some of you may have heard of this this was called the dynamic analysis to optimize aging ject and we were part of a pooled data set from nine other longitudinal studies in Australia that had included some aspect of aging so turning back to the
29:05
other topic of today's seminar and this is how do we how do we actually manage the sensitive information that we collect because we are very personal information including questions about reproductive events about sexual identity about violence and these are questions that are incredibly personal to people and we can't just go out sharing that information and our study is considered to be a public resource funded by the government and open to anybody who really wants to use our data so how do we actually go about sharing these data legally and ethically so the next couple of slides I'll just talk about our processes so basically when
29:48
the women join the study and in every survey that they complete a I informed and asked to consent that their survey data will be lived with their previous survey responses so that we can follow women longitudinally and women have the choice to say no I do not want this to happen in which case we can only use their data in a cross-sectional way all women can withdraw from the study all together we also get women to sign a a consent form and to read information sheet where they agree and that openly says to them that we that your data will be used but you will not be able to be personally identifiable from your data we will d identify your data so we've set that condition upfront with anybody who's part of the study and then in
30:38
terms of how we manage the data practically all the surveys that we receive ID identified and confidential so when a woman first joined the study her personal identifying information was removed from the survey and in its place she was awarded what's called an ID alias which is just an alias number for example 2060 69 is the idea alias of one of the women in our mid age cohort so that you think unique identifies linking the ID areas to be personal identifying information of a particular woman is held securely at the University of Newcastle and only the data manager at the University of Newcastle has access to these so the rest of the staff may include at the University of Newcastle and the university Queensland have no idea who the women aren't who are in our study and so any data set that we send out for people to use for analysis on that data set the first column is the ID alias so that they can then link that I the alias to future surveys with future data but again they have no idea who that woman is and another level of protection that we offer is because we have women from all over Australia including rural and remote areas we have the geocoded data of those women's addresses and some researchers who are doing research on geo coding and looking at for example women's response to drought and may want to know a postcode that a woman lives in we have strict criteria that we do not release data that is smaller than a certain geographical area so that there is no way that anyone could work out that that I da udders that comes from that post code could be a particular woman so we have metadata available
32:36
about the else one in several national repositories including research data Australia the Australian data archives and tribes and this basically is just a description of the sort of data that we have but if you want to access our data you must come and ask us and get commissioned to do that and so we have out our own public website that is very content that's very complex and has an awful lot of information in it but we but we have a specific section about how to access the data so if you click on that link then you have to come and put an application in and so basically you
33:16
have to complete an expression of interest for them you have to provide information about about yourself about what you want to do with the data your research question the variables you want to ask your analysis plan and all the details about exactly what you want to do with the data if you want to use the linked data you have to provide a justification for why you need the linked Asia and you have to provide information about what your publications what your intended to do with the data what publications you're planning what conference presentations and so on so this application is reviewed by the publication sub studies and analyses committee which consists of all of the steering committee members plus a couple of other experts and this process takes a couple of months and that each application is reviewed on merit to make sure that it's an appropriate use of the women station and that it's feasible and then if that is approved and other data re user you need to sign a statement of data use and also confidentiality statements and anybody who uses our data must sign this must sign these statements and these basically cover aspects of what you were going to do with the data and that you're not going to send the data on to somebody else who hasn't sign these documents and that you're going to treat the d??j?? properly and also once you have signed these documents and receive the data and you can begin your analyses we ask you then to provide a six monthly progress report and update so that we can keep a rough idea that you're doing what you said you were going to do and not breaching any agreements that you may have had with us right thank you very much only following
34:57
on from Leah just in summary in discussing earlier what's included in the guide and then hearing a great a great story about how sensitive data which we often think about us something that's too difficult to share as something that we're there already and for a long time has been some great examples of data publication sharing a very large scale and in very influential and successful ways so since you've done a publication and sharing can be done what you might like to go away today and think about what can I do to start or what can I do now is to familiarize yourself with with local policies for your institutional around intellectual property licensing of data if you have any recommended policies around the confidential izing of data about repositories to use and if you're in the stage where you haven't already collected your data take advantage of that I've plan ahead I think in terms of ethics applications and consent forms include information about potential data sharing for your participants and voyeur ethics communities before the data is collected and that will save you a lot of grief and make this process of publication and sharing and they've been the potential benefits of that for you much much easier normal track you can start by
36:15
publishing your metadata if you've already got your data sensitive data set there and you're wondering well what can I do now how can I start taking advantage of this data set now you can publish a description of your data without making the data itself openly accessible and as we've said a number of times and this is the recommended way to go about publishing confidential I sensitive data because no one will know about your data if you don't publish the description of the data in least to begin with and it's very rare to be unable to let's put a description a good description of your data out there with further information about when or how it's already available access to the data can be gained if you put the description of your data out there it's nothing to stop you doing that and then from keep working on access around maybe you need to treat your data in some way in terms of confidential ization and you can include information about that obviously in the description on your project how you can keep working you can keep working on access conditions if you require some further expertise advice or time to do that but getting a description of your data out there is something that can be done almost immediately we're really greatly value
37:24
your feedback on this topic particularly as I mentioned it is an iterative process that we're going to keep adding more detail and more references to the sensitive data section of the anns website and if you've got even content to share in terms of your institution has has some guides or information around this topic please send them either directly to me or our main
37:44
contact page we'd love to hear from you
