Bestand wählen
Merken

Ensuring Climate Data Remains Public

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
so that it was and
that it could be a a and B it up and on my
left side there is don't walk health she is a member of it in vain from my 0 . turned to government In effect this it's called in to the ET value and pitch to use you wouldn't that's affect your to your of information off torn toll the talk is insuring I'm its battle remains public and that's something like that that the Liberation Front had been lost in these times to get all the information for people on the long distance achieved the warm-up flows thank you want and the higher and and what on OK get it I even your Congress so far and so like I was introduced uh this talk is ensuring climate data remains public and in then I'll speak to the question of how we keep important environmental and climate data accessible admits political instability in rats in particular in this past year I think many of us have been paying attention at 300 states now speak to recent data preservation efforts there so the plan is to have an intro of why I'm talking here today and about what makes now a pressing moment kind of a whirlwind tour of efforts to identify preserving rethinking access to climate data and hopefully some sort of rousing call about features for climate and environmental data and this is a work and doing alone all speak to about many projects and organizations that thousands of people various variously coordinated homework done and if you was 1 impression I hope it's that climate science climate data collection and we use web archiving and grassroots organizing around data are all collaborative efforts are my plan was to try the room for 1 to 2 burning questions but I'm really more than happy to talk after here by this stage and I have stickers so increased by me I wanna give them to you yeah the could great so at
1st and not an expert in actually come science is that my background I'm a PhD student really interested in how designing with a framework of data justice inches more equitable outcomes both forms of data collected in access to technologies to try and think through this I've been looking to those actively using data to try and push for otherwise is this takes the form of DIY science counter-mapping an increasingly decentralized web projects actually got involved with thinking about climate data somewhat circuitously and of a local civic Technion shadow to civic tech Toronto and that served as a meeting states in anchor for many veggies early efforts on and so what energy and you just kind of a mouthful is environmental data and Governance Initiative and distributed consensus organization of more than 150 scholars organizes a nonprofit groups and he was formed from an e-mail thread that started november 2016 in the immediate wake of the US presidential elections for more than a little over a year now we've been documenting contextualizing in analyzing changes to environmental data and government governance practices in the US I've tried to include at least a portion of the people who have been involved in energy projects on the slide many more exist so 1st to unpack and data into structures of climate and environment
a bit more about climate science and environment
available on a collaborative often state-supported research infrastructure I think there's
been many talks earlier Congress highlighted how did it contributes to knowledge about climate change climate modeling satellites and building around DOI satellite ground station work which I now want again against station upon and so I
would take to those that articles over examples some the commonest 1 stresses of
coordinating global stick scale of this collection and processing and and something that
scholar Paul Edwards is described as a global knowledge infrastructure making global data in the
United States at the federal level there are a handful of agencies departments and institutions involved with the creation publishing of this data no I U S S NASA DOE EPA and more in addition to these are resources institutions like Columbia University where the Center for International Earth Science Information Network is based given the coordinated collection
holding of the state the certain no singular form of public access the publishing of data and data products has been increasingly public through a combination of policies portals libraries and archives and hoping that the Government Data initiatives in the US under Title 17 Section 105 most data with some exemptions is considered a work of the US government and therefore in the public domain and an historical climate environmental data is critically important to contextualize and understand current observed phenomena however in addition to the data itself there are reports summaries analyses Italy open up the topic to a broader audience and I kind of consider myself that broader audience and to sort of beyond those with domain expertise seconds on a pause for a moment here Nintendo climate and environmental data and people can use them interchangeably and I've been doing so right now but I think there are some differences in the way certain communities use on the important so I think in many cases of people's accommodate other really referring to atmospheric weather and hydrologic conditions data and where's environmental data when people use that are often explicitly referring to environmental health and hazard this includes air and water quality toxicant pollutants as well as waste and so both a really vital to help characterize navigate our relationship to our environment and but I think it has been miniscule feel at different scales and so I have this is the 1st of the 2 terrible game attempts that could happen to you position them against each other and also access to kind of data and
methods has already face challenges prior to the past year in many cases from those disputing global warming and this has led to motivated targeting of climate scientists in the datasets in cases and in some cases with financial support from other groups where the more well known examples of the hockey stick controversy where a graph showing the gradual cooling in the recent rapid warming at roughly resembling a hockey stick was highlighted in an Intergovernmental Panel on Climate Change Report and I had been published in subsequent years the results of which have the results have been replicated numerous times with different and additional data but at that time the results were new and compelling and as a result they were then disputed and Michael man his called wonder personally targeted online subject to freedom of Information Act request and drawn into or receiving that lasted many years and the more examples but in the interest of time I have to skip them but I think maybe the other the most visible 1 would be the 2009 Climategate e-mail links my sense is that before 2017 this form of Torre targeting would have been identified as like the most likely public risk to climate science is sort of way introduce doubt around climate change in public opinion through concerted efforts to discredit results or scientists and I just wanna say 1 more time charitable Edwards discussion on Environmental dataset and systems is under siege is really instructive and his book a vast machine as well as more recent research can unpacks the history of climate data company I think uh in his uh his work and kind of these previous examples kind of raise important questions about access to climate data and what men's opponents encouraging skeptics said they wanted in many cases was the raw data like a full record and and so in 1 case in particular for a project such actually ordered the siting of surface temperature instruments and I have a I think it's important to note and insulin that scientists like the time was how necessary context is to interpreting data they think we need a better understand that we're working with complex data including climate and environmental data so at this moment in particular on November 8 2016 Donald Trump was elected have for many people there is an immediate sense that we have to be ready we have to do something scientists environmentalists and environmental justice organizers so statements made during the campaign is indicating that climate and environmental data infrastructures could be erased and actively targeted but this is the same as the rest about and said this is from the risk is how do you ensure continued access to data about climate in about in the environment and the supporting institutions may no longer be able or desire to have however many from environmental justice background have long recognized existing environmental data structures as imperfect for example in cases where it's relying upon industry reported data or non rapport non-representative communities embodied experiment and experience of pollution talk 6 this people put people into a position of concern for the preservation of imperfect data to avoid an alternative of noted and the weight may have been
the missile time even from Toronto and is a trial in Canada is I am as however canadians experience I kind of our own and mobilizing moment under a previous prime minister Stephen Harper and and and and I think this highlighted as a new form of a threat that kind environment of the structure Stephen Harper was able to really quickly and successfully implementing Agenda of systematically undercutting environmental climate research budgets closing Labs including in an odd on research station weakening government uh environmental regulations and shutting down libraries in reducing historical periodical and record collections the speed and immediate impact and serve as a rallying moment and highlighted facets of vulnerability that many had not been considering so do something for edgy members that
something quickly became preserving existing invent federal environmental data through helping facility grassroots archiving efforts monitoring changes to federal websites and documenting the political transition through interviews and timely academic analysis and between December 2016 in June 2017 local organizers hosted 49 rescue events in cities across the US and Canada I was support from energy in the data refuge project at the University of Pennsylvania at events ranging in size from a couple dozen to over 200 people gathered to nominate key federal environmental datasets for archiving as part of the Internet archives of pre-existing endoderm crawl in addition it can be strategically organize how to deal with Linked Data sets that could not be preserved to automated methods and at these events attendees nominated over 63 thousand web pages as seeds for subsequent calling and however and it's it's hard not to go into the really extended conversational colors suffer here then probably not the best person to do as well suffer is not actually easily able to fully archiving discover links to datasets in web pages on all sites partially because of underlying and web development practices in Internet infrastructure and partially because of resource and storage constraints so in addition to more than 20 thousand datasets rose again if I were identified as candidates for non automated preservation deem that as deem them is not able to be successfully crawled some 100 of which went through workflow of developing custom solutions to scrape links in datasets and upload them to our data refuge repository using an Open Source tool kit and giving use the benefit of hindsight now consist of avoid falling into a narrative that betrays us as underdogs and when alone accomplices project of a massive scale I know as people in many cases where the expertise of a digital preservation and archiving or along-track predict record in it we have fully appreciate the scale along the way we discovered rediscovered was stressed that long-standing issues of archiving and digital preservation and that many groups are already navigating so rather than forging ahead alone we quickly found affinities with existing advocates projects and institutions many of which have been operating in this space for a long time so in addition to us in data refuge climate near project azimuth and archive team who had existed for years prior also became a rallying projects you want to quickly organize around preserving data and I'm just gonna mention 3 projects is way too many and but when entangled is something that I think are interesting around access coverage and rest so 1st Internet Archive on the
Internet Archive is unparalleled reserves for web archiving act in this particular case we just end of term crawl and they manage the over 200 terabytes of the government Web I and because of the additional focus they had thought sections of websites that might have been missed and based on the way they can figure that crawl and so while it may not include an archive copy of all the datasets for the reasons mentioned earlier it provides an important snapshot of how the data was presented on web sites at the end of the previous administration and further provides the ability to browse the previous versions of the site in a way that extends have a content was initially presented so that kind of uh the question about the what will what we think about think about access and the next 1 is code for Science spearheaded projects fall body and named after this ball as a collection of over 30 8 gigabytes of metadata to try and create a single catalog of research data files and Wall data . gov has put a catalog not all data that could be there is there at without a comprehensive view assessing where data is and how much data is preserved is difficult as you can imagine and finally as existing data center practitioners that Earth Science and Information partnership made a case for a collaborative effort to understand risk stressing existing preservation backup methods may not be visible and particularly for climate data is surface different understandings of risk from a pump from from public lands on coming from a data practitioner perspective and and I think it's really important that was a bad that slide up there so as to urge you to support this quote but they say and so they they they frame long-standing factors of risk and but but I would say there's a new dimension on under a certain administrations and that is an obsolete technology your data formats lack of metadata lack of expertise lack of funding to maintain the data and I think in addition that you a lack of funding for additional or an extended collection in the future so a year later what happened out what it is generally a transition distances at we haven't seen a inestimable datasets and the thing if you walk that few that have been taken down and for for reasons that are not clearly linkable to you and so the goal of removing them from public access as a sort of politically motivated and executive orders and Scott fruits appointment to the EPA has led to reverse the ban on the neurotoxic effects on a neurotoxic pesticide a proposal to rescind Obama's clean power plant is in the works in cuts enforce environmental programmes notably those that predict marginalized and vulnerable populations are underway further budget proposals out aimed at severely cutting funding the federal agencies and think that involve environment data collection in terms of data we've actually seen as a shift in how it's presented on federal websites screenshot on the slide is actually from a recent edgy web site monitoring report documenting the removals and changes in access to resources on the EPA is climate and energy resources for state local and tribal government and so since January edges website monitoring team has released over 25 reports like this documenting changes to how environmental and climate data is presented and so when next the the biggest opportunity see is in the public conversation attention toward continued access to this data the fact that people who were librarians web archive as a research scientist showed up and state involved attests to this as the edges website monitoring work as a way to attempt to mobilize that that continue public conversation but there could be more and in the wake of recent FCC decision on each neutrality accuracy seeing another way the public conversation around infrastructure but kind operating at a lower level since the summer and he has been working with particle that's the creator of IPF that's interplanetary file system and query the data datasets of the developing dataset research tools on the distributed web on a project called data together which aims to convene a conversation around building around and better data infrastructures lunar explorer how decentralize patterns can support community data stewardship in part and the few content address web archiving and are having those conversations out in the open where people can join and and back the a whole talking itself and I preferment somewhat to give it and I have more questions than answers so I think it's probably so it works better is a conversation and 1 I'm hoping at least some of you wanna participate in on and will maybe just to kind of I suggest as well I think many and hacker suffer open hardware open science communities have have recognized the ways that technology is not neutral it can come with embedded bias and fetuses is decisions to be used in certain ways I think of that recognition can be coupled with the recognition from environmental justice advocates and academics and the ways that data is not neutral and also I'm you know with attention to the vital data about climate and environment it is critical to navigating a changing relationship to the environment I really think we have a chance to build better data together and so maybe just in conclusion I wanna say is always looking for people interested in volunteering are projects range uh for people from a variety of backgrounds in particularly if you are like adults which can finally me like serious doubts self and the 2nd website it had consented to our mailing list and we can have is created together is mailing list readers have some conversations about the somewhere online athletes the fewer says because the important work below we you as we like effect but based information we have to achieve a soul that null come to the queuing delay please go to the micro folds and all this is this question from the internet I get informed from the UN angels yeah the fair were question all there's somebody coming this Michael formed 1 it's for you by the you often hear it outside the dataset saw very fragmented slaughter or not very easily accessible by the margin of certain is something you run into well trying to rescue it and crawl it's or what is your experience been there and juicy and opportunities to sample us in your office yeah and this so absolutely I think we did run aground uh of of that fragmentation I think maybe that if we could offer 1 thing to other people with the experience like is not being are actually were familiar with kind stumbling through making like all the mistakes comfortable we like where is this how can we find it and so in terms of come what we found versatile like away
for it actually what 1 hour flight that I think there already processes to train like interest fragmentation and anything data . gov is becoming this Open Data Portal I think the way that certain countries have some kind of like a one-stop portal to try and find data sets as well as you coordination between and the guy I had this the items the data distribution center but they think those projects are a good hour 1 attempts at that time I am I mean I think there's still this problem of access in the sense that I don't think it's unclear to me how people who are within a certain community practice would have been known to get there to get that data and a new 1 1 thing metering conversations have with others including in the US Climate Alliance and I had their mobile of you is that it I think there's a certain set of people who care a lot and and in their decisions and how they work is going really heavily impacted by climate data but they're not going look at the data and they're going to look at those reports and like getting access to those is extremely important and at anything portals are are a big help I think that the library depository programs those things there exist are really important I don't see them go away but it is still think there's like uh something else slightly missing about usability and and I'm not I'm not entirely sure how to address that but I think and Kennedy's one-stop things in and work around opening the datasets is like a really good for sets reoffer Thomas OK we have 10 minutes centers we questions from from the intended to in the room so Internet stop 1st please OK so 1 person from where ask is the bar for putting data into the World Data Center for Climate too high in terms of providing a number of threatened doubt there which is a lot of work but er and so at that my understanding is that operated at selective that govern the sense that it's like optic and so on and it's often from a data publisher level and if I'm incur Darren and but working with that assumption and I think the barrier we found is that not everyone has opted so if you if you're a person who cares about the data the the person you made the data and you kind of stuck if if that the publisher has not included in in these have repositories and and and so I mean I I think it's an interesting approach could be to figure out ways to incentivize more people to get in there and know what those hooks could be even if there's a way that I can request they push it there and is way motivate that behavior like I think that would be awesome microphone 5 please the of question regarding the creation of new that that means that the 1 of the call this to protect that that preserve that the from the old the scientific research but that what would happen for example there will be a lack of funding for the next research and for example a long time serious for Klamath researcher would loss for that are there for for example in this community are the people who try to brew reached rather obvious and tell them that that we need to find funding for preserving the thought and for creating new data items for measuring still all this class thinks the have a reagent that's really critical and in and not something where as there are in the in the States and in Canada what people mobilize around this issue of in thinking about how am don't have the knock-on effects of limiting budgets now and and continued constraining of budgets and a lack of funding a lack of you and cutting jobs instead of growing jobs and at the the group then most similiar with with existing really strong advocacy around that and the states is that you need for concerned scientists and so I think there are definitely groups where our flagging what those uh but what the outcomes of the budget proposals would work at or the impact of those and so on there are groups who are advocating for it and I think the need not also the human expert and making uh government uh the policy here how budgets are implemented I think there are constraints in and how advocacy as a tool to effect change in what a budget is a gets adopted and means that maybe that that alone is is a strategy is not going to prevent it from happening OK microphone 1 please so is the distributed data digitally signed I could imagine that there are some view of some groups of people who might be interested in fiddling around with it I yeah I'm so that through their data and rescue process we worked really close C and D refuge projects as many of them are library and so there was a strong concern with meetings and ability of data and also to me about integrity and verification on that I think actually raise a lot of really interesting questions for me at least in in in the uh how you would imagine like a very uh volunteering human-intensive intensive process of doing that verification and so that there was a workflow management tool that was developed where you would have like a log of who had touched each data adamant that didn't invent a data center webpage and and then as we the use existing as librarian and library of Congress schools and to kind of generate and and check some and to ensure that what was uploaded was what people thought was uploaded when he downloaded you can verify that and so I was trying to do a parallel like sort of social and technical implementation to do that in the in the 2 at some of the data together work we actually have a reference implementation of and you know generating works such as the web archiving format and writing them directly in to and tightly fastened so with with the accession content address particles there are additional ways to you and verification and to ensure that what you only refuse to data you think you're retrieving so I think that it was there were questions really interesting and and anything we we tried it on a and I'm I'm not a librarian and by practice so I think there are large enough so that I'm probably not as sensitive to what that solves the very trustworthy thank you so much FIL then please give a backup apply full don't walk on for they'll tell you loose talk about the public data on what is meant to revolve warm because the other other thing about how what to what
sirens if if if if the it it completely and we kept but
Soundverarbeitung
Benutzerbeteiligung
Momentenproblem
Selbst organisierendes System
Automatische Handlungsplanung
Systemaufruf
Projektive Ebene
Abstand
Information
Programmierumgebung
Quick-Sort
Packprogramm
Aggregatzustand
Expertensystem
Bit
Selbst organisierendes System
Mathematisierung
Gruppenkeim
Stellenring
t-Test
Nabel <Mathematik>
Framework <Informatik>
Rechenschieber
Spezialrechner
Energiedichte
Bildschirmmaske
Benutzerbeteiligung
Verbandstheorie
t-Test
Abschattung
Thread
Projektive Ebene
Datenstruktur
Programmierumgebung
Aggregatzustand
Formale Grammatik
Spezialrechner
Satellitensystem
Spieltheorie
Netzwerkbetriebssystem
Versionsverwaltung
Gammafunktion
Zentrische Streckung
Addition
Prozess <Physik>
Finite-Elemente-Methode
Übergang
Spezialrechner
Digital Object Identifier
Adressraum
Energiedichte
MIDI <Musikelektronik>
Versuchsplanung
Versionsverwaltung
Innerer Punkt
Resultante
Momentenproblem
Gruppenkeim
Aggregatzustand
Kardinalzahl
Information
Spezialrechner
Freeware
Nintendo Co. Ltd.
Gruppentheorie
Konditionszahl
Addition
Public-domain-Software
E-Mail
Metropolitan area network
Distributionstheorie
Zentrische Streckung
Addition
Befehl <Informatik>
Prozess <Informatik>
Web Site
Programmierumgebung
Biprodukt
Kontextbezogenes System
Widerspruchsfreiheit
Portal <Internet>
Einheit <Mathematik>
Verschlingung
Login
Konditionszahl
Ein-Ausgabe
Lesen <Datenverarbeitung>
Garbentheorie
Projektive Ebene
Information
Programmierumgebung
Simulation
Aggregatzustand
Subtraktion
Web Site
Rundung
Gewicht <Mathematik>
Ortsoperator
Selbst organisierendes System
Schaltnetz
Virtuelle Maschine
Content <Internet>
Homepage
Hypermedia
Graph
Open Source
Virtuelle Maschine
Selbst organisierendes System
Bildschirmmaske
Domain-Name
Zufallszahlen
Spieltheorie
Proxy Server
Endlicher Graph
Hasard <Digitaltechnik>
Kommunalität
Programmbibliothek
Äußere Algebra eines Moduls
Gruppoid
Strom <Mathematik>
Datenstruktur
Ereignishorizont
Drucksondierung
Graph
Zwei
Mathematisierung
p-V-Diagramm
Hasard <Digitaltechnik>
Physikalisches System
Binder <Informatik>
Sichtenkonzept
Packprogramm
Quick-Sort
Endogene Variable
Singularität <Mathematik>
Zentrische Streckung
Hilfesystem
Registrierung <Bildverarbeitung>
Term
Verkehrsinformation
Beobachtungsstudie
Stellenring
Momentenproblem
Adressraum
Gruppenkeim
Fastring
Internetworking
Spezialrechner
Azimut
Regulator <Mathematik>
Internetworking
Zentrische Streckung
Addition
Dokumentenserver
Strahlensätze
Systemaufruf
Web Site
Programmierumgebung
Ereignishorizont
Software Development Kit
Verschlingung
Digitalisierer
Server
Projektive Ebene
Programmierumgebung
Nebenbedingung
Web Site
Selbst organisierendes System
Mathematisierung
Gruppenoperation
Azimut
Web-Seite
Datensicherung
Homepage
Datensatz
Bildschirmmaske
Adressraum
Arbeitsplatzcomputer
Programmbibliothek
Affiner Raum
Datenstruktur
Speicher <Informatik>
Ereignishorizont
Analysis
Spider <Programm>
Open Source
Mathematisierung
Primideal
Binder <Informatik>
Menge
Packprogramm
Energiedichte
Koalition
Softwareschwachstelle
Mereologie
Web-Designer
Kantenfärbung
Spider <Programm>
Distributionstheorie
Stellenring
Umsetzung <Informatik>
Einfügungsdämpfung
Aggregatzustand
Hecke-Operator
Spezialrechner
Metadaten
Code
Mustersprache
Dateiverwaltung
Analytische Fortsetzung
E-Mail
Verschiebungsoperator
Feuchteleitung
Addition
Hardware
Sichtenkonzept
Mobiles Internet
Entscheidungstheorie
Menge
Verbandstheorie
Ordnung <Mathematik>
Programmierumgebung
Subtraktion
Hash-Algorithmus
Mathematisierung
Klasse <Mathematik>
Abgeschlossene Menge
Online-Katalog
Flächentheorie
Perspektive
Programmbibliothek
Inhalt <Mathematik>
Abstand
Soundverarbeitung
Spider <Programm>
Programmverifikation
Elektronische Publikation
Elektronische Bibliothek
Offene Menge
Partikelsystem
Information Retrieval
Programmiergerät
Prozess <Physik>
Adressraum
Versionsverwaltung
Gruppenkeim
Datensicherung
Übergang
Internetworking
Rechenzentrum
Datenmanagement
Prozess <Informatik>
Hacker
Schnitt <Graphentheorie>
Parallele Schnittstelle
Nichtlinearer Operator
Internetworking
Benutzerfreundlichkeit
Stellenring
Abfrage
Systemaufruf
Web Site
Programmierumgebung
Mustererkennung
Teilbarkeit
Rechenschieber
Portal <Internet>
Strategisches Spiel
Dateiformat
Garbentheorie
Projektive Ebene
Information
Faltung <Mathematik>
Varietät <Mathematik>
Aggregatzustand
Web Site
Metadaten
Gruppenoperation
Implementierung
Zahlenbereich
Web-Seite
Term
Code
Benutzerbeteiligung
Stichprobenumfang
Optimierung
Hilfesystem
URL
Expertensystem
Kollaboration <Informatik>
Datenmissbrauch
Systemverwaltung
Einfache Genauigkeit
Mailing-Liste
Fokalpunkt
Packprogramm
Quick-Sort
Office-Paket
Integral
Energiedichte
Mereologie
Energiedichte
Verkehrsinformation
Term
Hypermedia
Systemprogrammierung

Metadaten

Formale Metadaten

Titel Ensuring Climate Data Remains Public
Serientitel 34th Chaos Communication Congress
Autor dcwalk
Lizenz CC-Namensnennung 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/34854
Herausgeber Chaos Computer Club e.V.
Erscheinungsjahr 2017
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract How do we keep important environmental and climate data accessible amidst political instability and risk? What even counts as an “accessible” dataset? Could we imagine better infrastructures for vital data? By describing the rapid data preservation efforts of U.S. environmental data that started in the wake of the recent election, I’ll address these questions and the new and existing issues that preservation surfaced about the vulnerability of data infrastructures. I'll focusing on specific projects, including the work of EDGI, that is trying to address these challenges by creating alternate forms of access and infrastructure!
Schlagwörter Science

Zugehöriges Material

Folgende Ressource ist Begleitmaterial zum Video
Video wird in der folgenden Ressource zitiert

Ähnliche Filme

Loading...
Feedback