Protecting Data in Times of Political Turmoil – A Panel and Workshop January 20, 2017.
Overview and Outcomes
Over 60 participates turned up on a rainy Los Angeles day for our morning panel and workshops. Our panel discussion centered on the importance of the SAVE EVERYTHING default mode (the operational mantra of the Internet Archive), the vulnerabilities of small scientific data, decentralized archiving as an activist tactic, and the often physical pains climate scientists go through to produce a single datum. We then broke into groups to nominate federal websites to the Internet Archive, archive important federal datasets, create an advocacy toolkit for democratic action, and discuss future research projects around climate change science and archival practice.
Our Archive-a-thon yielded 568 new URLs using the Internet Archive's nomination tool spreadsheet (aka 'seeding'), of which 528 are unique. We focused on the Department of Energy, using primers created by EDGI to coordinate the DataRescue events - we got through first primer before beginning the second on the list. We had a number of people experiment with scraping as well and were able to upload an uncrawlable dataset to the CKAN repository.
JANUARY 20 EVENT SCHEDULE
9:00 - 10:30 Panel Discussion | Data Politics, Management and Rescue
Christine Borgman (Information Studies & Center for Knowledge Infrastructures, UCLA)
Big Data, Little Data, or No Data? Sustaining Access to Research Data
Steve Diggs(Hydrographic Data Group - Scripps Institution of Oceanography)
As the oceans go, so goes the earth
Katie Mika(Institute of the Environment and Sustainability, UCLA)
On the space between science and policy
Jason Scott(Internet Archive)
Everything is Fine
Joan Donovan(Center for the Study of Genetics)
Today Won't Be Like Yesterday: Mining, Archiving, and Decentralizing Data to Preserve Scientific Futures
Followed by Q&A Session
10:30 - 11:00 orientation for break out sessions, start working
SESSION DESCRIPTIONS
Citizen Data Advocacy, or An Intervention Toolkit for Those Who Care About Facts and Data
This workshop will explore and identify allies in data collection and archiving, review tried and tested methods for advocating the issue with your elected officials, present templates (email, telephone, and letter) for contacting representatives, and, time permitting, write emails and letters to identified local (municipal and county), state, and regional representatives, as well as other allies.
Protecting Climate Data Over the Long Haul - A Research Agenda
This workshop will draft a proposal for carrying out climate data protection and archiving research over the challenging years to come. The outcomes from this workshop will be used to generate an ongoing agenda that we hope to incorporate as part of the Environmental Data Governance Initiative (EDGI), an international network of academics and non-profits that believes in evidence-based policy making and public interest science. EDGI projects work to proactively archive public environmental data, as well as track and respond to the undermining of evidence-based environmental governance in the United States.
Best Practices for Archiving Scientific Data
Climate change data are of a special kind, and need ad-hoc care. These datasets contain unique longitudinal observations that cannot be repeated. Participants in this workshop will discuss and develop guidelines and best practices for preserving climate change data, and make sure they can be reused over time. We will focus on identifying those data mining, storing, updating and sharing practices that are needed to ensure that climate datasets will not cease to be valuable sources of information for the research community. Our primary focus will be on the datasets archived by the data rescue team.
DataRescue and Archive-a-thon
This hands-on workshop will employ web harvesting as well as other methods to target data at special risk. We invite participants with all skill levels for activities including seeding and scraping the End of Term Harvest Project; downloading, describing, and uploading datasets; creating chain-of-custody between datasets and related website assets; and designing new tools to make web scraping and archiving for this project easier.
ACKNOWLEDGMENTS
Jonathan Furner, Chair Information Studies, UCLA
Christine L. Borgman, Distinguished Professor and Presidential Chair in Information Studies, UCLA
Milena Golshan, Center for Knowledge Infrastructures, UCLA
Lisa Snyder, Institute for Digital Research and Education, UCLA Libraries
Jason Scott, Internet Archive
Kelly Mika, UCLA Institute of the Environment and Sustainability
Joan Donavan, Center for the Study of Society and Genetics
Laurie Allen, Penn Libraries #DataRefuge
Bethany Wiggin, Penn Program for the Environmental Humanities
Michelle Murphy, University of Toronto
Steve Diggs, Scripps Institution of Oceanography
Environmental Data Governance Initiative(EDGI)
End of Term Harvest, Internet Archive
Mike Hucka, CalTech
Renee Rother, UC, Santa Barbara
UCLA Libraries
Peter Broadwell
Andy Rutkowski
ORGANIZING TEAM
Morgan Currie
Britt S. Paris
Irene Pasquetto
Jennifer Pierre
Overview and Outcomes
Over 60 participates turned up on a rainy Los Angeles day for our morning panel and workshops. Our panel discussion centered on the importance of the SAVE EVERYTHING default mode (the operational mantra of the Internet Archive), the vulnerabilities of small scientific data, decentralized archiving as an activist tactic, and the often physical pains climate scientists go through to produce a single datum. We then broke into groups to nominate federal websites to the Internet Archive, archive important federal datasets, create an advocacy toolkit for democratic action, and discuss future research projects around climate change science and archival practice.
Our Archive-a-thon yielded 568 new URLs using the Internet Archive's nomination tool spreadsheet (aka 'seeding'), of which 528 are unique. We focused on the Department of Energy, using primers created by EDGI to coordinate the DataRescue events - we got through first primer before beginning the second on the list. We had a number of people experiment with scraping as well and were able to upload an uncrawlable dataset to the CKAN repository.
JANUARY 20 EVENT SCHEDULE
9:00 - 10:30 Panel Discussion | Data Politics, Management and Rescue
Christine Borgman (Information Studies & Center for Knowledge Infrastructures, UCLA)
Big Data, Little Data, or No Data? Sustaining Access to Research Data
Steve Diggs(Hydrographic Data Group - Scripps Institution of Oceanography)
As the oceans go, so goes the earth
Katie Mika(Institute of the Environment and Sustainability, UCLA)
On the space between science and policy
Jason Scott(Internet Archive)
Everything is Fine
Joan Donovan(Center for the Study of Genetics)
Today Won't Be Like Yesterday: Mining, Archiving, and Decentralizing Data to Preserve Scientific Futures
Followed by Q&A Session
10:30 - 11:00 orientation for break out sessions, start working
SESSION DESCRIPTIONS
Citizen Data Advocacy, or An Intervention Toolkit for Those Who Care About Facts and Data
This workshop will explore and identify allies in data collection and archiving, review tried and tested methods for advocating the issue with your elected officials, present templates (email, telephone, and letter) for contacting representatives, and, time permitting, write emails and letters to identified local (municipal and county), state, and regional representatives, as well as other allies.
Protecting Climate Data Over the Long Haul - A Research Agenda
This workshop will draft a proposal for carrying out climate data protection and archiving research over the challenging years to come. The outcomes from this workshop will be used to generate an ongoing agenda that we hope to incorporate as part of the Environmental Data Governance Initiative (EDGI), an international network of academics and non-profits that believes in evidence-based policy making and public interest science. EDGI projects work to proactively archive public environmental data, as well as track and respond to the undermining of evidence-based environmental governance in the United States.
Best Practices for Archiving Scientific Data
Climate change data are of a special kind, and need ad-hoc care. These datasets contain unique longitudinal observations that cannot be repeated. Participants in this workshop will discuss and develop guidelines and best practices for preserving climate change data, and make sure they can be reused over time. We will focus on identifying those data mining, storing, updating and sharing practices that are needed to ensure that climate datasets will not cease to be valuable sources of information for the research community. Our primary focus will be on the datasets archived by the data rescue team.
DataRescue and Archive-a-thon
This hands-on workshop will employ web harvesting as well as other methods to target data at special risk. We invite participants with all skill levels for activities including seeding and scraping the End of Term Harvest Project; downloading, describing, and uploading datasets; creating chain-of-custody between datasets and related website assets; and designing new tools to make web scraping and archiving for this project easier.
ACKNOWLEDGMENTS
Jonathan Furner, Chair Information Studies, UCLA
Christine L. Borgman, Distinguished Professor and Presidential Chair in Information Studies, UCLA
Milena Golshan, Center for Knowledge Infrastructures, UCLA
Lisa Snyder, Institute for Digital Research and Education, UCLA Libraries
Jason Scott, Internet Archive
Kelly Mika, UCLA Institute of the Environment and Sustainability
Joan Donavan, Center for the Study of Society and Genetics
Laurie Allen, Penn Libraries #DataRefuge
Bethany Wiggin, Penn Program for the Environmental Humanities
Michelle Murphy, University of Toronto
Steve Diggs, Scripps Institution of Oceanography
Environmental Data Governance Initiative(EDGI)
End of Term Harvest, Internet Archive
Mike Hucka, CalTech
Renee Rother, UC, Santa Barbara
UCLA Libraries
Peter Broadwell
Andy Rutkowski
ORGANIZING TEAM
Morgan Currie
Britt S. Paris
Irene Pasquetto
Jennifer Pierre