Exploring Slave Narratives with ICPSR #AdoptaDataset

Adopt.pngThe team at ICPSR is doing some clever promotions of data for Love Data Week, including Adopt a Dataset! I adopted the Quantitative Data Coded from the Federal Writers’ Project Slave Narratives, United States, 1936-1938.  I’ve read so much about this project and it seemed appropriate for February and Black History Month. You can read the actual interview transcripts on the Library of Congress website: Born in Slavery: Slave Narratives from the Federal Writers’ Project, 1936 to 1938. In the late 1970s, Paul Escott read and coded 2,358 of the slave narratives to create this dataset.

The narratives provide insight both into the process of the interview as well as the experiences of the formerly enslaved people. One of the most controversial questions was about attitudes toward the master, with some writers pointing to “favorable” attitudes toward masters as an indicator of slavery being a “less harsh” institution. But that ignores the fact that there were 771 who did not answer the question (or gave no indication of an answer in the narrative). In addition, around 1200 of the interviewers were white as opposed 400 who were black. In the 1930s American South, it would have been difficult for a person of color to speak ill of a white person in front of another white person. In addition, the coder’s interpretation of favorability needs to be taken into account.


ICPSR has made the dataset easy to use in R. The only trick is that the variables are mostly factors that need to be converted to numeric. ICPSR helpfully provides the R library and functions that can help with the conversion. Just remember to read the documentation closely before jumping in! Below are some my explorations including creating a subset of NC and another of NC women.


You should adopt a dataset and explore some data! You don’t need to know statistical software because the codebooks can provide some basic overviews of the dataset. In addition, many of their datasets have online analyses available.

Tomorrow you can join their tweetchat starting at 12:30 pm. Go and give some love to your data!

December Help! Webinar on Social Work Resources

Time to sign up for our December Help! Gov Info webinar.  And our November recording on presidential research is posted below!

Help! I’m an Accidental Government Information Librarian presents … Government Information for Social Workers: From students to professionals

The Government Resources Section of the North Carolina Library Association welcomes you to a series of webinars designed to help us increase our familiarity with government information. All are welcome because government information wants to be free.

From regulations to statistics, government information resources are highly utilized and embedded within the field of social work. The type of resources and depth of information needs often vary based on career stages. This webinar will examine the various government information needs ranging from undergrads in BSW programs to licensed professionals. The presenter will cover strategies for subject liaisons to better understand local social work information needs. It will also include relevant website recommendations for inclusion on library online research guides.

Presenter: Michelle Donlin is the Scholarly Communications and Research Librarian and subject liaison for Social Work at East Stroudsburg University in East Stroudsburg, PA. She also serves as Kemp Library’s depository coordinator and holds an MLS with a concentration in E-Government Librarianship from the University of Maryland.

We will meet together online on Thursday December 6th from 12:00 – 1:00 p.m. (Eastern). Please RSVP for the session using this link: https://tinyurl.com/grssession82

We will use WebEx for the live session. Information on testing and accessing the session will be made available when you register.

The session will be recorded and available after the live session, linked from the NCLA GRS web page (http://www.nclaonline.org/government-resources).


And from November …

Promoting ICPSR @UNCG

Did you know UNC Greensboro staff, students, & faculty can access ICPSR social/behavioral research data?

The ICPSR Data Fair is always a great opportunity for learning more about their new data tools and services. They are creating lots of new tools for promotions, so I encourage you to check those out.

This year I participated by talking about how to promote ICPSR on campus, including social media outreach, graduate student promotions, and creating targeted messages.

How do you promote ICPSR on your campus? I would love to get some new ideas too!

Beyond the Numbers Day 2

Because I teach a semester-long course sometimes and I have duties elsewhere, I haven’t been able to attend many smaller conferences lately. Even NCLA has been a struggle. Beyond the Numbers is the perfect small conference that brings together people really interested and knowledgeable about a concentrated topic (or related topics). Not only was I able to connect with librarians I haven’t seen in a while, I was able to put several names to faces and meet new people. Day 1 didn’t disappoint.  Day 2 was a half-day so not as much going on, but there were some interesting sessions.
The keynote was by Wendy Stephens, a professor at Jacksonville State University. Her presentation titled All About You, Up For Sale: How Data Brokers Like Cambridge Analytica Construct Consumer Identities looked at data as a commodity and ways that organizations collect information about us. She made the case for controlling the data that we put out online or allow others to connect. She suggested a number of readings for more information some of which are well known and others I’ve not heard of.
Next, Jennifer Boettcher, ALA Councilor and Business and Economic Liaison and Reference Librarian at Georgetown, talked about intellectual property governance for government data. Her slides were quite good and complete, so I will post a link when they are up on the BTN website. She talked about the difference between copyright and public domain, the open data movement, intellectual property, and more. She mentioned her article in Online Searcher so definitely check that out for more info:  Boettcher, J., & Dames, K. (2018). Government data as intellectual property: Is public domain the same as open access? Online Searcher, 42(4).

After Jennifer, Marie Concannon, Katrina Stierholz, and I presented on the PEGI project looking at the preservation of economic data and information. Marie is the Head of the Government Information and Data Archives at University of Missouri and Katrina is the Vice President and Director of Library and Research Information Services at the Federal Reserve Bank of St. Louis. I discussed the history of PEGI and current focus of the project. Marie talked about issues she had come across in her work with economic data (and by the way check out her awesome Prices and Wages by Decade libguide). She discussed several of the issues that we’ve encountered including lack of data documentation, the move to cloud services that require a fee for extraction of government data, and the commercialization of government data. Finally, she mentioned the decreasing number of electronic documents available through the GPO, despite the move to electronic formats. For example, she searched the GPO’s Catalog of Government Publications for the “L” SuDoc, which includes the documents for the Labor department, and only 30 items came up. Keep in mind, these aren’t print documents; these are the electronic documents. Her presentation  brought home the scale of the problem that we face regarding the loss of government information.

Katrina then talked about the process of revising economic data and the importance in capturing those revisions over time. She talked about how current versions of economic data are less accurate, but those are the ones on which policy is often made. Therefore, we need to collect the past data so that we can better understand how policy was decided and what the errors were. Moreover, ALFRED, the historical economic data database, only captures series that are in FRED, but there are a lot of data series that aren’t in FRED. Furthermore, the Federal Reserve Banks aren’t government agencies and aren’t subject to the same rules for retention.  So, the question becomes how do we coordinate with these kinds of special nongov organizations that are producing information necessary to the functioning of our government? What becomes the highest priorities?

Lots to think about. PEGI will hold a national forum in December at the CNI meeting with the goal of bringing in stakeholders from the wider communities (librarians, community leaders, activists, archivists, journalists, and government employees). More to come on those discussions soon.


Finally, for our working lunch, representatives from Census, the Federal Reserve Board and Banks, the Bureau of Labor Statistics, the World Bank, the OECD, and FRED sat on a panel to discuss various issues. Ron Nakao, the Data and Economics Librarian at Stanford, asked an interesting question about the priorities for data in these organizations. He noted that there are three threads in data: data creation/collection,  metadata creation/collection, and tool creation/collection, and that the metadata curation aspects often do not have enough infrastructural support. Several of the representatives agreed and noted the activities at their institutions for metadata creation. For example, the Census Bureau is requiring all surveys to use the same metadata and the BLS is working towards a glossary of terms. Hopefully those efforts will help to reduce the creation of metadata as an afterthought in the data collection/creation process.
Great conference. Really happy that I went although it was whirlwind! BTN is on the list for next year!
P.S. Thanks for the mug! It’s like they know me … and my coffee addiction.

Beyond the Numbers 2018 Day 1 #data #BTN2018

I had the opportunity to attend the Beyond the Numbers conference at the Federal Reserve Bank of St. Louis this week. This biennial event brings together librarians, archivists, and economists from all over the country to talk about the challenges in economic information access and use. Usually they add their presentation materials every year so check back for slides. I’ve never been to the conference but have heard a lot about it from IASSIST members since it started in 2014. I arrived late because of teaching and plane malfunctions, but I was able to attend a few sessions on Thursday.

Data Play

The first was with Christine Murray from Bates College talking about using R for economics data. She did a great job showing both the basics of using R and then how to do use the pdfetch package to work with time series from economic data vendors like FRED, BLS, and others. I’ve imported data using API but this package makes it much easier to work with these particular vendors.  You can also visualize and layer time series within R. She created a great libguide showing how to use R for economics. Definitely going on my data play list for winter break.

The second was Kate McNamara’s Evidence-Based Research with the Census Bureau Data Linkage Infrastructure. Kate talked about the new efforts in the Census Bureau’s Data Linkage Infrastructure program. This is related to the  Federal Statistical Data Research Centers (FSRDCs) located around the country (our closest is at Duke) that have administrative data from a wide variety of government agencies that are linked together. Researchers must apply to access the data and it has been a lengthy (and slightly cumbersome process). One of their efforts is to promote evidence building projects that are collaborations between Bureau researchers and academics. The difficulty for academics in the past has been that, while there is a data inventory, the CB hasn’t provided detailed metadata about the available datasets and information on what unique identifiers are available for linking datasets. Without that information it can extremely difficult for researchers to know before they apply if the data will be useful. The CB is preparing though to post that metadata on ICPSR and create a new inventory available to the public. That is REALLY exciting news for data users.

Finally,  Kristin Fontichiaro and Wendy Stephens presented on From “Skip the Numbers” to “Great Stuff”: A Data Education Project. These LIS professors created a project geared to high school teachers and media center specialists to help them integrate statistical literacy into their curricula. Their project, Creating Data Literate Students, made the rounds a while back and they have recordings from past virtual conferences if you are interested. For lower level or data adverse students, the principles and teaching suggestions are very helpful. They also have two free books on teaching statistical and data literacy in teaching. I’m xcited to read Lynette Hoelter’s chapter! She does some great work at ICPSR.

So, day 1 is a wrap. Today we learn more about data and I am presenting on the PEGI Project. Exciting stuff and more to come!

Presidential Research Resources webinar!

Our next NCLA Gov Resources Section Help! I’m an Accidental Government Information Librarian webinar is November 7, 2018 from 12-1 pm (eastern). It is on Presidential Research Resources.

This talk will discuss digital and archival resources for Presidential Research with librarians and archivists from the Miller Center, the Fred W. Smith National Library for the Study of George Washington and Seton Hall University.  The presenters will include Rebecca Baird, Archivist, Mount Vernon Ladies’ Association (MVLA), Sheila Blackford, Librarian, Scripps Library, Miller Center, University of Virginia, Lisa DeLuca, Social Sciences Librarian, Seton Hall University, and Katherine Hoarn, Special Collections Librarian, The Fred W. Smith National Library for the Study of George Washington at Mount Vernon

Sign up here: http://bit.ly/NCLAGRS-81-1

For more of our webinars, see our YouTube channel: http://tinyurl.com/nclagrsonyoutube