Python for Humanities: an Intro for Researchers

By: Louisa Gaylord

Last week the eScience Institute and UW Libraries Open Scholarship Commons co-hosted a workshop called “Python, your personal research assistant” for participants studying the humanities to explore the Python programming language and how to use it as a tool to aid in qualitative humanities work. Led by eScience Technical Education Specialist Naomi Alterman, the program encouraged students to decipher lines of Python, and learn how to make use of it to complete repetitive tasks. “I’m expecting folks to show up to the workshop with no experience with computer code,” Naomi Alterman said. “And I want them to leave with a suitable argument as to why it’s useful for them in the future.”

The program emphasized that the goal was not to learn how to program, but instead to understand the process of reading pieces of code and beginning to comprehend them, to then be able to reshape existing code and eventually learn to write original Python script. “Feeling comfortable working with code gives us agency as researchers,” explained Alterman. “If we can glean the workings of a tool that we couldn’t necessarily build from scratch, we’ll gain the power to apply it to our own purposes and assess it critically when we see others using it in the wild.”

Participants first had the opportunity to interact with Python with a group generator exercise, where they examined a block of code that sorts a list of names into random groups. They learned how to decipher recognizable pieces of code and their functions, and notice similarities between Python and sentences created from words. An interactive JupyterLab workspace enabled participants to experiment with the code themselves: “I’m not interested in what happens when you hit the play button and run your code. I’m interested in analyzing the gap between what you thought the code would do and then what actually happened,” Naomi Alterman said.

The second half of the workshop focused on applying Python to online content and analyzing it for varying degrees of Covid-19 disinformation, in a project contributed by eScience’s Text Mining Student Specialist Shachi Sonar. Sorting through 27,000 Tweets and manually assigning value to specific words would be a monumental task; with Python, researchers can automate repetitive tasks, and dedicate more time to the meaningfully human parts of their work. A project that would have previously taken months to complete now takes only a fraction of that time.

Naomi guided participants through the sentiment analysis process, where positive and negative words were assigned values between 1 and -1 to help quantify their tone. For example, a Tweet like “Mushrooms are just the worst and need to back off before I lose it” might receive a score of -0.704 since words like “worst” and “lose” are clearly towards the negative end of the spectrum. Conversely, a phrase like “Mushrooms give me life, they are lovely and I want to be surrounded” would have a positive value at the opposite end. These tools can be applied to a number of research areas within humanities, such as tracking language use over time, or visualizing similarities and trends across authors from a particular era.

If you are interested in learning more about Python and how it can help you with your research, the eScience Institute offers Software Carpentries courses to learn basic research computing skills. Or reach out to our data scientists with questions through remote Office Hours.