Skip to content
/ 664 Public

Course materials for INFO-664 Programming Cultural Heritage

Notifications You must be signed in to change notification settings

gofilipa/664

Repository files navigation

Introduction to Python for Programming Cultural Heritage

By Filipa Calado

Pratt Institute School of Information

Welcome to Programming for Cultural Heritage!

This website offers a series of Python lessons organized into different sections, like "Introduction to Python Fundamentals" and "Python for Data Cleaning". These materials introduce participants to the Python programming language for working in cultural heritage contexts like libraries, archives, and museums.

Section overviews

1: “Introduction to Python Fundamentals”

  • Offers basic introduction to core concepts in Python programming, grounded in a critical awareness about data and what happens to data at various levels of transformation and abstraction.

2: “Python for Web Scraping and APIs”

  • Introduction to ethics, legality, and programmatic methods for extracting data from the web. Advances core concepts from introductory session (like loops and conditional statements) and adds new concepts on object-oriented programming and working with Python libraries. Participants practice scraping metadata from current “anti-trans” bills in the USA.
  • libraries: requests, bs4, and pandas

3: “Python for Data Cleaning”

  • Experiments with approaches for wrangling text data into formats for analysis, with emphasis on removing unwanted elements that may skew analysis. While building on skills for writing loops and conditional statements and working with external libraries, participants will learn to write functions and scripts for running customized text cleaning processes.
  • libraries: pandas, spacy

4: “Python for Data Analysis”

  • Explores methods for finding and analyzing textual patterns through popular tasks in Natural Language Processing. Participants practice writing code to annotate and extract text according to specific features from current “anti-trans” bills in the USA.
  • libraries: spaCy

5: “Python for Machine Learning”

  • With the anti-trans bills data that they prepared in previous workshops, participants practice fine-tuning a small Text Generation model and learn about how to use Machine Learning for research.
  • libraries: transformers

**6: "Python for Publishing"

  • For this session, we will learn Jekyll and Github Pages to deploy your project into a website that others can access on the internet.

Sources

This curriculum is inspired by the Graduate Center Digital Initiatives Digital Humanities Research Institute Python workshop.

The opening challenge takes text from the Feminist Data Manifest-No by M. Cifor, P. Garcia, et al.

For more instruction with Python, please see these books:

All of the above workshops were first developed and piloted at the Princeton University Library in the 2023-2024 academic year. Thank you to Princeton students, faculty, and staff for their generous participation and suggestions.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

About

Course materials for INFO-664 Programming Cultural Heritage

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published