Skip to content

scdenney/ba3_text_as_data

Repository files navigation

Topical Reading: Digital Humanities

Course: BA3 Korean Studies, Leiden University
Instructor: Dr. Steven Denney
Time & Place: Fridays, 11:15–13:00, Huizinga 0.09
Duration: 6 seminars starting October 10 and ending November 21


Course Description

This is the DH strand of the BA3 course Contemporary Korea and Digital Humanities. This course is meant to introduce students to digital humanities (DH) methods, focusing on text-as-data approaches. Using Orange Data Mining and pre-prepared Korean corpora, students will learn how to clean, analyze, and interpret textual data.

The DH strand complements the topical reading seminars by equipping students with methodological skills that may support their undergraduate thesis research. There are no programming requirements whatsoever in this course, although students will have the opportunity to explore ways to acquire such skills.

This page provides a short overview of the course. See the syllabus for complete information on lessons and other important information.


Learning Objectives

By the end of the DH module, students will be able to:

  • Understand the role of Digital Humanities in Korean Studies.
  • Apply text preprocessing techniques to prepare data.
  • Conduct descriptive text analysis (frequency, keywords, word clouds).
  • Use classification, clustering, and topic modeling for analysis.
  • Practice data management and transparency with GitHub.
  • Reflect on how computational methods may strengthen thesis projects.

Weekly Schedule

  • Week 1 (Oct. 10): Introduction to DH, GitHub & Data Management
  • Week 2 (Oct. 17): Text Preprocessing
  • Week 3 (Oct. 24): Descriptive Patterns
  • Week 4 (Nov. 7): Classification & Prediction
  • Week 5 (Nov. 14): Clustering & Similarity
  • Week 6 (Nov. 21): Topic Modeling & Wrap-Up
  • Final Project (Dec. 05): Text-as-Data Analysis

Note: Course content subject to change.


Tools

  • Orange Data Mining (main application)
  • GitHub (data management & transparency)
  • Provided Korean corpora (in /data)

License & Use

This repository is for educational use in the KoreaStudies program at Leiden University.


Repository Structure

ba3_text_as_data/
├── syllabus/          
│   └── syllabus.md    # the course syllabus
│   └── repo_how-to.md # how to set up your GitHub repo
├── lectures/          # slides and lecture materials (uploaded after lessons)
├── assignments/       # weekly assignment instructions
├── data/              # provided corpora and supplementary information
└── README.md          # this file

About

DH strand of the BA3 course Contemporary Korea and Digital Humanities

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors