Text processing for linguists and literary scholars with R

This course is a hands-on introduction to using the programming language R for the analysis of textual data (such as corpora, literary works, web data etc.) It is based on the second edition (2016) of my textbook Quantitative corpus linguistics with R and introduces a variety of programming constructs required for text processing: functions and relevant data structures (e.g., vectors), control flow structures such as loops and conditionals, and a sizable number of regular expressions; in addition and time permitting, we will also cover very elementary basics of data visualization. The kinds of data dealt with in this course come from a variety of differently formatted/annotated corpora and will also include 1-2 examples of literary works and/or XML processing.