Germain Gauthier



(Tuesday, 16th May 2023)

Title : Text as Data for the Social Sciences

Download the presentation - 6.48 MB

Advances in computational methods and the widespread availability of digital text data have made it increasingly feasible to extract meaningful information from large text corpora. This workshop will provide an overview of the computer science techniques to analyze text, including simple word counts, topic models, semantic information extraction, text embeddings, and the rise of general-purpose, large language models (e.g., ChatGPT). To make things concrete, we will highlight important applications of text-as-data methods in economics and political science. Recent papers have successfully measured news sentiment, racial and misogynistic bias, economic narratives, political frames, and speech polarisation, among many other outcomes. Finally, we will highlight some challenges and limitations associated with working with text-as-data and sketch various avenues for future research.