activeText
Description: With Saki Kuzushima (University of Michigan), Yuki Shiraito (University of Michigan) and Ted Enamorado (Washington University St. Louis). We develop a semi-supervised active learning algorithm for text classification, and show its effectiveness at reducing the amount of labeled data needed to train a classifier. We also introduce activeText, an R package for active learning for text classification.
Papers:
- Improving Probabilistic Models in Text Classification via Active Learning
- Revise & Resubmit at American Political Science Review (APSR), 2023.
Keywords: active learning, text classification, semi-supervised learning, machine learning, R package
Language Models for Political Science
Description: With Musashi Jacobs-Harukawa (Princeton University), Alexander Hoyle (University of Maryland), Hauke Licht (University of Cologne). With the rapid development of large language models (LLMs), we claim that researchers using LLMs must make three critical decisions: model selection, domain-adaptation strategies, and prompt design. To help provide guidance on these choices, we establish a set of benchmarks for a wide range of natural language processing (NLP) tasks pursued by political science tasks.
Papers:
- Do we still need BERT in the age of GPT? Comparing the benefits of domain-adaptation and in-context-learning approaches to using LLMs for Political Science Research
- Presented at the 2023 Annual Meeting of the Midwest Political Science Association (MPSA), Chicago, IL, April 2023.
Keywords: language models, BERT, GPT, NLP, political science
Legislative Records from Colonial India, 1919-1947
Description: With Thiha Zaw (University of Michigan). A new dataset of legislative records from colonial India, 1919-1947.
Papers:
- Institutional Change after Franchise Expansion: Evidence from British India
- Presented at the 2022 Annual Meeting of the Midwest Political Science Association (MPSA), Chicago, IL, April 2022. Keywords: India, colonialism, legislative records, text analysis