INST 798/808: A.I.-Powered Research Assistants
Location + Time: TWS 0207, Thursdays 2 to 4.45p.m.
Course Description
This course explores how Large Language Models (LLMs) can transform labor-intensive research tasks in the social sciences. Using the challenge of tracing ideas and concepts through text as our primary lens, we examine how these powerful tools can aid both qualitative and quantitative research methodologies. Measuring the evolution of ideas through text presents uniquely complex challenges - concepts may be expressed through varied language, their meaning often shifts over time, and understanding them requires deep contextual knowledge that has traditionally relied heavily on human expertise.
The course begins by examining traditional approaches to concept measurement, from word embeddings to early neural architectures, before exploring how transformer-based models have revolutionized our ability to detect and track complex ideas in text. We then delve into recent advances in mechanistic interpretability to understand how these models internally represent and manipulate concepts. This foundation allows us to evaluate various approaches to concept tracing, from using LLMs to scale up qualitative research methods to exploring how modern neural topic modeling can capture evolving ideas across large corpora.
Throughout the course, we maintain a strong focus on validation and methodology, culminating in an examination of how to properly conduct downstream analyses using LLM-processed data. Through paper presentations, class discussions, hands-on labs, and a research paper, students will develop both theoretical understanding and practical experience applying these tools to real research problems.
By the course’s end, students will be equipped to evaluate when and how to effectively integrate LLMs into their research workflows, understand the methodological implications of using these tools, and implement appropriate validation strategies for LLM-assisted research. Most importantly, they will develop a critical perspective on both the transformative potential and the limitations of using LLMs as research assistants in the social and information sciences.
Course Objectives
After completing this course, students will be able to:
- Understand the fundamental concepts of natural language processing and their evolution
- Evaluate the capabilities and limitations of AI research tools
- Implement proper validation and error analysis techniques
- Design research workflows that appropriately incorporate AI assistance
Prerequisites
While students aren’t expected to have deep expertise in all areas, you should be comfortable with the following:
- Python programming fundamentals (working with common data structures, functions, pandas)
- Basic linear algebra. Understanding how language models represent and manipulate text requires familiarity with vector and matrix operations (addition, multiplication, transpose, distance, similarity).
- Basic machine learning concepts (supervised vs unsupervised learning, common evaluation metrics)
- Fundamental probability concepts (conditional probability, independence)
Technical Requirements
This course explores cutting-edge AI technologies, but we’ll be working within practical computational constraints. While large language models like GPT-4 or Claude 3.5 require significant computing resources, we’ll focus on working with smaller, more manageable models that can run on personal computers. Students will need a laptop capable of running Python and handling lightweight language models (8GB RAM minimum, 16GB+ recommended). We’ll use TerpAI for tasks requiring more computational power, but part of the learning experience will involve understanding how to conduct meaningful research within resource limitations. The course will emphasize understanding core concepts and developing practical workflows that can scale from limited to abundant computational resources.
Assessment
Weekly questions/reflections (20%)
Students will submit weekly questions or reflections (maximum 500 words) by Tuesday at 5pm before each class. These submissions serve two purposes: they help shape our class discussion and demonstrate your engagement with the Reading. These submissions will be used to guide our class discussions, so be prepared to elaborate on your question or reflection during class. Your submission should do one of the following:- Pose a substantive question sparked by the Reading. This could be about methodology, implications, connections to other work, or potential applications. Questions should go beyond basic clarification to engage with the material's concepts or implications.
- Offer a reflection that connects multiple Reading, relates the material to your own research, or critically examines the methodology or assumptions. Your reflection might consider how different papers approach similar problems, identify potential limitations, or propose new applications.
- Express and explore areas of confusion in the Reading. Some of our papers are technically challenging, and identifying what you don't understand is an important part of the learning process. When discussing confusing aspects, please:
- Describe your current understanding of the concept
- Identify specifically what aspect is unclear
Presentations (20%)
Throughout the semester, you will present assigned papers to the class. These 20-minute presentations should have 12-15 minutes of content and 5-8 minutes of discussion leading. Additionally, the presentations should:- Clearly state the paper's main contribution and why it matters
- Walk through one or two illustrative examples from the paper
- Discuss limitations and potential extensions
- Prepare 2-3 discussion questions for the class
- Be ready to facilitate brief discussion of these questions
Research Project Proposal (20%)
The research proposal (3-5 pages) outlines your planned investigation into either evaluating LLMs for specific research tasks or using LLMs to study a substantive research question. Before submitting your proposal, you must schedule a meeting with me to discuss your ideas. You can work by yourself or with one other student in the course, which I highly encourage. Prepare a one-paragraph summary of your idea and 2-3 specific questions for our discussion. This meeting should take place at least one week before the proposal deadline. The proposal is due in week 8 of the class (3/20). Your proposal should be structured as follows:- Introduction (~1 page)
Present your core research question or evaluation task. Whether you're assessing LLM capabilities or studying a substantive topic, explain why this question matters and how LLMs offer unique insights for addressing it. For instance, you might explore how LLMs could help trace the evolution of methodological discussions in your field, or evaluate how well they can identify theoretical frameworks in academic papers. - Proposed Methodology (1-2 pages)
Detail your research design, including:- Data sources
- Which models or tools you'll use
- Your analytical approach
- Why your chosen methods are appropriate for your question
- Validation Strategy (~1 page)
Describe how you'll verify your results and ensure methodological rigor. This might include:- Comparison with human coding
- Use of multiple models
- Development of benchmarks
- Strategies for addressing potential biases
- Timeline and Feasibility (~0.5 page)
Provide a realistic schedule showing how you'll complete the work within the semester. Include key milestones such as:- Data collection
- Initial analysis
- Validation steps
- Writing and revision
Final Paper (40%)
The Final Paper will be due on the first day of finals period.Weekly Schedule
Week 1: Introduction to Natural Language Processing
Required Reading:
- Textbook: Jurafsky & Martin (2025). Speech and Language Processing
- Chapter 3: 3.1 to 3.3, 3.7
- Chapter 4: 4.1, 4.6 to 4.8
Highly Recommended Video:
Recommended Video:
Week 2: Word Embeddings (No Class)
Required Reading:
- Rodriguez & Spirling (2022). Word Embeddings: What Works, What Doesn’t, and How to Tell the Difference for Applied Research
- Kutuzov et al. (2018). Diachronic word embeddings and semantic shifts: a survey
- Hofmann et al. (2021). Dynamic Contextualized Word Embeddings
Optional Reading:
- Hamilton et al. (2018). Diachronic Word Embeddings Reveal Statistical Laws
- Yao et al. (2018). Dynamic Word Embeddings for Evolving Semantic Discovery
- Charlesworth et al. (2022). Historical representations of social groups
Week 3: Neural Networks
Required Reading:
- Textbook: Jurafsky & Martin (2025). Speech and Language Processing
- Chapter 5: 5.1 to 5.5
- Chapter 6: 6.1 to 6.4, 6.8 to 6.12
- Chapters 7
Highly Recommended Videos:
Week 4: RNNS and Attention
Required Reading:
- Textbook: Jurafsky & Martin (2025). Speech and Language Processing
- Chapter 8: 8-8.2.1, 8.3, 8.7 (not 8.7.1), 8.8
- Chapter 9
- Alammar (2018). The Illustrated Transformer
- Rush (2018). The Annotated Transformer
Highly Recommended Videos:
- ritvikmath: Recurrent Neural Networks
- AI Coffee Break: Deepdive into transformers
- 3Blue1Brown: Neural Networks - Chapters 5 and 6
Recommended Videos:
- Transformer explanation
- Serrano.Academy: The Attention Mechanism in Large Language Models
- Neel Nanda: What is a Transformer
Optional Reading:
- Devlin et al. (2019). BERT: Pre-training of Deep Bidirectional Transformers
- Vaswani et al. (2017). Attention Is All You Need
Week 5: Transformers and Mechanistic Interpretability
Required Reading:
- Tennenholtz et al. (2024). Demystifying Embedding Spaces using Large Language Models
- Huben et al. (2023). Sparse Autoencoders Find Highly Interpretable Features in Language Models
- Lieberum et al. (2024). Gemma Scope: Open Sparse Autoencoders
- Luo et al. (2024). PaCE: Parsimonious Concept Engineering for Large Language Models
- Valios et al. (2024). Frame Representation Hypothesis: Multi-Token LLM Interpretability and Concept-Guided Text Generation
Optional Reading:
- Park et al. (2024). The Linear Representation Hypothesis and the Geometry of Large Language Models
- Park et al. (2024). The Geometry of Categorical and Hierarchical Concepts in Large Language Models
- Engels et al. (2024). Not All Language Model Features Are Linear
- Jiang et al. (2024). On the Origins of Linear Representations in Large Language Models
- Ahuja et al. (2024). Learning Syntax Without Planting Trees: Understanding When and Why Transformers Generalize Hierarchically
- Csordas et al. (2024). Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations
- Rajendran et al. (2024). Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models
- Karvonen et al. (2024). Evaluating Sparse Autoencoders
Week 6: Modern Topic Modeling
Required Reading:
- Hoyle et al. (2022). Are Neural Topic Models Broken?
- We et al. (2024). FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model
- Pham et al. (2024). TopicGPT: A Prompt-based Topic Modeling Framework
Optional Reading:
- Hastings & Pesando (2024). What’s a parent to do?
- Egger & Yu (2022). A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts
Week 7: Qualitative Coding with LLMs I
Required Reading:
- Zhang et al. (2024). When Qualitative Research Meets Large Language Model: Exploring the Potential of QualiGPT as a Tool for Qualitative Coding
- Eschrich & Sterman (2024). A Framework For Discussing LLMs as Tools for Qualitative Analysis
- Pangakis et al. (2023). Automated Annotation with Generative AI Requires Validation
Optional Reading:
Week 8: Qualitative Coding with LLMs II
Required Reading:
- Rasheed et al. (2024). Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis
- Dunivin (2024). Scalable Qualitative Coding with LLMs: Chain-of-Thought Reasoning Matches Human Performance in Some Hermeneutic Tasks
- Arlinghaus et al. (2024). Inductive Coding with ChatGPT - An Evaluation of Different GPT Models Clustering Qualitative Data into Categories
Optional Reading:
Week 9: Error Analysis and Validation
Required Reading:
- Pangakis et al. (2023). Automated Annotation with Generative AI Requires Validation
- Ludwig et al. (2024). Large Language Models: An Applied Econometric Framework
- Carlson et al. (2024). The Use of LLMs to Annotate Data in Management Research: Warnings, Guidelines, and an Application to Organizational Communication
Optional Reading:
- Barrie et al. (2024). Replication for Large Language Models
- Thalken et al. (2023). Modeling Legal Reasoning: LM Annotation at the Edge of Human Agreement
- Egami et al. (2024). Using Imperfect Surrogates for Downstream Inference
Week 10: Concept Tracing I
Required Reading:
- Nascimento et al. (2022). Concept Detection in Philosophical Corpora
-
Li (2024). Tracing the Genealogies of Ideas with Large Language Model Embeddings
- Ganguli et al. (2024). Mapping Inventions in the Space of Ideas, 1836–2022: Representation, Measurement, and Validation
Optional Reading:
- Vicinanza et al. (2022). Deep-learning model of prescient ideas demonstrates that they emerge from the periphery
- Lucy et al. (2023). Words as Gatekeepers
- Rosin et al. (2022). Time Masking for Temporal Language Models
Week 11: Concept Tracing II
Required Reading:
- Garg & Fetzer (2025). Causal Claims in Economics
- Cong et al. (2024). Textual Factors: A Scalable, Interpretable, and Data-driven Approach to Analyzing Unstructured Information
- Lam et al. (2024). Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM