INST 798/808: A.I.-Powered Research Assistants
Location + Time: TWS 0207, Thursdays 2 to 4.45p.m.
Course Description
This course explores how Large Language Models (LLMs) can transform labor-intensive research tasks in the social sciences. Using the challenge of tracing ideas and concepts through text as our primary lens, we examine how these powerful tools can aid both qualitative and quantitative research methodologies. Measuring the evolution of ideas through text presents uniquely complex challenges - concepts may be expressed through varied language, their meaning often shifts over time, and understanding them requires deep contextual knowledge that has traditionally relied heavily on human expertise.
The course begins by examining traditional approaches to concept measurement, from word embeddings to early neural architectures, before exploring how transformer-based models have revolutionized our ability to detect and track complex ideas in text. We then delve into recent advances in mechanistic interpretability to understand how these models internally represent and manipulate concepts. This foundation allows us to evaluate various approaches to concept tracing, from using LLMs to scale up qualitative research methods to exploring how modern neural topic modeling can capture evolving ideas across large corpora.
Throughout the course, we maintain a strong focus on validation and methodology, culminating in an examination of how to properly conduct downstream analyses using LLM-processed data. Through paper presentations, class discussions, hands-on labs, and a research paper, students will develop both theoretical understanding and practical experience applying these tools to real research problems.
By the course’s end, students will be equipped to evaluate when and how to effectively integrate LLMs into their research workflows, understand the methodological implications of using these tools, and implement appropriate validation strategies for LLM-assisted research. Most importantly, they will develop a critical perspective on both the transformative potential and the limitations of using LLMs as research assistants in the social and information sciences.
Course Objectives
After completing this course, students will be able to:
- Understand the fundamental concepts of natural language processing and their evolution
- Evaluate the capabilities and limitations of AI research tools
- Implement proper validation and error analysis techniques
- Design research workflows that appropriately incorporate AI assistance
Prerequisites
While students aren’t expected to have deep expertise in all areas, you should be comfortable with the following:
- Python programming fundamentals (working with common data structures, functions, pandas)
- Basic linear algebra. Understanding how language models represent and manipulate text requires familiarity with vector and matrix operations (addition, multiplication, transpose, distance, similarity).
- Basic machine learning concepts (supervised vs unsupervised learning, common evaluation metrics)
- Fundamental probability concepts (conditional probability, independence)
Technical Requirements
This course explores cutting-edge AI technologies, but we’ll be working within practical computational constraints. While large language models like GPT-4 or Claude 3.5 require significant computing resources, we’ll focus on working with smaller, more manageable models that can run on personal computers. Students will need a laptop capable of running Python and handling lightweight language models (8GB RAM minimum, 16GB+ recommended). We’ll use TerpAI for tasks requiring more computational power, but part of the learning experience will involve understanding how to conduct meaningful research within resource limitations. The course will emphasize understanding core concepts and developing practical workflows that can scale from limited to abundant computational resources.
Assessment
Weekly questions/reflections (20%)
Students will submit weekly questions or reflections (maximum 500 words) by Tuesday at 5pm before each class. These submissions serve two purposes: they help shape our class discussion and demonstrate your engagement with the Reading. These submissions will be used to guide our class discussions, so be prepared to elaborate on your question or reflection during class. Your submission should do one of the following:- Pose a substantive question sparked by the Reading. This could be about methodology, implications, connections to other work, or potential applications. Questions should go beyond basic clarification to engage with the material's concepts or implications.
- Offer a reflection that connects multiple Reading, relates the material to your own research, or critically examines the methodology or assumptions. Your reflection might consider how different papers approach similar problems, identify potential limitations, or propose new applications.
- Express and explore areas of confusion in the Reading. Some of our papers are technically challenging, and identifying what you don't understand is an important part of the learning process. When discussing confusing aspects, please:
- Describe your current understanding of the concept
- Identify specifically what aspect is unclear
Presentations (20%)
Throughout the semester, you will present assigned papers to the class. These 20-minute presentations should have 12-15 minutes of content and 5-8 minutes of discussion leading. Additionally, the presentations should:- Clearly state the paper's main contribution and why it matters
- Walk through one or two illustrative examples from the paper
- Discuss limitations and potential extensions
- Prepare 2-3 discussion questions for the class
- Be ready to facilitate brief discussion of these questions
Research Project Proposal (20%)
The research proposal (3-5 pages) outlines your planned investigation into either evaluating LLMs for specific research tasks or using LLMs to study a substantive research question. Before submitting your proposal, you must schedule a meeting with me to discuss your ideas. You can work by yourself or with one other student in the course, which I highly encourage. Prepare a one-paragraph summary of your idea and 2-3 specific questions for our discussion. This meeting should take place at least one week before the proposal deadline. The proposal is due in week 8 of the class (3/20). Your proposal should be structured as follows:- Introduction (~1 page)
Present your core research question or evaluation task. Whether you're assessing LLM capabilities or studying a substantive topic, explain why this question matters and how LLMs offer unique insights for addressing it. For instance, you might explore how LLMs could help trace the evolution of methodological discussions in your field, or evaluate how well they can identify theoretical frameworks in academic papers. - Proposed Methodology (1-2 pages)
Detail your research design, including:- Data sources
- Which models or tools you'll use
- Your analytical approach
- Why your chosen methods are appropriate for your question
- Validation Strategy (~1 page)
Describe how you'll verify your results and ensure methodological rigor. This might include:- Comparison with human coding
- Use of multiple models
- Development of benchmarks
- Strategies for addressing potential biases
- Timeline and Feasibility (~0.5 page)
Provide a realistic schedule showing how you'll complete the work within the semester. Include key milestones such as:- Data collection
- Initial analysis
- Validation steps
- Writing and revision
Final Paper (40%)
The Final Paper will be due on the first day of finals period.Weekly Schedule
Week 1: Introduction to Natural Language Processing
Slides | Lab Notebook | Solutions
Required Reading:
- Textbook: Jurafsky & Martin (2025). Speech and Language Processing
- Chapter 3: 3.1 to 3.3, 3.7
- Chapter 4: 4.1, 4.6 to 4.8
Highly Recommended Video:
Recommended Video:
Week 2: Word Embeddings (No Class)
Required Reading:
- Hofmann et al. (2021). Dynamic Contextualized Word Embeddings
- Kutuzov et al. (2018). Diachronic word embeddings and semantic shifts: a survey
- Rodriguez & Spirling (2022). Word Embeddings: What Works, What Doesn’t, and How to Tell the Difference for Applied Research
Optional Reading:
- Charlesworth et al. (2022). Historical representations of social groups
- Hamilton et al. (2018). Diachronic Word Embeddings Reveal Statistical Laws
- Yao et al. (2018). Dynamic Word Embeddings for Evolving Semantic Discovery
Week 3: Neural Networks
Required Reading:
- Textbook: Jurafsky & Martin (2025). Speech and Language Processing
- Chapter 5: 5.1 to 5.5
- Chapter 6: 6.1 to 6.4, 6.8 to 6.12
- Chapters 7
Highly Recommended Videos:
Week 4: RNNS and Attention
Slides | Lab Notebook | Solutions
Required Reading:
- Textbook: Jurafsky & Martin (2025). Speech and Language Processing
- Chapter 8: 8-8.2.1, 8.3, 8.7 (not 8.7.1), 8.8
- Chapter 9
Highly Recommended Videos:
Recommended Videos:
- Serrano.Academy: The Attention Mechanism in Large Language Models
- Transformer explanation
- Neel Nanda: What is a Transformer
Optional Reading:
- Devlin et al. (2019). BERT: Pre-training of Deep Bidirectional Transformers
- Vaswani et al. (2017). Attention Is All You Need
Week 5: Transformers and Sentence Embeddings
Required Reading:
- Textbook: Jurafsky & Martin (2025). Speech and Language Processing
- Chapter 11: 11.1 to 11.4
- Alammar (2018). The Illustrated Transformer
- Rush (2018). The Annotated Transformer
Highly Recommended Videos:
Optional Reading:
- Devlin et al. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Blog Post: Foundation Models, Transformers, BERT and GPT
Week 6: Mechanistic Interpretability
Required Reading:
- Bricken et al. (2023). Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
- Tennenholtz et al. (2024). Demystifying Embedding Spaces using Large Language Models
Optional Reading:
- Ahuja et al. (2024). Learning Syntax Without Planting Trees: Understanding When and Why Transformers Generalize Hierarchically
- Csordas et al. (2024). Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations
- Engels et al. (2024). Not All Language Model Features Are Linear
- Huben et al. (2023). Sparse Autoencoders Find Highly Interpretable Features in Language Models
- Jiang et al. (2024). On the Origins of Linear Representations in Large Language Models
- Karvonen et al. (2024). Evaluating Sparse Autoencoders
- Lieberum et al. (2024). Gemma Scope: Open Sparse Autoencoders
- Luo et al. (2024). PaCE: Parsimonious Concept Engineering for Large Language Models
- Park et al. (2024). The Linear Representation Hypothesis and the Geometry of Large Language Models
- Park et al. (2024). The Geometry of Categorical and Hierarchical Concepts in Large Language Models
- Rajendran et al. (2024). Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models
- Valios et al. (2024). Frame Representation Hypothesis: Multi-Token LLM Interpretability and Concept-Guided Text Generation
Week 7: Qualitative Coding with LLMs I
Slides (Includes Final Project guidelines)
Required Reading:
- Textbook: Jurafsky & Martin (2025). Speech and Language Processing
- Chapter 12
- Lam et al. (2024). Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM
- Schroeder et al. (2025). Large Language Models in Qualitative Research: Uses, Tensions, and Intentions
- Zhang et al. (2024). When Qualitative Research Meets Large Language Model: Exploring the Potential of QualiGPT as a Tool for Qualitative Coding
Optional Reading:
- Chew et al. (2023). LLM-Assisted Content Analysis
- Dai et al. (2023). LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis
- Eschrich & Sterman (2024). A Framework For Discussing LLMs as Tools for Qualitative Analysis
- Raza et al. (2025). LLM-TA: An LLM-Enhanced Thematic Analysis Pipeline for Transcripts from Parents of Children with Congenital Heart Disease
Week 8: Qualitative Coding with LLMs II
Lab Notebook - Qualitative Coding with APIs | Lab Notebook - Analyze Results
Required Reading:
- Dunivin (2024). Scalable Qualitative Coding with LLMs: Chain-of-Thought Reasoning Matches Human Performance in Some Hermeneutic Tasks
- Rasheed et al. (2024). Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis
- Relin et al. (2024). Using Instruction-Tuned Large Language Models to Identify Indicators of Vulnerability in Police Incident Narratives
Optional Reading:
- De Paoli (2024). Further Explorations on the Use of Large Language Models for Thematic Analysis. Open-Ended Prompts, Better Terminologies and Thematic Maps
- Gao et al. (2025). Using Large Language Model to Support Flexible and Structural Inductive Qualitative Analysis
- Overney et al. (2025). SenseMate: An Accessible and Beginner-Friendly Human-AI Platform for Qualitative Data Analysis
Week 9: Modern Topic Modeling
Required Reading:
- Hoyle et al. (2022). Are Neural Topic Models Broken?
- Pham et al. (2024). TopicGPT: A Prompt-based Topic Modeling Framework
- We et al. (2024). FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model
Optional Reading:
- Egger & Yu (2022). A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts
- Hastings & Pesando (2024). What’s a parent to do?
Week 10: Error Analysis and Validation
Required Reading:
- Carlson et al. (2024). The Use of LLMs to Annotate Data in Management Research: Warnings, Guidelines, and an Application to Organizational Communication
- Ludwig et al. (2024). Large Language Models: An Applied Econometric Framework
- Pangakis et al. (2023). Automated Annotation with Generative AI Requires Validation
Optional Reading:
- Barrie et al. (2024). Replication for Large Language Models
- Egami et al. (2024). Using Imperfect Surrogates for Downstream Inference
- Thalken et al. (2023). Modeling Legal Reasoning: LM Annotation at the Edge of Human Agreement
Week 11: Concept Tracing I
Required Reading:
- Ganguli et al. (2024). Mapping Inventions in the Space of Ideas, 1836–2022: Representation, Measurement, and Validation
-
Li (2024). Tracing the Genealogies of Ideas with Large Language Model Embeddings
- Nascimento et al. (2022). Concept Detection in Philosophical Corpora
Optional Reading:
- Lucy et al. (2023). Words as Gatekeepers
- Rosin et al. (2022). Time Masking for Temporal Language Models
- Vicinanza et al. (2022). Deep-learning model of prescient ideas demonstrates that they emerge from the periphery
Week 12: Concept Tracing II
Required Reading:
- Cong et al. (2024). Textual Factors: A Scalable, Interpretable, and Data-driven Approach to Analyzing Unstructured Information
- Garg & Fetzer (2025). Causal Claims in Economics
- Lam et al. (2024). Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM