INST 414 / SDSI 414: Data Science Techniques (Spring 2026)
Format: Blended — async video lectures (online) + in-person Friday labs
Lab Location: BLD4 3321
Lab Time: Fridays 9:00 to 10:30 a.m.
Instructor: Zubin Jelveh (zjelveh@umd.edu)
Office Hours: TBD
Course Description
This course provides a strong foundation in key data science techniques. As the amount of data available to public- and private-sector organizations continues to grow, it has become critical to have data scientists who can leverage this data to generate insights and enable data-driven decision-making, while ensuring that these decisions are based on fair and ethical principles.
You will learn foundational data science techniques including basic probability and statistics, supervised machine learning algorithms (such as Naive Bayes, linear and logistic regression, decision trees, and random forests), feature engineering, working with Python libraries, principles of model selection and evaluation (including cross-validation), and considerations around fairness and bias in machine learning applications. We’ll also touch on text mining techniques and explore generative AI.
Course format: Lecture content is delivered asynchronously through video. Friday sessions are hands-on labs where you’ll apply techniques to real datasets with instructor support.
Note: This course primarily focuses on foundational supervised learning techniques.
Note: Knowledge of Python is assumed.
Course Objectives
After completing this course, students will be able to:
- Apply basic probability concepts and relate them to uncertainty in predictions
- Implement standard machine learning algorithms (linear regression, logistic regression, decision trees, random forest)
- Select and evaluate models using cross-validation and appropriate performance metrics
- Engineer features from raw data to improve model performance
- Identify and address fairness considerations in machine learning applications
Prerequisites
Minimum grade of C- in MATH115 and STAT100; minimum C- in INST126 or GEOG276; minimum C- in INST201, INST301, or BSOS233; minimum C- in a social science course (AASP101, ANTH210, ECON200, GVPT170, PSYC100, SOCY100, etc.); and minimum C- in BSOS233 or INST314.
Technical Requirements
- A laptop capable of running Python (bring to Friday labs)
- No software purchases required — we use free, open-source tools
- Access to ELMS for async lecture videos, assignments, and announcements
Working with Lab Notebooks
Lab notebooks open in Google Colab directly from the links below.
Important: When you open a lab notebook, immediately click File → Save a copy in Drive. This creates your own version in your Google Drive that you can edit and save.
If you skip this step, your work will not be saved.
Turn off AI assistance: Go to Settings → AI Assistance and uncheck everything. AI-generated code is not allowed on assignments in this course.
Assessment
Homework Assignments (30%)
Four homework assignments throughout the semester that build on lecture and lab content. Assignments will be distributed and submitted through ELMS.Weekly Quizzes (20%)
10 weekly quizzes covering lecture material. These are administered through ELMS.Midterm Exam (25%)
Covers material from Weeks 1-8 (probability through random forests). Scheduled for Week 9 (4/3).Final Project or Exam (25%)
Details to come.Weekly Schedule
Week 1 (1/30): Data Science Background
- Video Lecture: Slides | Video
- Lab: Introduction to Pandas — Lab 1 Slides | Lab 1 Notebook | Notes
- Resource: Resources for Learning Pandas
Week 2 (2/6): Probability
- Video Lecture: Slides (Part 1: Probability Basics) | Slides (Part 2: Joint and Marginal Distributions) | Video
- Practice Problems
- Lab: Lab 2 Slides | Lab 2 Notebook
- Resource: Web Resources for Probability
Week 3 (2/13): Conditional Probability
- Video Lecture: Slides
- Lab: Lab 3 Slides | Lab 3 Notebook
Problem Set 1 Out
Week 4 (2/20): Bayes Theorem / Making Predictions
- Video Lecture: Slides
- Lab: Lab 4 Slides | Lab 4 Notebook
Week 5 (2/27): Performance Metrics / Naive Bayes
- Video Lecture: Slides
- Lab: Lab 5 Slides | Lab 5 Notebook
Problem Set 1 Due
Week 6 (3/6): Linear and Logistic Regression
- Video Lecture: Slides
- Lab: Lab 6 Slides | Lab 6 Notebook
Problem Set 2 Out
Week 7 (3/13): Overfitting / Decision Trees
- Video Lecture: Slides
- Lab: Lab 7 Slides | Lab 7 Notebook
Spring Break (3/20): No Class
Week 8 (3/27): Random Forest / Gradient Boosting
- Video Lecture: Slides
- Lab: Lab 8 Slides | Lab 8 Notebook
Problem Set 2 Due
Week 9 (4/3): Midterm (No Lab)
Week 10 (4/10): Fairness
- Video Lecture: Slides
- Lab: Lab 9 Slides | Lab 9 Notebook
Problem Set 3 Out
Week 11 (4/17): Sources of Bias, Part I
- Video Lecture: Slides
Week 12 (4/24): Sources of Bias, Part II
- Video Lecture: Slides
Problem Set 3 Due / Problem Set 4 Out
Week 13 (5/1): Text Mining
- Video Lecture: Slides
Week 14 (5/8): Text Mining / Generative AI
- Video Lecture: Slides
Problem Set 4 Due
Course Policies
Communications
ELMS — Official course site for video lectures, materials, assignments, announcements, and grades. Make sure your email and announcement notifications are enabled.
Email — Administrative requests, quick clarifications. Please prefix the subject line with [INST414]. If you haven’t received a reply within 2 days, please email again.
Office Hours — Complex technical questions.
Accessibility and Disability Support
The University of Maryland is committed to creating and maintaining a welcoming and inclusive educational, working, and living environment for people of all abilities. Students with disabilities who require accommodations should contact the Accessibility and Disability Service (ADS) at 301-314-7682 or adsfrontdesk@umd.edu. Please inform me of any accommodations as soon as possible, preferably within the first two weeks of the semester.
More information: https://www.counseling.umd.edu/ads/
LLM Policy
While you are allowed and encouraged to use LLMs (ChatGPT, Copilot, etc.) to help you learn the material — for example, to explain concepts, debug error messages, or explore topics in more depth — do not use them to complete assignments. If it is clear that an LLM has been used on an assignment, you will receive an automatic zero.
UMD Policies
It is our shared responsibility to know and abide by the University of Maryland’s policies that relate to all courses, which include topics like academic integrity, student and instructor conduct, accessibility and accommodations, attendance and excused absences, grades and appeals, and copyright and intellectual property.
Please visit www.ugst.umd.edu/courserelatedpolicies.html for the full list of campus-wide policies.