Instructor: Ethan Levien
Prerequisites: Some exposure to probability/statistics (e.g., Math 10, 20) and comfort with coding (e.g., CS 1, internships). Also, see Unit 0.
Not registered? Fill out this form.
Course objectives:
This is an introductory/intermediate statistics class focusing on regression modeling, especially linear regression. These models are the foundation of many widely used data analysis techniques, including machine learning algorithms. You will learn what is going on “under the hood” in regression models, not just how to implement them. Compared to more mathematical statistics courses, the emphasis will be on computational experiments with real and simulated data, rather than theorems and proofs.
Topics include: basic probability theory and statistical inference, single- and multiple-predictor linear regression models, model evaluation, overfitting, nonlinear models, connections between classical statistics and machine learning. The course will also emphasize the responsible and safe use of AI tools to supplement traditional coding and mathematical calculations. Applications to the social and natural sciences will be discussed.
See weekly schedule for details.
Attendance and class rules: Come to class. Please do not use phones or computers in class except during in-class problem-solving sessions. If you have an issue with this policy, please come talk to me. I may ask you to leave the room if your use of technology is distracting.
My availability:
Textbooks: My notes are mostly self contained, although I will reference material from a few textbooks:
You should be able to find PDFs of all these books online.
Software:
All coding will be done using Python in Colab Notebooks. Within Python, we will use several packages throughout the course, including:
Your grade will be based on the following. You should see Canvas for the specific grading scheme and see the linked pages for assignment details.
Exams: There will be two exams:
Project: You will complete a project as described in the project page. You may work in groups of up to 3, and all students in a group will receive the same grade. The project should adhere to the guidelines on the Canvas assignment.
Contributing to course material: You can earn extra credit by contributing to course material. Contributions can include improving the class notes, adding exercises and suggesting exam problems. All contributions must be made in the form of pull requests on github, as described on the contribution page.
Exercises: At the end of each section there are a number of exercises. You should complete all of them and ask questions if you have any. Exercises marked with a ❐ are specifically recommended for the exams.
Students with disabilities who may need disability-related academic adjustments and services for this course are encouraged to see me privately as early in the term as possible. Students requiring disability-related academic adjustments and services must consult the Student Accessibility Services office (Carson Hall, Suite 125, 646-9900). Once SAS has authorized services, students must show the originally signed SAS Services and Consent Form and/or a letter on SAS letterhead to me.
As a first step, if you have questions about whether you qualify to receive academic adjustments and services, please contact the SAS office. All inquiries and discussions will remain confidential.