Instructor: Ethan Levien
Prerequisites: There are many other paths to prepare you for this course. The important thing is that you have some exposure to probability theory and are comfortable coding.
Not registered? Fill out this form
Course objectives: This is an introductory/intermediate statistics class with an emphasis on using simulation to explore statistical models. You will learn how to build, fit, and make predictions with regression models. Such models form the basis of many widely used data analysis techniques, including machine learning algorithms. This is an applied course, and we won’t spend much time deriving equations or proving theorems (at least compared to MATH 40 and 70). Instead, we will learn by playing around with real and simulated data. We will challenge the underlying assumptions of a method and see when things “break”. The goal is to help you develop an intuition about statistical inference, which can be generalized to real-world settings where theorems and analytical formulas are not applicable. See weekly schedule for details.
Note on coding: All coding for the course will be done in Python. Some advantages of Python (over R) are that: (1) We can easily run everything directly in the browser using Google Colab Notebooks, so there is no need to download anything on your machine. (2) Python is widely used in data science and machine learning, both in academia and in industry. (3) While some very basic things are more difficult to implement, I think you’ll find it’s easier to generalize to more advanced methods. (4) I know Python and I’ve basically never coded in R.
My availability: I will hold office hours in person:
I will generally be available to answer questions on Slack (please use Slack over email for course-related matters) throughout the week. I will occasionally answer questions on the weekend, but there is no guarantee.
Attendance and class rules: The course meets twice a week and attendance is mandatory. You should bring a laptop computer or tablet on Fridays when I will do live coding exercises. Please do not use phones or computers in class except on Friday. If you have an issue with this policy, please come talk to me. I may ask you to leave the room if I find your use of technology to be distracting.
Exams: There will be an in-class midterm quiz and a final. I will provide practice exams and additional practice problems for each one.
Exercises: Your homework is to submit solutions to a set of exercises. You will submit your solutions to Canvas (approximately) every week. Then, the following week you will self-evaluate (i.e., grade) your solutions and submit the evaluation. You should use the following point scale, which I will elaborate on in class.
The number of points you get for each self-evaluation is the score you’ve given yourself plus an additional point for providing an explanation of what you did wrong. The graders will review your self-evaluations.
Guidelines for turning in exercises:Final grades: See Canvas for details on how your final grade will be computed.
Use of Large Language Models (LLM) such as ChatGPT: I STRONGLY encourage you to use LLM to assist with exercises and the final project. However, it is important that you use it in a way that supports learning the material and not as a crutch. Therefore, usage of ChatGPT and other AI programs is subject to the following guidelines.
Here are some examples of acceptable LLM usage:
x
, y
, and z
, how do I make a scatter plot where x
and y
are the coordinates and z
is the color of the point?"[Note: I reserve the right to change the guidelines above at any point during the term.]
Textbooks: The course is based on two textbooks:
Software: All coding will be done using Python in Colab Notebooks. Within Python, there are a number of packages we will use throughout the course, including:
Students with disabilities who may need disability-related academic adjustments and services for this course are encouraged to see me privately as early in the term as possible. Students requiring disability-related academic adjustments and services must consult the Student Accessibility Services office (Carson Hall, Suite 125, 646-9900). Once SAS has authorized services, students must show the originally signed SAS Services and Consent Form and/or a letter on SAS letterhead to me. As a first step, if students have questions about whether they qualify to receive academic adjustments and services, they should contact the SAS office. All inquiries and discussions will remain confidential.