Instructor: Ethan Levien.
Meeting time: T/TH @ 10-12
Office Hours: T @ 1
Prerequisites: Technically math 10; however, there are many other paths to prepare you for this course. The import thing is that you have some exposure to probability theory and are comfortable coding.
Course objectives: You will learn how to build, fit and make predictions with regression models. Such models form the basis of many widely used data analysis techniques, including most machine learning algorithms. This is an applied course, and we won’t spend much time deriving equations or proving theorems. Instead, we will learn by playing around with real and simulated data. We will challenge the underlying assumptions of a method and see when things “break”. The goal is to help you develop an intuition about statistical inference, which can be generalized to real world settings where theorems and analytical formulas are not applicable.
Coding: All coding for the course will be done in Python. Some advantages of python (over R) are that: (1) We can easily run everything directly in the browser using google colab notebooks, so there is no need to download anything on your machine. (2) Python is widely used in data science and machine learning, both in academia and in industry. (3) While some very basic things are more difficult to implement, I think you’ll find it’s easier to generalize to more advanced methods. (4) I know python and I’ve basically never coded in R.
Attendence: The course meets twice a week and attendance is mandatory.
Grading: Your grade will be based on the following assignments:
Grades will be based on a combination of self-evaluation and my own assessment (more on this in class).
Exercises: Your “homework” is to submit solutions to a set of exercises. I say “homework” because I plan to incorperate problem solving sessions into the lectures, giving you more time to discuss problems with myself and your peers. You will submit you solutions to gradescope (approximately) every week before I release the solutions. Then, the following week you will self-evaluate (i.e. grade) you solutions and submit the evaluation. You should use the following point scale, which I will elaborate on in class:
The # of points you get for each self-evaluations are simply the score you’ve given yourself plus an additional point for providing an explanation of what you did wrong. The graders will simply review your self-evaluations.
Textbooks: The course will be self-contained in the python notebooks and class notes. However, I will occasionally reference the following textbooks for additional readings (note that the first two texts use R rather than python, so the coding components can be skipped):
Software: All coding will be done using python in colab notebooks. Within python there are a number of packages we will use throughout the course, including:
You do not need to be an expert in any of these packages and I will provide you with skeleton code for most tasks. That said, I expect you to have some basic proficiency debugging, which will occasionally involve referencing the documentation for these packages.
Week | Topics | Reading | Assignments |
---|---|---|---|
1 |
|
|
|
2 |
|
|
|
3 |
|
|
|
4 |
|
|
|
5 |
|
|
|
6 |
|
|
|
7 |
|
|
|
8 |
|
||
9 |
|
|
Students with disabilities who may need disability-related academic adjustments and services for this course are encouraged to see me privately as early in the term as possible. Students requiring disability- related academic adjustments and services must consult the Student Accessibility Services office (Carson Hall, Suite 125, 646-9900). Once SAS has authorized services, students must show the originally signed SAS Services and Consent Form and/or a letter on SAS letterhead to me. As a first step, if students have questions about whether they qualify to receive academic adjustments and services, they should contact the SAS office. All inquiries and discussions will remain confidential.