As I say many times in class, the linear regression model is the basis for many other models in statistics and machine learning. In this unit we will begin to see how the basic linear regression model can take us beyond strict linear relationships, starting by adding interactions to regression models. These occur when the effect of one predictor depends on the values of the others (a key assumption of the linear regression models studied this far). We can include these by adding products of predictors to our model. Using a similar idea, we can model any nonlinear relationship in principle. This is the idea of feature space in machine learning. This allows us to build much more complex models and with that comes concerns about overfitting. To this end, we will investigate the tradeoffs that arise when building complex regression models.
Material:
Interactions, Residual plots and diagnostics (interaction detection), Nonlinear models within the linear framework/(focusing on polynomials), Cross validation, test/training data and overfitting.