Module 2: Univariate Linear Regression

What this module covers

This module lays the groundwork for all supervised Machine Learning algorithms covered in this course — all the way to neural networks. You’ll learn how to find the best-fitting parameters of a linear model by defining a cost function (the Mean Squared Error) and minimizing it using gradient descent.

Starting from any initial guess of the fitting parameters (intercept and slope), gradient descent iteratively improves those guesses by taking steps “downhill in the error landscape.” The size of each step is controlled by a user-selected learning rate, and the process repeats until the error falls below a tolerance — giving you your final model.

You’ll also learn about:

Batch gradient descent — the foundational optimization algorithm, made efficient through vectorized (matrix) operations
Mini-batch and stochastic gradient descent — variants that work better for large datasets
Accelerated methods (e.g., Adam) that automatically adapt the learning rate using second-derivative information

Materials

Slides: Univariate Linear Regression (pdf)

Lecture: Linear Regression, Cost Functions, and Gradient Descent — An interactive notebook developing the theory with code and visualizations. The code snippets and figures in the slides are taken from this notebook, which goes into much more detail.

Practice: Linear Regression — Hands-on exercise applying gradient descent to fit real data.

Prerequisites

Module 1 — basic Python, NumPy, and Matplotlib.

Next module: Module 3: Multivariate Regression — generalizing to multiple features and nonlinear models.