Module 7: Artificial Neural Networks
What this module covers
This module introduces the ultimate family of Machine Learning algorithms and the stepping stone to deep learning: Neural Networks.
Artificial Neural Networks (ANN) are essentially combinations of multiple logistic regressions (for classification) or multiple numerical regressions (for continuous targets). They’re inspired by how the human brain works:
- The building blocks are perceptrons — each one computes a weighted sum of inputs (the logit) and applies an activation function (like a sigmoid) to determine whether the “neuron” fires
- By combining many perceptrons into hidden layers, ANNs create complex nonlinear combinations of input features — the network does the feature engineering for you
- This is more powerful and more scalable than manually constructing polynomial features, which can quickly get out of control
The lecture notebook codes an ANN from scratch so you can peek inside the normal “black box” of how neural networks learn. The practice exercise guides you through applying an ANN to a different dataset to develop intuition for hyperparameter tuning (number of layers, nodes, learning rate, regularization).
Materials
Slides: Artificial Neural Networks (pdf)
Lecture: Artificial Neural Networks — feedforward networks, backpropagation, and activation functions, with an ANN coded from scratch on the MNIST handwriting dataset.
Practice: Neural Networks — apply an ANN to a Fashion-MNIST dataset. Experiment with network architecture and hyperparameters to achieve accurate classification without overfitting.
These exercises use small, optimized datasets (handwriting digits, fashion items) so that training completes in minutes on a CPU. The same ANN architectures apply to more complex scientific problems, but those typically require GPU acceleration — which you’ll explore in Module 8.
Prerequisites
Module 6 — cross-validation and regularization (these concepts apply directly to neural networks).
Next module: Module 8: Convolutional Neural Networks — the state-of-the-art for computer vision and satellite image analysis.