Code
!pip install rasterio
!pip install rioxarrayWhich platform should I choose?
📖 Review the Lecture notes
This lab is intended to give anyone who hasn’t worked with multispectral satellite data (in python) before some hands-on experience in doing so. The application (separate lab section) is the classification of water (though other classes, such as vegetation, are very similar) from images over the Alaskan and Canadian Arctic. The examples are from our paper on super-resolution fully convolutional neural networks (FCN).
The notebook covers how to import satellite images with rasterio, which is a python wrapper for gdal, a popular library for working with geospatial data. You can see how to visualize the data stored in an xarray with matplotlib, similar to earlier labs. Next, we turn the data into a regular numpy array, which you can manipulate with the python commands that you learned throughout the semester. Specifically, we perform some basic algebraic operations to compute the (Modified) Normalized Water Index (MNDWI) as a traditional way to classify water. The classification from NDWI is compared to one from the FCN in our paper and we compute various accuracy metrics between those two classification predictions.
We will try to do the full CNN implementation and training in a separate notebook.
There are not many specific lab questions; rather you can tinker with the codes to see if you understand every step and change any number of parameters. For example, instead of plotting a RGB image, you can explore different band combinations to see if water (or other features) stand out more in, e.g., green, blue, and near infrared. Instead of NDWI, you can plot the Normalized Different Vegetation Index (NDVI) and see what that looks like. You are asked to play around with the NDWI threshold to see how easy or hard it is to separate out the water with this approach. And you can do all of this with 8 different satellite images (you have to do ‘Restart and Run all’ when switching between images).
Enjoy.
To work with geospatial data, i.e. satellite imagery, we need to actually install some python packages that aren’t installed by default in Google Colab. pip, or ‘package installer for python’ is one way to do this. Another common one is (ana)conda. We can run pip directly from a jupyter notebook by preceding the usual linux command by an exclamation mark.
Rasterio and xarray are python packages to work with rasterized geo-referenced data.
!pip install rasterio
!pip install rioxarraySome other packages, not all of which may be strictly necessary for this notebook. The last couple of packages allows us to directly import images from URLs, rather than working with locally stored files.
%matplotlib inline
import os
import numpy as np
import cv2
import rasterio as rio
import xarray as xr
import rioxarray
from osgeo import gdal
# for TOA correction:
import pandas as pd
from lxml import etree
import datetime
import tifffile as tiff
import matplotlib.pyplot as plt
plt.rcParams.update({'figure.figsize': (10, 6)})
# Importing images:
from io import BytesIO
import requests
from requests import get
from PIL import ImageI selected a few Sentinel-2 images over rivers in the Alaskan and Canadian Arctic and hosted those on my own Google Drive. Let’s see if we can import these into this notebook.
For anyone that may want to do something similar in the future. There is a trick to edit the ‘share’ link that google provides into a direct link to the file that we can use to download it from a python code. If you have a file in your Google Drive, click Share and select ‘anyone with link can view’, google will give you a link that looks like this:
https://drive.google.com/file/d/YOUR_ID/view?usp=sharing
to open this link from a python script, like this one, we need to edit this to:
https://drive.google.com/uc?export=download&id=YOUR_ID
For example:
# Note, for some reason that I have not yet been able to debug,
# if you want to select a different image you should do 'Restart and Rerun All'
# from the Runtime menu in the top.
choose_image = 1
if choose_image == 1: # yukon
url = 'https://drive.google.com/uc?export=download&id=1fnk58WulRWKt5HJ-lbfG60y4d-Vo3RSN'
url_mask = 'https://drive.google.com/uc?export=download&id=1dUV-Q0DehUlyG4D9_vxpml9eAixrjKvn'
img_name = 'image1.tif'
mask_name = 'mask1.tif'
elif choose_image == 2: #gage7
url = 'https://drive.google.com/uc?export=download&id=1d3YQCy3PJHZ2ZULaGDY5-eMb5-BZAHVL'
url_mask = 'https://drive.google.com/uc?export=download&id=1cKFPzQtWwW1UKgUb2rhnnj38jFetUayZ'
img_name = 'image2.tif'
mask_name = 'mask2.tif'
elif choose_image == 3: #gage2
url = 'https://drive.google.com/uc?export=download&id=179PPA1ZOblysZqssXhNkDH68CuvjQ3S2'
url_mask = 'https://drive.google.com/uc?export=download&id=1GyN7QE8_EJ2eOwqIkcMyG3V5D8F7RSIN'
img_name = 'image3.tif'
mask_name = 'mask3.tif'
elif choose_image == 4: # tanana
url = 'https://drive.google.com/uc?export=download&id=1ZIN7DZOer2hJCVPEwHOBwokEWnII1wLN'
url_mask = 'https://drive.google.com/uc?export=download&id=1P8zrL4bTqRg9xfA2wwSgx8MaqqDSvPxl'
img_name = 'image4.tif'
mask_name = 'mask4.tif'
elif choose_image == 5: # nenana
url = 'https://drive.google.com/uc?export=download&id=1Xa1ALarTTh6tG4uEQfKdaetE7WkEO1FP'
url_mask = 'https://drive.google.com/uc?export=download&id=1gRIHOEW4aztRfCQ81wYM9zFR5f7O-fyf'
img_name = 'image5.tif'
mask_name = 'mask5.tif'
elif choose_image == 6: # gage1
url = 'https://drive.google.com/uc?export=download&id=1pRNrkEYQyafq25Rwdic6kN4eVE--VHCV'
url_mask = 'https://drive.google.com/uc?export=download&id=1WyQH-jRfEPOcYXKvZoiD3L4nCtIZdf-Z'
img_name = 'image6.tif'
mask_name = 'mask6.tif'
elif choose_image == 7: # gage5
url = 'https://drive.google.com/uc?export=download&id=1jOnk9TXbz_0Rn-pr7TIwNGjtC8CpLXLt'
url_mask = 'https://drive.google.com/uc?export=download&id=1cd4hqoWORX3cBeOlTbBlTrrzhoElDp3a'
img_name = 'image7.tif'
mask_name = 'mask7.tif'
elif choose_image == 8: # gage4
url = 'https://drive.google.com/uc?export=download&id=1Ra3FciJN3D2zMgK1QUNilp5kOayG898I'
url_mask = 'https://drive.google.com/uc?export=download&id=1KT1EoCZsV_nmzVkjtGEfz8p_Mu-zf_Q7'
img_name = 'image8.tif'
mask_name = 'mask8.tif'Next, we create an easy-to-use function to download a file from a URL into the jupyter notebook, and give this file a name as if we had it locally stored. Here is the function:
def download_file(url, file_name):
with open(file_name, "wb") as file:
response = get(url)
file.write(response.content)And here is how easy it now is to download a satellite image from a URL link:
download_file(url, img_name)The file we downloaded is not a regular ‘tif’ image, but a geotiff. It includes the intensities (reflectance) of each pixel for each color, as for a regular image, but allows for any number of colors (multi- or hyper-spectral data) and also includes a bunch of meta-data, such as the coordinates of each pixel, the coordinate reference system (CRS), the physical size of each pixel at the surface (e.g., 10 m by 10 m), and more.
For this reason, we cannot open this file with a regular python library for images, but use rasterio or rioxarray, which will turn the data into a rasterized image/array that we can work with. Rasterio is a python wrapper for gdal, which is a common software to work with geospatial data (Matlab has its own functions to work with GDAL).
Load our image into an x-array:
sat_xarr = rioxarray.open_rasterio(img_name)We can easily check out some of the properties by just call this array:
sat_xarrWhich shows in the top the number of spectral bands (6 in this case) and nr of pixels in the \(y\) and \(x\) directions. The next line shows the total nr of values in this array, which is \(x\ \times y\ \times\) bands \(=13,436,040\) and those are of integer type. In the Coordinates section you see that the coordinates are simply called \(x\) and \(y\) and are of floating type and that the bands are numbered as 1, 2, …, 6 (they could be given different names, such as R, G, B). The final section lists some of the meta-data attributes.
You can extract those separately as:
sat_xarr.attrsOther important geospatial metadata are stored in the “accessor” .rio. To see all of those, you can always use python’s dir:
dir(sat_xarr.rio)One majorly important geospatial attribute is the coordinate reference system, CRS, which is EPSG:32609 in this case:
sat_xarr.rio.crsThis looks a bit messy. Here’s a few ways to extract information more cleanly:
crs = sat_xarr.rio.crs
print(crs.to_epsg()) # → 32609
print(crs.to_authority())# → ('EPSG', '32609')
print(crs.to_string()) # → 'EPSG:32609'
print(crs.to_proj4()) # → '+proj=utm +zone=9 +datum=WGS84 +units=m +no_defs'
# print(crs.to_wkt()) # → the full WKT stringWe can also get the spatial resolution of this image, i.e. the size of each pixels in units of meters:
sat_xarr.rio.resolution()You can also extract \(x\) and \(y\) coordinates, e.g., as
sat_xarr.xwhich themselves are xarrays. Even though these are xarrays instead of numpy arrays, some numpy functionality works directly on these xarrays, for example:
np.shape(sat_xarr)np.shape(sat_xarr.x)print("Minimum, maximum, and mean x-coordinates:", np.min(sat_xarr.x).values, np.max(sat_xarr.x).values, np.mean(sat_xarr.x).values)By now perhaps you’re getting curious about that this image actually looks like!? Below we use matplotlib as usual to make a nice figure. Because satellite images can have any number of spectral bands, rather than just RGB, but computer monitors can actually only show 3 colors at once, we have to specify which colors we want to plot. RGB is what our eyes interpret as ‘true color’, but you can also select any other combination as ‘false color’ which can make certain features stand out more clearly (try it!).
When working with xarrays, we can use .sel to select spectral bands by their names, which are just 1, 2, 3, …, 6, in this case.
The actual six bands and their order in these Sentinel-2 images are:
So, note, that the first three bands are BGR rather than the common RGB. Let’s plot these in the usual RGB order:
plt.figure(figsize=(16, 12));
sat_xarr.sel(band=[3, 2, 1]).plot.imshow(rgb='band', robust=True,add_colorbar=True);
plt.axis("off");Exercise: check out plotting different band combinations and see if the water ‘pops’ out more by selecting different bands.
Most/all things you can do with numpy arrays, you can also do with xarrays, plus much more for georeferenced data. For purposes where we don’t explicitly need the georeferencing data, though, the easiest thing to do is to simply load the actual pixel values for multispectral values into a numpy array, so we can use all the familiar codes for standard vector/matrix linear algebra.
Turning an xarray into a numpy array is quite trivial (we’ll save both here with different variable names):
sat_arr = np.asarray(sat_xarr)
sat_arrNote the very last bit of this output: the array is of type 16-bit integer. Check for yourself what the maximum value is for 16-bits integers.
Check dimensions again:
# check dimensions of sat_arrTypically, to show RGB images or to do linear algebra in neural networks, we want the number of channels (colors/bands) to be the last dimension, and not the first as in this (x)array. We can do this easily as:
sat_arr = np.moveaxis(sat_arr, 0, -1)
[nry, nrx, nrbands] = np.shape(sat_arr)
print("Nr of pixels in y and x",nry, nrx, "and nr of bands/channels/colors:", nrbands )Check the range of values (min/max) for each band:
# print min/max values in each bandTraining and then making predictions for a fully convolutional neural network for image segmentation (classifying each pixel) can certainly be done with a jupyter notebook like this, but does take several hours for a sufficiently complex model. Here, we stick with the ‘old school’ approach of classifying water (or, e.g., vegetation) with pixel-based indices that simply compare the brightness of a pixel in different spectral bands. You can then compare those predictions to those from a deep FCN. The main purpose of this lab is to expose you to a little bit of python code for working with satellite imagery, more than practicing with code related to neural networks.
The Normalized Difference Water Index (NDWI) is defined as: \[ \mathrm{NDWI} = \frac{\mathrm{green} - \mathrm{NIR} }{\mathrm{green} + \mathrm{NIR}} \]
In other words, it checks how green each pixel is (not blue, as we colloquially refer to the color of water), relative to near-infrared (NIR).
Technical detail: In python, and some other programming languages, you have to be a bit carefull of the type of your variables. For example, when you perform devisions of variables that are explicitly defined as integers, you can get some weird results. Look at the two examples below, which essentially perform the same operation of \(\frac{204-205}{204+205}\). In the first case, the values are defined as numpy arrays with just 1 integer values and in the second case they are defined as floats. The second answer is correct, whereas the former is probably not what anyone would want.
(np.array(204,np.uint8) - np.array(205,np.uint8)) / (np.array(204,np.uint8) + np.array(205,np.uint8))To be super explicit, I therefore just turn our Sentinel image data into floats:
sat_arrf = sat_arr.astype('float');Now we can calculate the NDWI index for each pixel. We can do so either by treating the arrays as single numbers, similar to how you would do this in Matlab (i.e. just adding, subtracting, multiplying, and dividing arrays element-by-element):
green = sat_arrf[:,:,1]
nir = sat_arrf[:,:,3]
ndwi = (green - nir)/(green + nir)
print("Min and max of NDWI", np.round(np.min(ndwi),2), np.round(np.max(ndwi),2))Or you can use numpy functions that spell out the operations more explicitly.
ndwi2 = np.divide( np.subtract(green,nir), np.add(green,nir))Can you check that these two approaches give the same results?
np.sum(1*(ndwi != ndwi2))
# check that the above two definitions of NDWI give the same resultMake a plot of the NDWI values for this image:
# plot NDWIFor each pixel, NDWI is a single number and if you think about the definition of the NDWI index \[ \mathrm{NDWI} = \frac{\mathrm{green} - \mathrm{NIR} }{\mathrm{green} + \mathrm{NIR}} \] a bit, you’ll realize that these values are –by definition– between -1 and +1. So this is a grayscale image (but ranging from -1 to 1 instead of the usual 0 to 1).
The idea behind using NDWI and similar indices is that the NDWI values are high for water and smaller for not-water. More specifically, if NDWI is higher than some threshold value, we can classify that pixel as water and if it is less than that threshold, we classify it as not-water, or land. In other words, we need to threshold the continuous (\(-1<\mathrm{NDWI}<+1\)) NDWI index to obtain a binary classification result of water vs. land.
In the next cell, I do this for one particular choice of threshold, define a separate variable for the resulting binary water-land mask, and plot the result.
Exercise: * Define a variablethreshold with some value of your choice. * Create a new variable like ndwi_bin to hold a binary water/land mask.
ndwi_bin = np.copy(ndwi)
A[A>0]=1
# Create binary land/water mask
# plotAnother common approach to do the same thing is to use the modified normalized water index, or MNDWI. There is nothing more to this than to take a short-wave inferared (SWIR) band instead of NIR. In other words, we compare the green band of a pixel to an ‘even-more-red’ band for that same pixel. \[ \mathrm{NDWI} = \frac{\mathrm{green} - \mathrm{SWIR} }{\mathrm{green} + \mathrm{SWIR}} \] Which wavelengths all of these bands refer to is different for each satellite, and as a result, the NDWI/MNDWI values and the corresponding thresholds are also different for every satellite. For Sentinel-2, we actually have 2 SWIR bands that are included in the data provided to your here as the last 2 bands.
Exercise: Copy-paste and edit the cells above to explore MNDWI indices based on the green and SWIR-1 as well as SWIR-2 bands, adjust thresholds, and see which combination gives you the best water classification.
As mentioned above, performing a full FCN classification from within a jupyter notebook within the time of a single lab is perhaps not feasible, so I just provide you with masks that my group produced for these same images using our super-resolution FCN. In other words, we actually classify the 10-m resolution Sentinel-2 image at 2-m resolution.
In the cell below, we download the mask, import into a georeferenced xarray and then immediately convert into a regular numpy array. For storage efficiency, the mask labels were stored as integers 0 (land) and 255 (water), whereas our NDWI derived masks was labeled as 0 (land) and 1 (water). To be consistent, I rescale the FCN labels to 0 and 1 as well.
download_file(url_mask, mask_name)mask_xarr = rioxarray.open_rasterio(mask_name)mask_arr = np.squeeze(np.asarray(mask_xarr,dtype='float')/255)
np.shape(mask_arr)Note the \(5 \times\) larger pixel dimensions.
** Exercise** Plot the FCN and NDWI classifications side-by-side.
# Plot your own land/water mask from (M)NDWI thresholding and the FCN predictions side-by-side.Question: which one looks better?
We will treat the FCN prediction as the ‘ground truth’ to see how well, or poorly, a thresholded NDWI index performs relative to the deep neural network prediction. The specific FCN labels that I provide are from our super-resolution FCN, so they are actually at 5x higher resolution than the native Sentinel-2 image and the derived NDWI classification. We could either downscale the FCN predictions to the 10-m Sentinel resolution or upsample the NDWI classification to 2-m resolution. Let’s do the latter to obtain a high-resolution mask. There are several ways to do the upsampling. Below you can choose between Nearest Neighbor or a more sophisticated cubic interpolation.
# ndwi_fine = cv2.resize(ndwi_bin, dsize=(np.shape(mask_arr)[1],np.shape(mask_arr)[0]), interpolation=cv2.INTER_CUBIC)
ndwi_fine = cv2.resize(ndwi_bin, dsize=(np.shape(mask_arr)[1],np.shape(mask_arr)[0]), interpolation=cv2.INTER_NEAREST)**Exercise* As a first comparison between the FCN and NDWI predictions, take the difference between the two masks and plot that, together with a color bar to show pixels that are -1, 0, 1?
# Plot difference between FCN and NDWI water/land masksWhen evaluating image segmentation, each pixel is treated as an individual prediction (rather than each image).
A segmentation model assigns every pixel to a class — for example, water, forest, or urban — and we compare those predictions to the ground-truth mask.
| Metric | Definition | Intuition |
|---|---|---|
| Precision | \(\displaystyle \text{Precision} = \frac{TP}{TP + FP}\) | Of all pixels the model labeled as a class (e.g., “water”), what fraction were actually that class? High precision → few false positives (the model rarely labels a non-water pixel as water). |
| Recall | \(\displaystyle \text{Recall} = \frac{TP}{TP + FN}\) | Of all true pixels belonging to that class, what fraction did the model correctly identify? High recall → few false negatives (the model rarely misses real water pixels). |
| F1-score | \(\displaystyle F_1 = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}\) | The harmonic mean of precision and recall — a balanced measure that penalizes extreme trade-offs. \(F_1 = 1\) means perfect segmentation; \(F_1 = 0\) means total failure. |
A confusion matrix summarizes how well a classification or segmentation model performs by comparing predicted and true labels.
For a segmentation problem with N classes (e.g., water, forest, urban), the confusion matrix is an N×N table:
| Predicted: Class 1 | Predicted: Class 2 | … | Predicted: Class N | |
|---|---|---|---|---|
| True: Class 1 | TP1 | FP1→2 | … | FP1→N |
| True: Class 2 | FP2→1 | TP2 | … | FP2→N |
| … | … | … | … | … |
| True: Class N | FPN→1 | FPN→2 | … | TPN |
For example: - If many “forest” pixels are misclassified as “pasture,” the cell at (true=forest, predicted=pasture) will have a large value. - A perfect segmentation would produce a diagonal matrix (all predictions match ground truth).
Confusion matrices are often shown as heatmaps, where brighter diagonal cells mean better performance:
plt.imshow(confusion_matrix, cmap="Blues")
plt.xlabel("Predicted class")
plt.ylabel("True class")
plt.title("Confusion Matrix")
plt.colorbar()
## Example: Confusion Matrix for Binary Land–Water Segmentation
For **binary segmentation**, we only have two possible classes:
- **Positive class:** e.g., *Water*
- **Negative class:** e.g., *Land*
The confusion matrix becomes a simple **2×2 table**:
| | **Predicted: Water (Positive)** | **Predicted: Land (Negative)** |
|:-----------------------------|:-------------------------------:|:------------------------------:|
| **True: Water (Positive)** | **True Positive (TP)** — pixels correctly classified as water | **False Negative (FN)** — water pixels incorrectly labeled as land |
| **True: Land (Negative)** | **False Positive (FP)** — land pixels incorrectly labeled as water | **True Negative (TN)** — pixels correctly classified as land |
A **2×2 heatmap** of these values highlights where your segmentation is missing water (false negatives) or mislabeling land (false positives).
Let's code this up (using code stolen from [here](https://stackoverflow.com/questions/50021928/build-confusion-matrix-from-two-vector)). So the first element of the confusion matrix `cm` is the total number of pixels that were both predicted and labeled as land (by far the largest number), the last (bottom-right) element is the nr of pixels both predicted and labeled as water, etc.
::: {#cell-88 .cell}
``` {.python .cell-code}
cm = np.zeros((2, 2), dtype=int)
np.add.at(cm, (ndwi_fine.astype('uint8'),mask_arr.astype('uint8')), 1)
[[TN,FN],[FP,TP]] = cm
cm:::
We can now use the true-positive, true-negative, false-positive, and false-negative elements of the confusion matrix to define the precision, recall, and F1 score of the NDWI classification, relative to the FCN prediction, which we treat here as the ground truth:
precision = TP/(TP+FP)
recall = TP/(TP+FN)
F1 = 2*precision*recall/(precision+recall)
print("Precision is:", int(100*precision),"%")
print("Recall is:", int(100*recall),"%")
print("F1 score is harmonic average of precision and recall:", int(100*F1),"%")The numbers that you see here will depend on what image you are considering and whether you used NDWI or some MNDWI, and finally what threshold you selected. In general, you will see that NDWI can have a high recall, i.e. it catches most pixels that should be labeled as water, but a (often much) lower prediction, because it labels many pixels as water that are not water. Big rivers are easier to classify than small water bodies; and the middle of a big river is easier to classify than the precise shorelines. So NDWI can look pretty good for a huge river like the Yukon, but less and less good when looking at more complex scenes.
This short lab is primarily intended to just give you some very basic/introductory exposure to importing and manipulating multispectral satellite imagery in python / jupyter notebooks. Together with the more rigorous exploration of various machine learning algorithms throughout this semester, and their implementation from scratch in python, I hope this course will have given you a pretty solid starting point for any future research questions you may want to pursue that would benefit from python coding and/or machine learning.