Chris Holden (ceholden@gmail.com) - https://github.com/ceholden

Now that we can read our data into the computer, let's calculate some vegetation indices.

The Normalized Difference Vegetation Index (NDVI) is so ubiquitous that it even has a Wikipedia entry. If you're here for learning how to do remote sensing image processing using GDAL and Python, I suspect you don't need any introduction to this section. If you need a refresher, please visit the Wikipedia URL for NDVI.

This chapter will be very straightfoward. We've already seen how we can read our imagery into a NumPy array -- this chapter will simply extend these tools by showing how to do simple calculations on NumPy objects.

Let's bring up our previous code for opening our image and reading in the data:

In [6]:

```
# Import the Python 3 print function
from __future__ import print_function
# Import the "gdal" and "gdal_array" submodules from within the "osgeo" module
from osgeo import gdal
from osgeo import gdal_array
# Import the NumPy module
import numpy as np
# Open a GDAL dataset
dataset = gdal.Open('../../example/LE70220491999322EDC01_stack.gtif', gdal.GA_ReadOnly)
# Allocate our array using the first band's datatype
image_datatype = dataset.GetRasterBand(1).DataType
image = np.zeros((dataset.RasterYSize, dataset.RasterXSize, dataset.RasterCount),
dtype=gdal_array.GDALTypeCodeToNumericTypeCode(image_datatype))
# Loop over all bands in dataset
for b in range(dataset.RasterCount):
# Remember, GDAL index is on 1, but Python is on 0 -- so we add 1 for our GDAL calls
band = dataset.GetRasterBand(b + 1)
# Read in the band's data into the third dimension of our array
image[:, :, b] = band.ReadAsArray()
print('Red band mean: {r}'.format(r=image[:, :, 2].mean()))
print('NIR band mean: {nir}'.format(nir=image[:, :, 3].mean()))
```

Even from simple mean statistics over the entire image we can see the contrast between the red and the near-infared (NIR) bands.

To calculate NDVI, we can simply use standard arithmetic operators in Python because these operations in NumPy are vectorized. Just like MATLAB, R, and other higher level languages, **NEVER** loop over a NumPy array unless you can avoid it.

In [4]:

```
b_red = 2
b_nir = 3
ndvi = (image[:, :, b_nir] - image[:, :, b_red]) / (image[:, :, b_red] + image[:, :, b_nir])
print(ndvi)
print(ndvi.max())
```

In Python 2 an integer divided by an integer produces an integer, even if the division would have produced a float point number. Python 3 changed this behavior, but if we run the NDVI calculation with Python 2 we would end up with all of our NDVI values equal to 0 because our input image is an integer datatype (int16). See documentation for division in NumPy for more information.

While we don't necessarily need to change anything in Python 3, it is generally useful to be explicit with the datatypes involved in our calculations for the sake of clarity. Additionally, we generally also want code written using Python 3 to work with Python 2.

In order to ensure we perform our calculation using floating point numbers, we can cast the demoninator or numerator of the calculation to a float:

In [13]:

```
ndvi = (image[:, :, b_nir] - image[:, :, b_red]) / \
(image[:, :, b_nir] + image[:, :, b_red]).astype(np.float64)
print('NDVI matrix: ')
print(ndvi)
print('\nMax NDVI: {m}'.format(m=ndvi.max()))
print('Mean NDVI: {m}'.format(m=ndvi.mean()))
print('Median NDVI: {m}'.format(m=np.median(ndvi)))
print('Min NDVI: {m}'.format(m=ndvi.min()))
```