Chris Holden (ceholden@gmail.com) - https://github.com/ceholden
One of the most fundamental components of the GDAL raster library is a "class)", or an object, that stores all of the information one might want about a raster image. This class, the GDALDataset
, combines the information about a raster file with a few actions one might want to perform with a raster image, like reading from an image. The information stored by a class is generally referred to as "properties)" or attributes and the actions a class can perform are called "methods)".
If you're coming from another language and want an overview of object oriented programming in Python, see the Python tutorial on classes or the LearnPython class tutorial.
For those wishing for a verbose, full reference to the the GDAL "Application Programming Interface" (API), you can find the C and Python API here:
Some class methods include:
GetDriver
GetRasterBand
GetGeoTransform
GetProjection
GetSubDatasets
These class methods are so called "getter" methods that allow you to access class attributes (remember: class attributes are just variables that belong to the class). When you call the class method, GetDriver
, the GDAL dataset will return the image format driver (e.g., ENVI driver, GeoTIFF driver, HDF driver) responsibile for handling the input and output operations for this raster file format. Similarly, the GetGeoTransform
method will the transformation that can be used to translate between pixel coordinates and projection coordinates.
Note: the "getter" and "setter" class methods for accessing and settting class properties is not "Pythonic" - these methods exist because the API was originally written for C++ where such methods are normal.
Another suite of class methods allow you to set class attributes. These include:
SetGeoTransform
SetProjection
which allow you to modify the geographic projection and location of the image.
Now that we've seen some of how the GDALDataset object encapsulates many of the ideas relevant to the concept of a raster image, let's see how we can implement these ideas in Python.
Before we can get started, we need to tell Python that we will be using functions, classes, and variables from the GDAL Python package. The technical wording for this is that we need to import the GDAL module into our namespace (see Python's documentation on the module
system here).
We will do this using some import
statements:
# Import the Python 3 print function
from __future__ import print_function
# Import the "gdal" submodule from within the "osgeo" module
from osgeo import gdal
# We can check which version we're running by printing the "__version__" variable
print("GDAL's version is: " + gdal.__version__)
print(gdal)
Once we import the gdal
submodule, Python will know where to look on our system for the code that implements the GDAL API. When we want to access classes, variables, or functions within the gdal
submodule, we will need to reference the full path, which includes the gdal
reference:
# Let's print out the value of the GDAL Byte data type (GDT_Byte)
# the number printed out is how GDAL keeps track of the various data types
# this variable, which has a fixed numeric value, is what's called an enumerated type, or enum
# Works
print(gdal.GDT_Byte)
# Doesn't work
print(GDT_Byte)
The datatype GDT_Byte
doesn't exist in the global namespace. We need to tell Python where to look for it to find it.
With some basic explanation of Python's namespace setup and how it applies to GDAL out of the way, let's get into some examples:
When we open an image in GDAL we create a GDALDataset object. As the name would suggest, we can open an image with the "Open" function within gdal
.
We will use an example image provided in this repository for this chapter. This image is a subset of a Landsat 7 image containing the 7 bands on this sensor rearranged in order of wavelength (e.g., Landsat 7's second SWIR channel comes before thermal channel in our stack). The last band in this image is a cloud and cloud shadow mask from Fmask.
# Open a GDAL dataset
dataset = gdal.Open('../../example/LE70220491999322EDC01_stack.gtif', gdal.GA_ReadOnly)
print(dataset)
Now that we have this dataset open, let's explore some of its capabilities.
# How many bands does this image have?
num_bands = dataset.RasterCount
print('Number of bands in image: {n}\n'.format(n=num_bands))
# How many rows and columns?
rows = dataset.RasterYSize
cols = dataset.RasterXSize
print('Image size is: {r} rows x {c} columns\n'.format(r=rows, c=cols))
# Does the raster have a description or metadata?
desc = dataset.GetDescription()
metadata = dataset.GetMetadata()
print('Raster description: {desc}'.format(desc=desc))
print('Raster metadata:')
print(metadata)
print('\n')
# What driver was used to open the raster?
driver = dataset.GetDriver()
print('Raster driver: {d}\n'.format(d=driver.ShortName))
# What is the raster's projection?
proj = dataset.GetProjection()
print('Image projection:')
print(proj + '\n')
# What is the raster's "geo-transform"
gt = dataset.GetGeoTransform()
print('Image geo-transform: {gt}\n'.format(gt=gt))
The first few pieces of information we obtained are fairly straightforward - the raster size, the number of bands, a description, some metadata, and the raster's file format.
The image's projection is formatted in what's known as "Well Known Text". For more information on specific projections and for format conversions among projection description formats (e.g., proj4 string, WKT, ESRI WKT, JSON, etc.) see Spatial Reference.
The last piece of information we accessed is something called a "geotransform". This set of 6 numbers provides all the information required to and from transform pixel and projected coordinates. In this example, the first number (462405) and the fourth number (1741815) are the top left of the upper left pixel of the raster. The pixel size in x and y dimensions of the raster is listed as the second (30) and the sixth (-30) numbers. Since our raster is north up oriented, the third and fifth numbers are 0. For more information on the GDAL data model, visit this web page.
The GDALDataset object we created contains a lot of useful information but it is not directly used to read in the raster image. Instead we will need to access each of the raster's bands individually using the method GetRasterBand
:
# Open the blue band in our image
blue = dataset.GetRasterBand(1)
print(blue)
Following our guide of the GDALDataset, let's explore some of the attributes and methods of the GDALRasterBand:
# What is the band's datatype?
datatype = blue.DataType
print('Band datatype: {dt}'.format(dt=blue.DataType))
# If you recall from our discussion of enumerated types, this "3" we printed has a more useful definition for us to use
datatype_name = gdal.GetDataTypeName(blue.DataType)
print('Band datatype: {dt}'.format(dt=datatype_name))
# We can also ask how much space does this datatype take up
bytes = gdal.GetDataTypeSize(blue.DataType)
print('Band datatype size: {b} bytes\n'.format(b=bytes))
# How about some band statistics?
band_max, band_min, band_mean, band_stddev = blue.GetStatistics(0, 1)
print('Band range: {minimum} - {maximum}'.format(maximum=band_max,
minimum=band_min))
print('Band mean, stddev: {m}, {s}\n'.format(m=band_mean, s=band_stddev))
Note that we didn't need to read the image into Python's memory to calculate these statistics - GDAL did all of this for us.
For most applications, however, we will need to use GDAL to read our raster bands into memory. When we load our raster band into memory we will read it into a NumPy 2 dimensional array. NumPy is, "the fundamental package for scientific computing with Python", because it allows us to represent our data in a very memory efficient way.
NumPy arrays are the cornerstone or building block of the rest of the Scientific Python suite of software. Get familiar with them:
Just as we made the routines and data types from GDAL available to us using import
, we'll load up NumPy. When we import NumPy, we'll also give it an alias so that we don't have to type numpy
every time we want to use it.
# No alias
import numpy
print(numpy.__version__)
# Alias or rename to "np" -- a very common practice
import numpy as np
print(np.__version__)
In order to read our band into one of these wonderful np.array
objects, we will use the ReadAsArray
method from our GDALRasterBand
object:
help(blue.ReadAsArray)
The method ReadAsArray
takes arguments that allow us to specify a subset of the raster band image using X and Y offsets and sizes. Remember this ability when you want to process large images or are working with a limited amount of memory. In these circumstances, you will run out of memory if you read the entire dataset in at once. Instead, read in a block of some number of columns and rows at one time, perform your computation and store your output, and then chunk through the rest of the image.
For now, we'll just read in the entire image:
blue_data = blue.ReadAsArray()
print(blue_data)
print()
print('Blue band mean is: {m}'.format(m=blue_data.mean()))
print('Size is: {sz}'.format(sz=blue_data.shape))
With our data read into a NumPy array, we can print it to console and even perform statistics on it. In addition to helping us store massive amounts of data efficiently, NumPy will help us with some basic linear algebra, numerical operations, and summary statistics.
Let's say we want to read all of our bands into one 3 dimensional (nrow x ncol x nband) dataset. We will begin by initializing a 3 dimensional array. Next, we will loop over all bands in our raster image dataset and read them into our newly allocated 3 dimensional array:
# Initialize a 3d array -- use the size properties of our image for portability!
image = np.zeros((dataset.RasterYSize, dataset.RasterXSize, dataset.RasterCount))
# Loop over all bands in dataset
for b in range(dataset.RasterCount):
# Remember, GDAL index is on 1, but Python is on 0 -- so we add 1 for our GDAL calls
band = dataset.GetRasterBand(b + 1)
# Read in the band's data into the third dimension of our array
image[:, :, b] = band.ReadAsArray()
print(image)
print(image.dtype)
One minor tweak we can make is to ensure that we read our GDAL image into a NumPy array of a matching datatype. GDAL has a function which can make this GDAL
<-> NumPy
translation for us:
dataset.GetRasterBand(1).DataType
from osgeo import gdal_array
# DataType is a property of the individual raster bands
image_datatype = dataset.GetRasterBand(1).DataType
# Allocate our array, but in a more efficient way
image_correct = np.zeros((dataset.RasterYSize, dataset.RasterXSize, dataset.RasterCount),
dtype=gdal_array.GDALTypeCodeToNumericTypeCode(image_datatype))
# Loop over all bands in dataset
for b in range(dataset.RasterCount):
# Remember, GDAL index is on 1, but Python is on 0 -- so we add 1 for our GDAL calls
band = dataset.GetRasterBand(b + 1)
# Read in the band's data into the third dimension of our array
image_correct[:, :, b] = band.ReadAsArray()
print("Compare datatypes: ")
print(" when unspecified: {dt}".format(dt=image.dtype))
print(" when specified: {dt}".format(dt=image_correct.dtype))
Much more efficiently done this way -- we're saving 4x the memory!
As you proceed into the next chapter, the last key concept to understand is how to de-allocate memory within Python.
In order to close out your GDAL datasets and to signal that your NumPy arrays can be de-allocated, you can set them to None
:
dataset = None
image = None
image_correct = None