XArray Integration

The stems project tries to help better integrate XArray and NetCDF4 data with GDAL/rasterio, especially with respect to handling geospatial attributes and data. While you may import various functions from this module to use within your own code, there is also a convenient way of accessing many of these integrations that uses xarray “accessors”.

For more information on XArray accessors, please refer to their documentation on extending xarray.

Initialize The Accessor

In order to use the .stems xarray accessor, you need to import the module from stems:

In [1]: import stems.xarray_accessor

Examples

Attributes

In the following example, we use some functions from our test suite that generate example data:

In [2]: from stems.tests.build_data import create_test_dataset

In [3]: test_ds = create_test_dataset()

In [4]: test_ds
Out[4]: 
<xarray.Dataset>
Dimensions:  (time: 100, x: 11, y: 7)
Coordinates:
  * y        (y) float64 185.0 155.0 125.0 95.0 65.0 35.0 5.0
  * x        (x) float64 115.0 145.0 175.0 205.0 ... 325.0 355.0 385.0 415.0
  * time     (time) datetime64[ns] 2000-01-01 ... 2010-01-01
    crs      int32 32619
Data variables:
    blu      (y, x, time) int16 dask.array<shape=(7, 11, 100), chunksize=(3, 5, 25)>
    grn      (y, x, time) int16 dask.array<shape=(7, 11, 100), chunksize=(3, 5, 25)>
    red      (y, x, time) int16 dask.array<shape=(7, 11, 100), chunksize=(3, 5, 25)>
    nir      (y, x, time) int16 dask.array<shape=(7, 11, 100), chunksize=(3, 5, 25)>

We can inspect the “crs” coordinate, which are Climate Forecast (CF) formatted Grid Mapping variables. These conventions govern how various projection and ellipsoid parameters should be represented. The conventions are different than what the Open Geospatial Consortium (OSGEO/GDAL/etc) use, but can be converted.

In [5]: test_ds['crs']
Out[5]: 
<xarray.DataArray 'crs' ()>
array(32619, dtype=int32)
Coordinates:
    crs      int32 32619
Attributes:
    grid_mapping_name:                 transverse_mercator
    projected_coordinate_system_name:  WGS 84 / UTM zone 19N
    horizontal_datum_name:             WGS_1984
    reference_ellipsoid_name:          WGS 84
    prime_meridian_name:               Greenwich
    latitude_of_projection_origin:     0.0
    longitude_of_central_meridian:     -69.0
    scale_factor_at_central_meridian:  0.9996
    false_easting:                     500000.0
    false_northing:                    0.0
    semi_major_axis:                   6378137.0
    semi_minor_axis:                   6356752.314245179
    inverse_flattening:                298.257223563
    spatial_ref:                       PROJCS["WGS 84 / UTM zone 19N",GEOGCS[...
    GeoTransform:                      [100.  30.   0. 200.   0. -30.]

The stems XArray accessor makes accessing geospatial metadata much easier:

In [6]: print('CRS:\n', test_ds.stems.crs)
CRS:
 EPSG:32619

In [7]: print('Affine transform:\n', test_ds.stems.transform)
Affine transform:
 | 30.00, 0.00, 100.00|
| 0.00,-30.00, 200.00|
| 0.00, 0.00, 1.00|

In [8]: print('Bounds:\n', test_ds.stems.bounds)
Bounds:
 BoundingBox(left=100.0, bottom=-10.0, right=430.0, top=200.0)

In [9]: print('Bounding box:\n', test_ds.stems.bbox)
Bounding box:
 POLYGON ((430 -10, 430 200, 100 200, 100 -10, 430 -10))

Georeferencing

We can also assign georeferencing information for new data that we might create:

# Calculate (and drop the CRS to show example)
In [10]: ratio = (test_ds['nir'] / test_ds['red']).drop('crs')

In [11]: ratio.stems.is_georeferenced()
Out[11]: False

In [12]: ratio
Out[12]: 
<xarray.DataArray (y: 7, x: 11, time: 100)>
dask.array<shape=(7, 11, 100), dtype=float64, chunksize=(3, 5, 25)>
Coordinates:
  * y        (y) float64 185.0 155.0 125.0 95.0 65.0 35.0 5.0
  * x        (x) float64 115.0 145.0 175.0 205.0 ... 325.0 355.0 385.0 415.0
  * time     (time) datetime64[ns] 2000-01-01 ... 2010-01-01

Now we can use the stems accessor to georeference the data:

In [13]: ratio_ = ratio.stems.georeference(test_ds.stems.crs,
   ....:                                   test_ds.stems.transform)
   ....: 

In [14]: ratio_.stems.is_georeferenced()
Out[14]: True

In [15]: ratio_
Out[15]: 
<xarray.DataArray (y: 7, x: 11, time: 100)>
dask.array<shape=(7, 11, 100), dtype=float64, chunksize=(3, 5, 25)>
Coordinates:
  * y        (y) float64 185.0 155.0 125.0 95.0 65.0 35.0 5.0
  * x        (x) float64 115.0 145.0 175.0 205.0 ... 325.0 355.0 385.0 415.0
  * time     (time) datetime64[ns] 2000-01-01 ... 2010-01-01
    crs      int32 32619
Attributes:
    grid_mapping:  crs
    Conventions:   CF-1.7

For comparison, you could also directly use components from stems to do the same thing:

In [16]: from stems.gis import convert, conventions, coords

# Get the CRS/transform from source data "test_ds"
In [17]: grid_mapping = conventions.get_grid_mapping(test_ds)

In [18]: crs_ = convert.to_crs(grid_mapping.attrs['spatial_ref'])

In [19]: transform_ = coords.coords_to_transform(ratio['y'], ratio['x'])

# Apply
In [20]: ratio_ = conventions.georeference(ratio, crs_, transform_)

In [21]: ratio_.stems.is_georeferenced()
Out[21]: True

In [22]: ratio_
Out[22]: 
<xarray.DataArray (y: 7, x: 11, time: 100)>
dask.array<shape=(7, 11, 100), dtype=float64, chunksize=(3, 5, 25)>
Coordinates:
  * y        (y) float64 185.0 155.0 125.0 95.0 65.0 35.0 5.0
  * x        (x) float64 115.0 145.0 175.0 205.0 ... 325.0 355.0 385.0 415.0
  * time     (time) datetime64[ns] 2000-01-01 ... 2010-01-01
    crs      int32 32619
Attributes:
    grid_mapping:  crs
    Conventions:   CF-1.7

As you can see, it’s a lot faster and more straightforward to use the accessors!

API