Coverage for stems/io/encoding.py : 98%

Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
#: NumPy string types (str, bytes, unicode)
chunks=None, zlib=True, complevel=4, nodata=None, **encoding_kwds): """ Return "good" NetCDF encoding information for some data
The returned encoding is the default or "good" known standard for data used in ``stems``. Each default determined in this function is given as a keyword argument to allow overriding, and you can also pass additional encoding items via ``**encoding``. You may pass one override for all data, or overrides for each data variable (as a dict).
For more information, see the NetCDF4 documentation for the ``createVariable`` [1].
Parameters ---------- data : xr.DataArray or xr.Dataset Define encoding for this data. If ``xr.Dataset``, map function across all ``xr.DataArray`` in ``data.data_vars`` dtype : np.dtype, optional The data type used for the encoded data. Defaults to the input data type(s), but can be set to facilitate discretization based compression (typically alongside scale_factor and _FillValue) chunks : None, tuple or dict, optional Chunksizes used to encode NetCDF. If given as a `tuple`, chunks should be given for each dimension. Chunks for dimensions not specified when given as a `dict` will default to 1. Passing ``False`` will not use chunks. zlib : bool, optional Use compression complevel : int, optional Compression level nodata : int, float, or sequence, optional NoDataValue(s). Specify one for each ``DataArray`` in ``data`` if a :py:class:`xarray.Dataset`. Used for ``_FillValue`` encoding_kwds : dict Additional encoding data to pass
Returns ------- dict Dict mapping band name (e.g., variable name) to relevant encoding information
See Also -------- xarray.Dataset.to_netcdf Encoding information designed to be passed to :py:meth:`xarray.Dataset.to_netcdf`.
References ---------- .. [1] http://unidata.github.io/netcdf4-python/#netCDF4.Dataset.createVariable """ raise TypeError(f'Unknown type for input ``data`` "{type(data)}"')
dtype=None, chunks=None, zlib=True, complevel=4, nodata=None, **encoding_kwds):
# dtype: Determine & guard/fixup
# chunksizes: Determine and guard/fixup
# _FillValue
# complevel & zlib: compression
# Fill in user input
dtype=None, chunks=None, zlib=True, complevel=4, nodata=None, **encoding_kwds): # Allow user to specify local (key exists) or global (KeyError) kwds
# Construct, unpacking if needed data[var], dtype=dtype[var] if _is_dict(dtype) else dtype, chunks=chunks[var] if _has_key(chunks, var) else chunks, zlib=zlib[var] if _is_dict(zlib) else zlib, complevel=complevel[var] if _is_dict(complevel) else complevel, nodata=nodata[var] if _is_dict(nodata) else nodata, **kwds )
# ---------------------------------------------------------------------------- # Encoding components for DataArray(s) """ Return the name of the variable to provide encoding for
Either returns the name of the DataArray, or the name that XArray will assign it when writing to disk (:py:data:`xarray.backends.api.DATAARRAY_VARIABLE`).
Parameters ---------- xarr : xarray.DataArray Provide the name of this DataArray used for encoding
Returns ------- str Encoding variable name """
"""Get dtype encoding info
Parameters ---------- xarr : xarray.DataArray or np.ndarray DataArray to consider
Returns ------- dict[str, np.dtype] Datatype information for encoding (e.g., ``{'dtype': np.float32}``)
"""
""" Find/resolve chunksize for a DataArray
Parameters ---------- xarr : xarray.DataArray DataArray to consider chunks : tuple[int] or Mapping[str, int] Chunks per dimension
Returns ------- tuple[int] Chunksizes per dimension """ # Grab chunks from DataArray # Default to 1 chunk per dimension if none found for dim in xarr.dims)
# ---------------------------------------------------------------------------- # Encoding checks, safeguards, and fixes """ Guard chunksize to be <= dimension sizes
Parameters ---------- xarr : xarray.DataArray DataArray to consider chunksizes : tuple[int] Chunks per dimension
Returns ------- tuple[int] Guarded chunksizes """
f'size ({size}). Resetting ({csize}->{size})') chunksizes_.append(size) else:
"""Guard chunk sizes for str datatypes
Chunks for ``str`` need to include string length dimension since python-netcdf represents, for example, 1d char array as 2d array
Parameters ---------- xarr : xarray.DataArray DataArray to consider chunksizes : tuple[int] Chunk sizes per dimension
Returns ------- tuple[int] Guarded chunk sizes """ f'"{xarr.name}" corresponding to the string length') f'"{xarr.name}"')
"""Guard dtype encoding for datetime datatypes
Parameters ---------- xarr : xarray.DataArray or np.ndarray DataArray to consider dtype_ : dict[str, np.dtype] Datatype information for encoding (e.g., ``{'dtype': np.float32}``)
Returns ------- dict[str, np.dtype] Datatype information for encoding (e.g., ``{'dtype': np.float32}``), if valid. Otherwise returns empty dict """ # Don't encode datatype for datetime types, since xarray changes it
|