stems.datasets.simulate module

Tools for generating simulated datasets

The intent of this module is to provide tools for generating simulated data that is usefulf for testing, debugging, and learning. This module is inspired by the sklearn.datasets.make_classification() and sklearn.datasets.make_regression(), among others.

stems.datasets.simulate.make_dates(date_start=None, date_end=None, date_freq=None)[source]

Return datetime64 dates

Parameters
  • date_start (str, datetime, and more, optional) – Starting date (in a format known to Pandas)

  • date_end (str, datetime, and more, optional) – Ending date (in a format known to Pandas)

  • date_freq (str, optional) – Date frequency

Returns

Dates as np.datetime64

Return type

np.ndarray

stems.datasets.simulate.make_segments(date_start=None, date_end=None, date_freq=None, n_series=3, n_segments=2, seg_sep=None, means=None, stds=None, trends=None, amplitudes=None, phases=None)[source]

Simulate data from multiple temporal segments

Parameters
  • date_start (str, datetime, and more, optional) – Starting date (in a format known to Pandas)

  • date_end (str, datetime, and more, optional) – Ending date (in a format known to Pandas)

  • date_freq (str, optional) – Date frequency

  • n_series (int) – Number of series/spectral bands to simulate

  • n_segments (int) – Number of segments to simulate

  • seg_sep (Sequence[float]) – Separability of segments (i.e., the size of the disturbance)

  • means (Sequence[float]) – The mean value for each series/spectral band (passed to make_time_series_mean())

  • stds (Sequence[float]) – The standard deviation for each series/spectral band (passed to make_time_series_mean())

  • trends (Sequence[float]) – The time trend for each series/spectral band (passed to make_time_series_trend())

  • amplitudes (Sequence[float]) – The harmonic amplitude value for each series/spectral band (passed to make_time_series_harmonic())

  • phases (Sequence[float]) – The harmonic phase value for each series/spectral band (passed to make_time_series_harmonic())

Returns

  • xr.DataArray – Simulated data for n_segments across n_series series/spectral bands

  • np.ndarray – Array of datetime64 indicating the dates of change (size=``n_segments - 1``)

stems.datasets.simulate.make_time_series_harmonic(dates, amplitude=None, phase=None)[source]
stems.datasets.simulate.make_time_series_mean(dates, mean=None, std=None)[source]
stems.datasets.simulate.make_time_series_noise(dates, mean=0.0, std=1.0)[source]

Generate a time series of noise

stems.datasets.simulate.make_time_series_trend(dates, trend=None)[source]