etna.datasets.generate_hierarchical_df#

generate_hierarchical_df(periods: int, n_segments: List[int], freq: str | None = 'D', start_time: Timestamp | int | str | None = None, ar_coef: list | None = None, sigma: float = 1, random_seed: int = 1) DataFrame[source]#

Create DataFrame with hierarchical structure and AR process data.

The hierarchical structure is generated as follows:
  1. Number of levels in the structure is the same as length of n_segments parameter

  2. Each level contains the number of segments set in n_segments

  3. Connections from parent to child level are generated randomly.

Parameters:
  • periods (int) – number of timestamps

  • n_segments (List[int]) – number of segments on each level.

  • freq (str | None) – pandas frequency string for pandas.date_range() that is used to generate timestamp

  • start_time (Timestamp | int | str | None) – start timestamp

  • ar_coef (list | None) – AR coefficients

  • sigma (float) – scale of AR noise

  • random_seed (int) – random seed

Returns:

DataFrame at the bottom level of the hierarchy

Raises:
  • ValueError:n_segments is empty

  • ValueError:n_segments contains not positive integers

  • ValueError:n_segments represents not non-decreasing sequence

  • ValueError: – Non-integer timestamp parameter is used for integer-indexed timestamp.

Return type:

DataFrame