etna.datasets.DataFrameFormat#
- class DataFrameFormat(value)[source]#
-
Enum for different kinds of
pd.DataFrame
which can be used.This dataframe stores:
Timestamps;
Segments;
Features. In this context, ‘target’ is also a feature.
Currently, there are formats:
Wide
Has index to store timestamps.
Columns has two levels with names ‘segment’, ‘feature’. Each column stores values for a given feature in a given segment.
List of columns isn’t empty.
There are all combinations for (segment, feature) in the columns.
Long
Has column ‘timestamp’ to store timestamps.
Has column ‘segment’ to store segments.
Has at least one more column except for ‘timestamp’ and ‘segment’.
Currently, we don’t check the types of columns to save compatibility, but it is expected that:
Timestamps have type
int
orpd.Timestamp
. If it isn’t,TSDataset
makes conversion for you.Segments have type
str
. If it isn’t,TSDataset
makes conversion for you.
Methods
determine
(df)Determine format of the given dataframe.
Attributes
Wide format.
Long format.
- classmethod determine(df: DataFrame) DataFrameFormat [source]#
Determine format of the given dataframe.
- Parameters:
df (DataFrame) – Dataframe to infer format.
- Returns:
Format of the given dataframe.
- Raises:
ValueError: – Given long dataframe doesn’t have required column ‘timestamp’
ValueError: – Given long dataframe doesn’t have required column ‘segment’
ValueError: – Given long dataframe doesn’t have any columns except for ‘timestamp` and ‘segment’
ValueError: – Given wide dataframe doesn’t have levels of columns [‘segment’, ‘feature’]
ValueError: – Given wide dataframe doesn’t have any features
ValueError: – Given wide dataframe doesn’t have all combinations of pairs (segment, feature)
- Return type: