etna.transforms.ResampleWithDistributionTransform#
- class ResampleWithDistributionTransform(in_column: str, distribution_column: str, inplace: bool = True, out_column: str | None = None)[source]#
Bases:
IrreversiblePerSegmentWrapper
ResampleWithDistributionTransform resamples the given column using the distribution of the other column.
This transform expects
in_column
to have non-NaN values separated by the same number of timestamps to form a cycle. The cycle starts with a non-NaN value and each position has a number from 0 to cycle size - 1.During
fit'', the fraction of each cycle position in a total sum of values is calculated according to ``distribution_column''. During ``transform
the NaNs withinin_column
are filled using the learned distribution.The most common application of this transform is to fill NaNs in
in_column
that come from data with a different frequency. For example, a dataset has an hourly frequency, but an exogenous variable has only a daily frequency.Warning
This transform can suffer from look-ahead bias. For transforming data at some timestamp it uses information from the whole train part.
Init ResampleWithDistributionTransform.
- Parameters:
in_column (str) – name of column to be resampled
distribution_column (str) – name of column to obtain the distribution from
inplace (bool) –
if True, apply resampling inplace to
in_column
,if False, add transformed column to dataset
out_column (str | None) – name of added column. If not given, use
self.__repr__()
Methods
fit
(ts)Fit the transform.
fit_transform
(ts)Fit and transform TSDataset.
Return the list with regressors created by the transform.
Inverse transform TSDataset.
load
(path)Load an object.
Get grid for tuning hyperparameters.
save
(path)Save the object.
set_params
(**params)Return new object instance with modified parameters.
to_dict
()Collect all information about etna object in dict.
transform
(ts)Transform TSDataset inplace.
Attributes
This class stores its
__init__
parameters as attributes.- fit(ts: TSDataset) ResampleWithDistributionTransform [source]#
Fit the transform.
- Parameters:
ts (TSDataset) –
- Return type:
- fit_transform(ts: TSDataset) TSDataset [source]#
Fit and transform TSDataset.
May be reimplemented. But it is not recommended.
- classmethod load(path: Path) Self [source]#
Load an object.
Warning
This method uses
dill
module which is not secure. It is possible to construct malicious data which will execute arbitrary code during loading. Never load data that could have come from an untrusted source, or that could have been tampered with.- Parameters:
path (Path) – Path to load object from.
- Returns:
Loaded object.
- Return type:
Self
- params_to_tune() Dict[str, BaseDistribution] [source]#
Get grid for tuning hyperparameters.
This is default implementation with empty grid.
- Returns:
Empty grid.
- Return type:
- set_params(**params: dict) Self [source]#
Return new object instance with modified parameters.
Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a
model
in aPipeline
.Nested parameters are expected to be in a
<component_1>.<...>.<parameter>
form, where components are separated by a dot.- Parameters:
**params (dict) – Estimator parameters
- Returns:
New instance with changed parameters
- Return type:
Self
Examples
>>> from etna.pipeline import Pipeline >>> from etna.models import NaiveModel >>> from etna.transforms import AddConstTransform >>> model = NaiveModel(lag=1) >>> transforms = [AddConstTransform(in_column="target", value=1)] >>> pipeline = Pipeline(model, transforms=transforms, horizon=3) >>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2}) Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )