etna.transforms.embeddings.models.TS2VecEmbeddingModel#
- class TS2VecEmbeddingModel(input_dims: int, output_dims: int = 320, hidden_dims: int = 64, depth: int = 10, device: Literal['cpu', 'cuda'] = 'cpu', batch_size: int = 16, num_workers: int = 0, max_train_length: int | None = None, temporal_unit: int = 0, is_freezed: bool = False)[source]#
Bases:
BaseEmbeddingModel
TS2Vec embedding model.
If there are NaNs in series, embeddings will not contain NaNs.
Each following calling of
fit
method continues the learning of the same model.For more details read the paper.
Notes
Model’s weights are transferred to cpu during loading.
Init TS2VecEmbeddingModel.
- Parameters:
input_dims (int) – The input dimension. For a univariate time series, this should be set to 1.
output_dims (int) – The representation dimension.
hidden_dims (int) – The hidden dimension of the encoder.
depth (int) – The number of hidden residual blocks in the encoder.
device (Literal['cpu', 'cuda']) – The device used for training and inference. To swap device, change this attribute.
batch_size (int) – The batch size. To swap batch_size, change this attribute.
num_workers (int) – How many subprocesses to use for data loading. See (api reference
torch.utils.data.DataLoader
). To swap num_workers, change this attribute.max_train_length (int | None) – The maximum allowed sequence length for training. For sequence with a length greater than
max_train_length
, it would be cropped into some sequences, each of which has a length less thanmax_train_length
.temporal_unit (int) – The minimum unit to perform temporal contrast. When training on a very long sequence, this param helps to reduce the cost of time and memory.
is_freezed (bool) – Whether to
freeze
model in constructor or not. For more details seefreeze
method.
Notes
In case of long series to reduce memory consumption it is recommended to use max_train_length parameter or manually break the series into smaller subseries.
Methods
encode_segment
(x[, mask, sliding_length, ...])Create embeddings of the whole series.
encode_window
(x[, mask, sliding_length, ...])Create embeddings of each series timestamp.
fit
(x[, lr, n_epochs, n_iters, verbose])Fit TS2Vec embedding model.
freeze
([is_freezed])Enable or disable skipping training in
fit
.Return a list of available pretrained models.
load
([path, model_name])Load an object.
save
(path)Save the object.
set_params
(**params)Return new object instance with modified parameters.
to_dict
()Collect all information about etna object in dict.
Attributes
This class stores its
__init__
parameters as attributes.Return whether to skip training during
fit
.- encode_segment(x: ndarray, mask: Literal['binomial', 'continuous', 'all_true', 'all_false', 'mask_last'] = 'all_true', sliding_length: int | None = None, sliding_padding: int = 0) ndarray [source]#
Create embeddings of the whole series.
- Parameters:
x (ndarray) – data with shapes (n_segments, n_timestamps, input_dims).
mask (Literal['binomial', 'continuous', 'all_true', 'all_false', 'mask_last']) –
the mask used by encoder on the test phase can be specified with this parameter. The possible options are:
’binomial’ - mask timestamp with probability 0.5 (default one, used in the paper). It is used on the training phase.
’continuous’ - mask random windows of timestamps
’all_true’ - mask none of the timestamps
’all_false’ - mask all timestamps
’mask_last’ - mask last timestamp
sliding_length (int | None) – the length of sliding window. When this param is specified, a sliding inference would be applied on the time series.
sliding_padding (int) – contextual data length used for inference every sliding windows.
- Returns:
array with embeddings of shape (n_segments, output_dim)
- Return type:
- encode_window(x: ndarray, mask: Literal['binomial', 'continuous', 'all_true', 'all_false', 'mask_last'] = 'all_true', sliding_length: int | None = None, sliding_padding: int = 0, encoding_window: int | None = None) ndarray [source]#
Create embeddings of each series timestamp.
- Parameters:
x (ndarray) – data with shapes (n_segments, n_timestamps, input_dims).
mask (Literal['binomial', 'continuous', 'all_true', 'all_false', 'mask_last']) –
the mask used by encoder on the test phase can be specified with this parameter. The possible options are:
’binomial’ - mask timestamp with probability 0.5 (default one, used in the paper). It is used on the training phase.
’continuous’ - mask random windows of timestamps
’all_true’ - mask none of the timestamps
’all_false’ - mask all timestamps
’mask_last’ - mask last timestamp
sliding_length (int | None) – the length of sliding window. When this param is specified, a sliding inference would be applied on the time series.
sliding_padding (int) – the contextual data length used for inference every sliding windows.
encoding_window (int | None) – when this param is specified, the computed representation would be the max pooling over this window. This param will be ignored when encoding full series
- Returns:
array with embeddings of shape (n_segments, n_timestamps, output_dim)
- Return type:
- fit(x: ndarray, lr: float = 0.001, n_epochs: int | None = None, n_iters: int | None = None, verbose: bool | None = None) TS2VecEmbeddingModel [source]#
Fit TS2Vec embedding model.
- Parameters:
x (ndarray) – data with shapes (n_segments, n_timestamps, input_dims).
lr (float) – The learning rate.
n_epochs (int | None) – The number of epochs. When this reaches, the training stops.
n_iters (int | None) – The number of iterations. When this reaches, the training stops. If both n_epochs and n_iters are not specified, a default setting would be used that sets n_iters to 200 for a dataset with size <= 100000, 600 otherwise.
verbose (bool | None) – Whether to print the training loss after each epoch.
- Return type:
- freeze(is_freezed: bool = True)[source]#
Enable or disable skipping training in
fit
.- Parameters:
is_freezed (bool) – whether to skip training during
fit
.
- static list_models() List[str] [source]#
Return a list of available pretrained models.
Main information about available models:
ts2vec_tiny:
Number of parameters - 40k
Dimension of output embeddings - 16
- classmethod load(path: Path | None = None, model_name: str | None = None) TS2VecEmbeddingModel [source]#
Load an object.
Model’s weights are transferred to cpu during loading.
- Parameters:
path (Path | None) –
Path to load object from.
if
path
is not None andmodel_name
is None, load the local model frompath
.if
path
is None andmodel_name
is not None, save the externalmodel_name
model to the etna folder in the home directory and load it. Ifpath
exists, external model will not be downloaded.if
path
is not None andmodel_name
is not None, save the externalmodel_name
model topath
and load it. Ifpath
exists, external model will not be downloaded.
model_name (str | None) – Name of external model to load. To get list of available models use
list_models
method.
- Returns:
Loaded object.
- Raises:
ValueError: – If none of parameters
path
andmodel_name
are set.NotImplementedError: – If
model_name
isn’t from list of available model names.
- Return type:
- set_params(**params: dict) Self [source]#
Return new object instance with modified parameters.
Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a
model
in aPipeline
.Nested parameters are expected to be in a
<component_1>.<...>.<parameter>
form, where components are separated by a dot.- Parameters:
**params (dict) – Estimator parameters
- Returns:
New instance with changed parameters
- Return type:
Self
Examples
>>> from etna.pipeline import Pipeline >>> from etna.models import NaiveModel >>> from etna.transforms import AddConstTransform >>> model = NaiveModel(lag=1) >>> transforms = [AddConstTransform(in_column="target", value=1)] >>> pipeline = Pipeline(model, transforms=transforms, horizon=3) >>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2}) Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )