View Jupyter notebook on the GitHub.

Deep learning examples#

Binder

This notebooks contains examples with neural network models.

Table of contents

  • Loading dataset

  • Architecture

  • Testing models

    • Baseline

    • DeepAR

    • DeepARNative

    • TFT

    • TFTNative

    • RNN

    • MLP

    • Deep State Model

    • N-BEATS Model

    • PatchTS Model

[1]:
!pip install "etna[torch]" -q
[2]:
import warnings

warnings.filterwarnings("ignore")
[3]:
import random

import numpy as np
import pandas as pd
import torch

from etna.analysis import plot_backtest
from etna.datasets.tsdataset import TSDataset
from etna.metrics import MAE
from etna.metrics import MAPE
from etna.metrics import SMAPE
from etna.models import SeasonalMovingAverageModel
from etna.pipeline import Pipeline
from etna.transforms import DateFlagsTransform
from etna.transforms import LabelEncoderTransform
from etna.transforms import LagTransform
from etna.transforms import LinearTrendTransform
from etna.transforms import SegmentEncoderTransform
from etna.transforms import StandardScalerTransform
[4]:
def set_seed(seed: int = 42):
    """Set random seed for reproducibility."""
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)

1. Loading dataset#

We are going to take some toy dataset. Let’s load and look at it.

[5]:
df = pd.read_csv("data/example_dataset.csv")
df.head()
[5]:
timestamp segment target
0 2019-01-01 segment_a 170
1 2019-01-02 segment_a 243
2 2019-01-03 segment_a 267
3 2019-01-04 segment_a 287
4 2019-01-05 segment_a 279

Our library works with the special data structure TSDataset. Let’s create it as it was done in “Get started” notebook.

[6]:
ts = TSDataset(df, freq="D")
ts.head(5)
[6]:
segment segment_a segment_b segment_c segment_d
feature target target target target
timestamp
2019-01-01 170 102 92 238
2019-01-02 243 123 107 358
2019-01-03 267 130 103 366
2019-01-04 287 138 103 385
2019-01-05 279 137 104 384

2. Architecture#

Our library has two types of models:

First, let’s describe the pytorch-forecasting models, because they require a special handling. There are two ways to use these models: default one and via using PytorchForecastingDatasetBuilder for using extra features.

To include extra features we use PytorchForecastingDatasetBuilder class.

Let’s look at it closer.

[7]:
from etna.models.nn.utils import PytorchForecastingDatasetBuilder
[8]:
?PytorchForecastingDatasetBuilder
Init signature:
PytorchForecastingDatasetBuilder(
    max_encoder_length: int = 30,
    min_encoder_length: Optional[int] = None,
    min_prediction_idx: Optional[int] = None,
    min_prediction_length: Optional[int] = None,
    max_prediction_length: int = 1,
    static_categoricals: Optional[List[str]] = None,
    static_reals: Optional[List[str]] = None,
    time_varying_known_categoricals: Optional[List[str]] = None,
    time_varying_known_reals: Optional[List[str]] = None,
    time_varying_unknown_categoricals: Optional[List[str]] = None,
    time_varying_unknown_reals: Optional[List[str]] = None,
    variable_groups: Optional[Dict[str, List[int]]] = None,
    constant_fill_strategy: Optional[Dict[str, Union[str, float, int, bool]]] = None,
    allow_missing_timesteps: bool = True,
    lags: Optional[Dict[str, List[int]]] = None,
    add_relative_time_idx: bool = True,
    add_target_scales: bool = True,
    add_encoder_length: Union[bool, str] = True,
    target_normalizer: Union[pytorch_forecasting.data.encoders.TorchNormalizer, pytorch_forecasting.data.encoders.NaNLabelEncoder, pytorch_forecasting.data.encoders.EncoderNormalizer, str, List[Union[pytorch_forecasting.data.encoders.TorchNormalizer, pytorch_forecasting.data.encoders.NaNLabelEncoder, pytorch_forecasting.data.encoders.EncoderNormalizer]], Tuple[Union[pytorch_forecasting.data.encoders.TorchNormalizer, pytorch_forecasting.data.encoders.NaNLabelEncoder, pytorch_forecasting.data.encoders.EncoderNormalizer]]] = 'auto',
    categorical_encoders: Optional[Dict[str, pytorch_forecasting.data.encoders.NaNLabelEncoder]] = None,
    scalers: Optional[Dict[str, Union[sklearn.preprocessing._data.StandardScaler, sklearn.preprocessing._data.RobustScaler, pytorch_forecasting.data.encoders.TorchNormalizer, pytorch_forecasting.data.encoders.EncoderNormalizer]]] = None,
)
Docstring:
Builder for PytorchForecasting dataset.

Note
----
This class requires ``torch`` extension to be installed.
Read more about this at :ref:`installation page <installation>`.
Init docstring:
Init dataset builder.

Parameters here is used for initialization of :py:class:`pytorch_forecasting.data.timeseries.TimeSeriesDataSet` object.
File:           /opt/conda/lib/python3.9/site-packages/etna/models/nn/utils.py
Type:           type
Subclasses:

We can see a pretty scary signature, but don’t panic, we will look at the most important parameters.

  • time_varying_known_reals — known real values that change across the time (real regressors), now it it necessary to add “time_idx” variable to the list;

  • time_varying_unknown_reals — our real value target, set it to ["target"];

  • max_prediction_length — our horizon for forecasting;

  • max_encoder_length — length of past context to use;

  • static_categoricals — static categorical values, for example, if we use multiple segments it can be some its characteristics including identifier: “segment”;

  • time_varying_known_categoricals — known categorical values that change across the time (categorical regressors);

  • target_normalizer — class for normalization targets across different segments.

Our library currently supports these pytorch-forecasting models:

  • DeepAR (will be removed in version 3.0),

  • TFT (will be removed in version 3.0).

As for the native neural network models, they are simpler to use, because they don’t require PytorchForecastingTransform. We will see how to use them on examples.

3. Testing models#

In this section we will test our models on example.

[9]:
HORIZON = 7
metrics = [SMAPE(), MAPE(), MAE()]

3.1 Baseline#

For comparison let’s train some simple model as a baseline.

[10]:
model_sma = SeasonalMovingAverageModel(window=5, seasonality=7)
linear_trend_transform = LinearTrendTransform(in_column="target")

pipeline_sma = Pipeline(model=model_sma, horizon=HORIZON, transforms=[linear_trend_transform])
[11]:
metrics_sma, forecast_sma, fold_info_sma = pipeline_sma.backtest(ts, metrics=metrics, n_folds=3, n_jobs=1)
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.3s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.3s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.4s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.4s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.2s
[12]:
metrics_sma
[12]:
segment SMAPE MAPE MAE fold_number
0 segment_a 6.343943 6.124296 33.196532 0
0 segment_a 5.346946 5.192455 27.938101 1
0 segment_a 7.510347 7.189999 40.028565 2
1 segment_b 7.178822 6.920176 17.818102 0
1 segment_b 5.672504 5.554555 13.719200 1
1 segment_b 3.327846 3.359712 7.680919 2
2 segment_c 6.430429 6.200580 10.877718 0
2 segment_c 5.947090 5.727531 10.701336 1
2 segment_c 6.186545 5.943679 11.359563 2
3 segment_d 4.707899 4.644170 39.918646 0
3 segment_d 5.403426 5.600978 43.047332 1
3 segment_d 2.505279 2.543719 19.347565 2
[13]:
score = metrics_sma["SMAPE"].mean()
print(f"Average SMAPE for Seasonal MA: {score:.3f}")
Average SMAPE for Seasonal MA: 5.547
[14]:
plot_backtest(forecast_sma, ts, history_len=20)
../_images/tutorials_202-NN_examples_28_0.png

3.2 DeepAR#

[15]:
from etna.models.nn import DeepARModel

Before training let’s fix seeds for reproducibility.

[16]:
set_seed()

Default way#

[17]:
model_deepar = DeepARModel(
    encoder_length=HORIZON,
    decoder_length=HORIZON,
    trainer_params=dict(max_epochs=20, gradient_clip_val=0.1),
    lr=0.01,
    train_batch_size=16,
)
metrics = [SMAPE(), MAPE(), MAE()]

pipeline_deepar = Pipeline(model=model_deepar, horizon=HORIZON)
[18]:
metrics_deepar, forecast_deepar, fold_info_deepar = pipeline_deepar.backtest(ts, metrics=metrics, n_folds=3, n_jobs=1)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name                   | Type                   | Params
------------------------------------------------------------------
0 | loss                   | NormalDistributionLoss | 0
1 | logging_metrics        | ModuleList             | 0
2 | embeddings             | MultiEmbedding         | 0
3 | rnn                    | LSTM                   | 1.6 K
4 | distribution_projector | Linear                 | 22
------------------------------------------------------------------
1.6 K     Trainable params
0         Non-trainable params
1.6 K     Total params
0.006     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=20` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:   54.2s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name                   | Type                   | Params
------------------------------------------------------------------
0 | loss                   | NormalDistributionLoss | 0
1 | logging_metrics        | ModuleList             | 0
2 | embeddings             | MultiEmbedding         | 0
3 | rnn                    | LSTM                   | 1.6 K
4 | distribution_projector | Linear                 | 22
------------------------------------------------------------------
1.6 K     Trainable params
0         Non-trainable params
1.6 K     Total params
0.006     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=20` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:  1.8min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name                   | Type                   | Params
------------------------------------------------------------------
0 | loss                   | NormalDistributionLoss | 0
1 | logging_metrics        | ModuleList             | 0
2 | embeddings             | MultiEmbedding         | 0
3 | rnn                    | LSTM                   | 1.6 K
4 | distribution_projector | Linear                 | 22
------------------------------------------------------------------
1.6 K     Trainable params
0         Non-trainable params
1.6 K     Total params
0.006     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=20` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  2.7min
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  2.7min
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.3s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.4s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.4s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[19]:
metrics_deepar
[19]:
segment SMAPE MAPE MAE fold_number
0 segment_a 11.991780 11.136381 60.220559 0
0 segment_a 3.738702 3.802561 19.429988 1
0 segment_a 9.372203 9.128006 47.781721 2
1 segment_b 8.085429 7.699512 20.147574 0
1 segment_b 4.951003 4.924274 11.986943 1
1 segment_b 5.498260 5.821215 12.443900 2
2 segment_c 5.561443 5.633259 9.442315 0
2 segment_c 7.060727 6.841533 12.387767 1
2 segment_c 5.387421 5.496503 9.709660 2
3 segment_d 6.425421 6.287131 55.566650 0
3 segment_d 3.536986 3.587421 27.874773 1
3 segment_d 6.445379 6.539991 50.958557 2

To summarize it we will take mean value of SMAPE metric because it is scale tolerant.

[20]:
score = metrics_deepar["SMAPE"].mean()
print(f"Average SMAPE for DeepAR: {score:.3f}")
Average SMAPE for DeepAR: 6.505

Dataset Builder: creating dataset for DeepAR with etxtra features.#

[21]:
from pytorch_forecasting.data import GroupNormalizer

num_lags = 10

transform_date = DateFlagsTransform(
    day_number_in_week=True,
    day_number_in_month=False,
    day_number_in_year=False,
    week_number_in_month=False,
    week_number_in_year=False,
    month_number_in_year=False,
    season_number=False,
    year_number=False,
    is_weekend=False,
    out_column="dateflag",
)
transform_lag = LagTransform(
    in_column="target",
    lags=[HORIZON + i for i in range(num_lags)],
    out_column="target_lag",
)
lag_columns = [f"target_lag_{HORIZON+i}" for i in range(num_lags)]

dataset_builder_deepar = PytorchForecastingDatasetBuilder(
    max_encoder_length=HORIZON,
    max_prediction_length=HORIZON,
    time_varying_known_reals=["time_idx"] + lag_columns,
    time_varying_unknown_reals=["target"],
    time_varying_known_categoricals=["dateflag_day_number_in_week"],
    target_normalizer=GroupNormalizer(groups=["segment"]),
)

Now we are going to start backtest.

[22]:
set_seed()

model_deepar = DeepARModel(
    dataset_builder=dataset_builder_deepar,
    trainer_params=dict(max_epochs=20, gradient_clip_val=0.1),
    lr=0.01,
    train_batch_size=16,
)

pipeline_deepar = Pipeline(
    model=model_deepar,
    horizon=HORIZON,
    transforms=[transform_lag, transform_date],
)
[23]:
metrics_deepar, forecast_deepar, fold_info_deepar = pipeline_deepar.backtest(ts, metrics=metrics, n_folds=3, n_jobs=1)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name                   | Type                   | Params
------------------------------------------------------------------
0 | loss                   | NormalDistributionLoss | 0
1 | logging_metrics        | ModuleList             | 0
2 | embeddings             | MultiEmbedding         | 35
3 | rnn                    | LSTM                   | 2.2 K
4 | distribution_projector | Linear                 | 22
------------------------------------------------------------------
2.3 K     Trainable params
0         Non-trainable params
2.3 K     Total params
0.009     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=20` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:   52.0s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name                   | Type                   | Params
------------------------------------------------------------------
0 | loss                   | NormalDistributionLoss | 0
1 | logging_metrics        | ModuleList             | 0
2 | embeddings             | MultiEmbedding         | 35
3 | rnn                    | LSTM                   | 2.2 K
4 | distribution_projector | Linear                 | 22
------------------------------------------------------------------
2.3 K     Trainable params
0         Non-trainable params
2.3 K     Total params
0.009     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=20` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:  1.8min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name                   | Type                   | Params
------------------------------------------------------------------
0 | loss                   | NormalDistributionLoss | 0
1 | logging_metrics        | ModuleList             | 0
2 | embeddings             | MultiEmbedding         | 35
3 | rnn                    | LSTM                   | 2.2 K
4 | distribution_projector | Linear                 | 22
------------------------------------------------------------------
2.3 K     Trainable params
0         Non-trainable params
2.3 K     Total params
0.009     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=20` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  2.7min
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  2.7min
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.4s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.7s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.7s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s

Let’s compare results across different segments.

[24]:
metrics_deepar
[24]:
segment SMAPE MAPE MAE fold_number
0 segment_a 6.805589 6.531080 35.181074 0
0 segment_a 3.220955 3.193370 16.990396 1
0 segment_a 6.250621 5.970211 33.195578 2
1 segment_b 7.510025 7.240365 18.528839 0
1 segment_b 3.394227 3.402080 8.272507 1
1 segment_b 2.550606 2.556617 6.122042 2
2 segment_c 2.711822 2.738570 4.532950 0
2 segment_c 4.593303 4.459809 8.320413 1
2 segment_c 4.944583 4.842134 9.074206 2
3 segment_d 5.825117 5.610581 51.673697 0
3 segment_d 4.951742 5.077606 39.208156 1
3 segment_d 5.460450 5.284514 46.397400 2

To summarize it we will take mean value of SMAPE metric because it is scale tolerant.

[25]:
score = metrics_deepar["SMAPE"].mean()
print(f"Average SMAPE for DeepAR: {score:.3f}")
Average SMAPE for DeepAR: 4.852

Visualize results.

[26]:
plot_backtest(forecast_deepar, ts, history_len=20)
../_images/tutorials_202-NN_examples_49_0.png

3.3 DeepARNative#

It is recommended to use our native implementation of DeepAR, we will remove Pytorch Forecasting version in etna 3.0.0.

[27]:
from etna.models.nn import DeepARNativeModel
[28]:
num_lags = 7

scaler = StandardScalerTransform(in_column="target")
transform_date = DateFlagsTransform(
    day_number_in_week=True,
    day_number_in_month=False,
    day_number_in_year=False,
    week_number_in_month=False,
    week_number_in_year=False,
    month_number_in_year=False,
    season_number=False,
    year_number=False,
    is_weekend=False,
    out_column="dateflag",
)
transform_lag = LagTransform(
    in_column="target",
    lags=[HORIZON + i for i in range(num_lags)],
    out_column="target_lag",
)
label_encoder = LabelEncoderTransform(
    in_column="dateflag_day_number_in_week", strategy="none", out_column="dateflag_day_number_in_week_label"
)

embedding_sizes = {"dateflag_day_number_in_week_label": (7, 7)}
[29]:
set_seed()

model_deepar_native = DeepARNativeModel(
    input_size=num_lags + 1,
    encoder_length=2 * HORIZON,
    decoder_length=HORIZON,
    embedding_sizes=embedding_sizes,
    lr=0.01,
    scale=False,
    n_samples=100,
    trainer_params=dict(max_epochs=2),
)

pipeline_deepar_native = Pipeline(
    model=model_deepar_native,
    horizon=HORIZON,
    transforms=[scaler, transform_lag, transform_date, label_encoder],
)
[30]:
metrics_deepar_native, forecast_deepar_native, fold_info_deepar_native = pipeline_deepar_native.backtest(
    ts, metrics=metrics, n_folds=3, n_jobs=1
)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type           | Params
----------------------------------------------
0 | loss       | GaussianLoss   | 0
1 | embedding  | MultiEmbedding | 56
2 | rnn        | LSTM           | 4.3 K
3 | projection | ModuleDict     | 34
----------------------------------------------
4.4 K     Trainable params
0         Non-trainable params
4.4 K     Total params
0.018     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=2` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    3.3s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type           | Params
----------------------------------------------
0 | loss       | GaussianLoss   | 0
1 | embedding  | MultiEmbedding | 56
2 | rnn        | LSTM           | 4.3 K
3 | projection | ModuleDict     | 34
----------------------------------------------
4.4 K     Trainable params
0         Non-trainable params
4.4 K     Total params
0.018     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=2` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    6.9s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type           | Params
----------------------------------------------
0 | loss       | GaussianLoss   | 0
1 | embedding  | MultiEmbedding | 56
2 | rnn        | LSTM           | 4.3 K
3 | projection | ModuleDict     | 34
----------------------------------------------
4.4 K     Trainable params
0         Non-trainable params
4.4 K     Total params
0.018     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=2` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:   10.5s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:   10.5s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.9s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    1.8s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    2.7s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    2.7s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[31]:
score = metrics_deepar_native["SMAPE"].mean()
print(f"Average SMAPE for DeepARNative: {score:.3f}")
Average SMAPE for DeepARNative: 5.816
[32]:
plot_backtest(forecast_deepar_native, ts, history_len=20)
../_images/tutorials_202-NN_examples_57_0.png

3.4 TFT#

Let’s move to the next model.

[33]:
from etna.models.nn import TFTModel
[34]:
set_seed()

Default way#

[35]:
model_tft = TFTModel(
    encoder_length=2 * HORIZON,
    decoder_length=HORIZON,
    trainer_params=dict(max_epochs=60, gradient_clip_val=0.1),
    lr=0.01,
    train_batch_size=32,
)

pipeline_tft = Pipeline(
    model=model_tft,
    horizon=HORIZON,
)
[36]:
metrics_tft, forecast_tft, fold_info_tft = pipeline_tft.backtest(ts, metrics=metrics, n_folds=3, n_jobs=1)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

   | Name                               | Type                            | Params
----------------------------------------------------------------------------------------
0  | loss                               | QuantileLoss                    | 0
1  | logging_metrics                    | ModuleList                      | 0
2  | input_embeddings                   | MultiEmbedding                  | 0
3  | prescalers                         | ModuleDict                      | 96
4  | static_variable_selection          | VariableSelectionNetwork        | 1.7 K
5  | encoder_variable_selection         | VariableSelectionNetwork        | 1.8 K
6  | decoder_variable_selection         | VariableSelectionNetwork        | 1.2 K
7  | static_context_variable_selection  | GatedResidualNetwork            | 1.1 K
8  | static_context_initial_hidden_lstm | GatedResidualNetwork            | 1.1 K
9  | static_context_initial_cell_lstm   | GatedResidualNetwork            | 1.1 K
10 | static_context_enrichment          | GatedResidualNetwork            | 1.1 K
11 | lstm_encoder                       | LSTM                            | 2.2 K
12 | lstm_decoder                       | LSTM                            | 2.2 K
13 | post_lstm_gate_encoder             | GatedLinearUnit                 | 544
14 | post_lstm_add_norm_encoder         | AddNorm                         | 32
15 | static_enrichment                  | GatedResidualNetwork            | 1.4 K
16 | multihead_attn                     | InterpretableMultiHeadAttention | 676
17 | post_attn_gate_norm                | GateAddNorm                     | 576
18 | pos_wise_ff                        | GatedResidualNetwork            | 1.1 K
19 | pre_output_gate_norm               | GateAddNorm                     | 576
20 | output_layer                       | Linear                          | 119
----------------------------------------------------------------------------------------
18.4 K    Trainable params
0         Non-trainable params
18.4 K    Total params
0.074     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=60` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:  2.3min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

   | Name                               | Type                            | Params
----------------------------------------------------------------------------------------
0  | loss                               | QuantileLoss                    | 0
1  | logging_metrics                    | ModuleList                      | 0
2  | input_embeddings                   | MultiEmbedding                  | 0
3  | prescalers                         | ModuleDict                      | 96
4  | static_variable_selection          | VariableSelectionNetwork        | 1.7 K
5  | encoder_variable_selection         | VariableSelectionNetwork        | 1.8 K
6  | decoder_variable_selection         | VariableSelectionNetwork        | 1.2 K
7  | static_context_variable_selection  | GatedResidualNetwork            | 1.1 K
8  | static_context_initial_hidden_lstm | GatedResidualNetwork            | 1.1 K
9  | static_context_initial_cell_lstm   | GatedResidualNetwork            | 1.1 K
10 | static_context_enrichment          | GatedResidualNetwork            | 1.1 K
11 | lstm_encoder                       | LSTM                            | 2.2 K
12 | lstm_decoder                       | LSTM                            | 2.2 K
13 | post_lstm_gate_encoder             | GatedLinearUnit                 | 544
14 | post_lstm_add_norm_encoder         | AddNorm                         | 32
15 | static_enrichment                  | GatedResidualNetwork            | 1.4 K
16 | multihead_attn                     | InterpretableMultiHeadAttention | 676
17 | post_attn_gate_norm                | GateAddNorm                     | 576
18 | pos_wise_ff                        | GatedResidualNetwork            | 1.1 K
19 | pre_output_gate_norm               | GateAddNorm                     | 576
20 | output_layer                       | Linear                          | 119
----------------------------------------------------------------------------------------
18.4 K    Trainable params
0         Non-trainable params
18.4 K    Total params
0.074     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=60` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:  4.8min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

   | Name                               | Type                            | Params
----------------------------------------------------------------------------------------
0  | loss                               | QuantileLoss                    | 0
1  | logging_metrics                    | ModuleList                      | 0
2  | input_embeddings                   | MultiEmbedding                  | 0
3  | prescalers                         | ModuleDict                      | 96
4  | static_variable_selection          | VariableSelectionNetwork        | 1.7 K
5  | encoder_variable_selection         | VariableSelectionNetwork        | 1.8 K
6  | decoder_variable_selection         | VariableSelectionNetwork        | 1.2 K
7  | static_context_variable_selection  | GatedResidualNetwork            | 1.1 K
8  | static_context_initial_hidden_lstm | GatedResidualNetwork            | 1.1 K
9  | static_context_initial_cell_lstm   | GatedResidualNetwork            | 1.1 K
10 | static_context_enrichment          | GatedResidualNetwork            | 1.1 K
11 | lstm_encoder                       | LSTM                            | 2.2 K
12 | lstm_decoder                       | LSTM                            | 2.2 K
13 | post_lstm_gate_encoder             | GatedLinearUnit                 | 544
14 | post_lstm_add_norm_encoder         | AddNorm                         | 32
15 | static_enrichment                  | GatedResidualNetwork            | 1.4 K
16 | multihead_attn                     | InterpretableMultiHeadAttention | 676
17 | post_attn_gate_norm                | GateAddNorm                     | 576
18 | pos_wise_ff                        | GatedResidualNetwork            | 1.1 K
19 | pre_output_gate_norm               | GateAddNorm                     | 576
20 | output_layer                       | Linear                          | 119
----------------------------------------------------------------------------------------
18.4 K    Trainable params
0         Non-trainable params
18.4 K    Total params
0.074     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=60` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  7.3min
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  7.3min
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.3s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.3s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[37]:
metrics_tft
[37]:
segment SMAPE MAPE MAE fold_number
0 segment_a 48.136045 38.466858 207.882860 0
0 segment_a 15.005679 13.808033 75.980979 1
0 segment_a 78.258300 56.042976 307.516689 2
1 segment_b 25.119316 29.855394 70.259988 0
1 segment_b 11.968204 11.546382 28.715138 1
1 segment_b 10.453402 10.686422 24.270442 2
2 segment_c 60.551222 87.741155 148.974278 0
2 segment_c 10.676096 10.220948 18.654946 1
2 segment_c 25.062330 29.356065 51.625296 2
3 segment_d 89.548155 61.552517 536.311436 0
3 segment_d 39.707204 32.576548 273.843384 1
3 segment_d 111.788257 71.553335 610.088037 2
[38]:
score = metrics_tft["SMAPE"].mean()
print(f"Average SMAPE for TFT: {score:.3f}")
Average SMAPE for TFT: 43.856

Dataset Builder#

[39]:
set_seed()

num_lags = 10
lag_columns = [f"target_lag_{HORIZON+i}" for i in range(num_lags)]

transform_date = DateFlagsTransform(day_number_in_week=True, day_number_in_month=False, out_column="dateflag")
transform_lag = LagTransform(
    in_column="target",
    lags=[HORIZON + i for i in range(num_lags)],
    out_column="target_lag",
)

dataset_builder_tft = PytorchForecastingDatasetBuilder(
    max_encoder_length=HORIZON,
    max_prediction_length=HORIZON,
    time_varying_known_reals=["time_idx"],
    time_varying_unknown_reals=["target"],
    time_varying_known_categoricals=["dateflag_day_number_in_week"],
    static_categoricals=["segment"],
    target_normalizer=GroupNormalizer(groups=["segment"]),
)
[40]:
model_tft = TFTModel(
    dataset_builder=dataset_builder_tft,
    trainer_params=dict(max_epochs=50, gradient_clip_val=0.1),
    lr=0.01,
    train_batch_size=32,
)

pipeline_tft = Pipeline(
    model=model_tft,
    horizon=HORIZON,
    transforms=[transform_lag, transform_date],
)
[41]:
metrics_tft, forecast_tft, fold_info_tft = pipeline_tft.backtest(ts, metrics=metrics, n_folds=3, n_jobs=1)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

   | Name                               | Type                            | Params
----------------------------------------------------------------------------------------
0  | loss                               | QuantileLoss                    | 0
1  | logging_metrics                    | ModuleList                      | 0
2  | input_embeddings                   | MultiEmbedding                  | 47
3  | prescalers                         | ModuleDict                      | 96
4  | static_variable_selection          | VariableSelectionNetwork        | 1.8 K
5  | encoder_variable_selection         | VariableSelectionNetwork        | 1.9 K
6  | decoder_variable_selection         | VariableSelectionNetwork        | 1.3 K
7  | static_context_variable_selection  | GatedResidualNetwork            | 1.1 K
8  | static_context_initial_hidden_lstm | GatedResidualNetwork            | 1.1 K
9  | static_context_initial_cell_lstm   | GatedResidualNetwork            | 1.1 K
10 | static_context_enrichment          | GatedResidualNetwork            | 1.1 K
11 | lstm_encoder                       | LSTM                            | 2.2 K
12 | lstm_decoder                       | LSTM                            | 2.2 K
13 | post_lstm_gate_encoder             | GatedLinearUnit                 | 544
14 | post_lstm_add_norm_encoder         | AddNorm                         | 32
15 | static_enrichment                  | GatedResidualNetwork            | 1.4 K
16 | multihead_attn                     | InterpretableMultiHeadAttention | 676
17 | post_attn_gate_norm                | GateAddNorm                     | 576
18 | pos_wise_ff                        | GatedResidualNetwork            | 1.1 K
19 | pre_output_gate_norm               | GateAddNorm                     | 576
20 | output_layer                       | Linear                          | 119
----------------------------------------------------------------------------------------
18.9 K    Trainable params
0         Non-trainable params
18.9 K    Total params
0.075     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=50` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:  2.5min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

   | Name                               | Type                            | Params
----------------------------------------------------------------------------------------
0  | loss                               | QuantileLoss                    | 0
1  | logging_metrics                    | ModuleList                      | 0
2  | input_embeddings                   | MultiEmbedding                  | 47
3  | prescalers                         | ModuleDict                      | 96
4  | static_variable_selection          | VariableSelectionNetwork        | 1.8 K
5  | encoder_variable_selection         | VariableSelectionNetwork        | 1.9 K
6  | decoder_variable_selection         | VariableSelectionNetwork        | 1.3 K
7  | static_context_variable_selection  | GatedResidualNetwork            | 1.1 K
8  | static_context_initial_hidden_lstm | GatedResidualNetwork            | 1.1 K
9  | static_context_initial_cell_lstm   | GatedResidualNetwork            | 1.1 K
10 | static_context_enrichment          | GatedResidualNetwork            | 1.1 K
11 | lstm_encoder                       | LSTM                            | 2.2 K
12 | lstm_decoder                       | LSTM                            | 2.2 K
13 | post_lstm_gate_encoder             | GatedLinearUnit                 | 544
14 | post_lstm_add_norm_encoder         | AddNorm                         | 32
15 | static_enrichment                  | GatedResidualNetwork            | 1.4 K
16 | multihead_attn                     | InterpretableMultiHeadAttention | 676
17 | post_attn_gate_norm                | GateAddNorm                     | 576
18 | pos_wise_ff                        | GatedResidualNetwork            | 1.1 K
19 | pre_output_gate_norm               | GateAddNorm                     | 576
20 | output_layer                       | Linear                          | 119
----------------------------------------------------------------------------------------
18.9 K    Trainable params
0         Non-trainable params
18.9 K    Total params
0.075     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=50` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:  5.2min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

   | Name                               | Type                            | Params
----------------------------------------------------------------------------------------
0  | loss                               | QuantileLoss                    | 0
1  | logging_metrics                    | ModuleList                      | 0
2  | input_embeddings                   | MultiEmbedding                  | 47
3  | prescalers                         | ModuleDict                      | 96
4  | static_variable_selection          | VariableSelectionNetwork        | 1.8 K
5  | encoder_variable_selection         | VariableSelectionNetwork        | 1.9 K
6  | decoder_variable_selection         | VariableSelectionNetwork        | 1.3 K
7  | static_context_variable_selection  | GatedResidualNetwork            | 1.1 K
8  | static_context_initial_hidden_lstm | GatedResidualNetwork            | 1.1 K
9  | static_context_initial_cell_lstm   | GatedResidualNetwork            | 1.1 K
10 | static_context_enrichment          | GatedResidualNetwork            | 1.1 K
11 | lstm_encoder                       | LSTM                            | 2.2 K
12 | lstm_decoder                       | LSTM                            | 2.2 K
13 | post_lstm_gate_encoder             | GatedLinearUnit                 | 544
14 | post_lstm_add_norm_encoder         | AddNorm                         | 32
15 | static_enrichment                  | GatedResidualNetwork            | 1.4 K
16 | multihead_attn                     | InterpretableMultiHeadAttention | 676
17 | post_attn_gate_norm                | GateAddNorm                     | 576
18 | pos_wise_ff                        | GatedResidualNetwork            | 1.1 K
19 | pre_output_gate_norm               | GateAddNorm                     | 576
20 | output_layer                       | Linear                          | 119
----------------------------------------------------------------------------------------
18.9 K    Trainable params
0         Non-trainable params
18.9 K    Total params
0.075     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=50` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  7.8min
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  7.8min
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.4s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.6s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.6s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[42]:
metrics_tft
[42]:
segment SMAPE MAPE MAE fold_number
0 segment_a 6.116235 5.901792 31.657100 0
0 segment_a 6.543858 6.257777 33.911211 1
0 segment_a 7.527751 7.201543 39.835205 2
1 segment_b 6.692551 6.420435 16.839042 0
1 segment_b 6.789591 6.519271 16.461197 1
1 segment_b 5.155557 4.980205 12.515908 2
2 segment_c 3.931971 3.846683 6.791909 0
2 segment_c 3.867445 3.760706 7.001602 1
2 segment_c 6.231263 5.934746 11.344711 2
3 segment_d 7.338428 7.018524 64.302726 0
3 segment_d 3.332532 3.341312 26.611154 1
3 segment_d 4.325608 4.201900 36.394749 2
[43]:
score = metrics_tft["SMAPE"].mean()
print(f"Average SMAPE for TFT: {score:.3f}")
Average SMAPE for TFT: 5.654
[44]:
plot_backtest(forecast_tft, ts, history_len=20)
../_images/tutorials_202-NN_examples_73_0.png

3.5 TFTNative#

It is recommended to use our native implementation of TFT, we will remove Pytorch Forecasting version in etna 3.0.0.

[45]:
from etna.models.nn import TFTNativeModel
[46]:
num_lags = 7
lag_columns = [f"target_lag_{HORIZON+i}" for i in range(num_lags)]

transform_lag = LagTransform(
    in_column="target",
    lags=[HORIZON + i for i in range(num_lags)],
    out_column="target_lag",
)
transform_date = DateFlagsTransform(
    day_number_in_week=True,
    day_number_in_month=False,
    day_number_in_year=False,
    week_number_in_month=False,
    week_number_in_year=False,
    month_number_in_year=False,
    season_number=False,
    year_number=False,
    is_weekend=False,
    out_column="dateflag",
)
scaler = StandardScalerTransform(in_column=["target"])

encoder = SegmentEncoderTransform()
label_encoder = LabelEncoderTransform(
    in_column="dateflag_day_number_in_week", strategy="none", out_column="dateflag_day_number_in_week_label"
)
[47]:
set_seed()

model_tft_native = TFTNativeModel(
    encoder_length=2 * HORIZON,
    decoder_length=HORIZON,
    static_categoricals=["segment_code"],
    time_varying_categoricals_encoder=["dateflag_day_number_in_week_label"],
    time_varying_categoricals_decoder=["dateflag_day_number_in_week_label"],
    time_varying_reals_encoder=["target"] + lag_columns,
    time_varying_reals_decoder=lag_columns,
    num_embeddings={"segment_code": len(ts.segments), "dateflag_day_number_in_week_label": 7},
    n_heads=1,
    num_layers=2,
    hidden_size=32,
    lr=0.0001,
    train_batch_size=16,
    trainer_params=dict(max_epochs=5, gradient_clip_val=0.1),
)
pipeline_tft_native = Pipeline(
    model=model_tft_native, horizon=HORIZON, transforms=[transform_lag, scaler, transform_date, encoder, label_encoder]
)
[48]:
metrics_tft_native, forecast_tft_native, fold_info_tft_native = pipeline_tft_native.backtest(
    ts, metrics=metrics, n_folds=3, n_jobs=1
)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

   | Name                            | Type                     | Params
------------------------------------------------------------------------------
0  | loss                            | MSELoss                  | 0
1  | static_scalers                  | ModuleDict               | 0
2  | static_embeddings               | ModuleDict               | 160
3  | time_varying_scalers_encoder    | ModuleDict               | 512
4  | time_varying_embeddings_encoder | ModuleDict               | 256
5  | time_varying_scalers_decoder    | ModuleDict               | 448
6  | time_varying_embeddings_decoder | ModuleDict               | 256
7  | static_variable_selection       | VariableSelectionNetwork | 6.5 K
8  | encoder_variable_selection      | VariableSelectionNetwork | 222 K
9  | decoder_variable_selection      | VariableSelectionNetwork | 180 K
10 | static_covariate_encoder        | StaticCovariateEncoder   | 17.2 K
11 | lstm_encoder                    | LSTM                     | 16.9 K
12 | lstm_decoder                    | LSTM                     | 16.9 K
13 | gated_norm1                     | GateAddNorm              | 2.2 K
14 | temporal_fusion_decoder         | TemporalFusionDecoder    | 16.0 K
15 | gated_norm2                     | GateAddNorm              | 2.2 K
16 | output_fc                       | Linear                   | 33
------------------------------------------------------------------------------
481 K     Trainable params
0         Non-trainable params
481 K     Total params
1.927     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=5` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:   37.2s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

   | Name                            | Type                     | Params
------------------------------------------------------------------------------
0  | loss                            | MSELoss                  | 0
1  | static_scalers                  | ModuleDict               | 0
2  | static_embeddings               | ModuleDict               | 160
3  | time_varying_scalers_encoder    | ModuleDict               | 512
4  | time_varying_embeddings_encoder | ModuleDict               | 256
5  | time_varying_scalers_decoder    | ModuleDict               | 448
6  | time_varying_embeddings_decoder | ModuleDict               | 256
7  | static_variable_selection       | VariableSelectionNetwork | 6.5 K
8  | encoder_variable_selection      | VariableSelectionNetwork | 222 K
9  | decoder_variable_selection      | VariableSelectionNetwork | 180 K
10 | static_covariate_encoder        | StaticCovariateEncoder   | 17.2 K
11 | lstm_encoder                    | LSTM                     | 16.9 K
12 | lstm_decoder                    | LSTM                     | 16.9 K
13 | gated_norm1                     | GateAddNorm              | 2.2 K
14 | temporal_fusion_decoder         | TemporalFusionDecoder    | 16.0 K
15 | gated_norm2                     | GateAddNorm              | 2.2 K
16 | output_fc                       | Linear                   | 33
------------------------------------------------------------------------------
481 K     Trainable params
0         Non-trainable params
481 K     Total params
1.927     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=5` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:  1.3min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

   | Name                            | Type                     | Params
------------------------------------------------------------------------------
0  | loss                            | MSELoss                  | 0
1  | static_scalers                  | ModuleDict               | 0
2  | static_embeddings               | ModuleDict               | 160
3  | time_varying_scalers_encoder    | ModuleDict               | 512
4  | time_varying_embeddings_encoder | ModuleDict               | 256
5  | time_varying_scalers_decoder    | ModuleDict               | 448
6  | time_varying_embeddings_decoder | ModuleDict               | 256
7  | static_variable_selection       | VariableSelectionNetwork | 6.5 K
8  | encoder_variable_selection      | VariableSelectionNetwork | 222 K
9  | decoder_variable_selection      | VariableSelectionNetwork | 180 K
10 | static_covariate_encoder        | StaticCovariateEncoder   | 17.2 K
11 | lstm_encoder                    | LSTM                     | 16.9 K
12 | lstm_decoder                    | LSTM                     | 16.9 K
13 | gated_norm1                     | GateAddNorm              | 2.2 K
14 | temporal_fusion_decoder         | TemporalFusionDecoder    | 16.0 K
15 | gated_norm2                     | GateAddNorm              | 2.2 K
16 | output_fc                       | Linear                   | 33
------------------------------------------------------------------------------
481 K     Trainable params
0         Non-trainable params
481 K     Total params
1.927     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=5` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  1.9min
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  1.9min
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.4s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.5s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.5s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[49]:
score = metrics_tft_native["SMAPE"].mean()
print(f"Average SMAPE for TFTNative: {score:.3f}")
Average SMAPE for TFTNative: 6.151
[50]:
plot_backtest(forecast_tft_native, ts, history_len=20)
../_images/tutorials_202-NN_examples_81_0.png

3.6 RNN#

We’ll use RNN model based on LSTM cell

[51]:
from etna.models.nn import RNNModel
[52]:
num_lags = 7

scaler = StandardScalerTransform(in_column="target")
transform_lag = LagTransform(
    in_column="target",
    lags=[HORIZON + i for i in range(num_lags)],
    out_column="target_lag",
)
transform_date = DateFlagsTransform(
    day_number_in_week=True,
    day_number_in_month=False,
    day_number_in_year=False,
    week_number_in_month=False,
    week_number_in_year=False,
    month_number_in_year=False,
    season_number=False,
    year_number=False,
    is_weekend=False,
    out_column="dateflag",
)
segment_encoder = SegmentEncoderTransform()
label_encoder = LabelEncoderTransform(
    in_column="dateflag_day_number_in_week", strategy="none", out_column="dateflag_day_number_in_week_label"
)

embedding_sizes = {"dateflag_day_number_in_week_label": (7, 7)}
[53]:
set_seed()

model_rnn = RNNModel(
    input_size=num_lags + 1,
    encoder_length=2 * HORIZON,
    decoder_length=HORIZON,
    embedding_sizes=embedding_sizes,
    trainer_params=dict(max_epochs=5),
    lr=1e-3,
)

pipeline_rnn = Pipeline(
    model=model_rnn,
    horizon=HORIZON,
    transforms=[scaler, transform_lag, transform_date, label_encoder],
)
[54]:
metrics_rnn, forecast_rnn, fold_info_rnn = pipeline_rnn.backtest(ts, metrics=metrics, n_folds=3, n_jobs=1)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type           | Params
----------------------------------------------
0 | loss       | MSELoss        | 0
1 | embedding  | MultiEmbedding | 56
2 | rnn        | LSTM           | 4.3 K
3 | projection | Linear         | 17
----------------------------------------------
4.4 K     Trainable params
0         Non-trainable params
4.4 K     Total params
0.017     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=5` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    7.9s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type           | Params
----------------------------------------------
0 | loss       | MSELoss        | 0
1 | embedding  | MultiEmbedding | 56
2 | rnn        | LSTM           | 4.3 K
3 | projection | Linear         | 17
----------------------------------------------
4.4 K     Trainable params
0         Non-trainable params
4.4 K     Total params
0.017     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=5` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:   15.8s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type           | Params
----------------------------------------------
0 | loss       | MSELoss        | 0
1 | embedding  | MultiEmbedding | 56
2 | rnn        | LSTM           | 4.3 K
3 | projection | Linear         | 17
----------------------------------------------
4.4 K     Trainable params
0         Non-trainable params
4.4 K     Total params
0.017     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=5` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:   23.7s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:   23.7s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.3s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.5s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.5s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[55]:
score = metrics_rnn["SMAPE"].mean()
print(f"Average SMAPE for LSTM: {score:.3f}")
Average SMAPE for LSTM: 5.653
[56]:
plot_backtest(forecast_rnn, ts, history_len=20)
../_images/tutorials_202-NN_examples_88_0.png

3.7 MLP#

Base model with linear layers and activations.

[57]:
from etna.models.nn import MLPModel
[58]:
num_lags = 14

transform_lag = LagTransform(
    in_column="target",
    lags=[HORIZON + i for i in range(num_lags)],
    out_column="target_lag",
)
transform_date = DateFlagsTransform(
    day_number_in_week=True,
    day_number_in_month=False,
    day_number_in_year=False,
    week_number_in_month=False,
    week_number_in_year=False,
    month_number_in_year=False,
    season_number=False,
    year_number=False,
    is_weekend=False,
    out_column="dateflag",
)
segment_encoder = SegmentEncoderTransform()
label_encoder = LabelEncoderTransform(
    in_column="dateflag_day_number_in_week", strategy="none", out_column="dateflag_day_number_in_week_label"
)

embedding_sizes = {"dateflag_day_number_in_week_label": (7, 7)}
[59]:
set_seed()

model_mlp = MLPModel(
    input_size=num_lags,
    hidden_size=[16],
    embedding_sizes=embedding_sizes,
    decoder_length=HORIZON,
    trainer_params=dict(max_epochs=50, gradient_clip_val=0.1),
    lr=0.001,
    train_batch_size=16,
)
metrics = [SMAPE(), MAPE(), MAE()]

pipeline_mlp = Pipeline(model=model_mlp, transforms=[transform_lag, transform_date, label_encoder], horizon=HORIZON)
[60]:
metrics_mlp, forecast_mlp, fold_info_mlp = pipeline_mlp.backtest(ts, metrics=metrics, n_folds=3, n_jobs=1)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name      | Type           | Params
---------------------------------------------
0 | loss      | MSELoss        | 0
1 | embedding | MultiEmbedding | 56
2 | mlp       | Sequential     | 369
---------------------------------------------
425       Trainable params
0         Non-trainable params
425       Total params
0.002     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=50` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    4.8s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name      | Type           | Params
---------------------------------------------
0 | loss      | MSELoss        | 0
1 | embedding | MultiEmbedding | 56
2 | mlp       | Sequential     | 369
---------------------------------------------
425       Trainable params
0         Non-trainable params
425       Total params
0.002     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=50` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    9.8s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name      | Type           | Params
---------------------------------------------
0 | loss      | MSELoss        | 0
1 | embedding | MultiEmbedding | 56
2 | mlp       | Sequential     | 369
---------------------------------------------
425       Trainable params
0         Non-trainable params
425       Total params
0.002     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=50` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:   14.8s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:   14.8s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.3s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.4s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.4s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[61]:
score = metrics_mlp["SMAPE"].mean()
print(f"Average SMAPE for MLP: {score:.3f}")
Average SMAPE for MLP: 5.970
[62]:
plot_backtest(forecast_mlp, ts, history_len=20)
../_images/tutorials_202-NN_examples_96_0.png

3.8 Deep State Model#

Deep State Model works well with multiple similar time-series. It inffers shared patterns from them.

We have to determine the type of seasonality in data (based on data granularity), SeasonalitySSM class is responsible for this. In this example, we have daily data, so we use day-of-week (7 seasons) and day-of-month (31 seasons) models. We also set the trend component using the LevelTrendSSM class. Also in the model we use time-based features like day-of-week, day-of-month and time independent feature representing the segment of time series.

[63]:
from etna.models.nn import DeepStateModel
from etna.models.nn.deepstate import CompositeSSM
from etna.models.nn.deepstate import LevelTrendSSM
from etna.models.nn.deepstate import SeasonalitySSM
[64]:
from etna.transforms import FilterFeaturesTransform
[65]:
num_lags = 7

transforms = [
    SegmentEncoderTransform(),
    StandardScalerTransform(in_column="target"),
    DateFlagsTransform(
        day_number_in_week=True,
        day_number_in_month=True,
        day_number_in_year=False,
        week_number_in_month=False,
        week_number_in_year=False,
        month_number_in_year=False,
        season_number=False,
        year_number=False,
        is_weekend=False,
        out_column="dateflag",
    ),
    LagTransform(
        in_column="target",
        lags=[HORIZON + i for i in range(num_lags)],
        out_column="target_lag",
    ),
    LabelEncoderTransform(
        in_column="dateflag_day_number_in_week", strategy="none", out_column="dateflag_day_number_in_week_label"
    ),
    LabelEncoderTransform(
        in_column="dateflag_day_number_in_month", strategy="none", out_column="dateflag_day_number_in_month_label"
    ),
    FilterFeaturesTransform(exclude=["dateflag_day_number_in_week", "dateflag_day_number_in_month"]),
]


embedding_sizes = {
    "dateflag_day_number_in_week_label": (7, 7),
    "dateflag_day_number_in_month_label": (31, 7),
    "segment_code": (4, 7),
}
[66]:
monthly_smm = SeasonalitySSM(num_seasons=31, timestamp_transform=lambda x: x.day - 1)
weekly_smm = SeasonalitySSM(num_seasons=7, timestamp_transform=lambda x: x.weekday())
[67]:
set_seed()

model_dsm = DeepStateModel(
    ssm=CompositeSSM(seasonal_ssms=[weekly_smm, monthly_smm], nonseasonal_ssm=LevelTrendSSM()),
    input_size=num_lags,
    encoder_length=2 * HORIZON,
    decoder_length=HORIZON,
    embedding_sizes=embedding_sizes,
    trainer_params=dict(max_epochs=5),
    lr=1e-3,
)

pipeline_dsm = Pipeline(
    model=model_dsm,
    horizon=HORIZON,
    transforms=transforms,
)
[68]:
metrics_dsm, forecast_dsm, fold_info_dsm = pipeline_dsm.backtest(ts, metrics=metrics, n_folds=3, n_jobs=1)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type           | Params
----------------------------------------------
0 | embedding  | MultiEmbedding | 315
1 | RNN        | LSTM           | 11.2 K
2 | projectors | ModuleDict     | 5.0 K
----------------------------------------------
16.5 K    Trainable params
0         Non-trainable params
16.5 K    Total params
0.066     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=5` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:   19.9s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type           | Params
----------------------------------------------
0 | embedding  | MultiEmbedding | 315
1 | RNN        | LSTM           | 11.2 K
2 | projectors | ModuleDict     | 5.0 K
----------------------------------------------
16.5 K    Trainable params
0         Non-trainable params
16.5 K    Total params
0.066     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=5` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:   41.5s
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type           | Params
----------------------------------------------
0 | embedding  | MultiEmbedding | 315
1 | RNN        | LSTM           | 11.2 K
2 | projectors | ModuleDict     | 5.0 K
----------------------------------------------
16.5 K    Trainable params
0         Non-trainable params
16.5 K    Total params
0.066     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=5` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  1.0min
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  1.0min
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.3s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.5s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.8s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.8s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[69]:
score = metrics_dsm["SMAPE"].mean()
print(f"Average SMAPE for DeepStateModel: {score:.3f}")
Average SMAPE for DeepStateModel: 5.520
[70]:
plot_backtest(forecast_dsm, ts, history_len=20)
../_images/tutorials_202-NN_examples_105_0.png

3.9 N-BEATS Model#

This architecture is based on backward and forward residual links and a deep stack of fully connected layers.

There are two types of models in the library. The NBeatsGenericModel class implements a generic deep learning model, while the NBeatsInterpretableModel is augmented with certain inductive biases to be interpretable (trend and seasonality).

[71]:
from etna.models.nn import NBeatsGenericModel
from etna.models.nn import NBeatsInterpretableModel
[72]:
set_seed()

model_nbeats_generic = NBeatsGenericModel(
    input_size=2 * HORIZON,
    output_size=HORIZON,
    loss="smape",
    stacks=30,
    layers=4,
    layer_size=256,
    trainer_params=dict(max_epochs=1000),
    lr=1e-3,
)

pipeline_nbeats_generic = Pipeline(
    model=model_nbeats_generic,
    horizon=HORIZON,
    transforms=[],
)
[73]:
metrics_nbeats_generic, forecast_nbeats_generic, _ = pipeline_nbeats_generic.backtest(
    ts, metrics=metrics, n_folds=3, n_jobs=1
)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name  | Type        | Params
--------------------------------------
0 | model | NBeats      | 206 K
1 | loss  | NBeatsSMAPE | 0
--------------------------------------
206 K     Trainable params
0         Non-trainable params
206 K     Total params
0.826     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=1000` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:  1.4min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name  | Type        | Params
--------------------------------------
0 | model | NBeats      | 206 K
1 | loss  | NBeatsSMAPE | 0
--------------------------------------
206 K     Trainable params
0         Non-trainable params
206 K     Total params
0.826     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=1000` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:  2.7min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name  | Type        | Params
--------------------------------------
0 | model | NBeats      | 206 K
1 | loss  | NBeatsSMAPE | 0
--------------------------------------
206 K     Trainable params
0         Non-trainable params
206 K     Total params
0.826     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=1000` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  4.1min
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  4.1min
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[74]:
score = metrics_nbeats_generic["SMAPE"].mean()
print(f"Average SMAPE for N-BEATS Generic: {score:.3f}")
Average SMAPE for N-BEATS Generic: 5.674
[75]:
plot_backtest(forecast_nbeats_generic, ts, history_len=20)
../_images/tutorials_202-NN_examples_111_0.png
[76]:
model_nbeats_interp = NBeatsInterpretableModel(
    input_size=4 * HORIZON,
    output_size=HORIZON,
    loss="smape",
    trend_layer_size=64,
    seasonality_layer_size=256,
    trainer_params=dict(max_epochs=2000),
    lr=1e-3,
)

pipeline_nbeats_interp = Pipeline(
    model=model_nbeats_interp,
    horizon=HORIZON,
    transforms=[],
)
[77]:
metrics_nbeats_interp, forecast_nbeats_interp, _ = pipeline_nbeats_interp.backtest(
    ts, metrics=metrics, n_folds=3, n_jobs=1
)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name  | Type        | Params
--------------------------------------
0 | model | NBeats      | 224 K
1 | loss  | NBeatsSMAPE | 0
--------------------------------------
223 K     Trainable params
385       Non-trainable params
224 K     Total params
0.896     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=2000` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:  1.6min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name  | Type        | Params
--------------------------------------
0 | model | NBeats      | 224 K
1 | loss  | NBeatsSMAPE | 0
--------------------------------------
223 K     Trainable params
385       Non-trainable params
224 K     Total params
0.896     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=2000` reached.
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:  3.2min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name  | Type        | Params
--------------------------------------
0 | model | NBeats      | 224 K
1 | loss  | NBeatsSMAPE | 0
--------------------------------------
223 K     Trainable params
385       Non-trainable params
224 K     Total params
0.896     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=2000` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  4.8min
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:  4.8min
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[78]:
score = metrics_nbeats_interp["SMAPE"].mean()
print(f"Average SMAPE for N-BEATS Interpretable: {score:.3f}")
Average SMAPE for N-BEATS Interpretable: 5.441
[79]:
plot_backtest(forecast_nbeats_interp, ts, history_len=20)
../_images/tutorials_202-NN_examples_115_0.png

3.10 PatchTS Model#

Model with transformer encoder that uses patches of timeseries as input words and linear decoder.

[80]:
from etna.models.nn import PatchTSModel
[81]:
set_seed()

model_patchts = PatchTSModel(
    decoder_length=HORIZON,
    encoder_length=2 * HORIZON,
    patch_len=1,
    trainer_params=dict(max_epochs=30),
    lr=1e-3,
    train_batch_size=64,
)

pipeline_patchts = Pipeline(
    model=model_patchts, horizon=HORIZON, transforms=[StandardScalerTransform(in_column="target")]
)

metrics_patchts, forecast_patchts, fold_info_patchs = pipeline_patchts.backtest(
    ts, metrics=metrics, n_folds=3, n_jobs=1
)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type       | Params
------------------------------------------
0 | loss       | MSELoss    | 0
1 | model      | Sequential | 397 K
2 | projection | Sequential | 1.8 K
------------------------------------------
399 K     Trainable params
0         Non-trainable params
399 K     Total params
1.598     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=30` reached.
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:  3.7min
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name       | Type       | Params
------------------------------------------
0 | loss       | MSELoss    | 0
1 | model      | Sequential | 397 K
2 | projection | Sequential | 1.8 K
------------------------------------------
399 K     Trainable params
0         Non-trainable params
399 K     Total params
1.598     Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=30` reached.
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed: 11.5min
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed: 11.5min
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.2s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.3s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.3s
[Parallel(n_jobs=1)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=1)]: Done   2 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[Parallel(n_jobs=1)]: Done   3 tasks      | elapsed:    0.1s
[82]:
score = metrics_patchts["SMAPE"].mean()
print(f"Average SMAPE for PatchTS: {score:.3f}")
Average SMAPE for PatchTS: 5.852
[83]:
plot_backtest(forecast_patchts, ts, history_len=20)
../_images/tutorials_202-NN_examples_120_0.png