Skip to content

lstm time series prediction in R

lstm time series prediction in R

It turns out that deep learning, with all its power, can also be used for forecasting. Especially the LSTM (Long Short Term Memory) model, which proved to be useful while solving problems involving sequences with autocorrelation.

Long Short Term Memory networks are kind of Recurrent Neural Networks (RNN) that are capable of learning long-term dependencies. LSTM enables to persist long term states in addition to short term, which tradicional RNN’s have difficulty with. LSTMs are quite useful in time series prediction tasks involving autocorrelation, because of their ability to maintain state and recognize patterns over the length of the series.

Here I show how to implement forecasting LSTM model using R language.

HOW TO

LSTM model is available in the keras R package, which runs on top of the Tensorflow.

Before we start we need to install and load both of those:

 

data preparation

For the purpose of this example I used the economics dataset, which can be found in the ggplot2 package. I want to predict the unemployment in the following 12 months (unemploy column).

LSTM model requires to rescale the input data. Mean and standard deviation of the training dataset can be used as the scaling coefficients to scale both the training and testing data sets as well as the predicted values. This way we ensure that the scaling does not impact the model.

If you wish to train-test the model, you should start with data split. As I focused on creating the prediction, not accuracy of the model itself, I used full dataset for training.

 

LSTM algorithm creates predictions based on the lagged values. That means that it needs to look back to as many previous values as many points we wish to predict. As we want to do a 12 months forecast, we need to base each prediction on 12 data points. 

For demonstration purpose let’s say our series has 10 data points [1, 2, 3, …, 10] and we want to predict 3 values. Then our training data looks like this:

Predictors and target take form of

 

Additionally keras LSTM expects specific tensor format of shape of a 3D array of the form [samples, timesteps, features] for predictors (X) and for target (Y) values:

  • samples specifies the number of observations which will be processed in batches.
  • timesteps tells us the number of time steps (lags). Or in other words how many units back in time we want our network to see.
  • features specifies number of predictors (1 for univariate series and n for multivariate).

In case of predictors that translates to an array of dimensions: (nrow(data) – lag – prediction + 1, 12, 1), where lag = prediction = 12.

We lag the data 11 times, so that each prediction is based on 12 values, and arrange lagged values into columns. Then we transform it into the desired 3D form.

Basically every column is a lagged version of the previous one – the last one is lagged by 11 steps comparing to the first one.

And final version:

Data was turned into a 3D array. As we have only one predictor, last dimension equals to one.

 

Now we apply similar transformation for the Y values.

 

In the same manner we need to prepare input data for the prediction, which are in fact last 12 observations from our training set.

We need to scale and transform it.

 

lstm prediction

We can build a LSTM model using the keras_model_sequential function and adding layers on top of that. The first LSTM layer takes the required input shape, which is the [samples, timesteps, features]. We set for both layers return_sequences = TRUE and stateful = TRUE. The second layer is the same with the exception of batch_input_shape, which only needs to be specified in the first layer.

You can also decide to try out different activation functions with activation parameter (hyperbolic tangent tanh is the default one).

Also choose loss function for the optimization, type of optimizer and metric for assessing the model performance. Info about different optimizers can he found here.

Next, we can fit our stateful LSTM. We set shuffle = FALSE to preserve sequences of time series.

And perform the prediction:

 

LSTM model is more tricky than regular time series models, because you do not pass the explicit number of prediction points for the forecast. Instead you need to design your model in a way that it forecasts the desired number of periods. If you wish to predict more, you need to provide additional columns in your prediction set, containing values predicted for the previous periods. Following this example, predicting 13th month ahead requires “knowing” result for the 1st month ahead. The other option is to rebuild the model to predict 13 values instead of 12.

 

forecast object

As we have the values predicted, we can turn the results into the forecast object, as we would get if using the forecast package. That will allow i.e. to use the forecast::autoplot function to plot the results of the prediction. In order to do so, we need to define several objects that build a forecast object.

 
prediction on a train set

Prediction on a training set will provide us with 12 results for each input period. So we need to transform the data to get only one prediction per each date.

Due to the fact that our forecast starts with 12 months offset, we need to provide artificial (or real) values for those months:

 
prediction in a form of ts object

We need to change the predicted values into a time series object.

 
input series

Additionally we need to transform the economics data into a time series object.

 
forecast object

Finally we can define the forecast object:

Now we can easily plot the data:

 

lstm prediction with regressors

Handling regressors in LSTM goes down to treating the series as multivariate instead of univariate. Let’s create some random regressors for this example:

As with training set, we need to scale the regressors as well

Now we can add them to the training data and transform to tensors as previously.

This time we end up with 2 records in our list – input data in the exact same for as previously and regressors following the same logic.

Again we transform this list into a 3D array.

 

We also need to modify the prediction data to include regressors in the same manner as for training.

Rest of the modeling stays the same.

14 thoughts on “lstm time series prediction in R

  • Many thanks for this article.
    I am comfortable with R. Python’s system management and version compatibilities unsurmountable for many R users.
    It is a lot easier to install TF and keras as root user as installing and configuring for non-admin user. As root user, everything ran on the first go.
    I have got a project on sparse time series project and I am/was determined to stick to R whatever it takes, and your article was perfect in every manner.
    Just few things I have to guess to instal forecast and timetk.
    Thanks once again and really obliged.

  • Good evening!

    In the section “lstm prediction with regressors”, where it says # combine training data with regressors
    x_train <- cbind(scaled_train, scaled_reg)
    x_train_data <- list()

    Which ones are the vectors "scaled_train " and "scaled_reg"?

    Thanks a lot for your help.

    • Hi,
      “scaled_train” object you have in “data preparation” section, but generally it is a standardized time series.

      “scaled_reg” is described in “lstm prediction with regressors”. Here again we have a standardize regressor for the prediction.

  • I’ve donde a lot of search to find tutorials for apply lstm in ts forecast, your is the best one.
    Very helpful. I’ll be following your work.
    Many thanks for this.

  • I’m trying to reproduce your work here, but am running into the following:

    > lstm_model <- keras_model_sequential()

    *** caught illegal operation ***
    address 0x187ecf874, cause 'illegal trap'

    Traceback:
    1: py_module_import(module, convert = convert)
    2: import(module)
    3: doTryCatch(return(expr), name, parentenv, handler)
    4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
    5: tryCatchList(expr, classes, parentenv, handlers)
    6: tryCatch(import(module), error = clear_error_handler())
    7: py_resolve_module_proxy(x)
    8: $.python.builtin.module(keras, "models")
    9: keras$models
    10: keras_model_sequential()

    Possible actions:
    1: abort (with core dump, if enabled)
    2: normal R exit
    3: exit R without saving workspace
    4: exit R saving workspace
    Selection:

    Any thoughts how to fix this? I'm not that familiar with Python unfortunately. Working on a MacBook Pro M1.

  • Thank you for sharing way too nice model!

    I have a unsolved problem to develop a LSTM time series model.
    I want to develop ‘multivariate LSTM prediction model’, but I don’t know how to coordinate the feature parameter to develop it…
    And I don’t know how to prepare for multivariate datasets to apply into the model…
    Can I get to know how to solve this problem?

  • This is super helpful, but any suggestions on how to generate predictions on new data into the future? I am guessing you’d have to predict one point at a time into the future, and use the preceding prediction as the lagged value?

    • Actually the example generates prediction of 12 months ahead. While building the model you need to specify the data lag equal the desired prediction period. Then as above each 12 observations predict next 12 observations, so last 12 points from your dataset is use to generate next 12. If you with to predict 24 periods then you can lag data 24 times or perform the 12 points prediction 12 times 🙂

  • Thank you for example. The first part with one regressor works but the second with two doesn’t. Would you please send me the whole code of the example or publish it. That is what I wrote:
    lstm_model_m %
    layer_lstm(units = 50, # size of the layer
    batch_input_shape = c(1, 12, 2), # batch size, timesteps, features
    return_sequences = TRUE,
    stateful = TRUE) %>%
    # fraction of the units to drop for the linear transformation of the inputs
    layer_dropout(rate = 0.5) %>%
    layer_lstm(units = 50,
    return_sequences = TRUE,
    stateful = TRUE) %>%
    layer_dropout(rate = 0.5) %>%
    time_distributed(keras::layer_dense(units = 1))
    lstm_model %>%
    compile(loss = ‘mae’, optimizer = ‘adam’, metrics = ‘accuracy’)
    summary(lstm_model)
    lstm_model %>% fit(
    x = x_train_arr_m,
    y = y_train_arr,
    batch_size = 1,
    epochs = 20,
    verbose = 0,
    shuffle = FALSE
    )
    I included features=2 in
    batch_input_shape = c(1, 12, 2)
    but response was
    WARNING:tensorflow:Model was constructed with shape (1, 12, 1) for input KerasTensor(type_spec=TensorSpec(shape=(1, 12, 1), dtype=tf.float32, name=’lstm_7_input’), name=’lstm_7_input’, description=”created by layer ‘lstm_7_input'”), but it was called on an input with incompatible shape (None, 12, 2).
    Thank you

    • Sure, hope that helps.

  • Hi I am facing this issue : please help
    Error in loadNamespace(name): there is no package called ‘timetk’
    Traceback:

    1. timetk::tk_ts
    2. getExportedValue(pkg, name)
    3. asNamespace(ns)
    4. getNamespace(ns)
    5. loadNamespace(name)
    6. withRestarts(stop(cond), retry_loadNamespace = function() NULL)
    7. withOneRestart(expr, restarts[[1L]])
    8. doWithOneRestart(return(expr), restart)

  • Thanks for posting this blog. It’s really helpful.

    I have a question regarding your code.
    y_train_data <- t(sapply(
    (1 + lag):(length(scaled_train) – prediction + 1),
    function(x) scaled_train[x:(x + prediction – 1)]
    ))

    I think you don't need t() for this matrix?

    Have a great day 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *