Title: | Resampling Tools for Time Series Forecasting |
---|---|
Description: | A 'modeltime' extension that implements forecast resampling tools that assess time-based model performance and stability for a single time series, panel data, and cross-sectional time series analysis. |
Authors: | Matt Dancho [aut, cre], Business Science [cph] |
Maintainer: | Matt Dancho <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.3.9000 |
Built: | 2024-10-31 04:22:19 UTC |
Source: | https://github.com/business-science/modeltime.resample |
An internal function used by unnest_modeltime_resamples()
.
get_target_text_from_resamples(data, column_before_target = ".row")
get_target_text_from_resamples(data, column_before_target = ".row")
data |
Unnested resample results |
column_before_target |
The text column located before the target variable. This is ".row". |
# The .resample_results column is deeply nested m750_training_resamples_fitted # Unnest and prepare the resample predictions for evaluation unnest_modeltime_resamples(m750_training_resamples_fitted) %>% get_target_text_from_resamples()
# The .resample_results column is deeply nested m750_training_resamples_fitted # Unnest and prepare the resample predictions for evaluation unnest_modeltime_resamples(m750_training_resamples_fitted) %>% get_target_text_from_resamples()
Time Series Cross Validation Resample Predictions (Results) from the M750 Data (Training Set)
m750_training_resamples_fitted
m750_training_resamples_fitted
A Modeltime Table that has been fitted to resamples with predictions in the .resample_results
column
m750_training_resamples_fitted <- m750_models %>% modeltime_fit_resamples( resamples = m750_training_resamples, control = control_resamples(verbose = T) )
m750_training_resamples_fitted
m750_training_resamples_fitted
Resampled predictions are commonly used for:
Analyzing accuracy and stability of models
As inputs to Ensemble methods (refer to the modeltime.ensemble
package)
modeltime_fit_resamples(object, resamples, control = control_resamples())
modeltime_fit_resamples(object, resamples, control = control_resamples())
object |
A Modeltime Table |
resamples |
An |
control |
A |
The function is a wrapper for tune::fit_resamples()
to iteratively train and predict models
contained in a Modeltime Table on resample objects.
One difference between tune::fit_resamples()
and modeltime_fit_resamples()
is that predictions are always returned
(i.e. control = tune::control_resamples(save_pred = TRUE)
). This is needed for
ensemble_model_spec()
.
Resampled Prediction Accuracy
Calculating Accuracy Metrics on models fit to resamples can help
to understand the model performance and stability under different
forecasting windows. See modeltime_resample_accuracy()
for
getting resampled prediction accuracy for each model.
Ensembles
Fitting and Predicting Resamples is useful in
creating Stacked Ensembles using modeltime.ensemble::ensemble_model_spec()
.
The sub-model cross-validation predictions are used as the input to the meta-learner model.
A Modeltime Table (mdl_time_tbl
) object with a column containing
resample results (.resample_results
)
library(tidymodels) library(modeltime) library(timetk) library(magrittr) # Make resamples resamples_tscv <- training(m750_splits) %>% time_series_cv( assess = "2 years", initial = "5 years", skip = "2 years", # Normally we do more than one slice, but this speeds up the example slice_limit = 1 ) # Fit and generate resample predictions m750_models_resample <- m750_models %>% modeltime_fit_resamples( resamples = resamples_tscv, control = control_resamples(verbose = TRUE) ) # A new data frame is created from the Modeltime Table # with a column labeled, '.resample_results' m750_models_resample
library(tidymodels) library(modeltime) library(timetk) library(magrittr) # Make resamples resamples_tscv <- training(m750_splits) %>% time_series_cv( assess = "2 years", initial = "5 years", skip = "2 years", # Normally we do more than one slice, but this speeds up the example slice_limit = 1 ) # Fit and generate resample predictions m750_models_resample <- m750_models %>% modeltime_fit_resamples( resamples = resamples_tscv, control = control_resamples(verbose = TRUE) ) # A new data frame is created from the Modeltime Table # with a column labeled, '.resample_results' m750_models_resample
This is a wrapper for yardstick
that simplifies time
series regression accuracy metric calculations from
a Modeltime Table that has been resampled and fitted using
modeltime_fit_resamples()
.
modeltime_resample_accuracy( object, summary_fns = mean, metric_set = default_forecast_accuracy_metric_set(), ... )
modeltime_resample_accuracy( object, summary_fns = mean, metric_set = default_forecast_accuracy_metric_set(), ... )
object |
a Modeltime Table with a column '.resample_results' (the output of |
summary_fns |
One or more functions to analyze resamples. The default is
|
metric_set |
A |
... |
Additional arguments passed to the function calls in |
#' Default Accuracy Metrics
The following accuracy metrics are included by default via modeltime::default_forecast_accuracy_metric_set()
:
MAE - Mean absolute error, yardstick::mae()
MAPE - Mean absolute percentage error, yardstick::mape()
MASE - Mean absolute scaled error, yardstick::mase()
SMAPE - Symmetric mean absolute percentage error, yardstick::smape()
RMSE - Root mean squared error, yardstick::rmse()
RSQ - R-squared, yardstick::rsq()
Summary Functions
By default, modeltime_resample_accuracy()
returns
the average accuracy metrics for each resample prediction.
The user can change this default behavior using summary_fns
.
Simply pass one or more Summary Functions. Internally, the functions are passed to
dplyr::across(.fns)
, which applies the summary functions.
Returning Unsummarized Results
You can pass summary_fns = NULL
to return unsummarized results by .resample_id
.
Professional Tables (Interactive & Static)
Use modeltime::table_modeltime_accuracy()
to format the results for reporting in
reactable
(interactive) or gt
(static) formats, which are perfect for
Shiny Apps (interactive) and PDF Reports (static).
library(modeltime) # Mean (Default) m750_training_resamples_fitted %>% modeltime_resample_accuracy() %>% table_modeltime_accuracy(.interactive = FALSE) # Mean and Standard Deviation m750_training_resamples_fitted %>% modeltime_resample_accuracy( summary_fns = list(mean = mean, sd = sd) ) %>% table_modeltime_accuracy(.interactive = FALSE) # When summary_fns = NULL, returns the unsummarized resample results m750_training_resamples_fitted %>% modeltime_resample_accuracy( summary_fns = NULL )
library(modeltime) # Mean (Default) m750_training_resamples_fitted %>% modeltime_resample_accuracy() %>% table_modeltime_accuracy(.interactive = FALSE) # Mean and Standard Deviation m750_training_resamples_fitted %>% modeltime_resample_accuracy( summary_fns = list(mean = mean, sd = sd) ) %>% table_modeltime_accuracy(.interactive = FALSE) # When summary_fns = NULL, returns the unsummarized resample results m750_training_resamples_fitted %>% modeltime_resample_accuracy( summary_fns = NULL )
A convenient plotting function for visualizing resampling accuracy by resample set for each model in a Modeltime Table.
plot_modeltime_resamples( .data, .metric_set = default_forecast_accuracy_metric_set(), .summary_fn = mean, ..., .facet_ncol = NULL, .facet_scales = "free_x", .point_show = TRUE, .point_size = 1, .point_shape = 16, .point_alpha = 1, .summary_line_show = TRUE, .summary_line_size = 0.5, .summary_line_type = 1, .summary_line_alpha = 1, .x_intercept = NULL, .x_intercept_color = "red", .x_intercept_size = 0.5, .legend_show = TRUE, .legend_max_width = 40, .title = "Resample Accuracy Plot", .x_lab = "", .y_lab = "", .color_lab = "Legend", .interactive = TRUE )
plot_modeltime_resamples( .data, .metric_set = default_forecast_accuracy_metric_set(), .summary_fn = mean, ..., .facet_ncol = NULL, .facet_scales = "free_x", .point_show = TRUE, .point_size = 1, .point_shape = 16, .point_alpha = 1, .summary_line_show = TRUE, .summary_line_size = 0.5, .summary_line_type = 1, .summary_line_alpha = 1, .x_intercept = NULL, .x_intercept_color = "red", .x_intercept_size = 0.5, .legend_show = TRUE, .legend_max_width = 40, .title = "Resample Accuracy Plot", .x_lab = "", .y_lab = "", .color_lab = "Legend", .interactive = TRUE )
.data |
A modeltime table that includes a column |
.metric_set |
A |
.summary_fn |
A single summary function that is applied to aggregate the
metrics across resample sets. Default: |
... |
Additional arguments passed to the |
.facet_ncol |
Default: |
.facet_scales |
Default: |
.point_show |
Whether or not to show the individual points for each combination
of models and metrics. Default: |
.point_size |
Controls the point size. Default: 1. |
.point_shape |
Controls the point shape. Default: 16. |
.point_alpha |
Controls the opacity of the points. Default: 1 (full opacity). |
.summary_line_show |
Whether or not to show the summary lines. Default: |
.summary_line_size |
Controls the summary line width. Default: 0.5. |
.summary_line_type |
Controls the summary line type. Default: 1. |
.summary_line_alpha |
Controls the summary line opacity. Default: 1 (full opacity). |
.x_intercept |
Numeric. Adds an x-intercept at a location (e.g. 0). Default: NULL. |
.x_intercept_color |
Controls the x-intercept color. Default: "red". |
.x_intercept_size |
Controls the x-intercept linewidth. Default: 0.5. |
.legend_show |
Logical. Whether or not to show the legend. Can save space with long model descriptions. |
.legend_max_width |
Numeric. The width of truncation to apply to the legend text. |
.title |
Title for the plot |
.x_lab |
X-axis label for the plot |
.y_lab |
Y-axis label for the plot |
.color_lab |
Legend label if a |
.interactive |
Returns either a static ( |
Default Accuracy Metrics
The following accuracy metrics are included by default via modeltime::default_forecast_accuracy_metric_set()
:
MAE - Mean absolute error, yardstick::mae()
MAPE - Mean absolute percentage error, yardstick::mape()
MASE - Mean absolute scaled error, yardstick::mase()
SMAPE - Symmetric mean absolute percentage error, yardstick::smape()
RMSE - Root mean squared error, yardstick::rmse()
RSQ - R-squared, yardstick::rsq()
Summary Function
Users can supply a single summary function (e.g. mean
) to summarize the
resample metrics by each model.
m750_training_resamples_fitted %>% plot_modeltime_resamples( .interactive = FALSE )
m750_training_resamples_fitted %>% plot_modeltime_resamples( .interactive = FALSE )
An internal function used by modeltime_resample_accuracy()
.
unnest_modeltime_resamples(object)
unnest_modeltime_resamples(object)
object |
A Modeltime Table that has a column '.resample_results' |
The following data columns are unnested and prepared for evaluation:
.row_id
- A unique identifier to compare observations.
.resample_id
- A unique identifier given to the resample iteration.
.model_id
and .model_desc
- Modeltime Model ID and Description
.pred
- The Resample Prediction Value
.row
- The actual row value from the original dataset
Actual Value Column - The name changes to target variable name in dataset
Tibble with columns for '.row_id', '.resample_id', '.model_id', '.model_desc', '.pred', '.row', and actual value name from the data set
# The .resample_results column is deeply nested m750_training_resamples_fitted # Unnest and prepare the resample predictions for evaluation unnest_modeltime_resamples(m750_training_resamples_fitted)
# The .resample_results column is deeply nested m750_training_resamples_fitted # Unnest and prepare the resample predictions for evaluation unnest_modeltime_resamples(m750_training_resamples_fitted)