anomalize
R package is now available in timetk
:
anomlize()
: 1 function that breaks down, identifies, and cleans anomaliesplot_anomalies()
: Visualize the anomalies and anomaly bandsplot_anomalies_decomp()
: Visualize the time series decomposition. Make adjustments as needed.plot_anomalies_cleaned()
: Visualize the before/after of cleaning anomalies.Note - anomalize(.method)
: Only the .method = "stl"
is supported at this time. The "twitter"
method is also planned.
Update forecasting vignette: Use glmnet
for time series forecasting.
CRAN Fixes:
tzdata
time zone fixes:
@aliases
to timetk-packagerobets
tidyquant
from examplestidyverse
from examplesFANG
dataset to timetk
(port from tidyquant
)New Features
plot_time_series()
: Gets new arguments to specify .x_intercept
and .x_intercept_color
. #131Fixes
plot_time_series()
when .group_names
is not found. #121recipes >= 1.0.3
#132facet_trelliscope()
plotting parameters.
plot_time_series()
plot_time_series_boxplot()
plot_anomaly_diagnostics()
New Features
Many of the plotting functions have been upgraded for use with trelliscopejs
for
easier visualization of many time series.
plot_time_series()
:
trelliscope
: Used for visualizing many time series..facet_strip_remove
to remove facet strips since trelliscope is automatically labeled..facet_nrow
to adjust grid with trelliscope.facet_collapse = TRUE
was changed to FALSE
for better compatibility with Trelliscope JS. This may cause some plots to have multiple groups take up extra space in the strip.plot_time_series_boxplot()
:
trelliscope
: Used for visualizing many time series..facet_strip_remove
to remove facet strips since trelliscope is automatically labeled..facet_nrow
to adjust grid with trelliscope..facet_collapse = TRUE
was changed to FALSE
for better compatibility with Trelliscope JS. This may cause some plots to have multiple groups take up extra space in the strip.plot_anomaly_diagnostics()
:
trelliscope
: Used for visualizing many time series..facet_strip_remove
to remove facet strips since trelliscope is automatically labeled..facet_nrow
to adjust grid with trelliscope..facet_collapse = TRUE
was changed to FALSE
for better compatibility with Trelliscope JS. This may cause some plots to have multiple groups take up extra space in the strip.Updates & Bug Fixes
Recipes steps (e.g. step_timeseries_signature()
) use the new recipes::print_step()
function. Requires recipes >= 0.2.0
. #110
Offset parameter in step_log_interval()
was not working properly. Now works. #103
Potential Breaking Changes
.facet_collapse = TRUE
was changed to FALSE
for better compatibility with Trelliscope JS. This may cause some plots to have multiple groups take up extra space in the strip.New Features
tk_tsfeatures()
: A new function that makes it easy to generate time series feature matrix using tsfeatures
. The main benefit is that you can pipe time series data in tibbles
with dplyr
groups. The features will be produced by group. #95 #84
plot_time_series_boxplot()
: A new function that makes plotting time series boxplots simple using a .period
argument for time series aggregation.
New Vignettes
Time Series Clustering: Uses the new tk_tsfeatures()
function to perform time series clustering. #95 #84
Time Series Visualization: Updated to include plot_time_series_boxplot()
and plot_time_series_regression()
.
Improvements
Improvements for point forecasting when the target is n-periods into the future.
time_series_cv()
, time_series_split()
: New parameter point_forecast
. This is useful for testing / assessing the n-th prediction in the future. When set to TRUE
, will return a single point that returns on the last value in assess
.Fixes
plot_time_series()
: Smoother no longer fails when time series has 1 observation #106Improvements
summarize_by_time()
: Added a .week_start
argument to allow specifying .week_start = 1
for Monday start. Default is 7 for Sunday Start. This can also be changed with the lubridate
by setting the lubridate.week.start
option.
Plotting Functions:
.facet_dir
argument for adjusting the direction of facet_wrap(dir)
. #94plot_acf_diagnostics()
): Change default parameter to .show_white_noise_bars = TRUE
. #85plot_timeseries_regression()
: Can now show_summary
for group-wise models when visualizing groupsTime Series CV (time_series_cv()
): Add Label for tune_results
Improve speed of pad_by_time()
. #93
Bug Fixes
tk_make_timeseries()
and tk_make_future_timeseries()
are now able to handle end of months. #72
tk_tbl.zoo()
: Fix an issue when readr::type_convert()
produces warning messages about not having character columns in inputs. #89
plot_time_series_regression()
: Fixed an issue when lags are added to .formula
. Pads lags with NA.
step_fourier()
and fourier_vec()
: Fixed issue with step_fourier failing with one observation. Added scale_factor argument to override date sequences with the stored scale factor. #77
Improvements
tk_augment_slidify()
, tk_augment_lags()
, tk_augment_leads()
, tk_augment_differences()
: Now works with multiple columns (passed via .value
) and tidyselect
(e.g. contains()
).Fixes
#> New names:
#> * NA -> ...1
New Functions
filter_period()
(#64): Applies filtering expressions within time-based periods (windows).slice_period()
(#64): Applies slices within time-based periods (windows).condense_period()
(#64): Converts a periodicity from a higher (e.g. daily) to lower (e.g. monthly) frequency. Similar to xts::to.period()
and tibbletime::as_period()
.tk_augment_leads()
and lead_vec()
(#65): Added to make it easier / more obvious on how to create leads.Fixes
time_series_cv()
: Fix bug with Panel Data. Train/Test Splits only returning 1st observation in final time stamp. Should return all observations.future_frame()
and tk_make_future_timeseries()
: Now sort the incoming index to ensure dates returned go into the future.tk_augment_lags()
and tk_augment_slidify()
: Now overwrite column names to match the behavior of tk_augment_fourier()
and tk_augment_differences()
.Improvements
time_series_cv()
: Now works with time series groups. This is great for working with panel data.future_frame()
: Gets a new argument called .bind_data
. When set to TRUE
, it performs a data
binding operation with the incoming data and the future frame.Miscellaneous
step_slidify_augment()
- A variant of step slidify that adds multiple rolling columns inside of a recipe.Bug Fixes
%+time%
and %-time%
return missing valuestk_make_timeseries()
and tk_make_future_timeseries()
providing odd results for regular time series. GitHub Issue 60New Functionality
tk_time_series_cv_plan()
- Now works with k-fold cross validation objects from vfold_cv()
function.
pad_by_time()
- Added new argument .fill_na_direction
to specify a tidyr::fill()
strategy for filling missing data.
Bug Fixes
tk_augment_lags()
) - Fix bug with grouped functions not being exportedts
classNew Functions
step_log_interval_vec()
- Extends the log_interval_vec()
for recipes
preprocessing.Parallel Processing
tune
and recipes
Bug Fixes
log_interval_vec()
- Correct the messagingcomplement.ts_cv_split
- Helper to show time series cross validation splits in list explorer.New Functions
mutate_by_time()
: For applying mutates by time windowslog_interval_vec()
& log_interval_inv_vec()
: For constrained interval forecasting.Improvements
plot_acf_diagnostics()
: A new argument, .show_white_noise_bars
for adding white noise bars to an ACF / PACF Plot.pad_by_time()
: New arguments .start_date
and .end_date
for expanding/contracting the padding windows.New Functions
plot_time_series_regression()
: Convenience function to visualize & explore features using Linear Regression (stats::lm()
formula).time_series_split()
: A convenient way to return a single split from time_series_cv()
. Returns the split in the same format as rsample::initial_time_split()
.Improvements
summarise_by_time()
, filter_by_time()
, tk_summary_diagnostics
tk_time_series_cv_plan()
: Allow a single resample from rsample::initial_time_split
or timetk::time_series_split
modeltime
and tidymodels
.Plotting Improvements
plot_time_series()
:
.legend_show
to toggle on/off legends.Breaking Changes
...
with .facet_vars
or .ccf_vars
. This change is needed to improve tab-completion. It affects :
plot_time_series()
plot_acf_diagnostics()
plot_anomaly_diagnostics()
plot_seasonal_diagnostics()
plot_stl_diagnostics()
Bug Fixes
New Interactive Plotting Functions
plot_anomaly_diagnostics()
: Visualize Anomalies for One or More Time SeriesNew Data Wrangling Functions
future_frame()
: Make a future tibble from an existing time-based tibble.New Diagnostic / Data Processing Functions
tk_anomaly_diagnostics()
- Group-wise anomaly detection and diagnostics. A wrapper for the anomalize
R package functions without importing anomalize
.New Vectorized Functions:
ts_clean_vec()
- Replace Outliers & Missing Values in a Time Seriesstandardize_vec()
- Centers and scales a time series to mean 0, standard deviation 1normalize_vec()
- Normalizes a time series to Range: (0, 1)New Recipes Preprocessing Steps:
step_ts_pad()
- Preprocessing for padding time series data. Adds rows to fill in gaps and can be used with step_ts_impute()
to interpolate going from low to high frequency!step_ts_clean()
- Preprocessing step for cleaning outliers and imputing missing values in a time series.New Parsing Functions
parse_date2()
and parse_datetime2()
: These are similar to readr::parse_date()
and lubridate::as_date()
in that they parse character vectors to date and datetimes. The key advantage is SPEED. parse_date2()
uses anytime
package to process using C++ Boost.Date_Time
library.Improvements:
plot_acf_diagnostics()
: The .lags
argument now handles time-based phrases (e.g. .lags = "1 month"
).time_series_cv()
: Implements time-based phrases (e.g. initial = "5 years"
and assess = "1 year"
)tk_make_future_timeseries()
: The n_future
argument has been deprecated for a new length_out
argument that accepts both numeric input (e.g. length_out = 12
) and time-based phrases (e.g. length_out = "12 months"
). A major improvement is that numeric values define the number of timestamps returned even if weekends are removed or holidays are removed. Thus, you can always anticipate the length. (Issue #19).diff_vec
: Now reports the initial values used in the differencing calculation.Bug Fixes:
plot_time_series()
:
.value = .value
.tk_make_future_timeseries()
:
time_series_cv()
:
skip = 1
default. skip = 0
does not make sense.skip
adding 1 to stops.plot_time_series_cv_plan()
& tk_time_series_cv_plan()
:
tk_make_future_timeseries()
:
period()
returns NA
. Fix implemented with ceiling_date()
.pad_by_time()
:
pad_value
so only inserts pad values where new row was inserted.step_ts_clean()
, step_ts_impute()
:
lambda = NULL
Breaking Changes:
These should not be of major impact since the 1.0.0 version was just released.
impute_ts_vec()
to ts_impute_vec()
for consistency with ts_clean_vec()
step_impute_ts()
to step_ts_impute()
for consistency with underlying functionroll_apply_vec()
to slidify_vec()
for consistency with slidify()
& relationship to slider
R packagestep_roll_apply
to step_slidify()
for consistency with slidify()
& relationship to slider
R packagetk_augment_roll_apply
to tk_augment_slidify()
for consistency with slidify()
& relationship to slider
R packageplot_time_series_cv_plan()
and tk_time_series_cv_plan()
: Changed argument from .rset
to .data
.New Interactive Plotting Functions:
plot_time_series()
- A workhorse time-series plotting function that generates interactive plotly
plots, consolidates 20+ lines of ggplot2
code, and scales well to many time series using dplyr groups.plot_acf_diagnostics()
- Visualize the ACF, PACF, and any number of CCFs in one plot for Multiple Time Series. Interactive plotly
by default.plot_seasonal_diagnostics()
- Visualize Multiple Seasonality Features for One or More Time Series. Interactive plotly
by default.plot_stl_diagnostics()
- Visualize STL Decomposition Features for One or More Time Series.plot_time_series_cv_plan()
- Visualize the Time Series Cross Validation plan made with time_series_cv()
.New Time Series Data Wrangling:
summarise_by_time()
- A time-based variant of dplyr::summarise()
for flexible summarization using common time-based criteria.filter_by_time()
- A time-based variant of dplyr::filter()
for flexible filtering by time-ranges.pad_by_time()
- Insert time series rows with regularly spaced timestamps.slidify()
- Make any function a rolling / sliding function.between_time()
- A time-based variant of dplyr::between()
for flexible time-range detection.add_time()
- Add for time series index. Shifts an index by a period
.New Recipe Functions:
Feature Generators:
step_holiday_signature()
- New recipe step for adding 130 holiday features based on individual holidays, locales, and stock exchanges / business holidays.step_fourier()
- New recipe step for adding fourier transforms for adding seasonal features to time series datastep_roll_apply()
- New recipe step for adding rolling summary functions. Similar to recipes::step_window()
but is more flexible by enabling application of any summary function.step_smooth()
- New recipe step for adding Local Polynomial Regression (LOESS) for smoothing noisy time seriesstep_diff()
- New recipe for adding multiple differenced columns. Similar to recipes::step_lag()
.step_box_cox()
- New recipe for transforming predictors. Similar to step_BoxCox()
with improvements for forecasting including "guerrero" method for lambda selection and handling of negative data.step_impute_ts()
- New recipe for imputing a time series.New Rsample Functions
time_series_cv()
- Create rsample
cross validation sets for time series. This function produces a sampling plan starting with the most recent time series observations, rolling backwards.New Vector Functions:
These functions are useful on their own inside of mutate()
and power many of the new plotting and recipes functions.
roll_apply_vec()
- Vectorized rolling apply function - wraps slider::slide_vec()
smooth_vec()
- Vectorized smoothing function - Applies Local Polynomial Regression (LOESS)diff_vec()
and diff_inv_vec()
- Vectorized differencing function. Pads NA
's by default (unlike stats::diff
).lag_vec()
- Vectorized lag functions. Returns both lags and leads (negative lags) by adjusting the .lag
argument.box_cox_vec()
, box_cox_inv_vec()
, & auto_lambda()
- Vectorized Box Cox transformation. Leverages forecast::BoxCox.lambda()
for automatic lambda selection.fourier_vec()
- Vectorized Fourier Series calculation.impute_ts_vec()
- Vectorized imputation of missing values for time series. Leverages forecast::na.interp()
.New Augment Functions:
All of the functions are designed for scale. They respect dplyr::group_by()
.
tk_augment_holiday_signature()
- Add holiday features to a data.frame
using only a time-series index.tk_augment_roll_apply()
- Add multiple columns of rolling window calculations to a data.frame
.tk_augment_differences()
- Add multiple columns of differences to a data.frame
.tk_augment_lags()
- Add multiple columns of lags to a data.frame
.tk_augment_fourier()
- Add multiple columns of fourier series to a data.frame
.New Make Functions:
Make date and date-time sequences between start and end dates.
tk_make_timeseries()
- Super flexible function for creating daily and sub-daily time series.tk_make_weekday_sequence()
- Weekday sequence that accounts for both stripping weekends and holidaystk_make_holiday_sequence()
- Makes a sequence of dates corresponding to business holidays in calendars from timeDate
(common non-working days)tk_make_weekend_sequence()
- Weekday sequence of dates for Saturday and Sunday (common non-working days)New Get Functions:
tk_get_holiday_signature()
- Get 100+ holiday features using only a time-series index.tk_get_frequency()
and tk_get_trend()
- Automatic frequency and trend calculation from a time series index.New Diagnostic / Data Processing Functions
tk_summary_diagnostics()
- Group-wise time series summary.tk_acf_diagnostics()
- The data preparation function for plot_acf_diagnostics()
tk_seasonal_diagnostics()
- The data preparation function for plot_seasonal_diagnostics()
tk_stl_diagnostics()
- Group-wise STL Decomposition (Season, Trend, Remainder). Data prep for plot_stl_diagnostics()
.tk_time_series_cv_plan
- The data preparation function for plot_time_series_cv_plan()
New Datasets
Improvements:
tk_make_future_timeseries()
- Now accepts n_future
as a time-based phrase like "12 seconds" or "1 year".Bug Fixes:
lubridate::tz<-
which now returns POSIXct when used Date objects. Fixed in PR32 by @vspinu.Potential Breaking Changes:
tk_augment_timeseries_signature()
- Changed from data
to .data
to prevent name collisions when piping.New Features:
recipes
Integration - Ability to apply time series feature engineering in the tidymodels
machine learning workflow.
step_timeseries_signature()
- New step_timeseries_signature()
for adding date and date-time features.Bug Fixes:
xts::indexTZ
is deprecated. Use tzone
instead.arrange_
with arrange
.tidyquant
1.0.0 upagrade (single stocks now return an extra symbol column).tidyquant
v0.5.7 - Removed dependency on tidyverse
timeSeries
to Suggests to satisfy a CRAN issue.timetk
. Was formerly timekit
.robets