--- title: "Frequency and Trend Selection" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Frequency and Trend Selection} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, message = FALSE, warning = FALSE} knitr::opts_chunk$set( # message = FALSE, # warning = FALSE, fig.width = 8, fig.height = 4.5, fig.align = 'center', out.width='95%', dpi = 100 ) # devtools::load_all() # Travis CI fails on load_all() ``` __Frequency and trend cycles__ are used in many time series applications including Seasonal ARIMA (SARIMA) forecasting and STL Decomposition. `timetk` includes functionality for __Automatic Frequency and Trend Selection.__ These tools use only the the timestamp information to make logical guesses about the frequency and trend. # Prerequisites Before we get started, load the following packages. ```{r, message = FALSE, warning=FALSE} library(dplyr) library(timetk) ``` # Data __Daily Irregular Data__ The daily stock prices of Facebook from 2013 to 2016. Note that trading days only occur on "business days" (non-weekends and non-business-holidays). ```{r} data(FANG) FB_tbl <- FANG %>% dplyr::filter(symbol == "FB") FB_tbl ``` __Sub-Daily Data__ Taylor's Energy Demand data at a 30-minute timestamp interval. ```{r} taylor_30_min ``` # Applications An example of where automatic frequency detection occurs is in the `plot_stl_diagnostics()` function. ```{r, fig.height=8} taylor_30_min %>% plot_stl_diagnostics(date, value, .frequency = "auto", .trend = "auto", .interactive = FALSE) ``` # Automatic Frequency & Trend Selection ## Specifying a Frequency or Trend The `period` argument has three basic options for returning a frequency. Options include: - "auto": A target frequency is determined using a pre-defined ___Time Scale Template___ (see below). - time-based duration: (e.g. "7 days" or "2 quarters" per cycle) - numeric number of observations: (e.g. 5 for 5 observations per cycle) ## Frequency A _frequency_ is loosely defined as the number of observations that comprise a cycle in a data set. Using `tk_get_frequency()`, we can pick a number of observations that will roughly define a frequency for the series. __Daily Irregular Data__ Because `FB_tbl` is irregular (weekends and holidays are not present), the frequency selected is weekly but each week is only 5-days typically. So 5 is selected. ```{r, message = TRUE} FB_tbl %>% tk_index() %>% tk_get_frequency(period = "auto") ``` __Sub-Daily Data__ This works as well for a sub-daily time series. Here we'll use `taylor_30_min` for a 30-minute timestamp series. The frequency selected is 48 because there are 48 timestamps (observations) in 1 day for the 30-minute cycle. ```{r} taylor_30_min %>% tk_index() %>% tk_get_frequency("1 day") ``` ## Trend The trend is loosely defined as time span that can be aggregated across to visualize the central tendency of the data. Using `tk_get_trend()`, we can pick a number of observations that will help describe a trend for the data. __Daily Irregular Data__ Because `FB_tbl` is irregular (weekends and holidays are not present), the trend selected is 3 months but each week is only 5-days typically. So 64 observations is selected. ```{r, message = TRUE} FB_tbl %>% tk_index() %>% tk_get_trend(period = "auto") ``` __Sub-Daily Data__ A 14-day (2 week) interval is selected for the "30-minute" interval data. ```{r} taylor_30_min %>% tk_index() %>% tk_get_trend("auto") ``` # Time Scale Template A ___Time-Scale Template___ is used to get and set the time scale template, which is used by `tk_get_frequency()` and `tk_get_trend()` when `period = "auto"`. The predefined template is stored in a function `tk_time_scale_template()`. This is the default used by `timetk`. __Accessing the Default Template__ You can access the current template with `get_tk_time_scale_template()`. ```{r} get_tk_time_scale_template() ``` __Changing the Default Template__ You can modify the current template with `set_tk_time_scale_template()`. # Learning More

_My Talk on High-Performance Time Series Forecasting_ Time series is changing. __Businesses now need 10,000+ time series forecasts every day.__ __High-Performance Forecasting Systems will save companies MILLIONS of dollars.__ Imagine what will happen to your career if you can provide your organization a "High-Performance Time Series Forecasting System" (HPTSF System). I teach how to build a HPTFS System in my __High-Performance Time Series Forecasting Course__. If interested in learning Scalable High-Performance Forecasting Strategies then [take my course](https://university.business-science.io/p/ds4b-203-r-high-performance-time-series-forecasting). You will learn: - Time Series Machine Learning (cutting-edge) with `Modeltime` - 30+ Models (Prophet, ARIMA, XGBoost, Random Forest, & many more) - NEW - Deep Learning with `GluonTS` (Competition Winners) - Time Series Preprocessing, Noise Reduction, & Anomaly Detection - Feature engineering using lagged variables & external regressors - Hyperparameter Tuning - Time series cross-validation - Ensembling Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner) - Scalable Forecasting - Forecast 1000+ time series in parallel - and more.

Unlock the High-Performance Time Series Forecasting Course