MultivariateEDA.Rd
R6 class MultivariateEDA
R6 class MultivariateEDA
new()
Initialize an object to perform EDA on a Multivariate Time Series
MultivariateEDA$new(data = NA, var_interest = NA, var_time = NA, verbose = 0)
data
The dataframe (or type that can be coerced to a dataframe) containing the time series realizations
var_interest
The output variable of interest (dependent variable)
var_time
If the dataframe has a time column, what is the name of this column?
verbose
How much to print during the model building and other processes (Default = 0)
A new `MultivariateEDA` object.
get_data()
Returns the time series realization
MultivariateEDA$get_data(time = NA)
time
NA will return the original data without the 'var_time' column 'original' will return the original data with any 'var_time' column (if applicable) 'sub' will return the original data without any 'var_time' column (if applicable) but with a substitute 'Time' column which is equal to the observation number
The Time Series Realization
get_var_interest()
Returns the dependent variable name
MultivariateEDA$get_var_interest()
The dependent variable name
get_var_time()
Returns the time variable
MultivariateEDA$get_var_time()
The time variable
get_data_var_interest()
Returns the dependent variable data only
MultivariateEDA$get_data_var_interest()
The dependent variable data only
set_verbose()
Adjust the verbosity level
MultivariateEDA$set_verbose(verbose = 0)
verbose
0 = Minimal Printing only (usualy limited to step being performed) 1 = Basic printing of model builds, etc. 2 = Reserved for debugging mode. May slow down the run due to excessive printing, especially when using batches
plot_data()
Plots the time series with all the dependent variables
MultivariateEDA$plot_data(ncol = 1, scales = "free_y", ...)
ncol
Number of columns to use to show the data
scales
The scales argument to be passed to ggplot facet_wrap layer (Default = 'free_y') Other appropriate options: 'fixed'
...
Arguments to pass to facet wrap. Example "ncol = 3, scales = 'free_y'"
plot_scatterplots()
Plots the scatterplots matric of all the variables in the dta
MultivariateEDA$plot_scatterplots()
plot_ccf_analysis()
Plots the CCF function for the dependent variable against all independent variables
MultivariateEDA$plot_ccf_analysis(lag.max = 12, negative_only = TRUE)
lag.max
The maximum lag to evaluate
negative_only
Whether to take max cross correlation of only negative lags for the independent variables. Many times durign predictions, we dont have future values available for the independent variables In such cases, we can not use positive lag values for predictions. (Default = TRUE)
A dataframe containing (1) 'variable': the dependent variable name, (2) 'max_ccf_index': lag at which max cross correlation occurs (3) 'max_ccf_value': max cross correlation value (abs) (4) 'max_ccf_index_adjusted': adjusted index (if negative_only is FALSE, then this will show a value capped at lag = 0 for any positive lag index). User may then decide to use either this value or the 'max_ccf_index' dependig on if the positive lag values of the dependent variable will be available for the prediction
plot_lag_plots()
MultivariateEDA$plot_lag_plots()
clone()
The objects of this class are cloneable with this method.
MultivariateEDA$clone(deep = FALSE)
deep
Whether to make a deep clone.