R/na_treatment.r
na_treatment.Rd
A wrapper function which treats missing values (removes, imuptes, etc.) found in a data set. This function is set-up to handle data geared towards univariate analysis (e.g. a single response predicted by multiple covariates).
na_treatment(
data_df,
na_thresh,
treatment_type,
id_var,
response_var,
covariate_vec,
random_seed,
verbose = FALSE
)
A class data.frame object to treat any missing values (e.g. NAs)
Specify a proportion between 0 and 1. Any covariate with the proportion of missing data greater than this threshold value will simply be excluded.
Specifies how the missing values are treated.
Missing values are omitted from the data set.
Missing values are replaced with the median and modal values for continuous and discrete covariates, respectively.
A list either with entries (type = "resample", "random_seed" = i) or (type = "impute", ntree = n). For type resample, missing values are replaced with randomly re-sampled observations where i sets the random seed. For type impute, the missRanger package is used to impute missing values using a random forest machine learning algorithm.
The column name of 'data_df' containing the observation id or row id.
The column name of 'data_df' containing the response variable. This column is not treated for missing values.
The column names of covariates to treat for missing values.
Sets a random seed.
Defaults to FALSE. Print the report?
A named list is returned.
A class data.frame object without any missing values.
A text report of the processing that occurred.