A wrapper function which treats missing values (removes, imuptes, etc.) found in a data set. This function is set-up to handle data geared towards univariate analysis (e.g. a single response predicted by multiple covariates).

na_treatment(
  data_df,
  na_thresh,
  treatment_type,
  id_var,
  response_var,
  covariate_vec,
  random_seed,
  verbose = FALSE
)

Arguments

data_df

A class data.frame object to treat any missing values (e.g. NAs)

na_thresh

Specify a proportion between 0 and 1. Any covariate with the proportion of missing data greater than this threshold value will simply be excluded.

treatment_type

Specifies how the missing values are treated.

'omit': Missing values are omitted from the data set.
'central_tendency': Missing values are replaced with the median and modal values for continuous and discrete covariates, respectively.
class(list): A list either with entries (type = "resample", "random_seed" = i) or (type = "impute", ntree = n). For type resample, missing values are replaced with randomly re-sampled observations where i sets the random seed. For type impute, the missRanger package is used to impute missing values using a random forest machine learning algorithm.

id_var

The column name of 'data_df' containing the observation id or row id.

response_var

The column name of 'data_df' containing the response variable. This column is not treated for missing values.

covariate_vec

The column names of covariates to treat for missing values.

random_seed

Sets a random seed.

verbose

Defaults to FALSE. Print the report?

Value

A named list is returned.

data: A class data.frame object without any missing values.
report: A text report of the processing that occurred.