This function provides a convenient way to train several model types. It allows a user to predict on new data and depending on the metrics, the user is able to decide which model predictions to finally use. The models are built based on Max Kuhn's models in the caret package.
multi_model_1( old_data, yname, xname, method = NULL, metric = NULL, control = NULL, new_data = NULL, ... )
old_data | The data holding the training dataset |
---|---|
yname | The outcome variable |
xname | The predictor variable(s) |
method | A vector containing methods to be used as defined in the caret package |
metric | One of several metrics. Accuracy,RMSE,MAE,etc |
control | See caret ?trainControl for details. |
new_data | A data set to validate the model or for which predictions are required |
... | Other arguments to caret's train function |
A list containing two objects. A tibble containing a summary of the metrics per model, a tibble containing predicted values and information concerning the model
Most of the details of the parameters can be found in the caret package documentation. This function is meant to help in exploratory analysis to make an informed choice of the best models
Kuhn (2014), "Futility Analysis in the Cross-Validation of Machine Learning Models" http://arxiv.org/abs/1405.6974,
Kuhn (2008), "Building Predictive Models in R Using the caret" (http://www.jstatsoft.org/article/view/v028i05/v28i05.pold_data)
data("yields", package="manymodelr") train_set<-createDataPartition(yields$normal,p=0.8,list=FALSE) valid_set<-yields[-train_set,] train_set<-yields[train_set,] ctrl<-trainControl(method="cv",number=5) set.seed(233) m<-multi_model_1(train_set,"normal",".",c("knn","rpart"), "Accuracy",ctrl,new_data =valid_set) m$Predictions #> NULL m$Metrics #> NULL m$modelInfo #> NULL