
Perform validations on the compound task ID set used to calculate an ensemble of component model outputs for the sample output type, including checks that (1) compound_taskid_set
is a subset of task_id_cols
, (2) the provided model_out_tbl
is compatible with the specified compound_taskid_set
, and (3) all models submit predictions for the same set of non compound_taskid_set
variables.
Source: R/validate_ensemble_inputs.R
validate_compound_taskid_set.Rd
Perform validations on the compound task ID set used to calculate an ensemble of
component model outputs for the sample output type, including checks that
(1) compound_taskid_set
is a subset of task_id_cols
, (2) the provided
model_out_tbl
is compatible with the specified compound_taskid_set
, and
(3) all models submit predictions for the same set of non compound_taskid_set
variables.
Usage
validate_compound_taskid_set(
model_out_tbl,
task_id_cols,
compound_taskid_set,
derived_tasks = NULL,
return_missing_combos = FALSE
)
Arguments
- model_out_tbl
an object of class
model_out_tbl
with component model outputs (e.g., predictions).- task_id_cols
character
vector with names of columns inmodel_out_tbl
that specify modeling tasks. Defaults toNULL
, in which case all columns inmodel_out_tbl
other than"model_id"
,"output_type"
,"output_type_id"
, and"value"
are used as task ids.- compound_taskid_set
character
vector of the compound task ID variable set. This argument is only relevant foroutput_type
"sample"
. Can be one of three possible values, with the following meanings:NA
: the compound_taskid_set is not relevant for the current modeling taskNULL
: samples are from a multivariate joint distribution across all levels of all task id variablesEquality to
task_id_cols
: samples are from separate univariate distributions for each individual prediction task
Defaults to NA. Derived task ids must be included if all of the task ids their values depend on are part of the
compound_taskid_set
.- derived_tasks
character
vector of derived task IDs (variables whose values depend on that of other task ID variables). Defaults to NULL, meaning there are no derived task IDs.- return_missing_combos
boolean
specifying whether to return adata.frame
summarizing the missing combinations of dependent tasks for each model. If TRUE, the columns of thedata.frame
will be "model_id" and one for each of the dependent tasks (complement of thecompound_taskid_set
). Defaults to FALSE.