Check model output data tbl samples contain single unique combination of non-compound task ID values across all samples
Source:R/check_tbl_spl_non_compound_tid.R
check_tbl_spl_non_compound_tid.Rd
Check model output data tbl samples contain single unique combination of non-compound task ID values across all samples
Usage
check_tbl_spl_non_compound_tid(
tbl,
round_id,
file_path,
hub_path,
compound_taskid_set = NULL,
derived_task_ids = NULL
)
Arguments
- tbl
a tibble/data.frame of the contents of the file being validated. Column types must all be character.
- round_id
character string. The round identifier.
- file_path
character string. Path to the file being validated relative to the hub's model-output directory.
- hub_path
Either a character string path to a local Modeling Hub directory or an object of class
<SubTreeFileSystem>
created using functionss3_bucket()
orgs_bucket()
by providing a string S3 or GCS bucket name or path to a Modeling Hub directory stored in the cloud. For more details consult the Using cloud storage (S3, GCS) in thearrow
package. The hub must be fully configured with validadmin.json
andtasks.json
files within thehub-config
directory.- compound_taskid_set
a list of
compound_taskid_set
s (characters vector of compound task IDs), one for each modeling task. Used to override the compound task ID set in the config file, for example, when validating coarser samples.- derived_task_ids
Character vector of derived task ID names (task IDs whose values depend on other task IDs) to ignore. Columns for such task ids will contain
NA
s.
Value
Depending on whether validation has succeeded, one of:
<message/check_success>
condition class object.<error/check_error>
condition class object.
Returned object also inherits from subclass <hub_check>
.
Details
Output of the check includes an errors
element, a list of items,
one for each modeling task containing samples failing validation,
with the following structure:
mt_id
: Index identifying the config modeling task the samples are associated with.output_type_ids
: The output type IDs of samples that do not match the most frequent non-compound task ID value combination across all samples in the modeling task.frequent
: The most frequent non-compound task ID value combination across all samples in the modeling task to which all samples were compared. See hubverse documentation on samples for more details.