
Check that oracle values in an oracle output target data file are valid
Source:R/check_target_tbl_oracle_value.R
check_target_tbl_oracle_value.RdThis check is only performed when the target data file contains an
output_type_id column and cdf or pmf output types.
It verifies that distributional output type (cdf and pmf) oracle
values meet the following criteria:
oracle_valuevalues are either0or1.pmforacle values sum to1for each observation unit.cdforacle values are non-decreasing for each observation unit when sorted by theoutput_type_idset defined in the hub config.
Usage
check_target_tbl_oracle_value(
target_tbl,
target_type = c("oracle-output", "time-series"),
file_path,
hub_path,
config_target_data = NULL
)Arguments
- target_tbl
A tibble/data.frame of the contents of the target data file being validated.
- target_type
Type of target data to validate. One of
"time-series"or"oracle-output". Defaults to"oracle-output".- file_path
A character string representing the path to the target data file relative to the
target-datadirectory.- hub_path
Either a character string path to a local Modeling Hub directory or an object of class
<SubTreeFileSystem>created using functionss3_bucket()orgs_bucket()by providing a string S3 or GCS bucket name or path to a Modeling Hub directory stored in the cloud. For more details consult the Using cloud storage (S3, GCS) in thearrowpackage. The hub must be fully configured with validadmin.jsonandtasks.jsonfiles within thehub-configdirectory.- config_target_data
Target data configuration object from
read_config(hub_path, "target-data"), or NULL (default) if config does not exist. When target-data.json exists, this should be provided to enable date column extraction for date relaxation. If NULL and date_col is not provided, date relaxation cannot be applied and a warning will be issued if allow_extra_dates is TRUE.
Value
Depending on whether validation has succeeded, one of:
<message/check_success>condition class object.<error/check_failure>condition class object.
Returned object also inherits from subclass <hub_check>.
Details
When validating oracle values, data is grouped by observation unit to check PMF sums and CDF monotonicity within each unit.
With target-data.json config:
Observable unit is determined from the config's observable_unit specification.
Without target-data.json config:
Observable unit is inferred from task ID columns present in the data.
The as_of column is NOT included in the grouping. Oracle data is designed to
contain a single version per observable unit with a one-to-one mapping to model
output data.