
Validate file level properties of a target data file.
Source:R/validate_target_file.R
      validate_target_file.RdValidate file level properties of a target data file.
Usage
validate_target_file(
  hub_path,
  file_path,
  validations_cfg_path = NULL,
  round_id = "default"
)Arguments
- hub_path
 Either a character string path to a local Modeling Hub directory or an object of class
<SubTreeFileSystem>created using functionss3_bucket()orgs_bucket()by providing a string S3 or GCS bucket name or path to a Modeling Hub directory stored in the cloud. For more details consult the Using cloud storage (S3, GCS) in thearrowpackage. The hub must be fully configured with validadmin.jsonandtasks.jsonfiles within thehub-configdirectory.- file_path
 A character string representing the path to the target data file relative to the
target-datadirectory.- validations_cfg_path
 Path to YAML file configuring custom validation checks. If
NULLdefaults to standardhub-config/validations.ymlpath. For more details see article on custom validation checks.- round_id
 Character string. Not generally relevant to target datasets but can be used to specify a specific block of custom validation checks. Otherwise best set to
"default"which will deploy the default custom validation checks.
Value
An object of class hub_validations. Each named element contains
a hub_check class object reflecting the result of a given check. Function
will return early if a check returns an error.
For more details on the structure of <hub_validations> objects, including
how to access more information on individual checks,
see article on <hub_validations> S3 class objects.
Details
Details of checks performed by validate_target_file()
| Name | Check | Early return | Fail output | Extra info | 
|---|---|---|---|---|
| target_file_exists | File exists at `file_path` provided. | TRUE | check_error | |
| target_partition_file_name | Hive-style partition file path segments are valid and can be parsed successfully. Skipped if target dataset not hive-partitioned. | TRUE | check_error | |
| target_file_ext | Target data file extension is valid. | TRUE | check_error | 
Examples
hub_path <- system.file("testhubs/v5/target_file", package = "hubUtils")
validate_target_file(hub_path,
  file_path = "time-series.csv"
)
#> 
#> ── time-series.csv ────
#> 
#> ✔ [target_file_exists]: File exists at path target-data/time-series.csv.
#> ℹ [target_partition_file_name]: Target file path not hive-partitioned. Check
#>   skipped.
#> ✔ [target_file_ext]: Target data file extension is valid.
validate_target_file(hub_path,
  file_path = "oracle-output.csv"
)
#> 
#> ── oracle-output.csv ────
#> 
#> ✔ [target_file_exists]: File exists at path target-data/oracle-output.csv.
#> ℹ [target_partition_file_name]: Target file path not hive-partitioned. Check
#>   skipped.
#> ✔ [target_file_ext]: Target data file extension is valid.
hub_path <- system.file("testhubs/v5/target_dir", package = "hubUtils")
validate_target_file(hub_path,
  file_path = "time-series/target=wk%20flu%20hosp%20rate/part-0.parquet"
)
#> 
#> ── time-series/target=wk%20flu%20hosp%20rate/part-0.parquet ────
#> 
#> ✔ [target_file_exists]: File exists at path
#>   target-data/time-series/target=wk%20flu%20hosp%20rate/part-0.parquet.
#> ✔ [target_partition_file_name]: Hive-style partition file path segments are
#>   valid.
#> ✔ [target_file_ext]: Hive-partitioned target data file extension is valid.
validate_target_file(hub_path,
  file_path = "oracle-output/output_type=pmf/part-0.parquet"
)
#> 
#> ── oracle-output/output_type=pmf/part-0.parquet ────
#> 
#> ✔ [target_file_exists]: File exists at path
#>   target-data/oracle-output/output_type=pmf/part-0.parquet.
#> ✔ [target_partition_file_name]: Hive-style partition file path segments are
#>   valid.
#> ✔ [target_file_ext]: Hive-partitioned target data file extension is valid.