Create Hub Repository from Template
The first step to setting up a Modeling Hub is to make a copy of the
hubTemplate
GitHub template repository and clone a local copy of it.
Configure Modeling Hub
Hubs are configured through two JSON config files which are used to validate new submissions as well as inform data users of available data.
-
admin.json
: Contains Hub wide administrative configuration settings -
tasks.json
: Contains round-specific metadata of modeling tasks including task, target and output type metadata as well as details of submission windows. These files should live in a directory calledhub-config/
in the root of the Hub repository.
For more details on these files, guidance on how to set them up and access to templates, please see our central hubDocs as well the JSON schema the config files should adhere to.
Validate Config files
You can use function validate_config()
to check whether
individual Hub config files are valid. To specify the file you want to
validate you can either provide the path to the root of the Hub to
argument hub_path
(which assumes the config files are
correctly located in directory hub-config/
) or you can
provide a direct path to a config file to argument
config_path
. You also need to specify the type of config
file through argument config
(one of "tasks"
or "admin"
, defaults to "tasks"
).
The function will validate a given config file against a specific
version of it’s schema, specified through argument
schema_version
. The default value of
schema_version
is "from_config"
which uses the
version specified in the schema_version
property of the
config file being validated.
validate_config(
hub_path = system.file("testhubs/simple/", package = "hubUtils"),
config = "tasks"
)
#> Loading required namespace: jsonvalidate
#> [1] TRUE
#> ✔ ok: hub-config/tasks.json (<file:///home/runner/work/_temp/Library/hubUtils/testhubs/simple/hub-config/tasks.json>) (via tasks-schema v2.0.0 (<https://raw.githubusercontent.com/hubverse-org/schemas/main/v2.0.0/tasks-schema.json>))
If validation succeeds, the function returns TRUE
. The
path to the config file validated and the version and URL of the schema
used for validation are also attached as attributes
"config_path"
, "schema_version"
and
"schema_url"
respectively.
You can validate a config file against the latest version of the
schema by using "latest"
or you can choose a specific
version, e.g "v0.0.1"
.
The function defaults to using stable schema versions released to the
main
branch, but you can choose to validate against another
branch (e.g. an upcoming development version) through argument
branch
.
validate_config(
hub_path = system.file("testhubs/simple/", package = "hubUtils"),
config = "tasks",
schema_version = "v0.0.0.9"
)
#> [1] FALSE
#> ! 4 schema errors: hub-config/tasks.json
#> (<file:///home/runner/work/_temp/Library/hubUtils/testhubs/simple/hub-config/tasks.json>)
#> (via tasks-schema v0.0.0.9
#> (<https://raw.githubusercontent.com/hubverse-org/schemas/main/v0.0.0.9/tasks-schema.json>))
#> ℹ use `view_config_val_errors()` to view table of error details.
validate_config(
hub_path = system.file("testhubs/simple/", package = "hubUtils"),
config = "tasks",
schema_version = "latest"
)
#> [1] FALSE
#> ! 8 schema errors: hub-config/tasks.json
#> (<file:///home/runner/work/_temp/Library/hubUtils/testhubs/simple/hub-config/tasks.json>)
#> (via tasks-schema v4.0.0
#> (<https://raw.githubusercontent.com/hubverse-org/schemas/main/v4.0.0/tasks-schema.json>))
#> ℹ use `view_config_val_errors()` to view table of error details.
Validation returning errors
If validation of the config file fails, the function returns
FALSE
. An additional list dataframe of errors returned by
the ajv
validation engine used is also attached as
attribute "errors"
.
config_path <- system.file("error-schema/tasks-errors.json",
package = "hubUtils"
)
validate_config(config_path = config_path, config = "tasks")
#> Warning: Hub configured using schema version v0.0.0.9. Support for schema earlier than
#> v2.0.0 was deprecated in hubUtils 0.0.0.9010.
#> ℹ Please upgrade Hub config files to conform to, at minimum, version v2.0.0 as
#> soon as possible.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
#> [1] FALSE
#> ! 7 schema errors: error-schema/tasks-errors.json
#> (<file:///home/runner/work/_temp/Library/hubUtils/error-schema/tasks-errors.json>)
#> (via tasks-schema v0.0.0.9
#> (<https://raw.githubusercontent.com/hubverse-org/schemas/main/v0.0.0.9/tasks-schema.json>))
#> ℹ use `view_config_val_errors()` to view table of error details.
Because the default output of the validator can be unwieldy and
difficult to review, you can use function
view_config_val_errors()
to launch a more user friendly and
concise version of the errors table in the Viewer panel in Rstudio.
validation <- validate_config(config_path = config_path, config = "tasks")
view_config_val_errors(validation)
hubAdmin config validation error report |
|||||
Report for file /home/runner/work/_temp/Library/hubUtils/error-schema/tasks-errors.json using
schema version v0.0.0.9 |
|||||
Error location
|
Schema details
|
Config
|
|||
---|---|---|---|---|---|
instancePath | schemaPath | keyword | message | schema | data |
rounds └1 └─model_tasks └──1 └───task_ids └────target └─────required | properties └rounds └─items └──properties └───model_tasks └────items └─────properties └──────task_ids └───────properties └────────target └─────────properties └──────────required └───────────type | type | ❌ must be array,null | array, null | wk inc flu hosp |
rounds └1 └─model_tasks └──1 └───output_type └────mean | properties └rounds └─items └──properties └───model_tasks └────items └─────properties └──────output_type └───────properties └────────mean └─────────required | required | ❌ must have required property 'type_id' | type_id, value | type_id |
rounds └1 └─model_tasks └──1 └───output_type └────quantile | properties └rounds └─items └──properties └───model_tasks └────items └─────properties └──────output_type └───────properties └────────quantile └─────────required | required | ❌ must have required property 'type_id' | type_id, value | type_id |
rounds └1 └─submissions_due | properties └rounds └─items └──properties └───submissions_due └────oneOf | oneOf | ❌ must match exactly one schema in oneOf | 1
relative_to-description: Name of task id variable in relation to which submission start and end dates are calculated.
relative_to-type: string
start-description: Difference in days between start and origin date.
start-type: integer
start-format:‘NA’
end-description: Difference in days between end and origin date.
end-type: integer
end-format:‘NA’
required1: relative_to
required2: start
required3: end
2 relative_to-description:‘NA’ relative_to-type:‘NA’ start-description: Submission start date. start-type: string start-format: date end-description: Submission end date. end-type: string end-format: date required1: start required2: end |
start: -6, end: 1 |
For more information, please consult the
hubDocs documentation. |
In the example above:
-
instancePath
indicates the location of the validation error in the config file. -
schemaPath
indicates the location of the element in the schema which is failing validation. -
keyword
indicates the keyword causing the validation error. -
message
is the validation error message returned by the validator. -
schema
describes the valid schema values the failing keyword should conform to. -
data
is the value of the property in the config file which is failing validation.
Validating all Hub config files
To validate both admin.json
and tasks.json
in a single call, you can use function
validate_hub_config()
. This functions tests both files for
validity and returns a list of the results of the validation checks for
each file. By default it uses “from_config” as the
schema_version
argument and errors if both files are not
using the same schema version.
validate_hub_config(
hub_path = system.file("testhubs/simple/", package = "hubUtils")
)
#> ✔ Hub correctly configured!
#> admin.json, tasks.json and model-metadata-schema.json all valid.
#> ℹ schema version v2.0.0
#> (<https://github.com/hubverse-org/schemas/tree/main/v2.0.0>)
#>
#> ── $tasks
#> [1] TRUE
#> ✔ ok: hub-config/tasks.json (<file:///home/runner/work/_temp/Library/hubUtils/testhubs/simple/hub-config/tasks.json>) (via tasks-schema v2.0.0 (<https://raw.githubusercontent.com/hubverse-org/schemas/main/v2.0.0/tasks-schema.json>))
#>
#> ── $admin
#> [1] TRUE
#> ✔ ok: hub-config/admin.json (<file:///home/runner/work/_temp/Library/hubUtils/testhubs/simple/hub-config/admin.json>) (via admin-schema v2.0.0 (<https://raw.githubusercontent.com/hubverse-org/schemas/main/v2.0.0/admin-schema.json>))
#>
#> ── $model-metadata-schema
#> [1] TRUE
#> ✔ ok: hub-config/model-metadata-schema.json (<file:///home/runner/work/_temp/Library/hubUtils/testhubs/simple/hub-config/model-metadata-schema.json>) (from default json schema (<file:///home/runner/work/_temp/Library/hubUtils/testhubs/simple/hub-config/model-metadata-schema.json>))
You can also use the function to validate a Hub’s config against the latest version of the schema.
validate_config(
hub_path = system.file("testhubs/simple/", package = "hubUtils"),
schema_version = "latest"
)
#> [1] FALSE
#> ! 8 schema errors: hub-config/tasks.json
#> (<file:///home/runner/work/_temp/Library/hubUtils/testhubs/simple/hub-config/tasks.json>)
#> (via tasks-schema v4.0.0
#> (<https://raw.githubusercontent.com/hubverse-org/schemas/main/v4.0.0/tasks-schema.json>))
#> ℹ use `view_config_val_errors()` to view table of error details.
You can also use view_config_val_errors()
on the output
of validate_hub_config()
to review any detected validation
errors.