Time-based¶
Find a check under the
time_variance
directory to copy as a starting point.Add the check to the
time_variance/definitions.py
file. For example:"ocid": ocid, "tender_title": tender_title, "phase_stable": phase_stable,
Each check is an object (usually a module) that has two attributes: filter
and evaluate
.
Pairs of items with the same ocid are read in batches from the dataset and its ancestor. Each item is passed to the filter
function, which:
Accepts five arguments: an accumulator, an ancestor’s item and its ID, and a dataset’s item and its ID
Returns whether the check can be calculated against the pair of items (for example, if both are present)
If filter
returns True
, and if the new item is present, then the evaluate
function:
Accepts five arguments, like
filter
Determines whether the check passes
Returns the accumulator, and whether the check passes
The accumulator is initialized as:
"""
Initialize a time-based check result accumulator.
"""
return {
"total_count": 0,
"coverage_count": 0,
"failed_count": 0,
"ok_count": 0,
"examples": ReservoirSampler(50),
}
time_variance/processor.py
then prepares the result
dict. An empty result
dict looks like:
"""
Initialize a time-based check result.
:param version: the check's version
"""
return {
"check_result": None,
"check_value": None,
"coverage_value": None,
"coverage_result": None,
"meta": None,
"version": version,
}
Storage¶
The result of each check for a given dataset is stored in a single row in the time_variance_level_check
table.