Ed-Fi Validation API Design

Prepared for the Ed-Fi Alliance by: Jonathan Hickam, Learning Tapestry

Further Info

Note that this design and architectural options for this design were discussed by the Ed-Fi Technical Advisory Group. Please see notes on TAG Meeting 2021-03-31 - Subgroup to Review Validations API Design for important information as to how this design would likely be surfaced within the Ed-Fi technology stack. 

Contents

Introduction

This document has been prepared as part of a larger initiative that is looking at scalable, economical, and reusable solutions for level 2 validations.  For further context, architecture, and vocabulary refer to the associated Ed-Fi Data Validation Architecture documentThis work also builds heavily upon the work that is presented on Ed-Fi validations at the 2018 Ed-Fi Technical Congress by Vinaya Maya, Software Development Lead – Ed-Fi, and Britto Augustine, Chief Technology Officer – Arizona Dept. of Education, as seen in this presentation.

The purpose of this document is to define an approach that will enable systems that are currently submitting data to an Ed-Fi ODS/API to consume the level 2 validation results that are associated with the data provided.  A common use case would be a scenario in which a Student Information System submitting one district's data to a statewide Ed-Fi implementation would be able to read the validation errors associated with that district's student information data. From here, it could display those errors back to the district users who are responsible for researching and correcting the data, without requiring the district user to interface with a different dashboard or data validation system.

This document proposes a data structure for the validation results and lists out some of the initial details for API functionality.  The complete list of API requirements and detailed technical implementation details for the actual API are beyond the scope of this document.

Data Validation Structure 

The data validation results (as defined in the Ed-Fi Validation Architecture) will be composed of two resources, a "validation rule run" resource and a "validation result API".\

Validation Rule Run

This element will track the runs of the validation rules. The expectation is that this table would be populated before any validation results are produced with a status of 'running.' 

DATA ELEMENT

DATA TYPE / OPTIONALITY

DESCRIPTION

ValidationRunId

INTEGER
MANDATORY

This is a unique Id for each run, most likely sequential.

ValidationRunStartDateTime

DATETIME

MANDATORY

This is time that the validation run was started.

ValidationRunFinishDateTime

DATETIME

Optional

This is the time the validation run finished.

ValidationRunStatus

Restricted-list VARCHAR MANDATORY

This will denote the status of the validation run.  Possible values include 'Running','Finished','Stopped-manual','Stopped-Error'

Validation Results

This is the actual results from the validation rule.

DATA ELEMENT

DATA TYPE / OPTIONALITY

DESCRIPTION

ValidationResultId

VARCHAR

MANDATORY

This is a unique id. This id value would not be repeated when subsequent validation results from the same rule and the same target entity are produced.

ValidationRunId

INTEGER

MANGATORY

This refers (foriegn key) back up to the validation rule run.

ValidationRuleId*

VARCHAR

MANDATORY

This is a unique id that points back to the validation rule that caused the result to be produced. If a validation rule caused multiple results (for example multiple students with the same condition) they would share this id. This is part of the validation result signature.

ValidationResourceId*

Ed-Fi Resource Id

MANDATORY

This is the unique identifier in the ODS that is used to reference a specific resource.  Examples include StudentUniqueId or EducationOrganizationId. This is part of the validation result signature.

ValidationResourceType

Ed-Fi Resource

MANDATORY

This is the resource associated with the validation rule. This is denormalized from the validation rule, every instance of a given ValidationRuleId will have the same Ed-Fi resource

EducationOrganizationId

Integer

MANDATORY

Along with NameSpace, This is useful for limiting what systems can consume the validation results and routing the validation results within the consuming system. 

NameSpace

VACHAR Optional

Along with EducationOrganizationId, this can be used for limiting what systems can consume the validation results and routing the validation results within the consuming system.

ValidationDetails

VARCHAR

MANDATORY

This is non-structured ASCII text that will include the details that were used in the evaluation of the validation rule.

ValidationCategory

Descriptor

OPTIONAL

This is a category for the type of validation result.  Examples might be 'Student Demographics', 'Special Education', or 'Attendance' 

ValidationSeverity

Restricted-list VARCHAR
MANDATORY

This specifies whether the validation result is a 'Warning', 'Minor Validation Error', or 'Major Validation Error'. 

Validation Result Signature

A validation result will include the ValidationRuleId, which references back to the rule that caused it, and a ValidationResourceId, which will uniquely identify the Ed-Fi resource that is identified in the validation result.  If the underlying data is not corrected, then the next time that the validation rule runs the same validation rule will flag the same resource as having the same problem.
The system that is consuming the APIs may need to know that this is just an updated report on the same issue and not a new issue, especially if that issue has somehow been acknowledged or suppressed in that consuming system. The combination of ValidationRuleId and ValidationResourceId, as highlighted in the table above, forms the signature that can be used by that consuming system for implementing logic for deduplication and acknowledgement. The requirements of that logic is beyond the scope of this document.

Validation Results API Requirements

The following are the proposed, initial requirements for the validation results API:

  1. There must be a repository for validation results.
    1. The validation results repository should account for all of the data elements, there associated data types, and there associated optionality as described in the data validation structure section above.
    2. The validation results repository should be stored in a way that it can intuitively be queried by administrators with the appropriate level of database access.
    3. The validation results repository should have a mechanism that will prevent partial-reads of validation results.  In other words the timestamp in the validation results must be sequential so that there will never be validation results that are available in the API that have an older timestamp than validation results that are already available via the API.
  2. Validation results must be made available via a web-based pull API similar to other Ed-Fi resources.
    1. Validation results must be available to be pulled by any system that conforms to a published API.
    2. The validation results must include either Namespace or EducationOrganizationId. This is to enable the consuming system to route the validation result to the correct end-user (e.g. - a specific school in a SIS). 
    3. The consumer API should reuse the data-level security mechanism that is used via the existing Ed-Fi ODS (the details of this are unresolved, see the 'open issues' section of the architecture document for more discussion).
  3. Validation results must have the ability to be submitted via a bulk database API.
    1. The validation results submittal API should work with bulk SQL statements from a variety of validation rules engines.
  4. The validation engine must have a back-door method for viewing validation results (e.g. - database access).
  5. The validation result API must have a mechanism for handling incoming validation errors from systems other than the validation rule engine.
    1. The validation result API must be able to handle validation results that are not directly related to Ed-Fi ODS data.

Validation Consumer Requirements

The following is a preliminary, incomplete list of functionality that would need to be enabled by the validation result consumer system (hereafter called 'consumer') responsible for pulling validation results from the above-mentioned API. The consumer is most likely to be a Student Information System (SIS).

  1. The consumer must be able to talk to the validation API.
  2. The consumer should recognize when validation results with the same unique signature as described above are duplicates and handle them according to the logic specific for that implementation.
  3. The consumer should have the ability to route validation results to the appropriate end user for resolution
  4. The consumer should have a method for a user to 'acknowledge' a known issue for future validation results with the same signature to be suppressed. The detailed requirements for this is beyond the scope of this document.


Original Google doc: https://docs.google.com/document/d/1H2CaCUyg9cAcpAfoDZSeDjdIpliyLlOaYs-Axp6OGSg