Ed-Fi Validation API Design Rev1

Further Info

Note that this design is a revision to Ed-Fi Validation API Design

Contents

Introduction

This document has been prepared as part of a larger initiative that is looking at scalable, economical, and reusable solutions for level 2 validations.  For further context, architecture, and vocabulary refer to the associated Ed-Fi Data Validation Architecture documentThis work also builds heavily upon the work that is presented on Ed-Fi validations at the 2018 Ed-Fi Technical Congress by Vinaya Maya, Software Development Lead – Ed-Fi, and Britto Augustine, Chief Technology Officer – Arizona Dept. of Education, as seen in this presentation.

The purpose of this document is to define an approach that will enable systems that are currently submitting data to an Ed-Fi ODS/API to consume the level 2 validation results that are associated with the data provided.  A common use case would be a scenario in which a Student Information System submitting one district's data to a statewide Ed-Fi implementation would be able to read the validation errors associated with that district's student information data. From here, it could display those errors back to the district users who are responsible for researching and correcting the data, without requiring the district user to interface with a different dashboard or data validation system.

This document proposes a data structure for the validation results and lists out some of the initial details for API functionality.  The complete list of API requirements and detailed technical implementation details for the actual API are beyond the scope of this document.

Major Changes from Previous Version

  • A ValidationRule endpoint was added, to allow for capture of rules, to allow API clients to reference these via the API. This endpoint will de facto be required in order to allow an API client to understand the severity and categorization of issues.
  • Normalized by moving add Severity and Category from ValidationResult to ValidationRule. 
  • ValidationRule.Severity changed from string datatype to descriptor references.
  • EducationOrganization identifiers were changed to be references; this is consistent with the main data API pattern, and is important to allowing the Validation API to be implemented as an API extension.
  • Repeated use of term "Validation" on API resource fields was dropped to shorten element naming.
  • ValidationRun.RunStatus resource was changed from a string to a descriptor reference.

Data Validation Structure 

The data validation results (as defined in the Ed-Fi Validation Architecture) will be composed of three resources, "validation rule", "validation rule run" resource and a "validation result API".

The data validation results (as defined in the Ed-Fi Validation Architecture) will be composed of four resources, “rule collection”,  "validation rule", "validation rule run" resource and a "validation result API".

Validation Rule

DATA ELEMENT

DATA TYPE / OPTIONALITY

REVISION

DESCRIPTION

RuleIdentifier*

STRING

MANDATORY (IDENTITY)

New

This is the unique Id for a validation rule.

RuleSource

STRING

MANDATORY (IDENTITY)

NewThe source or origin of the rule.
HelpUrl

STRING

OPTIONAL

NewA link to more information about the rule and how to resolve it. 
ShortDescription

STRING

OPTIONAL

NewThis is non-structured ASCII text that will include the short details that were used in the evaluation of the validation rule.     
Description

STRING

MANDATORY

NewThis is non-structured ASCII text that will include the details that were used in the evaluation of the validation rule.
RuleStatus

Restricted-list

STRING

MANDATORY

NewThe current status of if the rule. Examples are “Active”, “Under Analysis”, Inactive”, “Deprecated”.

Category

STRING

OPTIONAL

New

This is a category for the type of validation rule.  Examples might be 'Student Demographics', 'Special Education', or 'Attendance' 

Severity

DESCRIPTOR
MANDATORY

New 

This specifies whether the validation rule is a 'Warning', 'Minor Validation Error', 'Major Validation Error' or other value standardized by the API
ExternalRuleIdSTRING OPTIONALNew Refers back to a unique identifier for this rule in another system (such as a state-maintained repository of validation rules)
ValidationLogicType

DESCRIPTOR

OPTIONAL
New Specifies the language that the validation logic is represented in, ie SQL or Pseudo-code
ValidationLogicSTRING OPTIONALNew Has the actual code or pseudo-code that is used to find validation errors.

Validation Rule Run

This element will track the runs of the validation rules. The expectation is that this table would be populated before any validation results are produced with a status of 'running.' 

DATA ELEMENT

DATA TYPE / OPTIONALITY

REVISION

DESCRIPTION

RunIdentifier*

STRING
MANDATORY

(IDENTITY)

Renamed

This is a unique Id for each run

RunStartDateTime

DATETIME

MANDATORY

Renamed

This is time that the validation run was started.

RunFinishDateTime

DATETIME

Optional

Renamed

This is the time the validation run finished.

RunStatus

DESCRIPTOR

MANDATORY

Renamed

This will denote the status of the validation run.  Possible values include 'Running','Finished','Stopped-manual','Stopped-Error'

HostSTRING OPTIONALNewThe name of the Host or ODS that was evaluated in this run
ValidationEngineSTRING OPTIONALNewA reference to the validation engine that was responsible for this run

Validation Results

This is the actual results from the validation rule.

DATA ELEMENT

DATA TYPE / OPTIONALITY

REVISION

DESCRIPTION

ResultIdentifier*

STRING

MANDATORY

Renamed

This is a unique id. 

ValidationRuleRunReference

REFERENCE

OPTIONAL

Renamed

This refers (foreign key) back up to the validation rule run.

ValidationRuleReference

REFERENCE MANDATORY

Renamed

This is a unique id that points back to the validation rule that caused the result to be produced. If a validation rule caused multiple results (for example multiple students with the same condition) they would share this id. This is part of the validation result signature.

ResourceId

Ed-Fi Resource Id

OPTIONAL

Renamed

This is the unique identifier in the ODS that is used to reference a specific resource.  Examples include StudentUniqueId or EducationOrganizationId. This is part of the validation result signature.

ResourceType

Ed-Fi Resource

OPTIONAL

Renamed

This is the resource associated with the validation rule. This is denormalized from the validation rule, every instance of a given RuleId will have the same Ed-Fi resource

EducationOrganizationReference

REFERENCE

MANDATORY

Renamed and Datatype changed

Along with NameSpace, This is useful for limiting what systems can consume the validation results and routing the validation results within the consuming system. 

As a reference, this JSON will follow this format: 

"educationOrganizationReference": {
      "educationOrganizationId": [ id ]
 }

StudentReferenceREFERENCE OPTIONALNewReference back to an EdFi student object, when applicable for that validation
StaffReferenceREFERENCE OPTIONALNewReference back to an EdFi staff object, when applicable for that validation

NameSpace

STRINGOPTIONAL

-

Along with EducationOrganization, this can be used for limiting what systems can consume the validation results and routing the validation results within the consuming system.

AdditionalContext

Array of name/value pairs

OPTIONAL

Renamed and Datatype changed

Includes the details that were used in the evaluation of the validation rule.

Validation Result Signature

De-duplication by API Client

A validation result will include the RuleIdentifier, which references back to the rule that caused it, and the impacted resources, as identified by EducationOrganization reference, student reference, staff reference, and the ResourceId, which will uniquely identify the Ed-Fi resource that is identified in the validation result.  If the underlying data is not corrected, then the next time that the validation rule runs the same validation rule will flag the same resource as having the same problem.

The system that is consuming the APIs may need to know that this is just an updated report on the same issue and not a new issue, especially if that issue has somehow been acknowledged or suppressed in that consuming system. The combination of RuleId and ResourceId, as highlighted in the table above, forms the signature that can be used by that consuming system for implementing logic for deduplication and acknowledgement. The requirements of that logic is beyond the scope of this document.

Validation Results API Requirements

The following are the proposed, initial requirements for the validation results API:

  1. There must be a repository for validation results.
    1. The validation results repository should account for all of the data elements, there associated data types, and there associated optionality as described in the data validation structure section above.
    2. The validation results repository should be stored in a way that it can intuitively be queried by administrators with the appropriate level of database access.
    3. The validation results repository should have a mechanism that will prevent partial-reads of validation results.  In other words the timestamp in the validation results must be sequential so that there will never be validation results that are available in the API that have an older timestamp than validation results that are already available via the API.
  2. Validation results must be made available via a web-based pull API similar to other Ed-Fi resources.
    1. Validation results must be available to be pulled by any system that conforms to a published API.
    2. The validation results must include either Namespace or EducationOrganizationId. This is to enable the consuming system to route the validation result to the correct end-user (e.g. - a specific school in a SIS). 
    3. The consumer API should reuse the data-level security mechanism that is used via the existing Ed-Fi ODS (the details of this are unresolved, see the 'open issues' section of the architecture document for more discussion).
  3. Validation results must have the ability to be submitted via a bulk database API.
    1. The validation results submittal API should work with bulk SQL statements from a variety of validation rules engines.
  4. The validation engine must have a back-door method for viewing validation results (e.g. - database access).
  5. The validation result API must have a mechanism for handling incoming validation errors from systems other than the validation rule engine.
    1. The validation result API must be able to handle validation results that are not directly related to Ed-Fi ODS data.

Validation Consumer Requirements

The following is a preliminary, incomplete list of functionality that would need to be enabled by the validation result consumer system (hereafter called 'consumer') responsible for pulling validation results from the above-mentioned API. The consumer is most likely to be a Student Information System (SIS).

  1. The consumer must be able to talk to the validation API.
  2. The consumer should recognize when validation results with the same unique signature as described above are duplicates and handle them according to the logic specific for that implementation.
  3. The consumer should have the ability to route validation results to the appropriate end user for resolution
  4. The consumer should have a method for a user to 'acknowledge' a known issue for future validation results with the same signature to be suppressed. The detailed requirements for this is beyond the scope of this document.