Newer
Older
notebooks / behaverse / validators / nback_json_validator.Rmd
---
title: "Validate Behaverse Sample Data Files"
author: "Morteza Ansarinia"
date: "Nov 26, 2019"
output: 
  html_document: 
    highlight: espresso
---

```{r setup, include=FALSE}
# TODO: change the current directory to where `data` and `behaverse/validators` are available.
knitr::opts_knit$set(root.dir = "~/workspace/notebooks/")

knitr::opts_chunk$set(echo = TRUE)

library(reticulate)
use_condaenv("miniconda3", conda = "auto", required = T)
```

The following code demonstrates how to validate the content of a sample json file generated by Behaverse.

TODO: BUILD?

```{python}
import jsonschema
import json
#import fastjsonschema

class BehaverseJsonValidator():
    """
    Loads Behaverse json file from local drive, aggregate json lines into a single
    array, and validate the content and structure of the events.
    """

    def __init__(self, json_schema_path: str = None):
        if json_schema_path is not None:
            with open(json_schema_path) as f:
                self.schema = json.load(f)

    @staticmethod
    def __join_json_lines(src_path):
        with open(src_path) as f:
            return f"[{','.join(f.readlines())}]"

    def convert(self, src_path, dest_path):
        with open(dest_path, "w") as f:
            f.write(self.__join_json_lines(src_path))

    def validate_json(self, content):
        """
        Validates a json object with respect to a predefined schema (json-schema-v7).
        If json contains errornous content, a validation error will be raised.
        """
        assert(self.schema is not None)
        return jsonschema.validate(content, self.schema)

    def validate_json_file(self, json_path: str):
        """
        Loads and validates a json file. Respective validation error will be raised in case of invalid structure.
        """
        with open(json_path) as f:
            content = json.load(f)
            return self.validate_json(content)

```

This code first concats all json lines and produces a single json array, and then confirms the validity of the json file according to a provided schema. This schema contains rules to validate content, structures, conditional structures (e.g., `custom` metadata), and integrity of the stored fields.

TODO: list all concerns that is being tested in a bullet list.


```{python}
import os
original_json_file = "data/samples/nback_demo.json"
valid_json_file = "data/samples/nback_demo_v.json"
schema_file = "behaverse/validators/nback.schema.json"

# load and concat json lines
#BehaverseJsonValidator().convert(original_json_file, valid_json_file)

# load and validate json lines
BehaverseJsonValidator(json_schema_path=schema_file).validate_json_file(valid_json_file)

print("Good news everyone! Behaverse JSON file is valid!")
```