Treehouse

JSON Validation with JSON Schema

By on  

It didn't take long for JSON to become the hottest thing since Pam Anderson slowly bounced her way down the BayWatch beaches. And why shouldn't it be? JSON is easy to understand visually, easy to parse on both the client and server sides, and is supported in just about every language except aborigine. There is however one problem I see with the way JSON is used by developers today: lack of validation. Most developers assume the JSON provide is not only error-free also in the proper format. Bad assumption. Let me show you how Kris Zyp's JSON Schema can help you validate JSON on both the client and server sides.

What is JSON Schema?

JSON Schema is a standard (currently in draft) which provides a coherent schema by which to validate a JSON "item" against. Properties within the schema are defined and with another object containing their expected type. For example:

"myObj" : {
	"type" : "array",
	"properties" : {
		"id": { "type": "number" },
		"username": { "type" : "string" }
	}
}

Besides providing the required type, other properties can be defined, including:

  • items: This should be a schema or an array of schemas. When this is an object/schema and the instance value is an array, all the items in the array must conform to this schema.
  • optional: Notes if the property should be considered optional
  • requires: This indicates that if this property is present in the containing instance object, the property given by requires attribute must also be present in the containing instance object.
  • maxItems: Defines the maximum number of items in the collection

Numerous other properties are available, all of which may be found at: http://tools.ietf.org/html/draft-zyp-json-schema-03

Defining a Simple JSON Schema

Let's say that our application requires data in the following format:

{
	users: [
		{ id: 1, username: "davidwalsh", numPosts: 404, realName: "David Walsh" },
		{ id: 2, username: "russianprince", numPosts: 12, realName: "Andrei Arshavin" }
	]
}

Right away we can see:

  • The object has a users property
  • The users property is an array
  • The users array contains objects
  • Each object has an id (number), username (string), numPosts (number), and realName (string)

With this structure in mind, we can create a simple schema to validate our expected format:

{
	"type" : "object",
	"properties" : {
		"users" : {
			"type" : "array", // remember that arrays are objects
			"items" : { // "items" represents the items within the "users" array
				"type" : "object",
				"properties" : {
					"id": { "type": "number" },
					"username": { "type" : "string" },
					"numPosts": { "type" : "number" },
					"realName": { "type" : "string", optional: true }
				}
			}
		}
	}
}

dojox.json.schema and JSON Schema - Client Side

A JSON Schema validation routine is available with dojox.json.schema. The validate method accepts two arguments: your JSON to validate and the schema. Let's load the schema we created above, along with the sample JSON we created, and validate it:

// Require the json scheme module
dojo.require("dojox.json.schema");

// When resources are ready
dojo.ready(function() {

	// Load the schema
	dojo.xhrGet({
		url: 'schema.json',
		handleAs: 'json',
		load: function(schema) {

			// Now load the JSON
			dojo.xhrGet({
				url: 'users.json',
				handleAs: 'json',
				load: function(users) {

					// Now validate it!
					var result = dojox.json.schema.validate(users,schema);

					// Show the result
					console.log(result);
				}
			});
		}
	});	
});

A true valid property signals that the JSON is valid. If the result fails validation, valid will be false and the errors property will contain an array of error messages detailing why the given property did not pass validation. Here's a sample return result with errors:

{
	errors: [
		{
			message: "is missing and not optional",
			property: "users"
		}
	]
	valid: false
}

How you handle invalid data is up to you; moving forward with invalid data could present a security risk for both your organization and the user.

CommonJS-Utils and JSON Schema - Server Side

Kris also provides a server side JSON Schema validation routine within his CommonJS Utils project on GitHub. I've installed this project using NPM for NodeJS:

npm install commonjs-utils

Within this package is a json-schema resource. The following snippet requires that resources, reads in the schema and data JSON files, and validates the data JSON against the schema:

// Require Sys and FileSystem
var sys = require('sys'), fs = require('fs');

// Require package
var validate = require('commonjs-utils/json-schema').validate;

// Load a schema by which to validate
fs.readFile('schema.json',function(err,data) {
	if(err) throw err;
	var schema = data;
	// Load data file
	fs.readFile('./users.json',function(err,data) {
		if(err) throw err;
		// Parse as JSON
		var posts = JSON.parse(data);
		// Validate
		var validation = validate(posts, schema);
		// Echo to command line
		sys.puts('The result of the validation:  ',validation.valid);
	});
});

To run this via the command line:

node server-validate.js

The server side uses the exact same schema and data as the client side, so your web application can be covered on both fronts.

Closing Thoughts on JSON Schema

JSON Schema is still a draft but I think Kris has done an outstanding job in creating the draft and coding server and client side validators. JSON validation is often overlooked and the data is wrongly assumed as correct. The resources for data validation are available -- it's up to you to use them!

ydkjs-4.png

Recent Features

  • How to Create a Twitter Card

    One of my favorite social APIs was the Open Graph API adopted by Facebook.  Adding just a few META tags to each page allowed links to my article to be styled and presented the way I wanted them to, giving me a bit of control...

  • Responsive and Infinitely Scalable JS Animations

    Back in late 2012 it was not easy to find open source projects using requestAnimationFrame() – this is the hook that allows Javascript code to synchronize with a web browser's native paint loop. Animations using this method can run at 60 fps and deliver fantastic...

Incredible Demos

  • Assign Anchor IDs Using MooTools 1.2

    One of my favorite uses of the MooTools JavaScript library is the SmoothScroll plugin. I use it on my website, my employer's website, and on many customer websites. The best part about the plugin is that it's so easy to implement. I recently ran...

  • Spoiler Prevention with CSS Filters

    No one likes a spoiler.  Whether it be an image from an upcoming film or the result of a football match you DVR'd, sometimes you just don't want to know.  As a possible provider of spoiler content, some sites may choose to warn users ahead...

Discussion

  1. Just to make sure that you are aware of http://www.schematron.com/ and the conceptual difference with grammar based schema like XSD.
    XML schema is a big part of the XML non-human complexity.
    Hope that JSON will not follow the same path !
    Wonder if an API wouldn’t be more suited for this purpose. Instead of an JSON Schema why not a piece of Javascript that would validate the data ?
    Good luck !

  2. Yo home boy!
    Rad, feels like unit-testing for json-responses.
    Hopefully it keeps on developing and can be incorporated as a plugin for moo, jquery and/or yui :)

  3. fred

    Thanks for this interesting post!
    But I tried the validation with nodejs, the schema and instance files load correctly, the result of the validation is always true even if the users instance contains invalid data. For instance:
    {
    "users": [
    { "id": 1, "username": "davidwalsh", "numPosts": 404, "realName": "David Walsh" },
    { "name": "Andrei Arshavin" }
    ]
    }

    Any idea?

  4. Chris Joslyn

    Draft 03 of the JSON Schema spec replaced the “optional” attribute with the “required” attribute. I think that means that the simple schema sample provided above should now be:


    {
    "type" : "object",
    "properties" : {
    "users" : {
    "type" : "object", // remember that arrays are objects
    "items" : { // "items" represents the items within the "users" array
    "type" : "object",
    "properties" : {
    "id": { "type": "number" },
    "username": { "type" : "string", "required" : true },
    "numPosts": { "type" : "number", "required" : true },
    "realName": { "type" : "string" }
    }
    }
    }
    }
    }

    Also note that JSON Schema “is a sub-type of the JSON format”, meaning that comments are not allowed, although they are very handy for explaining what’s going on in this post.

  5. ken

    Thx for implementing JSON schema in Java!

    I have built your library. The car.json test (JSONSchemaTest) fails. I suspect the car schema has changed since you ran your test. The test asserts there should be 4 errors but there are 9:

    $.adr.description: is missing and it is not optional
    $.adr.properties: is missing and it is not optional
    $.adr.type: is missing and it is not optional
    $.geo.description: is missing and it is not optional
    $.geo.properties: is missing and it is not optional
    $.geo.type: is missing and it is not optional
    $.email: string found, object expected
    $.email.value: is missing and it is not optional
    $.email.type: is missing and it is not optional

    A few other suggestions:
    * The test shouldn’t depend on the order of errors in errors[].
    * validate() should return a Map of errors so they can be bound to form fields.
    * Include a snapshot of the car schema in your resources.
    * It would be nice to have a suite which runs all of the tests.
    * The format attribute should be implemented so dates/times can validate.
    * The API should be standardized w/ the IETF so a programmer can switch impls.

    Do you need some help?

  6. ken

    Oops I meant to post this on Nico’s blog–please accept my apology.

  7. Ashton Harris

    Great read!
    Now I understand this whole JSON thing. Thanks!

  8. Fool Bart

    This code is unreadable on Chrome

  9. nrm

    please put dates on your blog entries! :)

  10. Alexander Magdenko

    Seems, that your schema has errors.
    “type” : “object”, // remember that arrays are objects

    Element “items” is not allowed for “type” : “object”
    You need to change it to “type” : “array”

    Try to validate it here http://jsonschemalint.com/

  11. Mayank

    Hey fred .. you have to parse schema.json file also after reading.
    var schema = JSON.parse(data);
    Now it will work :)

  12. Subeesh

    Its a nice tutorial. Just a question:

    How to validate the sequence of data in json? For example, ‘id’ and then ‘name’ is the real sequence in schema. If ‘name’ comes first then ‘id’, the validator should tell it as wrong. In XML, there is something like . Kindly provide the equivalent in json. By default, the sequence is not getting validated.

    Please help.

    Thank You,
    Subeesh KK

    • IIRC, JavaScript objects are unordered key-value pairs. Order shouldn’t matter in JSON schemas, as it’s not taken into account in the notation.

  13. Alwin

    using instead of:

    var schema = data;

    var schema = JSON.parse(data);

    would help so this little script REALLY validates against a schema.

  14. Roshan

    Dude! awesome thanks!

Wrap your code in <pre class="{language}"></pre> tags, link to a GitHub gist, JSFiddle fiddle, or CodePen pen to embed!