JSON Validation with JSON Schema
It didn't take long for JSON to become the hottest thing since Pam Anderson slowly bounced her way down the BayWatch beaches. And why shouldn't it be? JSON is easy to understand visually, easy to parse on both the client and server sides, and is supported in just about every language except aborigine. There is however one problem I see with the way JSON is used by developers today: lack of validation. Most developers assume the JSON provide is not only error-free also in the proper format. Bad assumption. Let me show you how Kris Zyp's JSON Schema can help you validate JSON on both the client and server sides.
What is JSON Schema?
JSON Schema is a standard (currently in draft) which provides a coherent schema by which to validate a JSON "item" against. Properties within the schema are defined and with another object containing their expected type. For example:
"myObj" : { "type" : "array", "properties" : { "id": { "type": "number" }, "username": { "type" : "string" } } }
Besides providing the required type
, other properties can be defined, including:
items
: This should be a schema or an array of schemas. When this is an object/schema and the instance value is an array, all the items in the array must conform to this schema.optional
: Notes if the property should be considered optionalrequires
: This indicates that if this property is present in the containing instance object, the property given by requires attribute must also be present in the containing instance object.maxItems
: Defines the maximum number of items in the collection
Numerous other properties are available, all of which may be found at: http://tools.ietf.org/html/draft-zyp-json-schema-03
Defining a Simple JSON Schema
Let's say that our application requires data in the following format:
{ users: [ { id: 1, username: "davidwalsh", numPosts: 404, realName: "David Walsh" }, { id: 2, username: "russianprince", numPosts: 12, realName: "Andrei Arshavin" } ] }
Right away we can see:
- The object has a users property
- The users property is an array
- The users array contains objects
- Each object has an id (number), username (string), numPosts (number), and realName (string)
With this structure in mind, we can create a simple schema to validate our expected format:
{ "type" : "object", "properties" : { "users" : { "type" : "array", // remember that arrays are objects "items" : { // "items" represents the items within the "users" array "type" : "object", "properties" : { "id": { "type": "number" }, "username": { "type" : "string" }, "numPosts": { "type" : "number" }, "realName": { "type" : "string", optional: true } } } } } }
dojox.json.schema and JSON Schema - Client Side
A JSON Schema validation routine is available with dojox.json.schema
. The validate
method accepts two arguments: your JSON to validate and the schema. Let's load the schema we created above, along with the sample JSON we created, and validate it:
// Require the json scheme module dojo.require("dojox.json.schema"); // When resources are ready dojo.ready(function() { // Load the schema dojo.xhrGet({ url: 'schema.json', handleAs: 'json', load: function(schema) { // Now load the JSON dojo.xhrGet({ url: 'users.json', handleAs: 'json', load: function(users) { // Now validate it! var result = dojox.json.schema.validate(users,schema); // Show the result console.log(result); } }); } }); });
A true valid
property signals that the JSON is valid. If the result fails validation, valid
will be false and the errors
property will contain an array of error messages detailing why the given property did not pass validation. Here's a sample return result with errors:
{ errors: [ { message: "is missing and not optional", property: "users" } ] valid: false }
How you handle invalid data is up to you; moving forward with invalid data could present a security risk for both your organization and the user.
CommonJS-Utils and JSON Schema - Server Side
Kris also provides a server side JSON Schema validation routine within his CommonJS Utils project on GitHub. I've installed this project using NPM for NodeJS:
npm install commonjs-utils
Within this package is a json-schema resource. The following snippet requires that resources, reads in the schema and data JSON files, and validates the data JSON against the schema:
// Require Sys and FileSystem var sys = require('sys'), fs = require('fs'); // Require package var validate = require('commonjs-utils/json-schema').validate; // Load a schema by which to validate fs.readFile('schema.json',function(err,data) { if(err) throw err; var schema = data; // Load data file fs.readFile('./users.json',function(err,data) { if(err) throw err; // Parse as JSON var posts = JSON.parse(data); // Validate var validation = validate(posts, schema); // Echo to command line sys.puts('The result of the validation: ',validation.valid); }); });
To run this via the command line:
node server-validate.js
The server side uses the exact same schema and data as the client side, so your web application can be covered on both fronts.
Closing Thoughts on JSON Schema
JSON Schema is still a draft but I think Kris has done an outstanding job in creating the draft and coding server and client side validators. JSON validation is often overlooked and the data is wrongly assumed as correct. The resources for data validation are available -- it's up to you to use them!
Just to make sure that you are aware of http://www.schematron.com/ and the conceptual difference with grammar based schema like XSD.
XML schema is a big part of the XML non-human complexity.
Hope that JSON will not follow the same path !
Wonder if an API wouldn’t be more suited for this purpose. Instead of an JSON Schema why not a piece of Javascript that would validate the data ?
Good luck !
Yo home boy!
Rad, feels like unit-testing for json-responses.
Hopefully it keeps on developing and can be incorporated as a plugin for moo, jquery and/or yui :)
Thanks for this interesting post!
But I tried the validation with nodejs, the schema and instance files load correctly, the result of the validation is always true even if the users instance contains invalid data. For instance:
{
"users": [
{ "id": 1, "username": "davidwalsh", "numPosts": 404, "realName": "David Walsh" },
{ "name": "Andrei Arshavin" }
]
}
Any idea?
Draft 03 of the JSON Schema spec replaced the “optional” attribute with the “required” attribute. I think that means that the simple schema sample provided above should now be:
{
"type" : "object",
"properties" : {
"users" : {
"type" : "object", // remember that arrays are objects
"items" : { // "items" represents the items within the "users" array
"type" : "object",
"properties" : {
"id": { "type": "number" },
"username": { "type" : "string", "required" : true },
"numPosts": { "type" : "number", "required" : true },
"realName": { "type" : "string" }
}
}
}
}
}
Also note that JSON Schema “is a sub-type of the JSON format”, meaning that comments are not allowed, although they are very handy for explaining what’s going on in this post.
I tried
"serverEnvironmentId": { "type": "number", "required" : true }
, but it still passed the test even though that element does not exist in the JSON data.Thx for implementing JSON schema in Java!
I have built your library. The car.json test (JSONSchemaTest) fails. I suspect the car schema has changed since you ran your test. The test asserts there should be 4 errors but there are 9:
$.adr.description: is missing and it is not optional
$.adr.properties: is missing and it is not optional
$.adr.type: is missing and it is not optional
$.geo.description: is missing and it is not optional
$.geo.properties: is missing and it is not optional
$.geo.type: is missing and it is not optional
$.email: string found, object expected
$.email.value: is missing and it is not optional
$.email.type: is missing and it is not optional
A few other suggestions:
* The test shouldn’t depend on the order of errors in errors[].
* validate() should return a Map of errors so they can be bound to form fields.
* Include a snapshot of the car schema in your resources.
* It would be nice to have a suite which runs all of the tests.
* The format attribute should be implemented so dates/times can validate.
* The API should be standardized w/ the IETF so a programmer can switch impls.
Do you need some help?
Oops I meant to post this on Nico’s blog–please accept my apology.
Great read!
Now I understand this whole JSON thing. Thanks!
This code is unreadable on Chrome
please put dates on your blog entries! :)
Seems, that your schema has errors.
“type” : “object”, // remember that arrays are objects
Element “items” is not allowed for “type” : “object”
You need to change it to “type” : “array”
Try to validate it here http://jsonschemalint.com/
Hey fred .. you have to parse schema.json file also after reading.
var schema = JSON.parse(data);
Now it will work :)
I second this, it should be updated in blogpost.
Its a nice tutorial. Just a question:
How to validate the sequence of data in json? For example, ‘id’ and then ‘name’ is the real sequence in schema. If ‘name’ comes first then ‘id’, the validator should tell it as wrong. In XML, there is something like . Kindly provide the equivalent in json. By default, the sequence is not getting validated.
Please help.
Thank You,
Subeesh KK
IIRC, JavaScript objects are unordered key-value pairs. Order shouldn’t matter in JSON schemas, as it’s not taken into account in the notation.
using instead of:
would help so this little script REALLY validates against a schema.
Dude! awesome thanks!
Great post. I agree JSON needs to be used together with schemas when used in serious applications. Seems like we are repeating same mistakes as done when using XML data in a unserious way.
I have always thought “WOW” when I saw XML Schemas, but never actually used them for manual validation as it was hard to use. I hope that methods like the one you describe above for validation will be used now, when sharing data using JSON.
Is there methods to define regular expression based types? Or even function based validation in JSON schemas, cause unlike XSD (XML Schemas) javascript would allow for code to decide if data is valid. XML cannot do that natively, and would need some external language to help validating using methods/functions.
That would allow for very complex (yet easy to implement) validation :-)
Looking forward to see validation spreading around the JSON world!
I personally use http://indicative.adonisjs.com/ for flat schema validations on server.
I worried about the size of the json data constrained to the schema, seeing the examples above it seems the data size grown too much , would be good to have a normalised version of data.
Another great tool for JSON validator and treeview is http://jsontuneup.com
Although this one is old post, This has helped me lot. Thanks David.
don’t know if it will be usefull to someone else but to fix the thing i change:
thanks for your work
Hi,
Am trying to validate the json data for the field, when am post the java it’s should display an error if it’s not a proper json data, can anyone suggest