Archive for August, 2014

Working with JSON Data from SQL

The MongoDB support was introduced in 12.10xC2, bringing many cool things to Informix, one of them being JSON and BSON data types. Putting all the NoSQL and MongoDB story aside, these new data types enable us to work with semi-structured data directly from SQL, thanks to several new built-in functions. Of course, you could do the same with XML documents, but it took a while before all the necessary functions became available in Informix, and working with XML is still more complex than working with JSON, because of the difference of those two data formats.

In order to put the data in a JSON column you can use genBSON function, or simply cast a text to a JSON type. Here’s an example – a tourist office database table storing various places people could visit. One table with JSON data could be used to store data of many different places – cities, regions, islands, landmarks etc. So the table could be defined as:

CREATE TABLE places (
  place_id SERIAL,
  numberOfVisitorsPerYear INT,
  place BSON
);

The place column could also be of JSON type, but if you want to perform more meaningful queries on a table, stick to BSON. There are some BSON functions we can use, and BSON can be cast to JSON data.

Rows could be inserted via plain SQL:

INSERT INTO places VALUES (0, 500000, '{city: "Zagreb"}'::JSON);

Note that last value needs to be cast to JSON in order to be able to run queries on it with the bson_value functions. Here are some other data rows with various attributes describing places:

INSERT INTO places VALUES (0, 600000, '{city: "Pula", country: "Croatia", population: 57000}'::JSON);
INSERT INTO places VALUES (0, 20000, '{mountain: "Velebit", country: "Croatia", height: 1757}'::JSON);
INSERT INTO places VALUES (0, 1000000, '{national_park: "Plitvice", country: "Croatia"}'::JSON);

Simplest way to find out what is stored in a table is to execute query like this one:

SELECT place_id, numberOfVisitorsPerYear, place::JSON
FROM places

which will return these results:

place_id numberOfVisitorsPerYear (expression)
1 500000 {“city”:”Zagreb”}
2 600000 {“city”:”Pula”,”country”:”Croatia”,”population”:57000}
3 20000 {“mountain”:”Velebit”,”country”:”Croatia”,”height”:1757}
4 1000000 {“national_park”:”Plitvice”,”country”:”Croatia”}

However, the idea is to be able to search within JSON data. For that purpose, there are some new functions we can use:

  • bson_value_int(column, bson_attribute) – returns an integer value of the bson_attribute stored in a specified column of the row
  • bson_value_bigint(column, bson_attribute) – returns a bigint value of the bson_attribute stored in a specified column of the row
  • bson_value_double(column, bson_attribute) – returns an double value of the bson_attribute stored in a specified column of the row
  • bson_value_lvarchar(column, bson_attribute) – returns an lvarchar value of the bson_attribute stored in a specified column of the row

Here are some query examples:

-- find all destinations in Croatia 
SELECT *, place::JSON FROM places 
WHERE bson_value_lvarchar (place, 'country') = 'Croatia'; 

-- find all destinations without specified county 
SELECT *, place::JSON FROM places 
WHERE bson_value_lvarchar (place, 'country') IS NULL; 

-- find all mountains higher than 1000 meters 
SELECT *, place::JSON FROM places 
WHERE bson_value_lvarchar (place, 'mountain') IS NOT NULL 
AND bson_value_int (place, 'height') > 1000; 

-- find all national parks in Croatia 
SELECT *, place::JSON FROM places 
WHERE bson_value_lvarchar (place, 'country') = 'Croatia' 
AND bson_value_lvarchar (place, 'national_park') IS NOT NULL; 

There is another new function, genBSON, which generates BSON/JSON data from a relational table’s data, depending on a query. It can be used to return JSON directly from a query or to insert a JSON data in a column. The Informix Knowledge Center support page for this function is informative, along with some examples, so I’m not going to repeat it all here. As a continuation to our example, if a tourist office already has a relational table named cities in its database, then this data could be imported in a places table with a single SQL statement:

-- cities relational table 
CREATE TABLE cities (
  city_id SERIAL, 
  city CHAR(30), 
  population INT, 
  country CHAR(30), 
  numberOfVisitorsPerYear INT
); 

-- copy the cities data into places table: 
INSERT INTO places 
SELECT 0, numberOfVisitorsPerYear, genBSON(ROW(city, country, population), 0, 1)::JSON
FROM cities; 

Or, if we don’t want to have a copy of cities in places, a view on structured and semi-structured data could be made (this one returning only JSON data):

CREATE VIEW places_and_cities (place) AS 
  SELECT place::JSON FROM places 
  UNION 
  SELECT genBSON(ROW(city, country, population), 0, 1)::JSON FROM cities; 

In conclusion, with JSON capabilities at hand, it’s now pretty simple to mix structured and semi-structured data in a single database. But before we do it, we should make sure there is a proper need to design our data model that way, bearing in mind there are numerous advantages of having relational data in a relational database.

Advertisements

, , ,

Leave a comment