NoSQL Zone is brought to you in partnership with:

Rickard Oberg is popular among Java developers. He has given seminars at all main Java conferences world wide. He worked as an architect at JBoss and other popular OpenSource Java frameworks, and wrote a book on RMI. In recent years, he has become famous as an Aspect Oriented Programming (AOP) crusader. He has worked with bleeding edge AOP in a portal product that has become a great commercial success, and is currently working on Qi4j at Jayway. Rickard is a DZone MVB and is not an employee of DZone and has posted 16 posts at DZone. You can read more from them at their website. View Full User Profile

Creating a JDBC driver for Neo4j

01.16.2012
| 6299 views |
  • submit to reddit

When it comes to NOSQL databases, one of the key advantages is that they allow you to structure your data in a way that better resembles your domain, and also allows you to use query languages where you can express things that are either really awkward or slow with SQL. However, one of the advantages that relational databases have is that they can be accessed from lots of tools using JDBC, as a standard API. So what would happen if a NOSQL database, like Neo4j, also had a JDBC driver? I decided to find out!

Neo4j has a REST API that allows distributed access, and a query language called Cypher which can be used for ad-hoc queries. Since the result of such queries are iterables of maps with key-values, they can be reasonably easily converted to the JDBC ResultSet concept. So I created a thin JDBC API implementation using the Restlet framefork that delegates to this REST API, and sends query strings to the Cypher endpoint. The JSON requests and responses were handled using the Jackson JSON library.

The next question was, if a client tried to list tables and their columns, what should be returned? The approach I took, for now, is to introduce “type nodes”, i.e. nodes that are not instances in the domain model, but instead represent meta information about the model. To model tables I introduced the type node itself, and also property nodes that contain name of the property and its type. Instances of these types would then be related to their type node with the IS_A relationship. See figure below for an example:


The JDBC driver can now ask for a list of tables with the following Cypher query, expressed using the DSL described in the previous post:

start(node("n", 0)).
match(path().from("n").out("TYPE").to("type")).
returns(properties("type.type"));


This basically means: “From the start node, follow the TYPE relationship to find all type nodes, and then return their names”. To list the properties of a particular type node a parameterized query can be used. This is done for the DatabaseMetaData.getColumns JDBC API call, like so:

public ExecuteWithParameters getColumns(String typeName)
{
return start(node("n", 0)).
       match(path().from("n").out("TYPE").to("type").
       link().out("HAS_PROPERTY").to("property")).
       where(prop("type.type").eq(param("typename"))).
       returns(properties("type.type", "property.name", "property.type")).
          parameter("typename", typeName);
}

 

Notice that the above creates a parameterized query and also supplies it with the concrete parameter, so that the result can be easily executed.

Quirks mode

While the JDBC driver can be used as-is to execute ad-hoc Cypher queries, to be usable from normal SQL-aware tools there is a little more work required. Specifically, most of these tools will send standardized SQL requests to get data, such as “SELECT * FROM Person”, which will not work with this JDBC driver, as there is no SQL support.

What I did here is to create a “quirks mode” which recognizes these calls and converts them into the Cypher equivalents. By doing this I managed to get the tools outlined below to work. This means that if you want to try a tool not explicitly listed below the discovery part of the tool probably won’t work properly, whereas sending query strings and viewing results should work. Please send me a note if you have a tool that you would want to have supported!

DbVisualizer

With this in place I could now access my Neo4j database using various JDBC tools. Here is a screenshot from DbVisualizer, which was the first tool I tried using it with:

And here’s another screenshot showing how to send ad-hoc Cypher queries:

LibreOffice

When it comes to using a database as a reporting tool, one of the simplest thing you can do is use one of the Office packages and connect to a database and use the data for charts and spreadsheets. Since LibreOffice has pretty good JDBC connectivity I tried it out, and here’s the result:

ODBC in Windows

While having a JDBC driver is great, not all tools that work with databases use JDBC. Some use ODBC instead, and since there is a ODBC-JDBC Gateway available from Easysoft I wanted to try this out. After installing this software it was really easy to set up a connection, and then connect to it using a standard ODBC tool. Here’s a screenshot from what that looks like:

IntelliJ

Lastly I tried using the JDBC driver with IntelliJ, my Java IDE of choice. This worked out really well, and with some configuration it even allows me to enter values for parameterized queries, which is nice. Here’s what it looks like:

JDBC as applet

What about JavaScript usage? While you can access Neo4j directly from JavaScript using REST calls there are still some advantages to using the JDBC driver in the web browser, as the JDBC API abstracts things likes types and other metadata. Because of this I made a packaging of the driver as a Java applet, so that it can be included in HTML pages and called through Javascript. You can see an example of this here.

Future ideas

At the moment the driver only supports reads, since Cypher, as the underlying query language only supports read operations. Once Cypher has been updated with support for modifying operations it should be possible to add this to the JDBC driver.

If/when Neo4j gets a binary connection protocol, it will be interesting to try and support that instead of the REST API. That should allow for higher performance, both due to serialization efficiency and connection management.

Conclusion

While most NOSQL databases, such as Neo4j, provide a non-relational way to store and query data, in this case it was possible to create a JDBC driver that can expose that non-relational data in a way that works reasonably well with the JDBC API. This makes it possible to access Neo4j from a whole host of tools, and makes it possible to more easily make reports from its data, as well as debug the data itself.

For more details, see the GitHub project page to download and install the driver. If you want to get started quickly you can download the test database that I created for the above screenshots from here.

 

Source: http://rickardoberg.wordpress.com/2011/12/19/creating-a-jdbc-driver-for-neo4j/

Published at DZone with permission of Rickard Oberg, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)