NoSQL Zone is brought to you in partnership with:

Mitch Pronschinske is the Lead Research Analyst at DZone. Researching and compiling content for DZone's research guides is his primary job. He likes to make his own ringtones, watches cartoons/anime, enjoys card and board games, and plays the accordion. Mitch is a DZone Zone Leader and has posted 2578 posts at DZone. You can read more from them at their website. View Full User Profile

How to Use JPA and JDO in HBase

  • submit to reddit
Thanks to Google App Engine's work with DataNucleus, the GAE users have enjoyed JPA and JDO support.  For its storage system, GAE uses the (NoSQL) Google BigTable implementation.  HBase, under the Apache Hadoop project, is a distributed, column-oriented storage system that has been modeled after BigTable.  There are some usage restrictions, but generally it's pretty easy to store data on BigTable using JPA and JDO.  What you may not know is that HBase can also support these standard APIs for a homegrown system.

For developers who don't want to host their applications or store their data at Google, HBase provides a viable (and Apache community supported) option for building your own open source system.  JDO and JPA are also used through DataNucleus to persist objects in HBase.  To install HBase, just read the documentation, which covers all of the possible pitfalls.  Set up is not very difficult.  Next, Matthias Wessendorf explains how to use the JPA with HBase.  You start with a regular persistence XML file listing your classes and the actual configuration:


<property name="datanucleus.ConnectionURL" value="hbase"/>
<property name="datanucleus.ConnectionUserName" value=""/>
<property name="datanucleus.ConnectionPassword" value=""/>
<property name="datanucleus.autoCreateSchema" value="true"/>
<property name="datanucleus.validateTables" value="false"/>
<property name="datanucleus.Optimistic" value="false"/>
<property name="datanucleus.validateConstraints" value="false"/>

In most cases you'll want to add @Entity to your class and try to deal with any limitations.  When your data model is complete, you can, for example, start using the EntityManager natively:
EntityManagerFactory emf = Persistence.createEntityManagerFactory(...);
EntityManager entityManager = emf.createEntityManager();
EntityTransaction entityTransaction = entityManager.getTransaction();


More often, you may instead want to move the JPA-dealing code into a DataAccessObject.  

During a Maven build (shown below) you'll have to enhance the bytecode of the actual classes.  Lucky for you, DataNucleus offers a Maven plugin:

There are significant benefits in this method when hosting a normal Java EE application on HBase.  Because Java EE uses the JPA for most of its storage, the integration of JEE applications is a lot easier.  You can also use the 'native' HBase API to read and store data on a JPA/JDO managed HBase table, but the code is not as simple.


Matthias Wessendorf replied on Tue, 2010/03/30 - 7:42am


 I'd appreciate if you could link to the original version of this blog, located here:

 It would be nice if you'd use my URL ( instead of the nofluffjuststuff thing.


Matthias Wessendorf

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.