Performance Zone is brought to you in partnership with:

Ilan has posted 1 posts at DZone. View Full User Profile

New JPA Performance Benchmark

11.10.2010
| 15740 views |
  • submit to reddit

This article presents a new open source database performance benchmark for JPA that covers Hibernate, EclipseLink, DataNucleus, OpenJPA and ObjectDB.

My name is Ilan Kirsh and I am the founder of ObjectDB Software, a provider of ObjectDB, a high performance commercial Java object database that natively supports both JPA and JDO.

I have created a comprehensive new benchmark for measuring JPA provider performance. As part of this, I collected results for many different combinations of JPA implementations (Hibernate, Eclipselink, DataNucleus, OpenJPA) and leading RDBMS products (MySQL, PostgreSQL, Derby, H2, HSQLDB, SQLite), as well as ObjectDB.

You can see the results at jpab.org

Try out the site – you can compare products and rank them according to the results.

Why a Benchmark Focused on JPA?

The Java Persistence API (JPA) is the new standard for working with databases in Java. Unlike older persistence solutions (JDBC,EJB CMP), JPA supports direct persistence and retrieval of POJO (Plain Old Java Objects), which increases development productivity.

There are many open source benchmark programs for testing database performance. However, most of the benchmark programs for Java are based on JDBC and cannot be used for comparison of Java Persistence API (JPA) providers, such as Hibernate, EclipseLink and ObjectDB.

The few JPA benchmarks that have been published so far for JPA were very limited in scope. They were restricted to a single RDBMS and to a few very specific database operations.

A good benchmark for JPA must support comparing effectively many combinations of JPA providers and databases, because a JPA provider might work well with one DBMS but may become very slow with another DBMS (as indeed shown in the benchmark results).

Why do I care? I wanted a transparent, repeatable set of tests to compare ObjectDB with the other products out there. I also wanted something that people could directly critique and try for themselves.

Benchmark Results

In addition to ObjectDB, 4 ORM implementations have been tested (Hibernate, EclipseLink, OpenJPA, DataNucleus) with 6 different RDBMS (MySQL, PostgreSQL, Derby, H2, HSQLDB, SQLite), some of them in both client-server and embedded modes.

The results can be used in different ways. First, you can browse the results and use filters to focus on a specific test.

For example, the following chart presents results of a specific test - persisting simple entity objects in batches of 5000 per transaction:

Second, you can view result summary for:

For example, the following chart presents the results of SQLite in the benchmark (with Hibernate and EclipseLink):

Finally, you can produce a head to head comparison of any two tested combinations, e.g. comparison of Hibernate with Derby in server mode vs embedded mode:

JPA Database Comparison

The Benchmark Program

Let me explain a little about the main criteria I used in creating the benchmark. First, it should work with any JPA 2 implementation. To avoid influence of specific code for specific implementations on the results - exactly the same code (standard JPA) must be used for all the tested implementations.

Second, the benchmark should work with any database with JPA support and it should be able to generate separate results for every combination of tested JPA/DBMS, e.g. Hibernate/MySQL, Hibernate/PostgreSQL, EclipseLink/MySQL, etc.

Third, the tests should be easy to run and easy to configure and it also has to be easy to add new tests.

In addition, the benchmark has to be open source. On the jpab.org site, you can download the full sources for all the tests, try them yourself, view and critique the code.

More details are provided in the Test Description and FAQ pages.

ObjectDB Performance

One of the clear outcomes of this testing is that ObjectDB outperforms the other JPA/DBMS combinations often by an order of magnitude. These results initially startled me.

"Haha", I hear you mutter. "He would say that". It’s a reasonable response -- after all, I created the tests.

The key point is that the tests are open, transparent and the source is available. Please run the tests yourself and verify the results. There’s no magic there – I want to get as much feedback and make it as fair as possible.

When I started selling ObjectDB in 2003, I didn't attach too much importance to performance. The original purpose of developing ObjectDB was merely to provide an easy to use database that can store objects (and graphs of objects) with no need for object relational mapping.

However, over the years feedback from customers told me that performance is actually the main source of attraction of ObjectDB. Many ObjectDB users have conducted their own benchmarks. Many of them have sent me the results. It is weird but actually I learned about a performance gap between ObjectDB and other products from users. So, why is ObjectDB faster? I put it down to the fact that there are a lot less layers involved. With a JPA ORM and an RDBMS, the provider must convert the objects, process the metadata etc, and then write out the SQL. With ObjectDB, the process is far more direct. What I didn’t realize is that this directness seems to improve performance by 10x.

So, frankly, one of my main motivations for developing a transparent JPA benchmark was to present the performance capabilities of ObjectDB. My aim is to be completely transparent, and in that spirit I am making all the results and test code available. The FAQ explains the limitations of the benchmark and in which situations the results are relevant and in which they are not. In addition, performance is not critical in every application, so in many cases using a slower JPA/DBMS combination is not an issue.

Notice that the benchmark should be useful also for anyone that is only interested in RDBMS and ORM based JPA implementations and not in an object database. In that case it is even very objective since I don't have a preference for any particular ORM or RDBMS product.

Call for Help in Tuning

All the results that are currently presented on the benchmark website reflect using default configuration for all the tested products (except one exception that is explained in the FAQ). Because the tests are relatively simple, the default configuration is expected to work well. It also seems fair to use the default configuration for all the products.

However, I would like to run a second phase, in which all the products are run with optimized tuning for this benchmark. We have had some preliminary success for OpenJPA - by adjusting parameters we were able to get results much closer to the high end of the ORM range and much higher test pass ratio (initial results are presented on http://temp.jpab.org). This next phase will require expert help from the community. If you can help in tuning one or more of the tested JPA/DBMS combinations please send an email to feedback at jpab dot org.

About ObjectDB

ObjectDB is a high performance commercial object database for Java, offering native support for JPA and JDO. One of its major selling points is its phenomenal performance. All license prices (per server, per site and OEM) are available online. We also offer free licenses to open source projects, individual developers and small startup companies. You can also download. a free edition of ObjectDB.
Published at DZone with permission of its author, Ilan Kirsh.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Julien Iguchi-c... replied on Wed, 2010/11/10 - 3:04am

This standalone articile is non-sense until you add some explanations for the graphs: please update it with some mentions like "Comparison of JPA/Database speed - the averages (normalized score, higher is better)" like in JPAB website.

Furthermore, without explanations about why your database is better, this article seems empty and more provocative than anything else. Please at least introduce a link to the last entry of our FAQ.

Ilan Kirsh replied on Wed, 2010/11/10 - 3:20am in response to: Julien Iguchi-cartigny

Thank you for your comment. You are right - understanding the results requires visiting the jpab.org website. I included many links to the website, including to the FAQ, but thanks for adding a link to the explanation why ObjectDB is faster at the end of the FAQ.

I did provide a brief explanation in the ObjectDB Performance section ("I put it down to the fact that there are a lot less layers involved. With a JPA ORM and an RDBMS, the provider must convert the objects, process the metadata etc, and then write out the SQL. With ObjectDB, the process is far more direct."), but indeed, more details are provided in the FAQ.

Julien Iguchi-c... replied on Wed, 2010/11/10 - 3:26am in response to: Ilan Kirsh

Thank you, seems interesting now ;-)

Mladen Girazovski replied on Wed, 2010/11/10 - 4:03am

Where did you configure the ConnectionPool for DataNucleus, since it doesn't use one by default.

Ilan Kirsh replied on Wed, 2010/11/10 - 4:12am in response to: Mladen Girazovski

The current results reflect out of the box default configuration. I am keen to try DataNucleus with optimized configuration but for this I will need some help from DataNucleus experts. If you think you can help with tuning - please contact me.

However, I wonder whether connection pool will make a big difference because the tests do not use short term database connections, and anyway, the time of connecting to the database is excluded from the time measurement.

Mladen Girazovski replied on Wed, 2010/11/10 - 5:40am in response to: Ilan Kirsh

As mentioned DataNuclues does not use a ConnectionPool per default, most other benchmarks that used DataNuclues were not taking care of that and thus the results were useless and Andy Jefferson gets mad :)

 This link describes how to set up a ConnectionPool: http://www.datanucleus.org/products/accessplatform_1_0/rdbms/connection_pooling.html

 

ceteris paribus - everything else is plain marketing BS ;)

Andrew McVeigh replied on Wed, 2010/11/10 - 5:52am

(disclaimer: i own 2 licenses of Objectdb, but I'm very happy with it)

the results are certainly quite startling.  I've used objectdb for a commercial project in the past, and found it very quick.  it surprised me at the time, the performance seemed very high even in server mode.  i guess since my background is in object databases, i shouldn't find it surprising, but i found it an order of magnitude faster even than db4o at the time.

it will be interested to see whether vendors of the ORM tools are able to tune their results to bring the performance of their systems closer (in spirit if not numbers) to objectdb's.  if i read the article correctly, that is what you are asking them for?

i take your point about the performance increase being due to less layers, but there might be something else going on in terms of the way objects are stored in your system that is making it have a different (better?) performance profile.


 

Andrew McVeigh replied on Wed, 2010/11/10 - 4:55pm in response to: Mladen Girazovski

> ceteris paribus - everything else is plain marketing BS ;)

for our less classically minded readers, you are basically saying this is not an "apples for apples" test.

it's a fair point, but consider also that (a) vendors often hide behind the "but you haven't tuned it fully" argument and (b) what is a like for like test when you are comparing very different products that have the same interface but different implementations.

(a) in particular is often true to a point, but with deep caveats.  in fact, many of the commercial conversations i've had with vendors (I am a consumer of column-oriented databases for large scale market data) tends to end with this as a hanging question.  i.e. them: "you haven't tuned it fully, that's why it scores lowly".  me: "ok, tell me what parameters to add".  them: no contact.

if the guy turns on connection pooling (why is it not on by default?) and the results are still not as good, do we start saying "oh, but you haven't turned on prepared statements", or "the schema is not fully denormalised" or "you haven't use the latest snapshot" and so on...  i speak from experience on this from a consumer side -- we have just spent ages tuning a Sybase IQ installation and they literally went through all of this for ages...

it gets to the point, like with Oracle, where they prohibit you from publishing any benchmarks ever!

as i read it also, the Ilan is asking for help tuning the benchmarks for the large number of combinations.  is it possible to recommend the optimum tuning parameters for the combinations you speak of? the code looks to be available.

Ilan Kirsh replied on Wed, 2010/11/10 - 7:16am in response to: Andrew McVeigh

I've spent a bit of time puzzling over why ObjectDB is so fast compared to the ORM/RDBMS combinations. i.e. Is there something intrinsic in what I've implemented that makes it faster? Certainly the performance results surprised me at the start, I hadn't expected there to be such a large difference.

My investigations, for what they are worth, seem to show that most RDBMS products have been designed to minimize I/O operations but are not so good at minimizing CPU processing. Particularly, the layers that I mentioned (ORM/JDBC/SQL/RDBMS) are not very efficient in terms of CPU.

I believe that processing SQL data on a field by field basis to be much slower than processing complete objects as almost atomic elements in ObjectDB.

So, the performance of an ORM/RDBMS should get closer to the performance of ObjectDB when the bottleneck is disk activity or if we have a very slow network - but only if the object model is also relatively simple (see below)...

Regardless, memory today is cheap and when using ObjectDB disk activity can be reduced by a large RAM cache (and in many applications even by keeping the entire database in memory or on SSDs). I believe that these trends mean that an ORM/RDBMS combination cannot get close to the speed of ObjectDB if disk is not the bottleneck.

Finally, the complexity of the object model is important. The ability of ObjectDB to store a graph of objects as one unit instead of separating it to different tables and rows - makes the performance gap even larger.

Nicolas Seyvet replied on Wed, 2010/11/10 - 2:44pm in response to: Ilan Kirsh

OpenJPA has failed of the tests, and from the errors it looks as if it failed to connect to the DB. org.apache.openjpa.persistence.PersistenceException: There were errors initializing your configuration: org.apache.openjpa.util.UserException: A connection could not be obtained for driver class "com.mysql.jdbc.Driver" and URL "jdbc:mysql://localhost:3306/jpab1169457926". You may have specified an invalid URL. I was glad to see that R2 seems to have included a fix for this. Would it be possible to add tests towards MySQL Cluster?

Ilan Kirsh replied on Wed, 2010/11/10 - 6:49pm

Obviously the high failure ratio of OpenJPA in the tests indicates a problem. I double checked the benchmark code to verify that the problem is not in the benchmark.

The OpenJPA experts on their forum are very cooperative, and actually they found that some of the problems have been fixed after the release of the current OpenJPA last version:

http://openjpa.208410.n2.nabble.com/JPAB-results-tc5693298.html#a5706903

The OpenJPA_R2 run uses the same benchmark code but with a snapshot of OpenJPA 2.1 that fixes some of these problems, and indeed the results are better and with less failures.

Regarding MySQL cluster - I have to check but maybe it could be tested with the current benchmark code, by only modifying the properties files.

Cristina Belderrain replied on Wed, 2010/11/10 - 10:38pm

Ilan,

if I understand you well, you're comparing two completely different things... You're comparing ORM (in fact, a pair consisting of an OR mapper like Hibernate plus a RDBMS like MySQL) with an object database that, by definition, doesn't need ORM. Am I that wrong?

Of course, ObjectDB, being an object, not a relational, database, shows a much better performance than the pairs I've just mentioned as there's no translation / mapping at all involved in its operation. Don't you think such an advantage is not only expected but even obvious given the way an object database deals with objects?

On the other hand, I would really like to hear what you have to say about the theoretical (to me, at least) interoperability between the various JPA 2.0 implementations. Were you actually able to just change the provider name and its properties in persistence.xml and then use the very same test code after the provider was replaced? I'm asking because my experience regarding this subject has been just awful... That's true, I'm talking about JPA 1.0 implementations, so things might be getting better now that JPA 2.0 implementations are replacing those.

Thanks so much,

Cristina

Ilan Kirsh replied on Thu, 2010/11/11 - 12:50am in response to: Cristina Belderrain

Hi Cristina,

You are right - it is expected that an object database will be faster than a combination of ORM and RDBMS, but the benchmark exposes a very large performance gap, at least larger than what I expected.

You are also right about the problems in switching between different JPA implementations. In theory it should be simple. In practice, however, there are issues also with JPA 2.

Actually the benhcmark demonstrates this point by presenting the test failures of every ORM/RDBMS combination. Some popular combinations such as Hibernate/MySQL pass all the tests. Other combinations fail on some tests - but all the tests use pure JPA.

So this benchmark can also be useful for selecting a more stable JPA/DBMS combination and not just for selecting a faster JPA/DBMS combination.

Andrew McVeigh replied on Thu, 2010/11/11 - 6:26am in response to: Cristina Belderrain

Of course, ObjectDB, being an object, not a relational, database, shows a much better performance than the pairs I've just mentioned as there's no translation / mapping at all involved in its operation. Don't you think such an advantage is not only expected but even obvious given the way an object database deals with objects?

This is a really interesting point from my perspective.  I have worked a lot with object databases in the past - things like objectstore, versant etc.  they certainly were fast (we used to do over 20k transactions a second on versant on sparc hardware in telecomms in the '90s) *but* they had a fundamental limitation -- you couldn't easily traverse against the direction of the object pointers.  so, for instance, the versant sql interface was fatally limited - if you did a join against the pointers it would run out of memory or take ages.  Reporting tools for these systems were virtually impossible to make.

JPA forces a different view though, and I was suprised to see that an object database can support it efficiently.  In essence, the JPA query language forces an SQL view of the world with selects etc.  hence my thinking that objectdb must be a bit "relational" under the covers.  certainly in my use of it, it seems unconstrained by the usual object database restrictions.  i very strongly suspect you could build a SQL engine inside or on top of it -- possibly even as a translation from SQL into JPQL.

What I find fascinating is that it has objectdb supports JPAQL *as well* as keeping the traditional object database performance improvements. I have no idea if the performance increase is simply because of lesser layers, or some other aspect of the insides.  What is further interesting about the graphs above is that the y-scale is logarithmic, which downplays the actual performance advantages of objectdb!  If you look at the linear graphs, the difference is very stark - close to 10x.  It's almost like the author of objectdb is embarrassed by the impressive performance of his own system compared to the ORMs...

Now, i've previously had a number of arguments with people like Gavin King about object databases and their advantages.  Objectdb seems a lot closer to gavin's conceptual model of how such a system should work, like a hybrid between an SQL model and an object model where a foreign key acts like an object pointer.   However, i'd be very interested in seeing if an ORM can even come close to objectdb in performance.

So, in a nutshell, it seems that objectdb isn't "your father's object database".  it's a different beast with the performance advantages of native object mapping *and* the advantages of SQL-like data independence.  that's why i think it is unusual and possibly unique.

(disclaimer: i own 2 copies of objectdb, and use it in a commercial product.  however, i'm not affiliated with objectdb in any other way.  i found out about it when i needed a faster database for my JDO work.  i persist complex UML2 models in it for a team version of my commercial product)

 

Thomas J. Clancy replied on Thu, 2010/11/11 - 7:06am

Hmm... This doesn't seem biased in any way, does it? Oh wait... of course it's biased.

Andrew McVeigh replied on Thu, 2010/11/11 - 7:43am in response to: Thomas J. Clancy

This doesn't seem biased in any way, does it? Oh wait... of course it's biased.

Everything's biased in its own way --  it's impossible to avoid *any* sort of bias.  That's what makes benchmarking so damned difficult.  Want to profile one JPA providers against others? -- what you choose and what you emphasise will automatically bias towards a certain result or technology.  Noone likes benchmarks particularly, but they are a necessary evil -- and this one has done the right thing by making it open source and asking for tuning parameters.

So, do you care to expand on this "bias"? Do you have suggestions to remove it?  Do you have any insight in a way to avoid bias when benchmarking 2 obviously different technologies (object database versus ORM/sql) which share the same interface?

Jessie Mear replied on Wed, 2011/09/07 - 6:46am

To make it simple, one view needs to perform some cleanup when it is unloaded, but that cleanup erases some data that is needed for another view. java programmers

Carla Brian replied on Wed, 2012/08/01 - 6:00pm

JPA requires Java 5 or higher, as it makes heavy use of new Java language features such as annotations and generics. - Mercy Minisrries

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.