SQL Zone is brought to you in partnership with:

More than 10 years of experience in Enterprise Applications development, and Project Management (since June 2001). Specialties: Enterprise Applications, J2EE, J2SE, Spring Framework, Spring Security, Spring Webflow, JPA and JDBC, JMX, JAXB, JUnit; XML, XSLT and XSL-FO; HTML/CSS, Javascript, JSON, jQuery/jQuery UI; RDBMS (Oracle, SQL Server, MySQL), PL/SQL; UML; Mobile Applications, mostly in Objective-C targeted iPhone/iPad Michal is a DZone MVB and is not an employee of DZone and has posted 12 posts at DZone. You can read more from them at their website. View Full User Profile

JPQL - Pagination on Oracle Database with Hibernate

08.27.2012
| 10471 views |
  • submit to reddit

In your daily work, you rely on many different libraries, trusting they will serve you well, being perfect piece of code ... do you? ... really?! Then it's time to realize that you are perfectly wrong :) Increasing complexity of code leads to new possibilities of making errors :) Many of them are lurking in the libraries used by you, even if they are used for years by thousands of developers.

Let's find some example. Suppose that we are using JPA, and have an entity named Employee, which contains at least two properties: name and id. Suppose that we want to display all employees ordered by name, and paginated. To fetch them from the underlying database we will need JPQL query like this:

 select e from Employee e order by e.name

We will paginate it using setFirstResult and setMaxResults methods of javax.persistence.Query interface. Now we need a JPA provider and database to make it work, let's choose Hibernate, and Oracle (10+).

At first try everything works perfectly :) - but let's assume that we have employees sharing the same name, for ex. 20 of them having name 'Smith' (identifiers between 1 and 20), 10 having name 'Donovan' (identifiers between 21 and 30) and 10 having name 'Johnson' (identifiers between 31 and 40) - total 40 employees. Let's try to display 5 employees on single page, and see what will happen:

Page 1 - employees having ID: 21, 25, 24, 23, 22 - all having name 'Donovan' - good :)
Page 2 - employees having ID: 26, 25, 24, 23, 22 - all having name 'Donovan' - but 4 of them were already displayed on first page (!)
Page 3 - employees having ID: 31, 35, 34, 33, 32 - here comes the 'Johnson' name - good again :)
Page 4 - employees having ID: 36, 35, 34, 33, 32 - 'Johnson' again, and again 4 of them were already displayed on third page (!!)
Page 5 - employees having ID: 1, 17, 18, 19, 20 - here comes the 'Smith' name - good again :)
and finally the real surprise - Pages 6, 7 and 8 contains same employees - having ID: 16, 17, 18, 19, 20

Don't you think something is wrong here ?! ;) Well, the reason of this strange error is visible when you check SQL queries generated by Hibernate:

select * from (
    select
        employee0_.EMPLOYEE_ID as EMPLOYEE1_0_, employee0_.DATE_OF_BIRTH as DATE2_0_,
        employee0_.EMPLOYEE_NAME as EMPLOYEE3_0_
    from EMPLOYEE employee0_
    order by employee0_.EMPLOYEE_NAME
    ) where rownum <= 5

for the first 5 rows and:

select * from (
    select
        row_.*, rownum rownum_
    from (
        select
            employee0_.EMPLOYEE_ID as EMPLOYEE1_0_, employee0_.DATE_OF_BIRTH as DATE2_0_,
            employee0_.EMPLOYEE_NAME as EMPLOYEE3_0_
        from EMPLOYEE employee0_
        order by employee0_.EMPLOYEE_NAME
    ) row_
    where rownum <= 10)
where rownum_ > 5

for the rows 6 - 10 (and similar for next pages).

What the heck?! - you may say - These beautiful SQL queries are suggested for pagination on Oracle's website - see Tom Kyte's article: On ROWNUM and Limiting Results - sure :) - but the one who implemented it in Hibernate didn't read this article too deeply, skipping this important part:

One important thing about using this pagination query is that the ORDER BY statement should order by something unique. If what you are ordering by is not unique, you should add something to the end of the ORDER BY to make it so.

As you see, correct JPQL query leads to invalid SQL query for Hibernate / Oracle combination, and the error shows himself only for some combinations of data in the database - it is very difficult to spot because of that.

Do you still think that libraries used by you are perfect? :)

PS: This article has been created thanks to one of my Colleagues - Joanna Głowińska - her awesome work on SQL queries in some of our projects lead me to the above thoughts.
 

Published at DZone with permission of Michal Jastak, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Tomasz Wermiński replied on Tue, 2012/08/28 - 3:40am

This is rather a problem of a wrong SQL query - not a Hibernate/JPA. Results without any ORM provider are very similar to yours.

On PostgreSQL I've just created table with 20 Smiths (ids: 1-20), then 10 Donovans (ids: 21-30), and 10 Johnsons (ids: 31-40).

Then I wrote the first query:

select * from tmp_employee order by name limit 5;
and I got:

  id | name
----+---------
22 | Donovan
23 | Donovan
21 | Donovan
25 | Donovan
24 | Donovan

with the second query:

 

select * from tmp_employee order by name offset 5 limit 5;

 

I got:

id | name
----+---------
27 | Donovan
21 | Donovan
22 | Donovan
30 | Donovan
29 | Donovan

Each page contains rows with id 21 and 22 - and it's not a surprise.

The conclusion is only one - above SQL query (with ordering by non unique column) isn't correct. Developer has to know that he needs to order by unique column in his pagination query.

Karl Peterbauer replied on Tue, 2012/08/28 - 5:26am

I'd rather say: Do you still think that the database used by you is perfect? ;-)

Michal Jastak replied on Tue, 2012/08/28 - 9:36am in response to: Tomasz Wermiński

@Tomasz - I agree with you, this is problem with wrong SQL query, but please notice, that at the very beginning there is correct JPQL query (select e from Employee e order by e.name) which is converted by Hibernate (used JPA provider) into wrong SQL query - the developer operates on the JPA/JPQL level, assuming that JPA provider will correctly transform the JPQL query into SQL ... 

Michal Jastak replied on Tue, 2012/08/28 - 9:43am in response to: Karl Peterbauer

@Karl - :), indeed - the underlying database isn't perfect :) - but its limitations are known - see for ex. Tom Kyte's article:On ROWNUM and Limiting Results - the problem is that Hibernate Developers ignored it, and thus we all have to worry about it :(

Gavin King replied on Tue, 2012/08/28 - 7:37pm

Well, Michal, I'm afraid your analysis is quite wrong, though it did take me a few seconds of thought to figure out exactly how, and I admit that you were probably led astray by the gloss that Hibernate books/tutorials usually give on setFirstResult()/setMaxResults(), so I guess you can't be blamed for the error.

The issue here is that setFirstResult()/setMaxResults() don't actually do pagination as such. All they do is, as their names suggest, limit the results returned from execution of a Query. No more, no less. Indeed, I think the way the JPA Query API is structured makes pretty clear that this is what is going on. If it were actual pagination of the results of a single query execution, then you would expect a quite differently-structured API, with some kind of special Result object.

Now, you can use setFirstResult()/setMaxResults() to implement pagination by applying them to a query that you know always returns its results in the same order. But your example query here is underconstrained from this point of view—there is no way you could reasonably expect the results of different executions of this query to return its results in the same order.

To see this a bit more clearly, imagine an even less-constrained query:

from Employee e

 

I don't think anyone would expect that multiple executions of this query would return the query results in the same order, and so the expectation must be that you simply can't paginate this query. And so the same applies to your query. So the answer is: you can only use these methods to implement pagination is you have a query that is sufficiently constrained that its results always come back in the same order. This is pretty easy to achieve, for example:

from Employee e order by e.name, e.id

 

The other lesson to take away from this is: always experiment and understand the actual semantics of APIs for yourself, instead of just accepting the gloss given by a book/tutorial, or even by the API doc itself, which very often skips over some details and preconditions.

Finally, be careful! There are even worse issues lurking under the covers of setFirstResult()/setMaxResults()! For example, if you try to apply these operations to a query with a join fetch in it, then you definitely wont get the results you're looking for! (This is documented in the spec, and isn't a bug, by the way.) There are some problems that JPA/Hibernate simply can't solve, given the limitations of SQL.

Cheers, Gavin

Michal Jastak replied on Wed, 2012/08/29 - 1:02am in response to: Gavin King

@Gavin - First of all :) - What I'm speaking about is JPA/JPQL usage with Hibernate as JPA provider - API referred by me can be found at javax.persistence.EntityManager and javax.persistence.Query and has nothing to do with Hibernate API. Also the number of Hibernate books / tutorials read by me has nothing to do with this example :)

JPQL query used by me: select e from Employee e order by e.name - produces results being Employee entity, not the rows on the database layer, thus I can assume that setFirstResult/setMaxResults methods from javax.persistence.Query API will return Employees from specific range, and can be used for pagination in this case - unless you have any other suggestion how we could implement pagination using JPA API.

As a developer I can assume that if I use ordering of any kind in JPQL the results will be ALWAYS returned in the same order, regardless of the database content (until database content will be changed). This is my understanding of word ordering - maybe different than your own. To be clear, I don't expect that JPQL query without any ordering at all (mentioned by you) will return results ordered anyhow.

If for some reason JPA/JPQL requires the ordering to be done over property having unique values it should be clearly stated in specs.

Finally the join fetch example mentioned by you at the end of comment - I agree that setFirstResult/setMaxResults will lead to amazing results here, because they will be applied on the SQL level, where it is possible to have more rows than expected, thanks to joins. How does it compare to my example? - I'm not speaking about all possible queries and their pagination, but the one specific :)

Have you asked yourself why we are using JPA? - one of the reasons will be probably - stop worrying about the database details, and focus on the higher level of abstraction - entities modelling our world, not the rows and columns. 

How can we do it, if JPA provider used by us works as expected for one database (MySQL), and doesn't for another (Oracle)? We are chained to those ugly rows and columns, while still dreaming about the entities sky above ...

regards for one of the most important persons in Hibernate community, from the simple developer like me :)

Gavin King replied on Wed, 2012/08/29 - 6:44pm in response to: Michal Jastak

What I'm speaking about is JPA/JPQL usage with Hibernate as JPA provider 

Michal, yes, I'm aware that you're referring to the JPA Query API. FTR, in this case there is no difference in semantics between the two APIs.

Also FTR, I am substantially responsible for the original design of both these APIs, so I think I understand the intended semantics pretty well. ;-)

 thus I can assume that setFirstResult/setMaxResults methods from javax.persistence.Query API will return Employees from specific range, and can be used for pagination in this case

No, you can't assume that. Sorry. Your query doesn't fully specify the order of the result set, and therefore you can't paginate it. It's really as simple as that. To prove your above assertion, you would have to find the section in the JPA spec which says that any arbitrary JPA query can be paginated using setFirstResult/setMaxResults. As far as I know there is no such language in the spec. (It's remotely possible that there is such language, and I'm unaware of it, but I doubt it.)

unless you have any other suggestion how we could implement pagination using JPA API. 

I already suggested a solution above: modify your JPAQL so that it includes the employee primary key field in the order by clause.

As a developer I can assume that if I use ordering of any kind in JPQL the results will be ALWAYS returned in the same order, regardless of the database content (until database content will be changed).

I honestly have no idea where you get this idea from. Some orders are total orders, some orders are partial orders. In your case you have a partial order. If you add the primary key, as I suggested, you will have a total order. Again, unless you can show me some language in the JPA spec that supports the assertion you make above, then there is simply no basis for asserting it. 

 This is my understanding of word ordering - maybe different than your own. To be clear, I don't expect that JPQL query without any ordering at all (mentioned by you) will return results ordered anyhow.

Your or my understanding of the word "ordering" is pretty much rrelevant here. What matters is what the specification explicitly requires of the JPA implentation. That's why we have specs&mdash;so that people don't have to argue over stuff like this. In fact, the word "order" is itself quite vague. My advice is to be careful to use the term "total order" when you mean a total order, and "partial order" when you mean a partial order. The JPAQL and SQL order by clauses can be used to express both partial and total orders.

If for some reason JPA/JPQL requires the ordering to be done over property having unique values it should be clearly stated in specs. 

Spec writers almost always take the view that a behavior not explicitly required is implicitly not required. It's usually much easier to specify positive requirements than unspecify non-requirements. Of course, we usually do highlight cases which we think will confuse people, but in this case I guess we didn't think confusion would arise since the behavior is exactly what anyone with an SQL background would expect.

 How can we do it, if JPA provider used by us works as expected for one database (MySQL), and doesn't for another (Oracle)? 

Of course, you should always test your code on all backends, since differences in behavior are to be expected. And you shoudn't depend on semantics that you imagine are in the spec without checking the spec first.

 We are chained to those ugly rows and columns, while still dreaming about the entities sky above ...

Our advice, repeated over and over ad nauseam, is that taking this approach to the database is almost certain to lead you into problems. The database is not some nasty 'orrible implementation detail of your gorgeous Java entities, rather, it is an equally-valid (or perhaps equally-broken) representation of the real world or at least of your mental model of the real world. ORM exists to bridge the two representations, and to use it effectively, you need to be able to freely switch between thinking of the world in terms of objects, and thinking of them in terms of relations. One of the things that has helped me through my whole career is that I learned early to be able to juggle multiple representations/models/paradigms/points of view at once.

Believe it or not, being able to quickly switch to a relational model of a thing sometimes helps me understand and improve my object model of it.

Michal Jastak replied on Thu, 2012/08/30 - 1:07am in response to: Gavin King

@Gavin - Yes, I'm aware that you are one of the persons responsible for Hibernate and JPA APIs creating :)  

You wrote, that I cannot assume setFirstResult/setMaxResults can be used for pagination:

No, you can't assume that. Sorry. Your query doesn't fully specify the order of the result set, and therefore you can't paginate it. It's really as simple as that. To prove your above assertion, you would have to find the section in the JPA spec which says that any arbitrary JPA query can be paginated using setFirstResult/setMaxResults. As far as I know there is no such language in the spec. (It's remotely possible that there is such language, and I'm unaware of it, but I doubt it.)

 Let me cite JPA 2.0 Spec then, page 106

The Query and TypedQuery APIs are used for the execution of both static queries and dynamic queries. These APIs also support parameter binding and pagination control. 

and page 121:

The effect of applying setMaxResults or setFirstResult to a query involving fetch joins over collections is undefined. 

The last one is applicable to your join fetch example from the first comment.

Do you still believe that I cannot use setFirstResult / setMaxResults for the pagination in my example?

From the other side - let's look at the ordering in JPQL - see JPQL Language Reference - Section 10.2.8 - JPQL ORDER BY Clause  - there is not a single word about partial and total order mentioned by you, and not a single word about the necessity of using only total order - don't you think it should be mentioned in specs if that's so important for us?

There is also one important thing there:

The ordering of the query result is preserved in the result of the query method if the ORDER BY clause is used. 

Don't you think all of the above proves my example is reasonable, and should be correctly handled by the library which claims to be JPA provider?

You wrote also:

Spec writers almost always take the view that a behavior not explicitly required is implicitly not required. It's usually much easier to specify positive requirements than unspecify non-requirements. Of course, we usually do highlight cases which we think will confuse people, but in this case I guess we didn't think confusion would arise since the behavior is exactly what anyone with an SQL background would expect. 

I agree that writing specs is not a simple task, but don't you think that so important thing should be there? It's like you wrote in specs: You can divide a number by another, and forgot to specify that the second one cannot be zero :)

I would also say that API creators should include only those methods in their API which are necessary, intuitive and well explained - if we have methods which usage can be tricky, they should be extremely well explained. Especially if they will be used by many other developers. 

Of course, you should always test your code on all backends, since differences in behavior are to be expected. And you shoudn't depend on semantics that youimagine are in the spec without checking the spec first. 

IMHO this is bug in Hibernate, which is not compliant with JPA Specs at this point.

Using JPA I'm assuming that it's JPA provider dirty work to avoid differences between the backends and give me a level of abstraction a little higher than SQL :) - otherwise I'll stay with raw SQL, and will be responsible for my own bugs only.

Andrew Thorburn replied on Thu, 2012/08/30 - 3:04am

A minor note: I've had issues in the past with DB2 returning rows in a different order after a REORG, even if the table itself hasn't actually changed (not using JPA, but still). As such, if you need the rows returned in a guaranteed, *repeatable* order, then make sure that you always include a total order. Even if you get lucky and your partial order returns the rows in the same order over multiple selects, there's no guarantee (depending on the database) that they will continue to be returned in that same order over the long term. So *always* use a total order.

 In this case, you're lucky and it's failing upfront (rather than days or months down the track!).

Karl Peterbauer replied on Thu, 2012/08/30 - 3:04am in response to: Michal Jastak

@Michal:

IMHO this is bug in Hibernate, which is not compliant with JPA Specs at this point.

Nope. The whole story boils down to the simple fact that Hibernate's strategy for implementing setFirstResult()/setMaxResults() on Oracle is somewhat bold. Blame Oracle for not supporting a simple LIMIT clause. If you have a better strategy you are free to implement your own Oracle dialect.

Using JPA I'm assuming that it's JPA provider dirty work to avoid differences between the backends and give me a level of abstraction a little higher than SQL :) - otherwise I'll stay with raw SQL, and will be responsible for my own bugs only.

JPA provides a level of abstraction which is dramatically higher than SQL, and it does a good job in smoothing out many vendor-specific details. But I have to second Gavin: JPA is not a silver bullet for solving all subtle differences between backends, and it does not claim to do so.

Get used to it: There are way too many subtle differences to guarantee identical behavior on all backends. For example, MySQL happily executes the trivial, but flawed query "SELECT DISTINCT name FROM Employees ORDER BY id", while PostgreSQL refuses to do so, since the order-by clause does not appear in the SELECT list. 

Nilanjan Mhatre replied on Thu, 2014/11/27 - 1:42am

check this out

http://www.jroller.com/sjivan/entry/hibernate_and_oracle_pagination_gotcha

You need to have a unique key as a secondary sorting criteria. The results will not change, but pagination will work.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.