SQL Zone is brought to you in partnership with:

Eyal Lupu is a Java/JEE software architect, consultant and trainer. Working as a software professional since 1993 Eyal is specialized in architecting, designing and building infrastructures and frameworks for software development teams. Eyal is also specialized in technology leadership of globally distributed development teams. He blogs at http://blog.eyallupu.com Eyal is a DZone MVB and is not an employee of DZone and has posted 6 posts at DZone. You can read more from them at their website. View Full User Profile

Hibernate - Tuning Queries Using Paging, Batch Size, and Fetch Joins

06.09.2008
| 148195 views |
  • submit to reddit

Final Tuning - BatchSize

Does it mean that in the case of paging we shouldn't use a join fetch? usually it does (unless your page size is very close to the actual number of records). But even if you use a simple select this is a classic case for using the @BatchSize annotation.

If my session/entity manager has 100 customers attached to it than, be default, for each first access to one of the customers' order collection Hibernate will issue a SQL statement to fill that collection. At the end I will execute 100 statements to fetch 100 collections. You can see it in the log:

Hibernate: /* select c from Customer c order by c.id */ select customer0_.id as id0_, customer0_.ccNumber as ccNumber0_, customer0_.name as name0_, customer0_.fixedDiscount as fixedDis5_0_, customer0_.DTYPE as DTYPE0_ from CUSTOMERS customer0_ order by customer0_.id limit ? offset ?

Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=?
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=?
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=?
............
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=?
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=?
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id=?





The @BatchSize annotation can be used to define how many identical associations to populate in a single database query. If the session has 100 customers attached to it and the mapping of the 'orders' collection is annotated with @BatchSize of size n. It means that whenever Hibernate needs to populate a lazy orders collection it checks the session and if it has more customers which their orders collections need to be populated it fetches up to n collections. Example: if we had 100 customers and the batch size was set to 16 when iterating over the customers to get their number of orders hibernate will go to the database only 7 times (6 times to fetch 16 collections and one more time to fetch the 4 remaining collections - see the sample below). If our batch size was set to 50 it would go only twice.

@OneToMany(mappedBy="customer",cascade=CascadeType.ALL, fetch=FetchType.LAZY)
    @BatchSize(size=16)
    private Set<Order> orders = new HashSet<Order>();


And in the log:

Hibernate: /* select c from Customer c order by c.id */ select customer0_.id as id0_, customer0_.ccNumber as ccNumber0_, customer0_.name as name0_, customer0_.fixedDiscount as fixedDis5_0_, customer0_.DTYPE as DTYPE0_ from CUSTOMERS customer0_ order by customer0_.id limit ? offset ?
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: /* load one-to-many par2.Customer.orders */ select orders0_.customer_id as customer4_1_, orders0_.id as id1_, orders0_.id as id1_0_, orders0_.customer_id as customer4_1_0_, orders0_.description as descript2_1_0_, orders0_.orderId as orderId1_0_ from ORDERS orders0_ where orders0_.customer_id in (?, ?, ?, ?)


Back to our test case. In my example setting the batch size to 100 looks like a nice tuning opportunity. And indeed when setting it to 100 the total execution time dropped to 188 millis (that's an 132 (!!!) times faster than worse result we had). The batch size can also be set globally by setting the hibernate.default_batch_fetch_size property for the session factory.

From http://www.jroller.com/eyallupu/

Published at DZone with permission of Eyal Lupu, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Koen VT replied on Mon, 2008/06/09 - 9:42am

I had a similar problem once. Since our database has millions of rows and we needed optimal performance, I had no choice but to use a native query. This is an example for Oracle, where the first 100 customers are retrieved, together with their orders (if any):

SELECT
A1.*,
ORD.*
FROM (
SELECT
ROWNUM RNUM,
A2.*
FROM(
SELECT
CUST.ID,
CUST.NAME
FROM
CUSTOMER CUST
ORDER BY CUST.NAME ASC
) A2
WHERE ROWNUM < 100
) A1
LEFT OUTER JOIN
ORDER ORD
ON
ORD.IDCUST = A1.ID
WHERE
RNUM >= 0;

In fact, this query returns the 100 customers in 100 records or more (if there are customers with more than one order)!

I'm not a JPA expert, but I know JPA has support for native queries. Sometimes, for optimal performance, you have no other choice.

I agree that a native query (in this case for Oracle) is not portable to other databases, but the same is true for the @BatchSize annotation at a certain level. It only works with Hibernate and you have to rewrite the code when you want to use OpenJPA, Toplink or whatever as a JPA implementation.

S. Radojcin replied on Sun, 2009/08/16 - 4:18am

"I have in my database 18,998 customer records, each with few orders."

What if you have a _lot of orders for each customers, and want to use pagination for those orders (for example, in order to display orders in a separate data table, with dataScroller, as ussual) ? How to determine total number of orders for a particular customer ? I suppose that 

c.getOrders().size();

will be too expensive, or I am wrong ?

 

Further, how to load just , let say, 50 orders for a particular customer ? Iterating throughoutc.getOrders(), or execute separate Query, with setFirstResult()/setMaxResult() ?

- regards, Cvele

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.