Mike has posted 2 posts at DZone. View Full User Profile

Looking Forward to JPA 2.0

04.17.2008
| 80282 views |
  • submit to reddit

The current way to deal with this modeling scenario is to add, for each of the relationships in the identifier, an additional integer @Id attribute to the entity to represent the foreign key of the related entity. The new object attribute is completely extraneous and is redundantly mapped to the same join column as the relationship. Additionally, one of the two mappings must be read-only to avoid both of them trying to insert into the same column at create-time.

It would be better if the relationship, itself, could be designated to be part of the identifier, avoiding the redundancy of a superfluous foreign key attribute. In JPA 2.0 this is the way such an entity would be modeled, by putting an @Id annotation (or the equivalent element in the XML mapping descriptor) on the relationship attribute.

The primary key class would still be required in the usual cases where the id was composed of both entity state plus a relationship foreign key, or if it was composed of multiple foreign keys. The primary key class would include the foreign key attribute(s), of course, because the value is needed to retrieve an entity. The naming of the primary key class attribute is also consistent with existing rules, whereby the attribute in the PK class must match the corresponding one in the entity, but the type of the attribute in the PK class would be integral instead of the target relationship entity type. A straightforward example actually illustrates this much better than a world of words.

@Entity 
@IdClass(ProjectId.class)
public class Project{
@Id String name;
@Id @ManyToOne @JoinColumn(name=”D_ID”)
Department dept;

}

public class ProjectId {
String name;
int dept;

}

Orphan Removal

Aggregation of entities, or parent-child relationships, is not all that uncommon in models. Although it was possible to manually model this scenario, unfortunately there was no specific support for it in JPA 1.0. New orphanRemoval attributes are now being added to @OneToOne and @OneToMany relationship annotations to mark the relationship as a parent-child relationship to make it easier to implement and more explicit to model.

@Entity
public class Department {
@Id int id;
@OneToMany(mappedBy=”project”, orphanRemoval=true)
Collection<Project> projects;

}

If, like in the example above, the orphanRemoval attribute is set to true then if a department is shut down then its projects will also get removed. In general, two behaviors will apply when orphan removal is enabled:

First, the cascade REMOVE option will automatically apply to the relationship, regardless of whether or not it was specified in the set of cascaded operations. If the parent is removed then the child will also be removed.

Second, if the relationship from the parent to the child is severed, meaning that the pointer from the parent to the child is nulled out, the child will be automatically removed. For example, setting a parent-child collection to null will cause all of the entities that were in the collection to be removed from the database when the transaction commits.

A final note about orphan removal is that the children must be treated as privately owned by the parent. In other words, a child must never be owned by more than one parent (precisely why it is only supported by OneToX relationships) and if a child is dis-owned by its parent it cannot be reassigned to a new parent. A new instance must be created in the context of its new parent.

Published at DZone with permission of its author, Mike Keith.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Mike Dee replied on Mon, 2008/04/21 - 9:29pm

I fail to see the appeal of these java persistence libraries. Perhaps someone can enlighten me. I won't use EJB 2.1 as an excuse. That is when I first came across the notion of container managed persistence. Everyone seems to think that was awful.  It was, but it took too much work to make it do anything useful anyway.

First, some of these persistence libraries seem to replace SQL with XML. Personally, I'd rather just do SQL. Why add another layer of indirection in there. This is kind of analagous to Struts and the other many frameworks compared to Wicket. Wicket just lets you manipulate HTML from straight Java. Nice and simple, without layers of stuff that don't add any value.

Second, does JPA make it any easier to do my job. Many of these types of libraries simply make it easy to do something that is already easy (in SQL). I do some pretty complex queries - queries with many joins, unions, self joins, conditional SQL, etc. Does it make that any easier? Or maybe JPA is meant for the "corporate developer" - someone who just needs to whip up a quick and dirty inhouse app and doesn't know much SQL?

Mike Dee

David Chamberlin replied on Tue, 2008/04/22 - 7:25am in response to: Mike Dee

Persistence frameworks have a long history which, as you point out in your reference to EJB 2.1 has not always been a smooth one.  However given the number of more or less well developed initiatives in the space there is clearly a perceived need for something to help bridge the gap between object oriented languages and relational data stores.  There is no single driving force for this need, but any list of reasons would have to include:

  • Ease of use: the marshalling of data returned from a relational database into memory structures in any programming language can be laborious. 
  • Modelling preferences: many developers prefer to design models in the OO domain (using class diagrams) and this is probably more natural (except for RDBAs) than designing in the relational domain.  It should be possible to add persistence to an object model that has already been designed.
  • Not wanting to have to build skills in SQL: relational algebra, of which SQL is an implementation needs significant training investment.
  • Caching of results returned from a database for performance reasons.
  • Mapping updates made to objets in memory to the necessary updates in the database
  • Creating clean APIs to database models - persistence frameworks obviate the need for external patterns to do this.

The fact that there are so many ORM solutions should not therefore be a suprise with so many different conflicting requirements to be met.  What JPA did amazingly well is to get a important set of these implementations to agree on a core set of functionality.  It was initially hard to see how a single specification could be extracted from the quite different ORM approaches of Hibernate, TopLink and KODO so to do this this was a major achievement.

In my opinion, however, this technical success does not seem to have been reflected by a significant move towards JPA as the persistence solution of choice.  A good number of developers, who were previously using Hibernate have continued to use the Hibernate native API (some in the mistaken belief that because Hibernate supports JPA they are now JPA compliant).  Those developers who were using JDO have tended to continue using JDO, arguing that JPA and JDO will converge.

If the changes promised in JPA 2.0 do one thing, they should address the reasons for this lack of take up and Mike K's focus on improved portability between providers and a better set of core features is important in realizing this goal.  However, I would raise issue with the choice of features described in the article.

Surely these features should be driven by projects that are planning to make full use of JPA's portability, that is, projects offering middleware functionality that want to deploy into organizations using differing JPA providers.  For me, this would deprioritize work that makes it easier to map legacy schemas, and prioritize work that extends the power and flexibility of the mapping and queries for raw performance when needed.

Examples (from JBoss) of this kind of middleware project would be Seam or jBPM, but there must be hundreds of other examples that fall into the same camp.  The JPA 2.0 designers need to listen to the portability stories of these projects to understand where the real challenges lie.

There is a close parallel to the work going on with JPA and the development a decade ago of ODBC/JDBC drivers.  Few at that time would have predicted that the differing feature sets of different database providers could be represented through a single API, but today very few developers choose to use native APIs to access databases.  The work on JPA should be looking to follow the same path.

 

Mike Keith replied on Thu, 2008/04/24 - 11:11am in response to: David Chamberlin

David,

I don't know if you are the one in a bubble or if I am, or maybe we both are in separate bubbles, but I'm not sharing your experience. I speak at many conferences and Java user groups and am seeing large numbers of people moving from proprietary APIs offered by Hibernate and TopLink to the JPA APIs offered by those products. Once they migrate to the JPA layer of those implementations some of them actually try different vendors out (either because they hit a particularly annoying bug, or are missing a feature, or their tea leaves tell them to).

There are some who have told me that they would like to move, but can't until a specific feature is supported in JPA (which is not currently there in 1.0). We are spending a good deal of time adding these kinds of features into 2.0 so that people that want to use JPA can use it. There are others who don't feel the need to move, choosing instead to stay with the proprietary APIs. Reasons for doing this vary from not being prepared to make the effort and pay the price of migrating, to not even recognizing the risk associated with having a vendor dependency. Companies that do make use of standards, though, are indeed starting new projects using JPA, and doing this en masse. Give this spec a couple more years and you will see it everywhere, of that I am sure.

As for portability, if you have seen any of my portability talks then you will know that it is a subject that is near and dear to my heart, and other group members feel the same way. We are definitely trying to reduce the ways that people can (knowingly or unknowingly) end up having to make non-zero code changes to port from one vendor to another. However, we are encouraged by the people that have changed vendors and reported that they didn't find it to be a big deal. Obviously it not only works, but seems to be the more common case.

I really liked your analogy of JDBC and JPA, and I believe that point will not be too long in coming.

-Mike
Pro EJB 3: Java Persistence API

Robin Bygrave replied on Mon, 2008/04/28 - 1:47am in response to: Mike Keith

Hi Mike,

I too struggle with the choice of features talked about for JPA 2.0.

There are 2 real performance/scalability related features and I can't understand why they don't seem to be addressed?

Partial Objects:

Are there any plans to change the query language to support Partial Objects? This not only has a big performance and scalability payoff but greatly eases the design decisions for "wide" entities. Is this being addressed in JPA 2.0?

Large Query support / Query Listener:

When you do query.getResultList() all the object graphs are loaded into the Persistence Context. This doesn't work too well if you want to process lots of data/object graphs on an object graph by object graph basis. Other ORMs support a per object graph persistence context (Ala query listener/callback). Any plans for addressing this in JPA 2.0?

Obviously it would be too hopeful to address other issues such as control of Isolation Level, access to java.sql.Connection, fix mapping of Enums, support of SQL3 Table Inheritance?

Any of these issues on the table for 2.0?

Thanks, Rob.

Mike Keith replied on Mon, 2008/04/28 - 8:31am in response to: Robin Bygrave

Rob,

I honestly haven't had that many people ask about partial objects, I think mainly because:

a) you can already set as many or as few of the object state and relationship attributes to be lazy
b) you can load subsets of entity data as non-entities and use those for specialized partial object views
c) although large numbers of columns does occasionally happen it is not overly common

As a result we have not spent much time on this. We have had some discussion on more dynamic
fetching, though, which would open up the door to yet another option. 

Not sure exactly what you are asking for on the query side. JPA supports pagination and gives you the ability to execute a query multiple times to sequence through the result, but it doesn't have built-in support for large collections or logical collection cursors. Is that what you mean? Haven't seen listener/callback-based query responses since back in the object database days and haven't had a single person ask about that one (okay, now I have had one :-).

All comments and suggestions are encouraged, though, and can also be made to the group feedback list:

    persistence-feature-requests [suffixed by: at-sun-dot-com]

-Mike
Pro EJB 3: Java Persistence API

Robin Bygrave replied on Tue, 2008/04/29 - 7:33am

BTW: I have replied to you but my reply is "under moderation". I included some links so perhaps that was why... if my reply doesn't show up tomorrow I'll try posting it again.

Rob. 

 

Joe Farmer replied on Wed, 2008/04/30 - 12:20pm

I like this kind of "cheerleading" articles. They really make you feel good. Nice and warm: relax, big brothers will take care of you.

In reality though, ORM first kills your performance by executing hundreds of unnecessary queries, and after that tries to fix some of the problems it created by, in many cases unnecessary, caching (killing your memory if you do not have unlimitted resources), or lazy loading that is an anti-pattern in majority of cases. If you also take into account its over-synchronized object model that actually makes it next to impossible to scale an application even with the best clustering products, then you really feel comforted.

Mike Keith replied on Wed, 2008/04/30 - 12:25pm

Little brother,

The vast majority of applications that use ORM find that it not only increases performance but reduces the code they have to manage to keep their Java objects in sync with the database. That actually does make people feel warm and fuzzy because it means that a huge chunk of functionality just got lifted off their shoulders. They get even warmer when they realize that the new functionality represents over a decade of experience and is likely going to be better than what their whole team of developers would produce in a year.

Of course, not every technology is right for every application, and clearly ORM is not a perfect fit for some, but the claims that you make are not only ridiculously false, but must be based on either simple naivete/lack of experience, or a lack of competence. Sounds like you really do need a big brother to come and hold your hand while explaining the facts...

Joe Farmer replied on Wed, 2008/04/30 - 1:01pm in response to: Mike Keith

"The vast majority of applications that use ORM" and contain up to 1000 records in their tables or run on 200 CPU boxes ...

ORM is a perfect solution for a prototype or for an enterprise system that can afford enormous investments into infrastructure. It reduces amount of code but it does not mean that the application becomes easier to maintain. In many cases it is quite opposite since you lose control over it. Plus now not only you have to have a good knowledge of SQL but also predict how your OQL or whatever translates into SQL. That really makes your life easier.

I do not argue ORMs have some advantages as it comes to functionality but the means they are achieved are wrong, IMHO. Yes, ORMs have a decade of experience but their ideology is still tainted by ideas rooted in EJB 1.

And Mike, please do not take it personally, even though I appreciate your offer to hold my hand.

Mike Keith replied on Wed, 2008/04/30 - 1:16pm in response to: Joe Farmer

[quote] I do not argue ORMs have some advantages as it comes to functionality but the means they are achieved are wrong, IMHO. Yes, ORMs have a decade of experience but their ideology is still tainted by ideas rooted in EJB 1. [/quote]

That is not only patently untrue, but would be a disruption of space and time if it were. ORM was developed years before EJB persistence was even released.

The reality is that the company that I worked for at the time actively tried (and failed) to get the EJB group to pursue simple ORM principles. EJB decided to go its own way, a way in complete opposition to the POJO model that the ORM products were and are still subscribing to.

Joe Farmer replied on Wed, 2008/04/30 - 1:40pm in response to: Mike Keith

Alright, I will elaborate. I mean the ideas that caching can compensate for inefficient SQL model, single instance of an object in cluster locking, believe that you can forget that underlying storage is a relational database, and so on. You are probably right, those ideas existed even before EJBs, which does not make them any better. JPA is way superior to previous incarnations of but it still does not make me jump up in celebration. It is moving in a right direction but just too slowly.

Mike Keith replied on Wed, 2008/04/30 - 6:57pm in response to: Joe Farmer

I honestly don't know where you came to those conclusions, but they do not represent the philosophies of ORM.

a) Caching absolutely can't and shouldn't compensate for a bad data model. The thing is, ORM doesn't have any business creating the data model. That is the domain of the DBA. An ORM maps object state to an existing data model that is finely tuned by data modeling professionals.

b) ORM has nothing to do with clustering. Although some ORM products choose to also have a clustering solution they vary according to the product and strategy. I don't know of one that enforces an exclusive lock on a single instance in a cluster, but sometimes you get what you pay for...

c) No ORM developer should ever forget what the underlying data store is, and nor should they be ignorant of the database concepts. ORM does not dumb down access from the object model to the db, it provides an extensive framework that makes the necessary features available to knowledgeable users so that people do not have to keep reinventing the same wheel over and over again.

Note that JPA is not something that you can compare to ORM. It is no worse and no better than ORM. It is a spec that encapsulates and standardizes what the ORM industry has been doing since the beginning (and remember that EJB was not ORM).

I hope this clears things up a little for you. Not that you are going to run out and try the next ORM product you bump into, but I am hoping that at least you won't go around spreading falsehoods...

-Mike

Joe Farmer replied on Wed, 2008/04/30 - 9:44pm in response to: Mike Keith

Strangely enough, I agree with everything you just said (except for me spreading falsehoods :)). What I wrote does not apply to JPA per se' but rather to ORM/EJB type approach and implementations (stateful entity beans and all that ...). Unfortunately, many people do not realize all their complexities and pitfalls. And there are just too many articles that blindly talk of how great those technologies are - reducing number of lines of code, improving performance, scalability, which in any concrete case may not be true. I would like to hear more of a balanced view with pros and cons. There are always many ways to achieve the same goal.

Robin Bygrave replied on Wed, 2008/04/30 - 11:01pm in response to: Mike Keith

Mike,

In regard the feedback email address... I'd question how well that is working because I have already used that to give my feedback but perhaps its not getting through?

anyways... back to reply to your feedback from ages ago about partial objects and large queries... 

In regards the large query stuff... check Ibatis RowHandler or Ebean QueryListener ... these are used in batch type processing where you want to process a lot of object graphs without loading the whole lot into the persistence context at once. IMO JPA needs this for use in batch processing tasks.

In regards the partial objects stuff...

The way I see it your a) means one mans partial object is another mans lazy load... hence the need for a dynamic (query language ) approach.

Does b) mean return Object[] with scalar types of entity objects? If so then this means the code change from a full object to a partial object is a big code change (not nice at all).

With c) it should be noted that you get other benefits of partial objects. Ala partial objects with very few properties can  be very performant as they can be statisfied from reading just DB indexes and not require the DB to read data blocks (hence very fast). Also the ORM design improves... where ever you use a secondary table property you can use a partial object with several benefits.

The other thing you should check out is "AutoFetch".  It would be great if JPA took that up. In short AutoFetch will automatically tune your query (fetch joins and used properties for partial objects if supported) for more optimal performance. Looking down the track I'd say this a very significant idea for ORM.

 Cheers, Rob. 

Jaikiran Pai replied on Sat, 2008/05/03 - 12:14am

Mike,

You mention that there are 3 steps to make an attribute to be accessed through its property instead of the field (using the @Access annotation). The 3rd step being :

3. Finally, we mark the corresponding field as @Transient so that the default access mode does not try to map it in addition to the property, and thereby map the same state twice.

Don't you think that marking a field as @Transient is redundant. When there is a @Access(PROPERTY) being used on an attribute to override the class level @Access(FIELD), shouldn't the persitence provider be smart enough to not map it twice? The @Transient annotation, in this scenario, looks redundant to me.

 

Mike Keith replied on Mon, 2008/05/05 - 5:32pm in response to: Robin Bygrave

[quote=rbygrave]In regard the feedback email address... I'd question how well that is working because I have already used that to give my feedback but perhaps its not getting through?

[/quote]

Really? Hmm. Try jsr-317-edr-feedback [atSunDotCom]

[quote]In regards the large query stuff... check Ibatis RowHandler or Ebean QueryListener ... these are used in batch type processing where you want to process a lot of object graphs without loading the whole lot into the persistence context at once. IMO JPA needs this for use in batch processing tasks.

[/quote]

Yeah, but Ibatis doesn't have to provide the same level of relationship support / persistence context. In any case, I guess its do-able, it just doesn't seem to be a great fit with the existing JPA model. Send an email to the group and we can take a look at what it might look like, though.

[quote]Does b) mean return Object[] with scalar types of entity objects? If so then this means the code change from a full object to a partial object is a big code change (not nice at all).

[/quote]

No, it means that I can do a SELECT new MyClass(e.state1, e.state2, ...) FROM MyEntity e WHERE ...

I have not actually tried passing in an entity class as a non-persistent class, but it might work, too.

[quote]The other thing you should check out is "AutoFetch". It would be great if JPA took that up. In short AutoFetch will automatically tune your query (fetch joins and used properties for partial objects if supported) for more optimal performance. Looking down the track I'd say this a very significant idea for ORM.

[/quote]

I would put this in the category of something that you might expect as a vendor extension, but is probably not ready for standardization, yet.

-Mike
Pro EJB 3: Java Persistence API

Mike Keith replied on Mon, 2008/05/05 - 5:37pm in response to: Jaikiran Pai

[quote=jaikiran]Don't you think that marking a field as @Transient is redundant. When there is a @Access(PROPERTY) being used on an attribute to override the class level @Access(FIELD), shouldn't the persitence provider be smart enough to not map it twice? The @Transient annotation, in this scenario, looks redundant to me.

[/quote]

The problem is that there is no way for the provider to know which field is being overridden. How could it know that  the pNum field is the one that is being remapped by the getPhoneNumberForDb() property method?

Robin Bygrave replied on Tue, 2008/05/06 - 12:37am in response to: Mike Keith

[quote=mkeith]

Really? Hmm. Try jsr-317-edr-feedback [atSunDotCom]

[/quote]

Yeah, I can try again... but I note the problems we are having with this discussion which leads me to believe that one way email feedback on complex ideas is difficult. For example, I have already sent an email on the large query issue, is it possible that no-one understood?

[quote]

Yeah, but Ibatis doesn't have to provide the same level of relationship support / persistence context.

[/quote]

Yes, but Ebean DOES. This then becomes "per object graph persistence context" and yes handles one to many relationships in the query. Its more than do-able because I have already done it in Ebean - aka Ebean QueryListener. I can also say from personal experiance that it is a very useful feature for batch programming requirements. You may not want it for JPA and thats fine by me.

[quote]

No, it means that I can do a SELECT new MyClass(e.state1, e.state2, ...) FROM MyEntity e WHERE ...

[/quote]

Yeah interesting, of course you want MyClass to be an entity (for lazy loading and persisting) and you do not want to use a constructor and you want to be able to have a partial object on ANY object in the object graph (aka joined objects). Just to be more direct... Ebean ORM has much better partial object support IMO and in implementing such partial object support as there is in Ebean I have real concerns with JPQL (around the select clause) - I hope for JPA's sake I'm wrong on that. Of course if the JPA EG is not that fussed on partial objects... it doesn't really matter.

[quote]

"AutoFetch"...

[/quote]

Fair enough, great that you have taken time to look at it.

Thanks, Rob.

Jaikiran Pai replied on Sat, 2008/05/24 - 2:08am in response to: Mike Keith

[quote=mkeith][quote=jaikiran]Don't you think that marking a field as @Transient is redundant. When there is a @Access(PROPERTY) being used on an attribute to override the class level @Access(FIELD), shouldn't the persitence provider be smart enough to not map it twice? The @Transient annotation, in this scenario, looks redundant to me.

[/quote]

The problem is that there is no way for the provider to know which field is being overridden. How could it know that the pNum field is the one that is being remapped by the getPhoneNumberForDb() property method?

[/quote]

 

 

You are right :)

I missed the fact that a getPhoneNumber() property method need not actually be referring to a field named phoneNumber.

Colbert Philippe replied on Thu, 2010/07/01 - 10:48am

Error or possible shortcoming in JPA 2 specifications.... Hi Mr. Mike Keith! My name is Colbert Philippe. I bought your book Pro JPA 2 and learned that you participated in the JPA 2 specification. I am currently using JPA 2 instead of Hibernate because many Cloud systems now support JPA 2. I ran into a situation where I need to embed 3 levels of classes inside one another. The JPA 2 specification can support one level of embedding but not more than that. I think that is a shortcoming of the specification since Hibernate supports that feature. If you know a way around this, please let me know but I haven't found any. I will let you know if I find other shortcomings in JPA 2. Colbert Philippe

Carla Brian replied on Wed, 2012/06/13 - 12:11pm

Changes throughout the platform make the development of enterprise Java technology applications much easier, with far less coding. - Joe Aldeguer

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.