SQL Zone is brought to you in partnership with:

I have a passion for talking to people, identifying problems, and writing software. When I'm doing my job correctly, software is easy to use, easy to understand, and easy to write... in that order. Michael is a DZone MVB and is not an employee of DZone and has posted 52 posts at DZone. You can read more from them at their website. View Full User Profile

Avoid Hibernate Anemia and Reduce Code Bloat

08.14.2014
| 5122 views |
  • submit to reddit

One of my beefs with Hibernate as an ORM is that it encourages anemic domain models that have no operations and are simply data structures. This coupled with java's verbosity tend to make code unmaintainable (when used by third party systems) as well as cause developers to focus in THINGS instead of ACTIONS. For example take the following class that represents a way to illustrate part of a flight booking at an airline:

public class Flight {
    public Date start;
    public Date finish;
    public long getDuration() {
        return finish.getTime() - start.getTime();
    }
}

This is the core "business" requirement for a use case in this model in terse java. Form an OO perspective, start and finish are attributes, and getDuration is an operation (that we happen to believe is mathematically derived from the first two fields. Of course, due to training and years of "best practices" brainwashing, most folks will immediately and mindlessly follow the java bean convention making all the member variables private and "just generate" the getters and setters. That makes the same functional unit above look like the following:

public class Flight {
    private Date start;

    public Date getStart() {
        return start;
    }

    public void setStart(Date start) {
        this.start = start;
    }

    private Date finish;

    public Date getFinish() {
        return finish;
    }

    public void setFinish(Date finish) {
        this.finish = finish;
    }
    public long getDuration() {
        return finish.getTime() - start.getTime();
    }
}

Wait, we're not done yet, if we want duration to be persisted, we'll move the logic to another class and add getters and setters:

public class Flight {
    public Date start;

    public Date getStart() {
        return start;
    }

    public void setStart(Date start) {
        this.start = start;
    }

    public Date getFinish() {
        return finish;
    }

    public void setFinish(Date finish) {
        this.finish = finish;
    }

    public Date finish;

    private long duration;

    public long getDuration() {
        return this.duration;
    }

    public void setDuration(long input) {
        this.duration = input;
    }
}

public class FlightHelper {
    public static long getDuration(long finish, long start) {
        return finish - start;
    }

}

This "Helper" or "Business Delegate" pattern is yet another area where things go wonky very quickly. Usually, to keep things "pure" folks will put all logic in the helper (or delegate, I'm not sure if there's a difference) and the model will have no logic. This really makes troubleshooting where the logic is contained very difficult. In addition, having a computed and stored field is fraught with potential for errors. Java folks will typically make the case that this class is really a Data Transfer Object (DTO)... OK, fine, but that's like saying an elephant is actually an herbivore...

But wait...it gets worse...

What I often see happen among java circles is that this is a death spiral of bloat in the interest of "best practices". A typical next step is that, folks invariably realize that serializing hibernate objects to remote servers or tiers that don't have access to hibernate becomes a huge challenge due to hibernate's technique of using AOP to actually replace the real object with a dynamic proxy. To get around this, developers invariably create another layer of DTOs or "Value Objects" as well a mapping layer to map between these two domains.

In conversation with most java developers about "why are we doing it this way?" I get blank stares and the best answer I've heard is "because that's the way we do it" or often a link to a web site explaining how to do it and why which ultimately is really just a clever way of saying "I don't know". Crafty individuals will then start talking about java patterns and all sorts of other artificial explanations that never explain "why", but simply re-endorse "how".

A way to mitigate this problem is to start decomposing application components functionally and realize that data persistence is in fact a first order operation in most systems. This means that persisting data should be atomic and a single step operations (hint: If you need a transaction manager the call is NOT atomic). Additionally, putting these behind web services means that the idea of persisting data becomes an internal responsibility and not something a caller needs to know or care about

Put another way, hide our persistance layer behind an API and don't create superfluous classes that need to be shared with third parties. So, in the example above, you could do something like:

public class FlightService {
    public Date getStart(long id) {
      //...implementation here...
    };
    //create a flight and return the identifier
    public long createFlight(Date start, Date finish) {
      //...implementation
    };
    ///Returns duration
    public long setStartAndFinish(long id, Date start, Date finish) {
      //..implementation here...
    };
    public Date getFinish(long id) {
      //...implementation here...
    };
    public long getDuration(long id) {
      ///...implementation here ...
    }
}

This preserves the idiomatic java, plus enables us to completely hide the implementation details from the caller. Yes, it introduces a transaction and granularity problem that we immediately need to solve... and should force us (unless we really want to do it the hard way) to start thinking about he API contract for atomic operations. I think this is the important distinction and shouldn't be forgotten. Worry about what your design is supposed to DO first as at the end of the day, the OPERATION is more important the the MODEL.

Published at DZone with permission of Michael Mainguy, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Scott Franson replied on Thu, 2014/08/14 - 5:47pm

What the code DOES is critical for the application. After all, an application that does nothing or does stuff incorrectly has little (or no) worth. However, the MODEL is paramount to understanding and communicating what the business does. It is the shared knowledge and understanding that crosses boundaries (code and team). As far as the justification for different service-level (business) models and persistence-level (data) models: there is naturally considerable overlap, but a business model may not be able to be persisted in an efficient manner. A separate data model offers the flexibility to intelligently optimize the persistence of that data without affecting the business model. As everything has a consequence, DTO / mapping code may be the price one pays for such isolation. 

Yannick Majoros replied on Sat, 2014/08/16 - 3:57am

Well, as much as I agree on some points, there doesn't seem to be any academic consensus on the fact that pure data structures are evil and that business operations should be in the same class as data. There are some people saying so, and others thinking otherwise. 

The term "anemic" is used here and there, but there is no scientific evidence that this should be avoided. What's called "anemic" here could be just seen as low coupling. You could have other structures (even if not 100% as you describe) that would maintain high cohesion, too.

I'd add that having all business logic in the same classes as pure data could have some negative impact too:

  • it's easy to break the single responsibility principle; should a cookie know how to bake itself?
  • instead of favoring high cohesion, it often ends up putting business logic that really uses multiple concepts, randomly in one of the involved classes
What's described here as a solution to avoid difficult-to-maintain code, rapidly becomes even more difficult in my eyes.Now, this is subject that's much difficult than it seems, and I'm still looking for convincing answers. That's the reason I'm posting here, as I did on occasion ( http://stackoverflow.com/questions/19520614/anemic-data-model-daos-authoritative-reference ). Are there other experiences, interesting facts?

Robin Varghese replied on Wed, 2014/08/27 - 12:42pm in response to: Yannick Majoros

 Just looking at the above code - I feel dealing with more primitive or language constructs at API level rather than data structures, exposes your code to API changes. For instance you might need to overload the createFlight method several times over to support new scenarios. The use of a VO gives you a way to represent the Flight properties as a single component.

Also if you are using the full power of Hibernate you tend to build complex object graphs that map very closely to your database. The VO gives us an opportunity to focus on/ or abstract the representation preserving the properties we need and ignoring the complexity.

I do agree and have seen the complexity that comes from like a ton of classes, but I think you look at the advantages - separation of layers, data layer isolation, ability to define granularity of higher layers - I think it is worth it.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.