Creator of the Apache Tapestry web application framework and the Apache HiveMind dependency injection container. Howard has been an active member of the Java community since 1997. He specializes in all things Tapestry, including on-site Tapestry training and mentoring, but has lately been spreading out into fun new areas including functional programming (with Clojure), and NodeJS. Howard is a DZone MVB and is not an employee of DZone and has posted 81 posts at DZone. You can read more from them at their website. View Full User Profile

Who Wants The Func? Gotta Have That Func!

06.09.2010
| 3537 views |
  • submit to reddit

I've been entranced by the concept of laziness since I first really considered it while teaching myself a bit of Haskell. Laziness is the idea that no computation takes place until it is actually needed ... an idea that is common in the functional programming world and one that works best with immutable data.

Why immutable? This has been covered extensively elsewhere, but the gist is that when you have any kind of mutable data (any field that can ever change its value), you add time as an invisible input to your expressions. Literally, the time that any single expression is evaluated relative to other changes to mutable state will affect the outcome of the expression, often in non-predictable ways. In the mathematical world, a function will always return the same value for the same inputs ... in the fuzzy, dirty world of Object Orientation, a method invocation may return different values at different times based on mutable state. Not necessarily mutable state in the object being invoked, but in some other object, somewhere, that the invoked object depends on.

This is why parallel programming in the OO world seems so hard. It requires locks on mutable data, which brings its own problems, such as deadlocks. It can feel like a tottering house of cards.

But remove mutability from this picture and an entirely different world emerges. Functions do behave as functions; same inputs: same result. Side effects disappear, because there's no mutable state. Evaluation of expressions is no longer linked to time: it can be evaluated in parallel threads, or can be deferred until absolutely needed.

That last bit is the laziness. Laziness is a way to bootstrap your code up to a simpler, clearer expression of your algorithms ... once you embrace laziness, you can see that a good amount of the code you write (using mutable data especially) is a case of premature optimization.

Back to Tapestry; as far back as Tapestry 4.0 (where HiveMind and the use of Inversion of Control and Dependency Injection where introduced), Tapestry's internal code has had many functional characteristics. The base unit of work in the Tapestry IoC container is an interface, not a function ... but often those interfaces have a single method. That makes the interfaces look a lot like a function, ready for the kind of composition possible in a functional programming language. Sure, it's a bit clumsy compared to a real functional programming language ... but the power is still there.

Tapestry 5 uses these features to handle a lot of Aspect Oriented Programming concepts; for instance, services are lazily instantiated, and they can be decorated and advised to provide cross-cutting concerns. In fact, Tapestry uses functional composition extensively for all kinds of meta-programming.

Meanwhile outside the realm of Tapestry, my exposure to Clojure has really sold me on the functional approach, and I take to immutable data structures like a warm, comforting blanket. I miss all that when I'm working with ordinary Lists and Sets from Collections API.

Given that Tapestry does a lot of complex things, I started work on a simple functional library. What I've created is not nearly as complex as Functional Java; I think it does less, but does it more cleanly. It's more focused.

The idea is that you'll create a Flow from some source (usually, a Collection). You can then map, filter, reduce, append, concatenate, and iterate the values inside the Flow. Further, the Flows are lazy (as with Haskell and Clojure); all evaluation is deferred until absolutely necessary, and it's thread safe. You can also have infinite flows.

It all starts in a static class F (for Functional) that has the initial factories for Flows. This example uses the F.range() method to create a Flow for a range of integers:

System.out.println(F.range(1, 100).filter(F.gt(10)).map(
  new Mapper() {
   public Integer map(Integer value) {
    return value * 2;
   }
  }).take(3).toList());

When executed, this code prints the following: [22, 24, 26]. That is, it dropped the values less than or equal to 10; it then multiplied each remaining value by 2 and converted the first three to a list.

  • F.range() creates a lazy Flow of integers in the range (from 1 to 99; the upper range is non-inclusive).
  • filter() is a Flow method that keeps only some values, based on a Predicate
  • F.gt() is a static factory method, it creates a Predicate that compares a Number value from the Flow against the provided value
  • map() is a Flow method that is applied to each value in the Flow
  • take() takes a limited number of values from the front of the Flow
  • toList() converts the Flow into a non-modifiable List

Here we mapped from Integer to Integer, but it would have been possible to map to a different type. At each stage, a new (immutable) Flow object is created.

What about laziness? Well if we modify the code a bit:

System.out.println(F.range(1, 100).filter(F.gt(10)).map(
  new Mapper() {
   public Integer map(Integer value) {
    System.out.println("** mapping " + value);
    return value * 2;
   }
  }).take(3).toList());

The new output is:
** mapping 11
** mapping 12
** mapping 13
[22, 24, 26]

... in other words, although we write code that makes it appear that the entire Flow is transformed by the map() call, the reality is that individual values from the original flow are mapped, just once, as needed. The code we write focuses on the flow of transformations, from the input, to the final result: "start with the range, retain the values greather than 10, multiply each by two, keep just the first three".

Does this make a difference? Not with trivial cases like this example. The functional code could be rewritten in standard Java as:

List<Integer> result = new ArrayList<Integer>();
for (int i = 1; i < 100; i++)
{
  if (i >= 10)
  {
    result.add(i * 2);
  }
}

result = result.subList(0, 3);

System.out.println(result);

Yes, this code is shorter, but it does more work (computing many doubles values that are not needed). We could do some extra work to keep a count of result values and end the loop earlier, but that increases the cyclomatic complexity even further. The extra work is not a big deal here, but if the transformations were more expensive (say, re-drawing images in different sizes, or reading data from a database) the work not done unnecessarily would become quite significant.

And is the traditional Java code really shorter? What if we create a reusable factory function:

public static Mapper<Integer,Integer> multiplyBy(final int multiplicand)
{
  return new Mapper<Integer, Integer>() 
  {
    public Integer map(Integer value)
    {
      return value * multiplicand;
    }
  };
}

Then our original Flow expressions becomes:

System.out.println(F.range(1, 100).filter(F.gt(10)).map(multiplyBy(2)).take(3).toList());

... meaning that, once we have a good collection of these Mapper and Predicate factory methods, we can have the efficient, lazy code and it will be more concise and readable.

Anyway, tapestry-func is a work in progress, but it's very promising, and already being used in both Tapestry 5.2, and in some of my clients' code.

 

From http://tapestryjava.blogspot.com/2010/06/who-wants-func-gotta-have-that-func.html

Published at DZone with permission of Howard Lewis Ship, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Howard Lewis Ship replied on Wed, 2010/06/09 - 9:54am

Just to head off discussions that have also been occurring on my blog ... yes, I'm aware of several other "functional Java" libraries. The point of this code was to have something tailored to Tapestry's needs, and to eliminate external dependencies. Still, I'm quite proud of how the laziness has been implemented.

Michael Eric replied on Wed, 2012/09/26 - 3:45pm

what you have is precisely what Google Collections does (and i guess op4j). Yet you want to ignore it....

you have to agree that it's a little bit un-lazy. but as long as you're happy to keep enhancing tapestry, we can't complain too much.

debian

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.