DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Top 10 C# Keywords and Features
  • Enhancing Code Clarity With Python Namedtuples
  • In-Depth Guide to Using useMemo() Hook in React

Trending

  • Securing the Future: Best Practices for Privacy and Data Governance in LLMOps
  • Driving DevOps With Smart, Scalable Testing
  • Memory Leak Due to Time-Taking finalize() Method
  • System Coexistence: Bridging Legacy and Modern Architecture
  1. DZone
  2. Software Design and Architecture
  3. Performance
  4. Do it in Java 8: Automatic memoization

Do it in Java 8: Automatic memoization

By 
Pierre-Yves Saumont user avatar
Pierre-Yves Saumont
·
Sep. 30, 14 · Tutorial
Likes (19)
Comment
Save
Tweet
Share
68.2K Views

Join the DZone community and get the full member experience.

Join For Free

Memoization is a technique used to speed up functions. Memoization may be done manually. It may also be done automatically. We can find many examples of automatic memoization on Internet. In this article, I will show how Java 8 makes it very easy to memoize functions.

What is memoization

Memoization consist in caching the results of functions in order to speed them up when they are called several times with the same argument. The first call implies computing and storing the result in memory before returning it. Subsequent calls with the same parameter imply only fetching the previously stored value and returning it.

How to apply memoization

Memoization may be applied manually by hard coding it in every function that may benefit from it. If it takes a long time to compute the return value, memoization will speed up the program. For functions that take less time to evaluate than fetching the previously stored value from memory, memoization is clearly not a good option.

Hard coding memoization by hand in each function is not a good option neither because it is repeating the same principle again and again. That is why automatic memoization is desirable.

What to memoize

Memoization applies to functions. Prior to Java 8, Java had no functions. However, it was perfectly possible to define some. Furthermore, we used to create “functional” methods, that is methods taking an argument and returning a value based only upon this argument. These kind of method may benefit from memoization. By the way, there is a match between functional methods and functions. For example, the following method:


Integer doubleValue(Integer x) {
  return x * 2;
}

corresponds to:


Integer doubleValue(Integer x) {
  if (cache.containsKey(x)) {
    return cache.get(x);
  } else {
    Integer result = x * 2;
    cache.put(x, result) ;
    return result;
  }
}

In Java 8, we can make this much cleaner:


Map<Integer, Integer> cache = new ConcurrentHashMap<>();

Integer doubleValue(Integer x) {
  return cache.computeIfAbsent(x, y -> y * 2);
}

Our function may be modified to use the same technique:

Function<Integer, Integer> doubleValue = x -> cache.computeIfAbsent(x, y -> y * 2);

This is pretty simple, but it has two main drawbacks:

  • We have to repeat this modification for all functions.
  • The map we use is exposed and could potentially be modified by another thread having nothing to do with the function.

The second problem is quite easy to address. We may put the method or the function in a separate class, including the map, with private access. For example, for the method case:


public class Doubler {
  private static Map<Integer, Integer> cache = new ConcurrentHashMap<>();

  public static Integer doubleValue(Integer x) {
    return cache.computeIfAbsent(x, y -> y * 2);
  }
}

We may then instantiate that class and use it each time we want to compute a value:


Integer y = Doubler.doubleValue(x);

With this solution, the map is no longer accessible from outside.

We can't do the same for functions because functions are anonymous classes and such classes may not have static members. One possibility would be to pass the map to the function as an additional argument. This may be done through a closure:

class Doubler {
  private static Map<Integer, Integer> cache = new ConcurrentHashMap<>();

  public static Function<Integer, Integer> doubleValue = x->cache.computeIfAbsent(x, y -> y * 2);
}

We can use this function as follows:

Integer y = Doubler.doubleValue.apply(x);

This gives no advantage compared to the “method” solution. However, we may also use this function in more idiomatic examples, such as:


IntStream.range(1, 10).boxed().map(Doubler.doubleValue);

This is equivalent to using the method version with the following syntax:


IntStream.range(1, 10).boxed().map(ThisClass::doubleValue);

The main problem is that while solving the second issue, we have made the first one more acute, which make automatic memoization more desirable.

Automatic memoization: requirements

What we need is a way to do the following:

Function<Integer, Integer> f = x -> x * 2;
Function<Integer, Integer> g = Memoizer.memoize(f);

so that we may use the memoized function as a drop in replacement for the original one. All values returned by function g will be calculated through the original function f the first time, and returned from the cache for all subsequent accesses. By contrast, if we create a third function:

Function<Integer, Integer> f = x -> x * 2;
Function<Integer, Integer> g = Memoizer.memoize(f);
Function<Integer, Integer> h = Memoizer.memoize(f);

the values cached by g will not be returned by h. In other words, g and h will use separate caches.

Implementation

The Memoizer class is quite simple:


public class Memoizer<T, U> {

  private final Map<T, U> cache = new ConcurrentHashMap<>();

  private Memoizer() {}

  private Function<T, U> doMemoize(final Function<T, U> function) {
    return input -> cache.computeIfAbsent(input, function::apply);
  }

  public static <T, U> Function<T, U> memoize(final Function<T, U> function) {
    return new Memoizer<T, U>().doMemoize(function);
  }
}

Using this class is also extremely simple:


Integer longCalculation(Integer x) {
  try {
    Thread.sleep(1_000);
  } catch (InterruptedException ignored) {
  }
  return x * 2;
}
Function<Integer, Integer> f = this::longCalculation;
Function<Integer, Integer> g = Memoizer.memoize(f);

public void automaticMemoizationExample() {
  long startTime = System.currentTimeMillis();
  Integer result1 = g.apply(1);
  long time1 = System.currentTimeMillis() - startTime;
  startTime = System.currentTimeMillis();
  Integer result2 = g.apply(1);
  long time2 = System.currentTimeMillis() - startTime;
  System.out.println(result1);
  System.out.println(result2);
  System.out.println(time1);
  System.out.println(time2);
}

Running the automaticMemoizationExample method will produce the following result:

2
2
1000
0

We can now make memoized function out of ordinary ones by just calling a single method!

What about functions with several arguments?

Short answer: nothing. There are no such things in this world as functions with several arguments. Functions are applications of one set (the source set) to another set (the target set). So, they simply can't have several arguments.

But this does not solve our problem. What is the functional equivalent to a method with several arguments?

Long answer: what people generally consider as functions with several arguments are in fact either:

  • Functions of tuples
  • Function returning functions returning functions … returning a result

In either cases, we are only concerned with functions of one argument, so we can easily use our Memoizer class.

Using functions of tuples would probably be the simplest choice... if Java had tuples! We could of course write tuples. But to store tuples in maps, we would have to implement equals and hashcode for them, plus we would have to define tuples for two elements (pairs), tuple for three elements, and so on. Who knows where to stop?

The second option is much easier. It is based upon currying, which means applying each argument one after the other instead of applying them as a whole (the tuple).

Currying a function is very easy. The only problem, in Java 8, is that writing the types is really cumbersome.

Currying a “function of two arguments” (in fact a function of a pair) is easy once you master the type. Java has in fact a shortcut for functions of tuple2 which is called BiFunction. We will take this as an example. The two following functions are equivalent (from the result point of view):


BiFunction<Integer, Integer, Integer> h = (x, y) -> x + y;
Function<Integer, Function<Integer, Integer>> hc = x -> y -> x + y;

Not considering the types, there are very little differences. In the first case, the two arguments are put between parentheses, separated by a comma, which is, by the way, how tuples are written in most languages which have them!

Remove the parentheses and separate the arguments with an arrow and you get the curried version.

We can only regret that we have to write the type as:

Function<Integer, Function<Integer, Integer>>

when other languages use a simplified syntax such as:

Integer -> Integer -> Integer

From this, it is easy to memoized this curried version, although we cant use the same simple form as previously. We have to memoize each function:


Function<Integer, Function<Integer, Integer>> mhc = 
        Memoizer.memoize(x -> Memoizer.memoize(y -> x + y));

Same thing for a function of (a tuple of) 3 arguments (which by the way has no equivalent in Java):


Function<Integer, Function<Integer, Function<Integer, Integer>>> f3 =
         x -> y -> z -> x + y - z;
Function<Integer, Function<Integer, Function<Integer, Integer>>> f3m = 
        Memoizer.memoize(x -> Memoizer.memoize(y -> Memoizer.memoize(z -> x + y – z));

Here is an example of using this memoized function “of three arguments”:


Function<Integer, Function<Integer, Function<Integer, Integer>>> f3 =
      x -> y -> z -> longCalculation(x) + longCalculation(y) - longCalculation(z);
  Function<Integer, Function<Integer, Function<Integer, Integer>>> f3m =
      Memoizer.memoize(x -> Memoizer.memoize(y -> Memoizer.memoize(z ->
          longCalculation(x) + longCalculation(y) - longCalculation(z))));

  public void automaticMemoizationExample2() {
    long startTime = System.currentTimeMillis();
    Integer result1 = f3m.apply(2).apply(3).apply(4);
    long time1 = System.currentTimeMillis() - startTime;
    startTime = System.currentTimeMillis();
    Integer result2 = f3m.apply(2).apply(3).apply(4);
    long time2 = System.currentTimeMillis() - startTime;
    System.out.println(result1);
    System.out.println(result2);
    System.out.println(time1);
    System.out.println(time2);
  }

This example produces the following output:

2
2
3002
0

showing that the first access to method longCalculation has taken 3000 milliseconds and the second has return immediately.

On the other hand, using a function of tuple may seem easier once you have the Tuple class defined. Here is an example of Tuple3:

public class Tuple3<T, U, V> {

  public final T _1;
  public final U _2;
  public final V _3;

  public Tuple3(T t, U u, V v) {
    _1 = Objects.requireNonNull(t);
    _2 = Objects.requireNonNull(u);
    _3 = Objects.requireNonNull(v);
  }

  @Override
  public boolean equals(Object o) {
    if (!(o instanceof Tuple3)) return false;
    else {
      Tuple3 that = (Tuple3) o;
      return _1.equals(that._1) && _2.equals(that._2) && _3.equals(that._3);
    }
  }

  @Override
  public int hashCode() {
    return _1.hashCode() + _2.hashCode() + _3.hashCode();
  }
}

Using this class, we may rewrite the previous example as:

Function<Tuple3<Integer, Integer, Integer>, Integer> ft = x -> longCalculation(x._1) + longCalculation(x._2) - longCalculation(x._3);
Function<Tuple3<Integer, Integer, Integer>, Integer> ftm = Memoizer.memoize(ft);

public void automaticMemoizationExample3() {
  long startTime = System.currentTimeMillis();
  Integer result1 = ftm.apply(new Tuple3<>(2, 3, 4));
  long time1 = System.currentTimeMillis() - startTime;
  startTime = System.currentTimeMillis();
  Integer result2 = ftm.apply(new Tuple3<>(2, 3, 4));
  long time2 = System.currentTimeMillis() - startTime;
  System.out.println(result1);
  System.out.println(result2);
  System.out.println(time1);
  System.out.println(time2);
}

Conclusion

Memoizing is about maintaining state between function calls. A memoized function is a function which behavior is dependent upon the current state. However, it will always return the same value for the same argument. Only the time needed to return the value will be different. So the memoized function is still a pure function if the original function is pure.


However, there is a kind of function that may pose a problem: recursive functions that call themselves several times with the same argument may not be memoized this way. This will be addressed in a next article.

Memoization Tuple

Opinions expressed by DZone contributors are their own.

Related

  • Top 10 C# Keywords and Features
  • Enhancing Code Clarity With Python Namedtuples
  • In-Depth Guide to Using useMemo() Hook in React

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!