Enterprise Integration Zone is brought to you in partnership with:

Jose has posted 9 posts at DZone. View Full User Profile

Parallelism in Spring

03.10.2009
| 17649 views |
  • submit to reddit

I've been wanting to post about parallelism in Spring for ages by now. It's an interesting topic that doesn't get the attention it deserves(if any at all), probably because an IoC container and Spring in particular really shine managing dependencies, a task that intuitively promotes serial processing, and also because JEE APIs (say Servlet or EJB) hide the need of it. There's a specific area where no matter what you'll be looking for concurrency and that is data retrieval. As long as a couple of different resources are involved, or even one alone if the data requested is independent, there are efficiency gains processing the several connections in parallel. A common case in todays environments would be, for example, calling web services.

Standard Java does not really offer any API to manage concurrency once inside a request (though the request itself is managed from a pool). In fact, it's open for discussion if the standard forbids opening new threads in a web context (it's specifically banned for EJBs). WebSphere and Weblogic proposed an alternative called CommonJ aka WorkManager API. It's a very good alternative when running under those application servers. Spring offers another, arguably even more powerful, option with the TaskExecutor abstraction. It's sometimes preferable, in a Spring environment, because it can use CommonJ as the underlying API but it can also use the Java5 Executor framework (among others) as well, making the switch just a matter of changing a couple of configuration lines.

Let's review how to use the framework. Our only pre-requisite is to have at least two data retrieval services already configured as a dependency of a third bean. All the data retrieval services must share a common interface, I can recommend something like the Command pattern here (beware this approach is not fully followed bellow to better showcase inbound data processing). At this point we're going to change the individual dependencies and transform them into a collection, we'll add init and destroy methods and an executor (let's start with a JDK5 implementation):

public class ParallelService implements InitializingBean, DisposableBean {
   private List<Command<T>> commands;
   private ExecutorService executor;
}

With our current implementation based on Java 5 executors we need to start up the thread pool in the initialization method and conclude everything when Spring context is closed

public void afterPropertiesSet() throws Exception {
   executor = Executors.newFixedThreadPool(commands.size());
}

public void destroy() throws Exception {
   executor.shutdownNow(); // Improve this as much as liked
}

We just need to handle the concurrent execution now. It's easy to do with the Future management of asynchronous tasks. Another alternative is to submit all tasks and await termination (see ExecutorService):

public void execute(Data data) {
   Set<Future<?>> tasks = new HashSet<Future<?>>(commands.size());
   for (Command command : commands)
      tasks.add(executor.submit(new RunCommand(command, data)));
   for (Future future : tasks)
      future.get();
   //Other stuff to execute after all data has been retrieved
}

The code above just creates a collection of Future objects to check when the jobs have finished. The tricky part is the creation of the concurrent job from a custom service and pass the required data (if needed). An inner class wrapper will suffice:

private static class RunCommand implements Runnable {
   private final Data data;
   private final Command command;
   public RunCommand(Command command, Data data) {
      this.data = data;
      this.command = command;
   }
   public void run() {
      command.execute(data);
   }
}

Well, that was pretty easy indeed. Right now we have a perfectly valid way to invoke beans in parallel. This approach has pros and cons. In the former list we have independence from Spring APIs (of course imagine that the Spring interfaces are substituted by their matching XML attributes) but we are also limited to a Java5 environment. If we don't mind introducing a dependency with Spring itself we can transform the source code to use the TaskExecutor framework:

public class ParallelService {
   private TaskExecutor taskExecutor;
   private List<Command<T>> commands;
   public void setTaskExecutor(TaskExecutor taskExecutor) {
      this.taskExecutor = taskExecutor;
   }
}

And now the init and destroy methods are substituted by some XML configuration:

<bean class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
   <property name="corePoolSize" value="5" />
   <property name="maxPoolSize" value="10" />
   <property name="queueCapacity" value="25" />
</bean>

But notice that not all implementations of the TaskExecutor interface allow tracking the progress of a task once scheduled for execution!

public void execute)( {
   for (Command command : commands)
      taskExecutor.execute(new RunCommand(command));
}

From http://internna.blogspot.com/

Published at DZone with permission of its author, Jose Noheda.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Yaozong Zhu replied on Tue, 2009/03/10 - 7:01am

To be honest, this essentially is all about the usage of the new concurrency framework introduced by Java 5. Spring doesn't add any value into it.

Andrew Perepelytsya replied on Tue, 2009/03/10 - 8:43am

for (Future future : tasks)
future.get();

The above is prone to be suboptimal, as e.g. out of 20 tasks, 5 can be really slow, and block the loop from completion sooner. Switching to juc's CompletionService is a better solution.

Jose Noheda replied on Tue, 2009/03/10 - 10:18am in response to: Yaozong Zhu

@Yaozong: Yes and no. The purpose when I first wrote the article was to highlight the fact that Spring beans can and should be called in parallel in many situations. Spring, as a framework, adds some value to the mix but we both agree that it's limited

Jose Noheda replied on Tue, 2009/03/10 - 10:19am in response to: Andrew Perepelytsya

@Andrew: Very true. The idea here was to block until all tasks had finished. If you don't want to wait at all or consume results as they are available there are other approaches, of course.

Monica Walker replied on Sat, 2009/04/18 - 2:27pm

I totally agree with you that there's a specific area where no matter what you'll be looking for concurrency and that is data retrieval.I am currently working on a project using JavaFX and will update you on my progress.

 Current account | Savings Project 09

 

Sumana Datta replied on Thu, 2010/02/04 - 2:19pm

I am using task executor but the program execution is not waitig for all the tsk to be completed and immidiately starting running the next line.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.