Big Data/Analytics Zone is brought to you in partnership with:

Entrepreneur. Creator of Groovy++ Alex is a DZone Zone Leader and has posted 31 posts at DZone. You can read more from them at their website. View Full User Profile

Why Scala actors slower compare to Jetlang and/or Groovy++ messaging

02.24.2010
| 20246 views |
  • submit to reddit

After this article was published Hossam Karim suggested several improvements, which significantly improved performance of Scala benchmark. That forced me to remove "15-20 times slower" from initial title of the article and include updated code below.

Asynchronious message passing is old and great idea. Instead of synchronizing objects and dealing with deadlocks we try to isolate objects and let them exchange messages. Erlang proved it to be extremly powerful tool applicable to wide range of highly scalable mission critical applications.

In JVM landscape big interest to this approach started after introducing Scala actors. My personal (maybe wrong) impression is that main inteerst to Scala grew up exactly because of actors library.

But do Scala actors perform well? When you exchange millions of messages internally you should believe that it happens as fast as possible.

In Erlang message passing and process scheduling built in to virtual machine. On JVM switching contexts between threads and blocking queues are expensive operations. It was the question we tried to understand while designing Groovy++ message passing architecture.

We started with benchmark, which shocked us. The goal of this article is to describe this benchmark, show three different implementations. One for Scala, one for buitiful Jetlang library (http://code.google.com/p/jetlang/) using Groovy++ and one for our own prototype of Groovy++ messaging.

While prototype of implementation we have in Groovy++ standard library seems to be noticably faster compare to one in Jetlang (20-40%) it does not necessary means a lot. Both implementations are very similar by spirit and most probably some ideas from our implementation can speed up Jetlang or vice versa support for must have features from Jetlang, which we did not implement yet can slow down our performance.

The point is that Scala actors are slower at least 10 times. On some variations of the benchmark with saw them being 15-20 times slower.

So what does the benchmark do?

We want to measure average speed of message sending and receiving. So we choose variation of well known thread ring benchmark.

  • We have 10000 actors indexed 0..9999
  • When object receive a message it forwards the message to the next object in a row
  • We start with sending string "Hi" to first 500 objects in a row
So a bit less than 50M messages are sent and received. We measure how long it takes to construct all these object, send and receive messages

Here is code using Jetlang. Formally speaking we should be implementing it with pure Java but as we know that code produced by Groovy++ should perform as fast as the one produced by javac we choose more expressive and less verbose language. It was very nice that Jetlang primitives mapped easierly to Groovy++ syntax

def start = System.currentTimeMillis()

def pool = Executors.newFixedThreadPool(Runtime.runtime.availableProcessors())
def fiberFactory = new PoolFiberFactory(pool)

def channels = new MemoryChannel[10000]
CountDownLatch cdl = [channels.length*500]
for (i in 0..<channels.length) {
def fiber = fiberFactory.create()
def channel = new MemoryChannel()
channel.subscribe(fiber) {
if (i < channels.length-1)
channels[i+1].publish(it)
cdl.countDown()
}
channels [i] = channel
fiber.start()
}
for(i in 0..<500) {
channels[i].publish("Hi")
for (j in 0..<i)
cdl.countDown()
}

assert(cdl.await(100,TimeUnit.SECONDS))
pool.shutdown()
println (System.currentTimeMillis() - start)

Groovy++ message passing are very similar to Jetlang. The main difference is that we choose (at least so far) not to separate fibers (message consumers) and message channels (where to publish messages). It gives us ability to use some simplified data structures internally and may be the reason why Groovy++ implementation is a bit faster than Jetlang's one. It might happen that we will change it later.

def start = System.currentTimeMillis()
def pool = new ChannelExecutor(Runtime.runtime.availableProcessors())
def channels = new ExecutingChannel [10000]
CountDownLatch cdl = [channels.length*500]
for (i in 0..<channels.length) {
ExecutingChannel channel = {
if (i < channels.length-1)
channels[i+1] << it
cdl.countDown()
}
channel.executor = pool
channels [i] = channel
}

for(i in 0..<500) {
channels[i] << "Hi"
for (j in 0..<i)
cdl.countDown()
}

assert(cdl.await(100,TimeUnit.SECONDS))
assert(pool.shutdownNow().empty)
pool.awaitTermination(0L,TimeUnit.SECONDS)
println(System.currentTimeMillis()-start)

And here is the code with Scala actors.

object FiberRingScala {
def main(args: Array[String]) {
val start = System.currentTimeMillis()
val channels = new Array[Actor](10000)
val cdl = new CountDownLatch(channels.length * 500)

var i: Int = 0
while (i < channels.length) {
val channel = actor {
loop {
react {
case x:Any =>
if (i < channels.length -1)
channels(i+1) ! x
cdl.countDown()
}
}
}
channels(i) = channel
i += 1
}
i = 0
while (i < 500) {
channels(i) ! "Hi"
var j : Int = 0
while (j < i) {
cdl.countDown()
j = j+1
}
i += 1
}

cdl.await(1000, TimeUnit.SECONDS)
Scheduler.shutdown

println(System.currentTimeMillis() - start)
}
}

Seems to be very similar to code with Jetlang ang Groovy++ messaging, right?

Benchmarking results (milliseconds, smaller number the better)

2155 - Jetlang

1682 - Groovy++

47911 - Scala

Something is very wrong with Scala performance. It seems to be 22 times slower compare to Jetlang. We believe that the reason is that Scala's tries to emulate Erlang behaviour instead of using messaging model, which fit naturally to JVM.

Here is optimization invented by Vaclav Pech(creator and lead of brilliant GPars library) which speedups Scala code twofold up to 22484, which is still 13 times slower than Groovy++ messaging. Exactly the fact that such optimization helps and helps so seriously make us think that something is wrong in the Scala approach to actors.

object FiberRingScala {
def main(args: Array[String]) {
val start = System.currentTimeMillis()
val channels = new Array[Actor](10000)
val cdl = new CountDownLatch(channels.length * 500)

var i: Int = 0
while (i < channels.length) {
val channel = actor {
handle(i, channels, cdl)
}
channels(i) = channel
i += 1
}
i = 0
while (i < 500) {
channels(i) ! "Hi"
var j : Int = 0
while (j < i) {
cdl.countDown()
j = j+1
}
i += 1
}

cdl.await(1000, TimeUnit.SECONDS)
Scheduler.shutdown

println(System.currentTimeMillis() - start)
}

def handle(i:Int, channels:Array[Actor], cdl:CountDownLatch) :Unit = {
react {
case x:Any =>
if (i < channels.length -1)
channels(i+1) ! x
cdl.countDown()
handle(i, channels, cdl)
}
}
}

What have changed compare to previous version? Almost nothing really - we replaced loop/react with react, where reaction method calls react again. Not very intuitive but understandable (if you are creator of message passing library as Vaclav or truly yours). Seems like we saved one flow control exception (just guess) The point is that in Erlang you deal with lightweight process. In JVM you have callback objects almost for free but illusion of lighweight process becomes extremly expensive

Don't ask me what flow control exception is. I like idea as interesting animal in the zoo of concurrency tricks but don't want it to become too popular

Here it is for today. You can find source code and libraries used in this article at http://code.google.com/p/groovypptest/source/browse/#svn/trunk/JetLangSample/src/org/mbte/groovypp/samples/jetlang

Please let us know if you have idea how to speedup any of samples above.

I hope it was interesting and till next time.

 UPDATE: Code below suggested by Hossam Karim executed in 5221ms on my setup

I will probably post separate article why it happens. First impression is that as usual smart scheduler helps a lot. Also might be that Reactor has much smarter implementation 

 

package org.mbte.groovypp.samples.jetlang

import java.util.concurrent.{TimeUnit, CountDownLatch}
import scala.actors.Actor._
import scala.actors.scheduler.ForkJoinScheduler
import scala.actors.Reactor
import scala.actors.Scheduler

class FiberRingActor
(i: Int, channels: Array[Reactor], cdl: CountDownLatch)
extends Reactor {
def act = FiberRingScala2.handle(i, channels, cdl)
override def scheduler = FiberRingScala2.fjs
}

object FiberRingScala2 {
val fjs = new ForkJoinScheduler(2, 2, false)
fjs.start()

def main(args: Array[String]) {
val start = System.currentTimeMillis()
val channels = new Array[Reactor](10000)
val cdl = new CountDownLatch(channels.length * 500)

var i: Int = 0
while (i < channels.length) {
/*
val channel = actor {
handle(i, channels, cdl)
}
*/
val channel =
new FiberRingActor(i, channels, cdl)
channel.start
channels(i) = channel
i += 1
}
i = 0
while (i < 500) {
channels(i) ! "Hi"
var j : Int = 0
while (j < i) {
cdl.countDown()
j = j+1
}
i += 1
}

cdl.await(1000, TimeUnit.SECONDS)
Scheduler.shutdown

println(System.currentTimeMillis() - start)
}

def handle(i:Int, channels:Array[Reactor], cdl:CountDownLatch) :Unit = {
react {
case x:Any =>
if (i < channels.length -1)
channels(i+1) ! x
cdl.countDown()
handle(i, channels, cdl)
}
}
}
Published at DZone with permission of its author, Alex Tkachman.

Comments

peter kovac replied on Wed, 2010/02/24 - 11:31am

you should probably run your benchmark against akka actors or lift actors, since both are more lightweight than the standard lib version.

Sean John replied on Wed, 2010/02/24 - 12:25pm

There's something I don't like about the name of your articles, you always bash other languages. First "How come that Groovy++ overperform Java?", now this. Why do you do this? Publicity?

Peter Bliznak replied on Wed, 2010/02/24 - 12:43pm

Scala is 20 times slower title is pure BS. Why dont you go and submit this to Scala specialized forum where  folk can confirm it or show you what is wrong with the code - instead of putting it to Java forum and knowing changes are nobody is going to chalenge it.... cheap publicity IMHO.

Cej Hah replied on Wed, 2010/02/24 - 12:55pm

This place is getting close to jumping the shark.  I'm sooo sick of coming on here and reading about how Groovy++ is going to rule the world.  It's not.  Let it be.  If you want to go off in a corner and do your thing, that's cool, but let the rest of us have a break.  We get it, Groovy++ is WAY faster than just plain Groovy and it's also WAAAY faster than anything else and we all need to jump on board.  Nevermind things such as being a well designed language, support, reliablity, etc.

 

Ben Courliss replied on Wed, 2010/02/24 - 12:57pm in response to: Peter Bliznak

so rather than pointing out what about the Scala code is slow, you just critize the author and refute his arguments with nothing but opinion?  Are you afraid that maybe he's right?  Why don't you prove him wrong?

I know next to nothing about Scala or Groovy++ messaging (although I do know some Groovy).  I can't help but to think that the more information out there, the better it is for BOTH communities.  And trust me, there's no need to post specifically on a scala-specific forum.  There's nothing preventing those users from coming here to refute this article, which BTW, is an excellent article.  The author made a good attempt at proving his point with benchmarks and sample code.  And for the record, he IS right that scala is 20 times slower in this benchmark. The numbes don't lie.  If you can speed up the Scala code, then prove it rather than hiding behind ad hominem attacks and strawman arguments.

Jihed Amine replied on Wed, 2010/02/24 - 1:00pm

I really don't like these groovy fanboys posts bashing other languages. I don't like the spirit of this post.

Peter Bliznak replied on Wed, 2010/02/24 - 1:10pm in response to: Ben Courliss

I cannot speed it up with my knowledge of Scala BUT true expert most likely can. So author should have first gone to Scala specific forum - post it there - discuss it there and then start writing article - then it would be worth of reading (less bombastic title would be nice too). And for strawman arguments I said this is Javalobby site not that many Scala guys coming here so whatever is inside this article must by majority be taken with leap of faith. Sure anybody can post here anything so what is next Chicken soup 15 times stronger then pure water?

Oliver Schweitzer replied on Wed, 2010/02/24 - 1:17pm

Hi! I'll leave the evaluation if the benchmark implementations are valid and comparable to others with more expertise with the languages and their libraries.

Just some general questions:

What Scala version did you use? What command line parameters did you use for compilation?

What Groovy++ version did you use? What command line parameters did you use for compilation?

What JVM did you use? Which command line options for the VM?

What OS did you use?

What hardware did you run your benchmarks on?

Alex Cruise replied on Wed, 2010/02/24 - 1:16pm

Here's the thread on scala-user where people have been discussing this article. Suffice it to say that the Scala solutions would be much more performant if they were written in a different (more idiomatic) style.

Peter Bliznak replied on Wed, 2010/02/24 - 2:02pm

BTW ... for a fun during launch I took your code pluged it into Eclipse - running 2.8 Scala Beta1-pre-release and got  8 250ms ....somehow you managed to get 47 911ms ??? Seems like there is something really fishy about this timing.

Hossam Karim replied on Wed, 2010/02/24 - 4:26pm

I don't really know on what kind of machine were able to get such figures, any way, on my lousy laptop I got 2810ms with the following modified code:

import java.util.concurrent.{TimeUnit, CountDownLatch}
import scala.actors.Actor._
import scala.actors.scheduler.ForkJoinScheduler
import scala.actors.{Reactor, Scheduler}

class FiberRingActor
  (i: Int, channels: Array[Reactor], cdl: CountDownLatch)
    extends Reactor {
  def act = FiberRingScala.handle(i, channels, cdl)
  override def scheduler = FiberRingScala.fjs
}

object FiberRingScala {
  val fjs = new ForkJoinScheduler(2, 2, false)
  fjs.start()

 def main(args: Array[String]) {
  val start = System.currentTimeMillis()
  val channels = new Array[Reactor](10000)
  val cdl = new CountDownLatch(channels.length * 500)

  var i: Int = 0
  while (i < channels.length) {
    /*
    val channel = actor {
      handle(i, channels, cdl)
    }
    */
    val channel =
      new FiberRingActor(i, channels, cdl)
    channel.start 
    channels(i) = channel
    i += 1
  }
  i = 0
  while (i < 500) {
    channels(i) ! "Hi"
    var j : Int = 0
    while (j < i) {
      cdl.countDown()
      j = j+1
    }
    i += 1
  }

  cdl.await(1000, TimeUnit.SECONDS)
  Scheduler.shutdown

  println(System.currentTimeMillis() - start)
 }

 def handle(i:Int, channels:Array[Reactor], cdl:CountDownLatch) :Unit = {
  react {
    case x:Any =>
      if (i < channels.length -1)
        channels(i+1) ! x
      cdl.countDown()
      handle(i, channels, cdl)
  }
 }
}

Alex Tkachman replied on Thu, 2010/02/25 - 1:41am in response to: Hossam Karim

Hossam, thanks a lot. This modified code executed on my machine 5221ms Use of non-standard scheduler makes a lot of difference. I will separate post about it later today.

Jevgeni Kabanov replied on Thu, 2010/02/25 - 1:47am

Actors are a bit of a high-level abstraction with a lot of implicit stuff happening behind the scenes. What you see here is the result of hitting the thread-local for lookup and exceptions for unwinding stack. There are ways to sacrifice this level of abstraction for performance to get similar results. Look in scala-user thread for details.

Alex Tkachman replied on Thu, 2010/02/25 - 2:13am in response to: Jevgeni Kabanov

Jevgeni, it is exactly my point. I have no goal of bashing Scala. My goal is to find right solution for message scheduling and I am big believer that flow control exceptions are evil (same for thread-locals used uncontrolably)

Oliver Schweitzer replied on Thu, 2010/02/25 - 9:43am in response to: Alex Tkachman

Ok, so to reach that goal you wrote a weak article about a badly setup microbenchmark with an inflammatory title and thought "Ok, maybe some Scala pr0s and even one of the language and library designers will read this and give me a really good solution!" . It seems to have worked out fine!

Brandon Nokes replied on Thu, 2010/02/25 - 12:05pm

I dont see any reason for people to make inflammatory remarks regarding the authors intent here.  Is there a problem with making objective comments without bashing the author or the other members??  Its one thing to disagree, its another to make accusatory statements. If you disagree or dont like what the author has to say, use the vote buttons, thats what they are there for.  There is no need to make personal attacks on anyone here at DZone.

 Lets try to keep this more professional and a little less argumentative.

 

 

Jason Marshall replied on Thu, 2010/02/25 - 5:05pm in response to:

I think being "more professional and a little less argumentative" IS the issue here.  The hard part of putting yourself out there, writing wise, is figuring out whether your intent is communicated by what you've written.

In this case, our author has confessed that it wasn't.  That's his fault, not the readers'.  

You can't post an article to a high profile development site questioning the viability of someone else's technology, and expect anyone to take it as anything else than what it really is: taking potshots. "Friends" don't air grievances in public until they've tried to have them addressed in private.  With friends like these, who needs enemies?

Here's my not-so-friendly take on the Scala situation:

Scala could reimplement their actor model, and most of these concerns would go away.  Sounds like maybe they have some new incentive to do for.  What blocks me from using it, where I have concerns, is some of the more esoteric aspects of some the language semantics.  Notably, their mixin implementation.  I've seen the diagrams for calling precedence and I'm still left entirely confused.  And people generally consider me to be a fairly clever fellow.  I have no idea how I'd explain it to a junior developer.

That's a tougher one to fix, because there's no backward compatible way to 'dumb down' the implementation so that it can be explained to newcomers.  They'll have to rip out functionality to get that to be simpler to understand.  Some existing users are likely to feel a little betrayed by that, and I don't envy them that conversation if and when it comes.

 

 

 

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.