Ant is a freelance Java architect and developer. He has been writing a blog and white papers since 2004 and writes about anything he finds interesting, related to Java or software. Most recently he has been working on enterprise systems involving Eclipse RCP, Google GWT, Hibernate, Spring and J2ME. He believes very strongly in being involved in all parts of the software life cycle. Ant is a DZone MVB and is not an employee of DZone and has posted 26 posts at DZone. You can read more from them at their website. View Full User Profile

Node JS and Server side Java Script

03.07.2011
| 16427 views |
  • submit to reddit

Let's start right at the beginning. Bear with me, it might get long...

The following snippet of Java code could be used to create a server which receives TCP/IP requests:

class Server implements Runnable {
public void run() {
try {
ServerSocket ss = new ServerSocket(PORT);
while (!Thread.interrupted())
Socket s = ss.accept();
s.getInputStream(); //read from this
s.getOutputStream(); //write to this
} catch (IOException ex) { /* ... */ }
}
}


This code runs as far as the line with ss.accept(), which blocks until an incoming request is received. The accept method then returns and you have access to the input and output streams in order to communicate with the client.

There is one issue with this code. Think about multiple requests coming in at the same time. You are dedicated to completing the first request before making the next call to the accept method. Why? Because the accept method blocks. If you decided you would read a chunk off the input stream of the first connection, and then be kind to the next connection and accept it and handle its first chunk before continuing with the original (first) connection, you would have a problem, because the accept method blocks. If there were no second request, you wouldn't be able to finish off the first request, because the JVM blocks on that accept method. So, you must handle an incoming request in its entirety, before accepting a second incoming request.

This isn't so bad, because you can create the ServerSocket with an additional parameter, called the backlog, which tells it how many requests to queue up before refusing further connections. While you are busy handling the first request, subsequent requests are simply queued up.

This strategy would work, although it's not really efficient. If you have a multicore CPU, you will only be doing work on one core. It would be better to have more threads, so that the load can be balanced across the cores (watch out, this is JVM and OS dependent!).

A more typical multi-threaded server gets built like this:

class Server implements Runnable {
public void run() {
try {
ServerSocket ss = new ServerSocket(PORT);
while (!Thread.interrupted())
new Thread(new Handler(ss.accept())).start();
// one thread per socket connection every thread
// created this way will essentially block for I/O
} catch (IOException ex) { /* ... */ }
}
}

The above code hands off each incoming request to a new thread, allowing the main thread to handle new incoming requests, while spawned threads handle individual requests. This code also balances the load across CPU cores, where the JVM and OS allow it. Ideally, we probably wouldn't create a thread per new request, but rather hand off the request to a thread pool executor (see the java.util.concurrent package). On the other hand, there are times when a thread per request is required. If the conversation between server and client is longer lasting (rather than a simple HTTP request that is typically serviced in anything from milliseconds to seconds), then the socket can stay open. An example of when this is required are things like chat servers, or VOIP, or anything else where a continual conversation is required. But in such situations, the above code, even though it is multi-threaded, has it's limits. Those limits are actually because of the threads! Consider the following code:

public class MaxThreadTest {

static int numLive = 0;

public static void main(String[] args) {
while(true){
new Thread(new Runnable(){
public void run() {
numLive++;

System.out.println("running " + Thread.currentThread().getName() + " " + numLive);
try {
Thread.sleep(10000L);
} catch (InterruptedException e) {
e.printStackTrace();
}

numLive--;
}
}).start();
}
}
}


This code creates a bunch of threads, until the process crashes. With 64 MB heap size, it crashed (out of memory) after around 4000 threads, while testing on my Windows XP Thinkpad laptop. I upped the heap size to 256 MB and Eclipse crashed while in debug mode... I started the process from the command line and managed to open 5092 threads, but it was unstable and unresponsive. Interestingly, I upped the heap size to 1 GB, and then I could only open 2658 threads... This shows, I don't really understand the OS or JVM at this level! Anyway, if we were writing a system to handle a million simultaneous conversations, we would probably need over two hundred servers. But theoretically, we could reduce our costs to less than 10% of that, because we are allowed to open just over 65,000 threads per server (well, say 63,000 by the time we account for all the ports used by the OS and other processes). We could theoretically get away with just having 16 servers per million simultaneous connections.

The way to do this is, is to use non-blocking I/O. Since Java 1.4 (around 2002?), the java.nio package has been around to help us. With it, you can create a system which handles many simultaneous incoming requests using just one thread. The way it works is roughly by registering with the OS to get events when something happens, for example when a new request is accepted, or when one of the clients sends data over the wire.

With this API, we can create a server, which is, sadly, a little more complicated than those above, but which handles lots and lots of sockets all from one thread:

public class NonBlockingServer2 {

public static void main(String[] args) throws IOException {
System.out.println("Starting NIO server...");
Charset charset = Charset.forName("UTF-8");
CharsetDecoder decoder = charset.newDecoder();
CharsetEncoder encoder = charset.newEncoder();

ByteBuffer buffer = ByteBuffer.allocate(512);

Selector selector = Selector.open();
ServerSocketChannel server = ServerSocketChannel.open();
server.socket().bind(new InetSocketAddress(30032));
server.configureBlocking(false);
SelectionKey serverkey = server.register(selector, SelectionKey.OP_ACCEPT);

boolean quit = false;
while(!quit) {
selector.select(); //blocks until something arrives, of type OP_ACCEPT
Set keys = selector.selectedKeys();
for (SelectionKey key : keys) {
if (key == serverkey) {
if (key.isAcceptable()) {
SocketChannel client = server.accept();
if(client != null){ //can be null if theres no pending connection
client.configureBlocking(false);
SelectionKey clientkey = client.register(selector,
SelectionKey.OP_READ); //register for the read event
numConns++;
}
}
} else {
SocketChannel client = (SocketChannel) key.channel();
if (!key.isReadable()){
continue;
}
int bytesread = client.read(buffer);
if (bytesread == -1) {
//whens this happen?
key.cancel();
client.close();
continue;
}
buffer.flip();
String request = decoder.decode(buffer).toString();
buffer.clear();

if (request.trim().equals("quit")) {
client.write(encoder.encode(CharBuffer.wrap("Bye.")));
key.cancel();
client.close();
}else if (request.trim().equals("hello")) {
String id = UUID.randomUUID().toString();
key.attach(id);
String response = id + "\r";
client.write(encoder.encode(CharBuffer.wrap(response)));
}else if (request.trim().equals("time")) {
numTimeRequests++;
String response = "hi " + key.attachment() + " the time here is " + new Date() + "\r";
client.write(encoder.encode(CharBuffer.wrap(response)));
}
}
}
}
System.out.println("done");
}
}


The above code is based on that found here. By reducing the number of threads being used, and not blocking, but rather relying on the OS to tell us when something is up, we can handle many more requests. I tested some code very similar to this to see how many connections I could handle. Windows XP proved its high reliability, when reproducibly and consistently, more than 12,000 connections lead to blue screens of death! Time to move to Linux (Fedora Core). I had no problems creating 64,000 clients all simultaneously connected to my server. Let me re-prase... I didn't have problems having the clients simply connect and keep the connection open, but getting the server to also handle just 100 requests a second caused problems. Now 100 requests a second on a web server, on hardware which was a 5 year old cheap Dell laptop, sounds quite impressive to me. But on a server with 64,000 concurrent connections, that means each client making a request every ten minutes! Not very good for a VOIP application... The connection speeds also slowed down from around 3 milliseconds with 500 concurrent connections, down to 100 milliseconds with 60,000 concurrent connections.

So, perhaps I better get to the point of this posting? A few days ago, I read about Voxer, and Node.js on The Register. I had difficulty with this article. Why would anyone want to build a framework for Javascript on the server? I have developed plenty of rich clients, and have the experience to understand how to do rich client development. I have also developed plenty of rich internet apps (RIA), which use Javascript, and I can only say, it's not the best. I'm not some script kiddie or script hacker who doesn't know how to design Javascript code, and I understand the problems of Javascript development well. And I have developed lots and lots of server side code, mostly in Java and appreciate where Java out punches Javascript.

It seems to me, that the developers of Node.js, and those following it and using it, don't understand server development. While writing in Javascript might initially be quicker, the lack of tools and libraries in comparison to Java make it a non-competition in my opinion.

If I were a venture capitalist, and knew my money was being spent on application development based on newly developed frameworks, instead of extremely mature technologies, when the mature technologies suffice (as shown with the non-blocking server code above), I would flip out and can the project.

Maybe though, this is why I have never worked at a start up!

To wrap up, let's consider a few other points. Before anyone says that the performance of my example server was poor because it's just Java which is slow, let me comment. First of all, Java will always be faster than Javascript. Secondly, using top to monitor the server, I noticed that 50% of the CPU time was spent by the OS working out what events to throw, rather than Java handling those requests.

In the above server, everything runs on one thread. To improve performance, once a request comes in, it could be handed off to a thread pool to respond. This would help balance load across multiple cores, which is definitely required to make the server production ready.

While I'm at it, here is a quote from Node JS's home page:

"But what about multiple-processor concurrency? Aren't threads necessary to scale programs to multi-core computers? Processes are necessary to scale to multi-core computers, not memory-sharing threads. The fundamentals of scalable systems are fast networking and non-blocking design—the rest is message passing. In future versions, Node will be able to fork new processes (using the Web Workers API ) which fits well into the current design."

Actually, I'm not so sure... Java on Linux can spread threads across cores, so individual processes are not actually required. And the above statement just proves that Node JS is not mature for building really professional systems - I mean come on, no threading support?!

So, in the interests of completion, here is the client app I used to connect to the server:

public class Client {

private static final int NUM_CLIENTS = 3000;

static Timer serverCallingTimer = new Timer("servercaller", false);

static Random random = new Random();

/**
* this client is asynchronous, because it does not wait for a full response before
* opening the next socket.
*/
public static void main(String[] args) throws UnknownHostException, IOException, InterruptedException {

final InetSocketAddress endpoint = new InetSocketAddress("192.168.1.103", 30032);
System.out.println(new SimpleDateFormat("HH:mm:ss.SSS").format(new Date()) + " Starting async client");

long start = System.nanoTime();
for(int i = 0; i < NUM_CLIENTS; i++){
startConversation(endpoint);
}

System.out.println(new SimpleDateFormat("HH:mm:ss.SSS")
.format(new Date())
+ "Done, averaging "
+ ((System.nanoTime() - start) / 1000000.0 / NUM_CLIENTS)
+ "ms per call");
}

protected static void startConversation(InetSocketAddress endpoint) throws IOException {
final Socket s = new Socket();
s.connect(endpoint, 0/*no timeout*/);
s.getOutputStream().write(("hello\r").getBytes("UTF-8")); //protocol dictates \r is end of command
s.getOutputStream().flush();

//read response
String str = readResponse(s);
System.out.println("New Client: Session ID " + str);

//send a request at regular intervals, keeping the same socket! eg VOIP
//we cannot use this thread, its the main one which created the socket
//simply create another task to be carried out by the scheduler at a later time

//the interval below is 4 minutes, otherwise the server gets REALLY slow handling
//so many requests. This is equivalent to ~260 reqs/sec

serverCallingTimer.scheduleAtFixedRate(
new ConversationContainer(s, str),
random.nextInt(240000/*in the next 4 mins*/),
240000L/*every 4 mins*/);
}

private static String readResponse(Socket s) throws IOException {
InputStream is = s.getInputStream();
int curr = -1;
ByteArrayOutputStream baos = new ByteArrayOutputStream();
while((curr = is.read()) != -1){
if(curr == 13) break; //protocol dictates a new line is the end of a response
baos.write(curr);
}
return baos.toString("UTF-8");
}

private static class ConversationContainer extends TimerTask {
Socket s;
String id;
public ConversationContainer(Socket s, String id){
this.s = s;
this.id = id;
}

@Override
public void run() {
try {
s.getOutputStream().write("time\r".getBytes("UTF-8")); //protocol dictates \r is end of command
s.getOutputStream().flush();

String response = readResponse(s);

if(random.nextInt(1000) % 1000 == 0){
//we dont want to log everything, because it will kill our server!
System.out.println(id + " - server time is '" + response + "'");
}

} catch (Exception e) {
e.printStackTrace();
}
}
}
}

 

 

From http://blog.maxant.co.uk/pebble/2011/03/05/1299360960000.html

Published at DZone with permission of Ant Kutschera, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Jose Maria Arranz replied on Mon, 2011/03/07 - 4:25am

Ant try to remove your -Xms Xmx options and add -Xss64k (is hard to explain but I think the more Xms/sx less memory to the stack), I've tried your example allocating around 23000 concurrent threads in a 32 bit JVM on Windows XP (I'm sure in 64 bit this number can be bigger), 23000 concurrent clients doing something is a big number in my opinion :)

And you can ever use a thread pool to address a bigger number of connections instead of 1-thread per connection.

In my opinion the single-magic-non-blocking thread approach is deeply wrong, I've written too much about this topic here, in a real system/app you are hardly going to effectively process 23000 concurrent users doing something useful (for instance some DB actions), this number may be interesting in Comet where most of the clients are stalled waiting for something (and threads are mostly stopped).

You cannot get rid of threads, in NIO servers, for instance Netty, end user code is executed by a conventional thread pool, because a NIO thread CANNOT be delegated to end user code because almost anything doing by end user code can block the NIO thread ruining the performance of the complete server.

NIO and all-in-the-same-thread processing can be fine if you have absolute control of code and all tasks are extremely non-blocking, and as you are going to discover in the Java example of my article, extremely non-blocking is a myth (almost in Java). And by the way, modern NIO servers are today multiple threaded to use cores as much as possible.

People of Node.js are realizing this, Web Workers may come to rescue Node.js, but Web Workers is basically the same as... threads!!

 

Jose Maria Arranz replied on Mon, 2011/03/07 - 4:42am

By the way, use AtomicInteger, otherwise numLive is not reliable (anyway it does not change very much the results).

    static AtomicInteger numLive = new AtomicInteger( 0 );

    ...

     numLive.incrementAndGet();

    ...

     numLive.decrementAndGet();

 

Sylvain Brocard replied on Mon, 2011/03/07 - 8:01am

If you want to understand why more heap means less threads: outofmemoryerror-unable-to-create-new-native-thread

Jose Maria Arranz replied on Mon, 2011/03/07 - 8:25am in response to: Sylvain Brocard

Right I was supposing something very similar to this equation :)

Xmx + MaxPermSize + (Xss * nombre de thread) = Max Mémoire pour un processus sur l’OS

and sure  Max Mémoire pour un processus sur l’OS = 2Gb in 32 bit JVMs

Thanks!

 

Matt Olsen replied on Mon, 2011/03/07 - 10:36am

Very good point about server side programming with JavaScript and JavaScript in general. Going forward, no programming language will find a meaningful, enduring place without some way of easily handling threads.  We're entrenched in JavaScript on the browser due to some unfortunate history, and threading is a weak spot that needs to be addressed there as well.  And forking is so 90s.

Ant Kutschera replied on Mon, 2011/03/07 - 4:00pm

Hi guys, thanks for the good comments.

I too think 20 thousand threads is a huge amount for a server, but then I have no experience of building a server for say VOIP where I can imagine you might want something like that?  Not sure really, what could be the other options?

The nice thing about your -xss option, is that it shows that you can have a large amount of threads in a server, so there is less need to go to the OS (event) based solution.

What I really don't get, is what posesses someone to think "I know, I'm not getting anywhere with C++ or Java or any other well supported mature platform, so I'll build something brand new based on Javascript..."

If the requirement was to have an API like this (taken from nodejs.org):

http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('Hello World\n');
}).listen(8124, "127.0.0.1");
console.log('Server running at http://127.0.0.1:8124/');

 

then why not build that in C, C++, .NET or Java, passing in a function pointer or anonymous inner class?

I don't like flaming, but people who think like this drive me nuts.  What I find really dangerous about node.js is that non-programmers who know Javascript (ie people who only program PHP front ends which connect directly to MySQL and hack their Javascript) will pick it up thinking they can now pretend to be real programmers and build non-maintainable, non-scalable solutions, wasting tons of money along the way.

</rant> :o)

Jose Maria Arranz replied on Mon, 2011/03/07 - 4:08pm in response to: Ant Kutschera

Diversity is fine, and Node.js is ok for people who like JavaScript way of life in server.

JavaScript in server is not new, many years ago was tried by Netscape Server, JScript was ever an option in Microsoft ASP world and defunct Aptana's technology (I don't remember the name) was based on JS on top of FireFox.

Said this I don't buy and I don't believe on the "extreme scalability with the magic thread" advertisements of Node.js crew, in my opinion it is not fair play.

 

Serguei Mourachov replied on Tue, 2011/03/08 - 12:01am

Ant, in your NIO example you have to explicitly remove processed keys from the "keys" set. Otherwise selector.select() will always return immediately eating your CPU cycles. This is quite common mistake for people who are new in NIO :) 

Mike P(Okidoky) replied on Tue, 2011/03/08 - 12:40am

Instead of people making assumptions about whether it's better to spawn thread, queue tasks in a thread pool executor, and/or to utilize nio, would it not be smarter to have an algorithm that does everything, but tweaks towards the best blend by dynamically adjusting the balances at runtime while measuring overall through put as it moves processing around?

Ant Kutschera replied on Tue, 2011/03/08 - 5:22am in response to: Serguei Mourachov

Hi Serguei,

With key.cancel(), or how?  Thanks for the tip!  That could explain why it slows down as I add more connections.

Ant Kutschera replied on Tue, 2011/03/08 - 5:49am

On http://www.java2s.com/Code/Java/Network-Protocol/Nonblockserver.htm and http://www.java2s.com/Code/Java/Network-Protocol/FingerServer.htm, there is no call to close the key - are these examples wrong?

Ant Kutschera replied on Tue, 2011/03/08 - 6:04am

If I call key.cancel(), then no subsequent calls work.  The client thread blocks while waiting for a response to the "time" request, and the key isn't selected on the server side.

 

Are you sure about cancelling the key? Or should that only be done when closing the connection? That's what happens on line 50 of the NonBlockingServer2 listing.

Serguei Mourachov replied on Tue, 2011/03/08 - 12:22pm

Ant, replace

for (SelectionKey key : keys) { //... key processing code }

with

for (Iterator it = keys.iterator(); it.hasNext();) {SelectionKey key = it.next(); it.remove(); //...key processing code }

Notice, in both your examples  keys are removed from the selectedKeys set

Ant Kutschera replied on Tue, 2011/03/08 - 2:18pm

Ah right, I wondered why they did that.  These examples need inline comments ;-)

Ant Kutschera replied on Thu, 2011/03/10 - 4:53pm in response to: Serguei Mourachov

Removing the keys improves the performance dramatically!

Andrew Pennebaker replied on Sat, 2011/03/12 - 8:27pm

Fail. The code is Java, NOT "Node JS Server side Java Script". And time servers haven't been relevant since the 80's.

Ant Kutschera replied on Sun, 2011/03/13 - 4:17am in response to: Andrew Pennebaker

@Andrew: I can see how at first glance, you might think the title isn't relevant, and you read this article hoping to find something about how great Javascript on the server is.

However, I think you have missed the point, perhaps because I have conveyed it badly.  That point is, that to go and develop a Javascript library / framework, which lets you program Javascript on the server, is in my opinion not a step forward for our society of developers.

For the press/media to get hold of Node JS and hail it as "the new PHP" or similar, is not only ridiculous, in my opinion, but also dangerous, because it encourages many unexperienced developers to go and learn a tool which will not serve them in the long run.  In my experience, Javascript is not suitable for developing complex server applications, partly because the language has no compiler to tell you when you have made mistakes, and partly because the language makes it hard/impossible for tools to fully support it, for example a lack of exhaustive auto-complete.  Your only choice is to work with a debugger, and those are not as good as the ones available for Java.

The reasons given for developing Node JS were that other options were unsatisfactory.  I am simply showing that using a very well supported and mature tool set, I don't think that statement is true.  While I haven't compared Javascript performance to that of Java, I feel it is not necessary to make my point.  I chose a random bit of functionality (a "time server" as you call it), so that the clients would make successive requests, not because I was trying to build something modern.  Ideally I would have built a VOIP server, which is why Node JS was built, but I feel that would be overkill and unneccesary to make the point trying to be made in this article.

On the other hand, this has been posted to the Java Lobby at DZone, so the fact that the code is Java makes it very relevant.  And I do discuss the content of the title, making that title relevant to the article.  In fact, the title is the very reason I wrote the article.

I hope that clears it up for you.

Peter Lind replied on Fri, 2011/03/25 - 1:46am

While I haven't compared Javascript performance to that of Java, I feel it is not necessary to make my point.

You're saying: don't take up node.js, it's a step backwards. You're saying that without actually testing node.js and what it might have to offer for this scenario. On the other hand, you work yourself through several Java examples, showing stats for performance and having ideas for upping the performance. You don't think this is just a little biased, a little unfair towards node.js? Would you take seriously any criticism of java that noted that it's probably the most anal retentive language ever made, if the author then stated he hadn't in fact programmed using java?

In short: why would you expect anyone but hardcore java fanboys to take this serious in any way?

Ant Kutschera replied on Sun, 2011/05/15 - 3:43pm in response to: Peter Lind

Hi Peter,

None of this is about Java being better than node.js, or node.js being better than Java.  It's not about performance either.

What this article is, is me making the point that instead of taking the time to learn existing languages, libraries, APIs or frameworks, software people go to great lengths to reinvent the wheel.

I understand why they do it - to some degree I often do it myself, because being a developer has a large creative element to it.  Being creative and reinventing a new wheel is great fun!

But it's expensive. How many hours are people investing learning node.js?  How many companies are paying for their architects to adopt node.js, because they think its cool, rather than because it can really solve their problems?

I don't need to learn node.js to make the statement that it is not as mature as Java, C++, .NET, etc.  That is a fact.  And because it's immature, it cannot have solved all the problems which Java, C++ and .NET have solved over the years.  As an example, when I wrote this post, node.js was single threaded and didn't support forking of parallel processes.  What about security?  What about a transaction manager for handling multiple resources?  What about deployment of standardised (web) applications?

So I could take the time to learn node.js to help me maybe solve one problem like a server needing to keep many connections open simultaneously.  But then later, when I need concurrency, or need a good security model, I'd be stuck, because the immature platform means I have to develop all that stuff myself, or wait for the community to do it, or help the community.

Alternatively, I could take the time to look into the more mature platforms out there, and choose a suitable one which can indeed solve my current problem, as well as solve more problems which I may face as I continue to develop my customer's software.

Mature platforms like Java, C++ and .NET are very hard to learn, but there is a reason: they can solve 99% of your problems.  Immature platforms are easy to learn, because the API is tiny.  But these platforms can't solve most of your problems.

Does that help any non Java fanboys to take this article more seriously?  How about if I agree that Java as a language contains some nasties?

Let's try a different way of looking at it.

You want a fast car.  Going out and buying a top of the range Mercedes will cost you a lot and it has so many features, that reading the manual will take too long.  You just want a little go kart.  So you go and buy the parts and make your little go kart.  It's great, for whizzing around a private track.

But then, your wife (the boss) wants to ride with you.  Oops, no second seat.  What about the kids?  She want you to take the family to the beach, and you need to take the highway to get there.  Once there, the carpark isn't safe, so you need a way to lock up the vehicle.  The kids need in car entertainment!  Doh!  Maybe it would have been easier (albeit not as fun) to make the investment in a fast Merc?  (Let's not get hung up on whether a Mercedes is better than a BMW or go kart please.)

Ant Kutschera replied on Mon, 2011/05/16 - 7:12am

For those interested in performance, I did find these interesting benchmarks and follow-ups:

http://www.olympum.com/java/quick-benchmark-java-nodejs/

http://www.olympum.com/java/java-aio-vs-nodejs/

That blog goes on to discuss the political problems which node.js needs to overcome, which I hadn't even thought of.  So as well as being slower in terms of performance (as shown in the above blog) and as well as suffering from immaturity, node.js has political issues too, as shown by the follow up postings you can find at the above two links.

In a few years, node.js might indeed be a contender.  But in the mean time, why are we, as a society, putting effort into this thing?  I don't get it.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.