.NET Zone is brought to you in partnership with:

James is an open source developer working on Play framework. His passion is making web development simpler, through using the right tools in the right way for the right job. James is a DZone MVB and is not an employee of DZone and has posted 14 posts at DZone. You can read more from them at their website. View Full User Profile

Comments Complement Code

11.14.2012
| 4575 views |
  • submit to reddit

 

Today I read this quote:

Good code is its own best documentation. As you’re about to add a comment, ask yourself, ‘How can I improve the code so that this comment isn’t needed?’

I just want to say, it's a load of rubbish. Take a look at the following code:

def toCharArray(
     decoder: CharsetDecoder = Charset.forName("UTF-8").newDecoder()
  ): Enumeratee[Array[Byte], Array[Char]] = new Enumeratee[Array[Byte],Array[Char]] {
 
  def step[A](inner: Iteratee[Array[Char], A], partialChar: Option[Array[Byte]] = None)(in: Input[Array[Byte]]):
      Iteratee[Array[Byte], Iteratee[Array[Char], A]] = {
    in match {
      case EOF => partialChar.map(_ => Error[Array[Byte]]("EOF encountered mid character", EOF))
        .getOrElse(Done[Array[Byte],Iteratee[Array[Char],A]](inner, EOF))
 
      case Empty => Cont(step(inner, partialChar))
 
      case El(data) => {
        val charBuffer = CharBuffer.allocate(data.length + 1)
        val byteBuffer = partialChar.map({ leftOver =>
          val buffer = ByteBuffer.allocate(leftOver.length + data.length)
          buffer.mark()
          buffer.put(leftOver).put(data)
          buffer.reset()
          buffer
        }).getOrElse(ByteBuffer.wrap(data))
 
        decoder.decode(byteBuffer, charBuffer, false)
 
        val leftOver = if (byteBuffer.limit() > byteBuffer.position()) {
          Some(byteBuffer.array().drop(byteBuffer.position()))
        } else None
 
        val decoded = charBuffer.array().take(charBuffer.position())
        val input = if (decoded.length == 0) Empty else El(decoded)
 
        inner.pureFlatFold {
          case Step.Cont(k) => Cont(step(k(input), leftOver))
          case _ => Done(inner, Input.Empty)
        }
      }
    }
  }
 
  def applyOn[A](inner: Iteratee[Array[Char], A]) = Cont(step(inner))
}

If you know iteratees and you know Scala, it's pretty obvious what this does. It converts a stream of byte arrays into a stream of char arrays, taking into the consideration the possibility that one character may be split across multiple byte arrays. Structurally it is purely functional, however the actual decoding is not, it uses the high performance Java CharBuffer and ByteBuffer classes to do the decoding, which are mutable, and arguably this is necessary since this enumeratee is a place where performance matters. I wrote it, and in my opinion it's not badly written, though if you can see anything that could be improved, please let me know.

So, tell me, on line 14, why do I allocate a char buffer of the incoming byte array length plus one? What is the reason for the plus one? When I first wrote it, I didn't have the plus one there, I didn't think it was needed. You see, when converting an array of bytes to an array of UTF-16 Java characters, at most, 8 bytes will become 8 characters, right? 8 bytes could become 4 characters, if those characters were multi byte characters, the number of chars needed might be less than the number of bytes being decoded, but it can never be more, right? One byte can't become multiple UTF-16 chars, so why would I ever need 9 characters for 8 bytes?

Now maybe you might criticise my code because the +1 is actually a magic number, and if I gave it a name, then that would explain everything. Well, let's give it a name, and reasonable a name (I could give it a two hundred character long name and that might explain everything but you can hardly call two hundred character long variable names good code. Well, maybe you can in Java, but not in Scala). So I'll create a val PotentialMultiCharOffset = 1. Does that help you at all? Do you know what it's for? Why is it 1? Why is it added, why don't I multiply by 2? If you do know the reason behind it, then hats off to you, you are a genius. But for the rest of us, we don't know. It as only after I wrote comprehensive unit tests for the code that I found the bug (I've heard other people say that unit tests are not necessary for functional code, another fallacy).

Let me show you the comment that is above that line of code:

// The +1 here is very important, it is there for the case when there are
// 3 bytes of a 4 byte character in the partialChar array, and so this data
// should contain the final byte, but that one byte will become 2 Chars.
val charBuffer = CharBuffer.allocate(data.length + 1)

Understand it now? Was there any way that I could have written the code that would have explained that? Was there any variable name that I could have given it that would have explained it better than that comment? No, it just needed a simple comment explaining its purpose. Without the comment, you'd be sitting there wondering why on earth I had added 1, you might have even thought "this is allocating more memory than needed, I'll just optimise this" and you would have injected a bug. In this case, a comment is aptly suited to making the code understandable. The comment complements the code. It is necessary and the best way of describing it.

And the fact is that we come across things every day where some really obscure edge case means we have to do some otherwise obscure behaviour. Maybe in a world of higher order logic this isn't the case, but we work in a world of far less than perfect protocols with edge cases that are impossible to memorise, where optimising an equals comparison to return early when you encounter the first character that isn't equal is a security vulnerability, where bugs in other software that our software has to interface to means we have to do counter intuitive things to work around their issues, and where some things are just plain hard to get your heard around, and sometimes a little plain English (or whatever language you speak) just does that little bit to helping you or the next developer make sense of it all.

Comments complement code. Good code does not negate the need of comments. Good code includes comments where comments are needed.


Published at DZone with permission of James Roper, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Cristian Vasile... replied on Thu, 2012/11/15 - 3:10pm

Good code is its own best documentation. As you’re about to add a comment, ask yourself, ‘How can I improve the code so that this comment isn’t needed?’
Good code does not negate the need of comments. Good code includes comments where comments are needed.
Would you be surprised to tell you that I agree with both statements :) ? The first statement doesn't say you should never use comments.

I see comments as an indicator of code that is hard to read (by itself). Many times code can be refactored to remove the comment:

Why have this:
Date date; // birth date
when you can (and should) have this instead:
Date birthDate;
The need for performance often makes the code harder to read - and in need of comments. Which is one of the reasons why you should use a profiler before making performance optimizations - if the code is not called often enough, the performance optimization doesn't really matter, but you end-up with code that is hard to read and maintain. It's a different story if you are writing a library, of course (you don't know how often the code is going to be invoked).

Non-obvious things, like you have in your example, needs comments.
But too many times comments are used to document bad code, instead of fixing it.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.