Mitch Pronschinske is the Lead Research Analyst at DZone. Researching and compiling content for DZone's research guides is his primary job. He likes to make his own ringtones, watches cartoons/anime, enjoys card and board games, and plays the accordion. Mitch is a DZone Zone Leader and has posted 2576 posts at DZone. You can read more from them at their website. View Full User Profile

10 Scala Programming Pitfalls

01.21.2010
| 29570 views |
  • submit to reddit

Scala is great for highly scalable, component-based applications that support concurrency and distribution.  It leverages aspects of object-oriented and functional programming.  This JVM-based language gained most of its clout when it was announced that Twitter was using it.  If used correctly, Scala can greatly reduce the code needed for applications.

For the Scala programmmer, DZone has gathered these common code-writing pitfalls.  These tips come from Daniel Sobral, a Scala enthusiast who has managed Java software development projects and participated in the FreeBSD project.

1.  Syntactic Mistake

Thinking of "yield" as something like "return".  People will try:

for(i <- 0 to 10) {
if (i % 2 == 0)
yield i
else
yield -i
}

Where the correct expression is:

for(i <- 0 to 10) 
yield {
if (i % 2 == 0)
i
else
-i
}

 

2. Misusage and Syntactic Mistake

Using scala.xml.XML.loadXXX for everything. This parser will try to access external DTDs, strip comments, and similar things. There is an alternative parser in scala.xml.parsing.ConstructingParser.fromXXX.

Also, forgetting spaces around the equal sign when handling XML. This:

val xml=<root/>

really means:

val xml.$equal$less(root).$slash$greater

This happens because operators are fairly arbitrary, and Scala uses the fact that alphanumeric character must be separated from non-alphanumeric characters by an underscore in a valid identifier to be able to accept expressions such as "x+y" without assuming this is a single identifier. Note that "x_+" is a valid identifier, though.

So, the way to write that assignment is:

val xml = <root/>



3.  Misusage Mistake

Using the trait Application for anything but the most trivial use.  The problem with:

object MyScalaApp extends Application {  
// ... body ...
}

is that body executes inside a singleton object initialization.  First, the execution of singletons initialization is synchronized, so your whole program can't interact with other threads.  Second, JIT won't optimize it, thus making your program slower than necessary.

By the way, no interaction with other threads means you can forget about testing GUI or Actors with Application.


4. Misusage Mistake

Trying to pattern match a regex against a string assuming the regex is not bounded:

val r = """(\d+)""".r
val s = "--> 5 <---"
s match {
case r(n) => println("This won't match")
case _ => println("This will")
}

The problem here is that, when pattern matching, Scala's Regex acts as if it were bounded begin and end with "^" and "$".  The way to get that working is:

val r = """(\d+)""".r
val s = "--> 5 <---"
r findFirstIn s match {
case Some(n) => println("Matches 5 to "+n)
case _ => println("Won't match")
}

Or just making sure the pattern will match any prefix and suffix:

val r = """.*(\d+).*""".r
val s = "--> 5 <---"
s match {
case r(n) => println("This will match the first group of r, "+n+", to 5")
case _ => println("Won't match")
}



5. Misusage Mistake

Thinking of var and val as fields.

Scala enforces the Uniform Access Principle, by making it impossible to refer to a field directly. All accesses to any field are made through getters and setters. What val and var actually do is define a field, a getter for that field, and, for var, a setter for that field.

Java programmers will often think of var and val definitions as fields, and get surprised when they discover that they share the same namespace as their methods, so they can't reuse their names. What share the same namespace is the automatically defined getter and setter, not the field. Many times they then try to find a way to get to the field, so that they can get around that limitation -- but there's no way, or the uniform access principle would be violated.

Another consequence of it is that, when subclassing, a val can override a def. The other way around is not possible, because a val adds the guarantee of immutability, and def doesn't.

There's no guideline on what to call private getters and setters when you need to override it for some reason. Scala compiler and library code often use nicknames and abbreviation for private values, as opposed to fullyCamelNamingConventions for public getters and setters. Other suggestions include renaming, singletons inside an instance, or even subclassing. Examples of these suggestions:

Renaming

class User(val name: String, initialPassword: String) {
private lazy var encryptedPassword = encrypt(initialPassword, salt)
private lazy var salt = scala.util.Random.nextInt

private def encrypt(plainText: String, salt: Int): String = { ... }
private def decrypt(encryptedText: String, salt: Int): String = { ... }

def password = decrypt(encryptedPassword, salt)
def password_=(newPassword: String) = encrypt(newPassword, salt)
}

Singleton

class User(initialName: String, initialPassword: String) {
private object fields {
var name: String = initialName;
var password: String = initialPassword;
}
def name = fields.name
def name_=(newName: String) = fields.name = newName
def password = fields.password
def password_=(newPassword: String) = fields.password = newPassword
}

alternatively, with a case class, which will automatically define methods for equality, hashCode, etc, which can then be reused:

class User(name0: String, password0: String) {
private case class Fields(var name: String, var password0: String)
private object fields extends Fields(name0, password0)


def name = fields.name
def name_=(newName: String) = fields.name = newName
def password = fields.password
def password_=(newPassword: String) = fields.password = newPassword
}

Subclassing

case class Customer(name: String)

class ValidatingCustomer(name0: String) extends Customer(name0) {
require(name0.length < 5)

def name_=(newName : String) =
if (newName.length < 5) error("too short")
else super.name_=(newName)
}

val cust = new ValidatingCustomer("xyz123")



6. Misusage Mistake

Forgetting about type erasure. When you declare a class C[A], a trait T[A] or a function or method m[A], A is not present at run-time. That means, for instance that any type parameter will be actually compiled as AnyRef, even though the compiler ensures, compile time, that the types are respected.

It also means that you can't use type parameter A at compile time. For instance, this won't work:

def checkList[A](l: List[A]) = l match {
case _ : List[Int] => println("List of Ints")
case _ : List[String] => println("List of Strings")
case _ => println("Something else")
}

At run-time, the List being passed doesn't have a type parameter. Also, List[Int] and List[String] will both become List[_], so only the first case will ever be called.

You can get around this, to some extent, by using the experimental feature Manifest, like this:

def checkList[A](l: List[A])(implicit m: scala.reflect.Manifest[A]) = m.toString match {
case "int" => println("List of Ints")
case "java.lang.String" => println("List of Strings")
case _ => println("Something else")
}



7. Design Mistake

Careless use of implicits. Implicits can be very powerful, but care must be taken not to use implicit parameters of common types or implicit conversions between common types.

For example, making an implicit such as:

implicit def string2Int(s: String): Int = s.toInt

is a very bad idea because someone might use a string in place of an Int by mistake.  In cases where there's use for that, it's simply better to use a class:

case class Age(n: Int)
implicit def string2Age(s: String) = Age(s.toInt)
implicit def int2Age(n: Int) = new Age(n)
implicit def age2Int(a: Age) = a.n

This will let you freely combine Age with String or Int, but never String with Int.

Likewise, when using implicit parameters, never do something like:

case class Person(name: String)(implicit age: Int)

Not only this will make it easier to have conflicts of implicit parameters, but it might result in an implicit age being passed unnoticed to something expecting an implicit Int of something else. Again, the solution is to use specific classes.

Another problematic implicit usage is being operator-happy with them. You might think "~" is the perfect operator for string matching, but others may use it for things like matrix equivalence, parser concatenation, etc. So, if you are going to provide them, make sure it's easy to isolate the scope of their usage.


8. Design Mistake

Badly designing equality methods. Specifically:

  • Trying to change "==" instead of "equals" (which gives you "!=" for free).

  • Defining it as

def equals(other: MyClass): Boolean

        instead of

override def equals(other: Any): Boolean
  • Forgetting to override hashCode as well, to ensure that if a == b then a.hashCode == b.hashCode (the reverse proposition need not be valid).

  • Not making it commutative: if a == b then b == a. Particularly think of subclassing -- does the superclass knows how to compare against a subclass it doesn't even know exists?  Look up canEquals if needed.

  • Not making it transitive: if a == b and b == c then a == c.



9. Usage Mistake

On Unix/Linux/*BSD, naming your host (as returned by hostname) something and not declaring it on your hosts file. Particularly, the following command won't work:

ping `hostname`

In such cases, neither fsc nor scala will work, though scalac will. That's because fsc stays running in background an listening to connections through a TCP socket, to speed up compilation, and scala uses that to speed up script execution.


10. Style Mistake

Using while. It has its usages, but, most of the time, a solution with for-comprehension is better.

Speaking of for-comprehensions, using them to generate indices is a bad idea too. Instead of:

def matchingChars(string: String, characters: String) = {
var m = ""
for(i <- 0 until string.length)
if ((characters contains string(i)) && !(m contains string(i)))
m += string(i)
m
}

Use:

def matchingChars(string: String, characters: String) = {
var m = ""
for(c <- string)
if ((characters contains c) && !(m contains c))
m += c
m
}

If one needs to return an index, the pattern below can be used instead of iterating over indices. It might be better applied over a projection (Scala 2.7) or view (Scala 2.8), if performance is of concern.

def indicesOf(s: String, c: Char) = for {
(sc, index) <- s.zipWithIndex
if c == sc
} yield index

 

Comments

Alex Miller replied on Fri, 2010/01/22 - 12:43pm

Thanks, nice list. As a Scala rookie, the one thing I found myself screwing up repeatedly (and getting no useful error message for) was forgetting to put the "=" in a function definition: def foo() { ... } instead of def foo() = { ... }

Christian Harms replied on Sat, 2010/01/23 - 4:11am

First example, why using the if construct?
 
for(i <- 0 to 10)
yield i*((i%2)*2-1)
 
Have fun. 

Cej Hah replied on Sun, 2010/01/24 - 4:26pm in response to: Christian Harms

Because no one wants to debug this line of utter garbage?  Consider writting code like writting english.  If you can't spell it out plainly, you shouldn't be doing it.

 

Geoffrey Bays replied on Fri, 2010/05/14 - 4:13pm

When playing around with Scala briefly, the thing that bit me was that all method parameters were immutable by default, whether var or val, and there was not a way to make them mutable,.(or is there?) So I would have to copy the immutable param to another var if I needed to return an altered value. This is undoubtedly safer, like the noxious 'const' in C++, but a shock for a Java programmer.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.