Jason Baldridge is an associate professor of computational linguistics at the University of Texas at Austin. He has been actively involved in open source software development for over 14 years (including founding OpenNLP), and regularly codes in Scala, Java, Python, R and Perl. Jason is a DZone MVB and is not an employee of DZone and has posted 12 posts at DZone. You can read more from them at their website. View Full User Profile

First steps in Scala for beginning programmers, Part 2

12.23.2011
| 4052 views |
  • submit to reddit

This is the second in a planned series of tutorials on programming in Scala for first-time programmers, with specific reference to my Fall 2011 course Introduction to Computational Linguistics. You can see the other tutorials here on this blog; they are also listed on the course’s links page.

This tutorial focuses on Tuples and Lists, which are two constructs for working with groups of elements. You won’t get much done without the latter, and the former are so incredibly useful you probably find yourself using them a lot.

Tuples

We saw in the previous tutorial how a single value can be assigned to a variable and then used in various contexts. A Tuple is a generalization of that: a collection of two, three, four, and more values. Each value can have its own type.

scala> val twoInts = (3,9)
twoInts: (Int, Int) = (3,9)

scala> val twoStrings = ("hello", "world")
twoStrings: (java.lang.String, java.lang.String) = (hello,world)

scala> val threeDoubles = (3.14, 11.29, 1.5)
threeDoubles: (Double, Double, Double) = (3.14,11.29,1.5)

scala> val intAndString = (7, "lucky number")
intAndString: (Int, java.lang.String) = (7,lucky number)

scala> val mixedUp = (1, "hello", 1.16)
mixedUp: (Int, java.lang.String, Double) = (1,hello,1.16)

The elements of a Tuple can be recovered in a few different ways. One way is to use a Tuple when initializing some variables, each of which takes on the value of the corresponding position in the Tuple on the right side of the equal sign.

scala> val (first, second) = twoInts
first: Int = 3
second: Int = 9

scala> val (numTimes, thingToSay, price) = mixedUp
numTimes: Int = 1
thingToSay: java.lang.String = hello
price: Double = 1.16

Scala peels off the values and assigns them to each of the single variables. This becomes very useful in the context of functions that return Tuples. For example, consider a function that provides the left and right edges of a range when you give it the midpoint of the range and the size of the interval on each side of the midpoint.

scala> def rangeAround(midpoint: Int, size: Int) = (midpoint - size, midpoint + size)
rangeAround: (midpoint: Int, size: Int)(Int, Int)

Since rangeAround returns a Tuple (specifically, a Pair), we can call it and set variables for the left and right directly from the function call.

scala> val (left, right) = rangeAround(21, 3)
left: Int = 18
right: Int = 24

Another way to access the values in a Tuple is via indexation, using “_n” where n is the index of the item you want.

scala> print(mixedUp._1)
1
scala> print(mixedUp._2)
hello
scala> print(mixedUp._3)
1.16

The syntax on this is a bit odd, but you’ll get used to it.

Tuples are an amazingly useful feature in a programming language. You’ll see some examples of their utility as we progress.

Lists

Lists are collections of ordered items that will be familiar to anyone who has done any shopping. Tuples are obviously related to lists, but they are less versatile in that they must be created in a single statement, they have a bounded length (about 20 or so), and they don’t support operations that perform computations on all of their elements.

In Scala, we can create lists of Strings, Ints, and Doubles (and more).

scala> val groceryList = List("apples", "milk", "butter")
groceryList: List[java.lang.String] = List(apples, milk, butter)

scala> val odds = List(1,3,5,7,9)
odds: List[Int] = List(1, 3, 5, 7, 9)

scala> val multinomial = List(.2, .4, .15, .25)
multinomial: List[Double] = List(0.2, 0.4, 0.15, 0.25)

We see that Scala responds that a List has been created, along with brackets around the type of the elements it contains. So, List[Int] is read as “a List of Ints” and so on. This is to say that List is a parameterized data structure: it is a container that holds elements of specific types. We’ll see how knowing this allows us to do different things with Lists parameterized by different types.

We can also create Lists with mixtures of types.

scala> val intsAndDoubles = List(1, 1.5, 2, 2.5)
intsAndDoubles: List[Double] = List(1.0, 1.5, 2.0, 2.5)

scala> val today = List("August", 23, 2011)
today: List[Any] = List(August, 23, 2011)

Types are sometimes autoconverted, such as converting Ints to Doubles for intsAndDoubles, but often there is no obvious generalizable type. For example, today is a List[Any], which means it is a List of Anys — and Any is the most general type in Scala, the supertype of all types. It’s sort of like saying “Yeah, I have a list of… well, you know… stuff.”

Lists can also contain Lists (and Lists of Lists, and Lists of Lists of Lists…).

scala> val embedded = List(List(1,2,3), List(10,30,50), List(200,400), List(1000))
embedded: List[List[Int]] = List(List(1, 2, 3), List(10, 30, 50), List(200, 400), List(1000))

The type of embedded is List[List[Int]], which you can read as “a List of Lists of Ints.”

List methods

Okay, so now that we have some lists, what can we do with them? A lot, actually. One of the most basic properties of a list is its length, which you can get by using “.length” after the variable that refers to the list.

scala> groceryList.length
res19: Int = 3

scala> odds.length
res20: Int = 5

scala> embedded.length
res21: Int = 4

Notice that the length of embedded is 4, which is the number of Lists it contains (not the number of elements in those lists).

The notation variable.method indicates that you are invoking a function that is specific to the type of that variable on the value in that variable. Okay, that was a mouthful. Scala is an object-oriented language, which means that every value has a set of actions that comes with it. Which actions are available depends on its type. So, above, we called the length method that is available to Lists on each of the list values given above. You didn’t realize it in the previous tutorial, but you were using methods when you added Ints or concatenated Strings — it’s just that Scala allows us to go without “.” and paretheses in certain cases. If we don’t drop them, here’s what it looks like.

scala> (2).+(3)
res25: Int = 5

scala> "Portis".+("head")
res26: java.lang.String = Portishead

What is going on is that Ints have a method called “+” and Strings have a different method called “+“. They could have been called “bill” and “bob”, but that would be harder to remember, among other things. Ints have other methods, such as “-“, “*“, and “/“, that Strings don’t have. (Note: I’m now returning to omitting the “.” and paretheses.)

scala> 5-3
res27: Int = 2

scala> "walked" - "ed"
<console>:8: error: value - is not a member of java.lang.String
"walked" - "ed"

Scala complains that we tried to use the “-” method on a String, since Strings don’t have such a method. On the other hand, Ints don’t have a method called length, while Strings do.

scala> 5.length
<console>:8: error: value length is not a member of Int
5.length
^

scala> "walked".length
res31: Int = 6

With Strings, length returns the number of characters, whereas with Lists, it is the number of elements. The String length method could have been called “numberOfCharacters”, but “length” is easier to remember and it allows us to treat Strings like other sequences and think of them similarly.

Lets return to Lists and what we can do with them. “Addition” of two lists is their concatenation and is indicated with “++“.

scala> val evens = List(2,4,6,8)
evens: List[Int] = List(2, 4, 6, 8)

scala> val nums = odds ++ evens
nums: List[Int] = List(1, 3, 5, 7, 9, 2, 4, 6, 8)

We can append a single item to the front of a List with “::“.

scala> val zeroToNine = 0 :: nums
zeroToNine: List[Int] = List(0, 1, 3, 5, 7, 9, 2, 4, 6, 8)

And sort a list with sorted, and reverse it with reverse, and do both in sequence.

scala> zeroToNine.sorted
res42: List[Int] = List(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> zeroToNine.reverse
res43: List[Int] = List(8, 6, 4, 2, 9, 7, 5, 3, 1, 0)

scala> zeroToNine.sorted.reverse
res44: List[Int] = List(9, 8, 7, 6, 5, 4, 3, 2, 1, 0)

What the last line says is “take zeroToNine, get a new sorted list from it, and then reverse that list.” Notice that calling these functions never changes zeroToNine itself! That is because List is immutable: you cannot change it, so all of these operations return new Lists. This property of Lists brings with it many benefits that we’ll return to later.

Note: immutability is different from the val/var distinction. It is common to think that a val variable is immutable, but it is not — it is fixed and cannot be reassigned. The following examples all involve immutable Lists, but the fixed variable is a val while the reassignable variable is a var.

scala> val fixed = List(1,2)
fixed: List[Int] = List(1, 2)

scala> fixed = List(3,4)
<console>:8: error: reassignment to val
fixed = List(3,4)
^

scala> var reassignable = List(5,6)
reassignable: List[Int] = List(5, 6)

scala> reassignable = List(7,8)
reassignable: List[Int] = List(7, 8)

One of the things one frequently wants to do with a list is access its elements directly. This is done via indexation into the list, starting with 0 for the first element, 1 for the second element, and so on.

scala> odds
res48: List[Int] = List(1, 3, 5, 7, 9)

scala> odds(0)
res49: Int = 1

scala> odds(1)
res50: Int = 3

Starting with 0 for the index of the first element is standard practice in computer science. It might seem strange at first, but you’ll get used to it fairly quickly.

We can of course use any Int expression to access an item in a list.

scala> zeroToNine(3)
res63: Int = 5

scala> zeroToNine(5-2)
res64: Int = 5

scala> val index = 3
index: Int = 3

scala> zeroToNine(index)
res65: Int = 5

If we ask for an index that is equal to or greater than the number of elements in the list, we get an error.

scala> odds(10)
java.lang.IndexOutOfBoundsException: 10
at scala.collection.LinearSeqOptimized$class.apply(LinearSeqOptimized.scala:51)
at scala.collection.immutable.List.apply(List.scala:45)
at .<init>(<console>:9)
at .<clinit>(<console>)
at .<init>(<console>:11)
at .<clinit>(<console>)
at $export(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:592)
at scala.tools.nsc.interpreter.IMain$Request$$anonfun$10.apply(IMain.scala:828)
at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
at scala.tools.nsc.io.package$$anon$2.run(package.scala:31)
at java.lang.Thread.run(Thread.java:680)

Looking at all that, you might be thinking “WTF?” It’s called the stack trace, and it gives you a detailed breakdown of where problems happened in a bit of code. For beginning programmers, this is likely to look overwhelming and intimidating — you can safely glaze over it for now, but before long, it will be necessary to be able to use the stack trace to identify problems in your code and address them.

Another useful method is slice, which gives you a sublist from one index up to, but not including, another.

scala> zeroToNine
res55: List[Int] = List(0, 1, 3, 5, 7, 9, 2, 4, 6, 8)

scala> zeroToNine.slice(2,6)
res56: List[Int] = List(3, 5, 7, 9)

So, the slice gave us a list with the elements from index 2 (the third element) up to index 5 (the sixth element).

Returning briefly to Strings — other List methods than length work with them too.

scala> val artist = "DJ Shadow"
artist: java.lang.String = DJ Shadow

scala> artist(3)
res0: Char = S

scala> artist.slice(3,6)
res1: String = Sha

scala> artist.reverse
res2: String = wodahS JD

scala> artist.sorted
res3: String = " DJSadhow"

On lists that contain numbers, we can use the sum method.

scala> odds.sum
res59: Int = 25

scala> multinomial.sum
res60: Double = 1.0

However, if the list contains non-numeric values, sum isn’t valid.

scala> groceryList.sum
<console>:9: error: could not find implicit value for parameter num: Numeric[java.lang.String]
groceryList.sum
^

What is going on is some very cool and useful automagical behavior by Scala involving implicits. We’ll come back to that later, but for now you can happily use sum on Lists of Ints and Doubles.

One thing we often want to do with lists is obtain a String representation of their contents in some visually useful way. For example, we might want a grocery list to be a String with one item per line, or a list of Ints to have a comma between each element. The mkString method does just what we need.

scala> groceryList.mkString("\n")
res22: String =
apples
milk
butter

scala> odds.mkString(",")
res23: String = 1,3,5,7,9

Want to know if a list contains a particular element? Use contains on the list.

scala> groceryList.contains("milk")
res4: Boolean = true

scala> groceryList.contains("coffee")
res5: Boolean = false

And now we arrive at Booleans, another of the most important basic types. They play a major role in conditional execution, which we’ll cover in the next tutorial.

There are actually many more methods available for lists, which you can see by going to the entry for List in the Scala API. API stands for Application Programming Interface — in other words a collection of specifications for what you can do with various components of the Scala programming language. I’m going to do my best to give you the methods you need for now, but eventually you will need to be able to look at the API entries for Scala types to see what methods are available, what they do and how to use them.

Some of the most important methods on Lists we haven’t covered are map, filter, foldLeft, and reduce. We’ll come back to them in detail later, but for now here is a teaser that should give you an intuitive sense of what they do.

scala> val odds = List(1,3,5,7,9)
odds: List[Int] = List(1, 3, 5, 7, 9)

scala> odds.map(1+)
res6: List[Int] = List(2, 4, 6, 8, 10)

scala> odds.filter(4<)
res7: List[Int] = List(5, 7, 9)

scala> odds.foldLeft(10)(_ + _)
res8: Int = 35

scala> odds.filter(6>).map(_.toString).reduce(_ + "," + _)
res9: java.lang.String = 1,3,5

Now we’re getting functional. :)

Copyright 2011 Jason Baldridge

 

From http://bcomposes.wordpress.com/2011/08/24/first-steps-in-scala-for-beginning-programmers-part-2/

Published at DZone with permission of Jason Baldridge, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)