I'm a software architect/consultant in Boulder, Colorado. I started blogging at: http://wayne-adams.blogspot.com/, but have since started the new blog, Data Sceintist in Training which will cover the areas I am learning as part of my own "big data" training, including the Hadoop family, frameworks like OpenStack, the R programming language, column-oriented databases and the art of examining data sets for useful patterns. The original blog will be kept alive with Java-specific posts. Wayne is a DZone MVB and is not an employee of DZone and has posted 35 posts at DZone. You can read more from them at their website. View Full User Profile

Haskell From an OO Developer's Perspective, Part 2

10.31.2011
| 3161 views |
  • submit to reddit
Today I'm looking at Haskell type definition and the use of pattern-matching in functions. Pattern-matching is much more an integral feature of FP, as opposed to OO. But first...

Haskell type definition

According to the HaskellWiki, an algebraic data type is a type which specifies the shape of the elements. One way to approach this topic, if you tend to think in OO, is 1) think C structs, and 2) try to forget behavior, for a moment. Even in OO, we've all been there -- sometimes you really want something like a struct just to organize some data, and the only behavior (in an OO sense, meaning "methods") that you really want is a set of accessors/mutators (getters/setters). So you either swallow a little hard and make the attributes public, or you blast out boilerplate getters/setters. Meanwhile, some people write frameworks of code that automagically expose your attributes, and so on.

Right now we just want to define a data structure. We'll be manipulating the structure with functions later, so we don't care about what it "does". It's just there. Here's an example of a data-type declaration in Haskell; I'm using it to describe some breakpoints I've serialized from a Java program:
-- file:  c:/HaskellDev/typesDemo.hs

type Label = String
type FileName = String
type LineNumber = Int
type MethodName = String
type ExceptionName = String
type Description = String

data Breakpoint = LineNumberBreakpoint Label FileName LineNumber Description |
MethodBreakpoint Label FileName MethodName Description |
ExceptionBreakpoint Label ExceptionName Description
deriving (Show)
This isn't the simplest-possible example, but it's representative of a useful structure and allows me to introduce a few things at once. First, focus on the data keyword; data is the beginning of the definition of an algebraic data type and is followed by the name of the type (which must start with a capital letter). This type name is followed by an equals sign and the type-constructor definition. "deriving (Show)" isn't critical to understanding the type; it provides information ghci needs to output the values we'll create based on this type definition.

The type constructor definition is composed of one or more value constructors. These value constructors must also be labeled by a name beginning with a capital letter and are separated by the pipe "|" symbol. Value constructors and type constructors reside in different "namespaces", so to speak, so if you have a really simple data type, you can use the same label for both the type definition name and the value constructor name and not worry about collisions between the two names. For example:
type User = User String String String Int
Back to our example, let's first get out of the way the list of types at the beginning of the source file. These types are called type synonyms in Haskell and are nothing elaborate; they are available to improve the readability of your code. So, if I want to describe a breakpoint definition as consisting of a label, the name of the file, the line number in the code, and a short description, in Haskell I could say
data Breakpoint = Breakpoint String String Int String
or I could improve the readability/understandability of the definition with some type synonyms:
type Label = String
type FileName = String
type LineNumber = Int
type Description = String

data Breakpoint = Breakpoint Label FileName LineNumber Description
and that's what I've done in our example. In addition, I've defined three different breakpoint types: one based on line number, one based on method call (e.g. method-entry, method-exit) (yes, we're using Haskell to analyze breakpoints in a Java class!), and one based on thrown exceptions.

Going back to our example, the next question would be "what do you do with this type definition?" At this point I want to launch the Glasgow interpreter/debugger, ghci. Once this is up and running, I can load my file. Note that ghci loads by default from the directory in which it was launched; you can set the directory from which it loads with :cd dirName:
*Main> :cd c:/HaskellDev
*Main> :load typesDemo.hs
[1 of 1] Compiling Main             ( typesDemo.hs, interpreted )
Ok, modules loaded: Main.
*Main>
I now have three value constructors for my Breakpoint data type, and I can create instances of each. I do this by referencing the name of the value constructor and providing the arguments that it expects, as specified in the type definition. Additionally, in ghci, to assign a variable a value, we need to use the let keyword. First, imagine that we want to create an exception breakpoint, but all we remember is the name of its value constructor. In ghci, you can get the type info for this value constructor:
*Main> :type ExceptionBreakpoint
ExceptionBreakpoint
:: Label -> ExceptionName -> Description -> Breakpoint
An explanation of the chain of -> symbols will have to wait until a later day, as it is a little foreign to the OO mind. But for now we can use the output to remind ourselves what we need to supply to the value constructor. Let's create an ExceptionBreakpoint, output its value, then use ghci's :type to output its type:
*Main> let bp3 = ExceptionBreakpoint "NPE" "java.lang.NullPointerException" "Break on all NullPointerExceptions"
*Main> bp3
ExceptionBreakpoint "NPE" "java.lang.NullPointerException" "Break on all NullPointerExceptions"
*Main> :type bp3
bp3 :: Breakpoint
Note that :type returns the actual type of the breakpoint, not the name of the value constructor which was used to build it. There are times when we want to know which value constructor was used -- an excellent segue to the 2nd topic of this post.

Pattern-matching in functions

Suppose I want to write a function that extracts the description of a breakpoint definition and further, reminds me what type of breakpoint it is. In Haskell, you accomplish this with pattern matching. In the following fragment:
describeBreakpoint (ExceptionBreakpoint _ _ desc) = desc
I have defined a function -- describeBreakpoint -- which matches only on an ExceptionBreakpoint. I've written it to match on a tuple which consists of
  • the name of the value constructor
  • two wildcards ("_", placeholders) for the values I'm not currently interested in (the Label and the ExceptionName)
  • a parameter for the description, desc.
After the equals sign is the value returned from the function, which is just the description. Applying this function to the breakpoint we created earlier, I get:
     *Main> describeBreakpoint bp3
    "Break on all NullPointerExceptions"
This process is called deconstruction, as we've essentially reversed the process we used to create bp3, by using a pattern which looks like the pattern used to construct the breakpoint value.

Haskell allows us to define a function as a series of predicate/value pairs. For example, we can update our function to extract the description of any Breakpoint as follows:
   describeBreakpoint (LineNumberBreakpoint _ _ _ desc) = desc
  describeBreakpoint (MethodBreakpoint _ _ _ desc) = desc
  describeBreakpoint (ExceptionBreakpoint _ _ desc) = desc
If we did not do this, we would get an error if we tried to use the function to describe a LineNumberBreakpoint:
 *Main> let bp4 = LineNumberBreakpoint "parseParam" "Parser.java" 79 "Break at beginning of parse"
   *Main> describeBreakpoint bp4
   "*** Exception: typesDemo.hs:15:1-56: Non-exhaustive patterns in function describeBreakpoint
Note that in general FP functions prefer that your matching patterns be exhaustive and will complain if they exhaust the list without a match. In this case the message should be self-explanatory. After updating our function as described earlier (and, reloading in ghci, which will require that you re-define the breakpoints), we get the expected answers:
*Main> :load typesDemo.hs
[1 of 1] Compiling Main             ( typesDemo.hs, interpreted )
Ok, modules loaded: Main.
*Main> let bp3 = ExceptionBreakpoint "NPE" "java.lang.NullPointerException" "Break on all NullPointerExceptions"
*Main> let bp4 = LineNumberBreakpoint "parseParam" "Parser.java" 79 "Break at beginning of parse"
*Main> let bp5 = MethodBreakpoint "initBean" "Parser.java" "init()" "Break at parser init() method"
*Main> describeBreakpoint bp3
"Break on all NullPointerExceptions"
*Main> describeBreakpoint bp4
"Break at beginning of parse"
*Main> describeBreakpoint bp5
"Break at parser init() method"
Note that the matches above treat the value as a simple tuple, where the first element is the name of the value constructor and subsequent elements are simply the values passed to the value constructor. Pattern matching can be performed on any tuple, in general. For example, suppose we write a function that returns the 2nd element of a 4-tuple:
     secondElement (a, b, c, d) = b
and then try this:
*Main> secondElement ("Washington", "Lincoln", "Fillmore", "Roosevelt")
"Lincoln"
That's great. Now look at this:
*Main> secondElement ("Washington", "Lincoln", "Fillmore")

:1:15:
   Couldn't match expected type `(t0, t10, t20, t30)'
               with actual type `(t1, t2, t3)'
   In the first argument of `secondElement', namely
     `("Washington", "Lincoln", "Fillmore")'
   In the expression:
     secondElement ("Washington", "Lincoln", "Fillmore")
   In an equation for `it':
       it = secondElement ("Washington", "Lincoln", "Fillmore")
While there is a second element in this list, it is a 3-tuple and there is no pattern in the function secondElement which matches a 3-tuple. We could add another match in the function definition to handle a 3-tuple (and so on), but there are better ways to do this, which we may investigate in later posts.

As with all other topics I'll be investigating in these early posts, this one is far from exhaustive, and it isn't intended to be. Again, this discussion reflects what stands out to me, as an OO developer, looking at all of this for the first time. Next post: more on thinking in FP.

 

From http://wayne-adams.blogspot.com/2011/10/haskell-from-oo-developers-perspective_29.html

Published at DZone with permission of Wayne Adams, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags: