Mark is a graph advocate and field engineer for Neo Technology, the company behind the Neo4j graph database. As a field engineer, Mark helps customers embrace graph data and Neo4j building sophisticated solutions to challenging data problems. When he's not with customers Mark is a developer on Neo4j and writes his experiences of being a graphista on a popular blog at http://markhneedham.com/blog. He tweets at @markhneedham. Mark is a DZone MVB and is not an employee of DZone and has posted 524 posts at DZone. You can read more from them at their website. View Full User Profile

Writing a Java function in Clojure

11.28.2009
| 6398 views |
  • submit to reddit

A function that we had to write in Java on a project that I worked on recently needed to indicate whether there was a gap in a series of data points or not.

If there were gaps at the beginning or end of the sequence then that was fine but gaps in the middle of the sequence were not.

null, 1, 2, 3 => no gaps
1, 2, 3, null => no gaps
1, null, 2, 3 => gaps

The Java version looked a bit like this:

public boolean hasGaps(List<BigInteger> values) {
Iterator<BigInteger> fromHead = values.iterator();
while (fromHead.hasNext() && fromHead.next() == null) {
fromHead.remove();
}

Collections.reverse(values);

Iterator<BigInteger> fromTail = values.iterator();
while (fromTail.hasNext() && fromTail.next() == null) {
fromTail.remove();
}

return values.contains(null);
}

We take the initial list and then remove all the null values from the beginning of it, then reverse the list and remove all the values from the end.

We then check if there's a null value and if there is then it would indicate there is indeed a gap in the list.

To write this function in Clojure we can start off by using the 'drop-while' function to get rid of the trailing nil values.

I started off with this attempt:

(defn has-gaps? [list]
let [no-nils] [drop-while #(= % nil) list]
no-nils)

 Unfortunately that gives us the following error!

Can't take value of a macro: #'clojure.core/let (NO_SOURCE_FILE:16)

It thinks we're trying to pass around the 'let' macro instead of evaluating it – I forgot to put in the brackets around the 'let'!

I fixed that with this next version:

(defn has-gaps? [list]
(let [no-nils] [drop-while nil? list]
no-nils))

But again, no love:

java.lang.IllegalArgumentException: let requires an even number of forms in binding vector (NO_SOURCE_FILE:23)

The way I understand it the 'let' macro takes in a vector of bindings as its first argument and what I've done here is pass in two vectors instead of one.

In the bindings vector we need to ensure that there are an even number of forms so that each symbol can be bound to an expression.

I fixed this by putting the two vectors defined above into another vector:

(defn has-gaps? [list]
(let [[no-nils] [(drop-while nil? list)]]
no-nils))

We can simplify that further so that we don't have nested vectors:

(defn has-gaps? [list]
(let [no-nils (drop-while nil? list)]
no-nils))

The next step was to make 'no-nils' a function so that I could make use of that function when the list was reversed as well:

(defn has-gaps? [list]
(let [no-nils (fn [x] (drop-while nil? x))]
(no-nils list)))

 I then wrote the rest of the function to reverse the list and then check the remaining list for nil:

(defn has-gaps? [list]
(let [[no-nils] [(fn [x] (drop-while nil? x))]
[nils-removed] [(fn [x] ((comp no-nils reverse no-nils) x))]]
(some nil? (nils-removed list))))

The 'comp' function can be used to compose a set of functions which is what I needed.

It seemed like the 'nils-removed' function wasn't really necessary so I inlined that:

(defn has-gaps? [list]
(let [no-nils (fn [x] (drop-while nil? x))]
(some nil? ((comp no-nils reverse no-nils) list))))

The function can now be used like this:

user=> (has-gaps? '(1 2 3))
nil
user=> (has-gaps? '(nil 1 2 3))
nil
user=> (has-gaps? '(1 2 3 nil))
nil
user=> (has-gaps? '(1 2 nil 3))
true
I'd be intrigued to know if there's a better way to do this.
References
Published at DZone with permission of Mark Needham, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Joerg Wassmer replied on Sat, 2009/11/28 - 4:08am

I think your Java code is much better, because for readers it is more simple to understand what it does.

ff aaa replied on Sat, 2009/11/28 - 4:18am

Umm. if you mean that any null value in the list indicates (except head and tail) that it has a gap, why not use a simple iteration from 1 to length-2 and check for null and return early if any? what am i missing?

Asiri Rathnayake replied on Sat, 2009/11/28 - 6:50am

How about,

public boolean hasGaps(List<BigInteger> values) {
int sequenceCount = 0;
boolean insideSequence = false;
for (BigInteger number : values) {
if (number != null) {
if (!insideSequence) {
// Start of a new sequence.
sequenceCount++;
insideSequence = true;
}
} else {
insideSequence = false;
}
}
return (sequenceCount > 1);
}

Haven't tested it myself though, just thought your java code is doing too much.

- Asiri

Artur Biesiadowski replied on Sat, 2009/11/28 - 7:10am

Java method is mutating passed array, so I thought it should trim, reverse and return info if there are any gaps. As far as I understand clojure version is not mutating passed array (they are generally immutable AFAIK), so it is different program.

As for iterating from 1 to n-2, it is not so obvious, because they might be multiple nulls in front or in the end. I think that 'proper java' implementation for non-mutating version (as opposed to 'proper functional' implementation), would use very simple state machine or something like


Iterator it = values.iterator()

while (it.hasNext() && it.next() == null) {}

while (it.hasNext() && it.next() != null) {}

while (it.hasNext() && it.next() == null) {}

return it.hasNext();
It is not defined what to do on list containing only nulls - here it will return that gap exists. If it is supposed to do otherwise, extra check is needed in the middle.

Only one iteration, no reverse, no recursion - pure, ugly java :)

Andy Leung replied on Sat, 2009/11/28 - 8:07am

You are kidding me with your logic just to check null in between, and that is the reason for you to introduce clojure? I think you are good at programming but may need some practises in using example. Your problem is simple with one iteration required if I understand your original problem correctly. Set return value to default false, run a for loop to check if there is null from element start+1 until length-2, if there is null, you can return true immediately.

Christophe Grand replied on Sat, 2009/11/28 - 8:48am

Here is my clojure version:

(defn has-gaps? [s] 
  (->> s (drop-while nil?) (drop-while (complement nil?)) (not-every? nil?)))

 or, using clojure 1.0:

(defn has-gaps? [s] 
  (not-every? nil? (drop-while (complement nil?) (drop-while nil? s))))

Marc Stock replied on Sat, 2009/11/28 - 4:34pm

In groovy, you can just do this:

def list = [1, 2, null, 3]
list.any{ it == null }​


Hard to get more clear than that.

Artur Biesiadowski replied on Sat, 2009/11/28 - 7:00pm in response to: Marc Stock

Marc, I think you missed the point. For [null,1,2,3,null] you are supposed to return 'no gaps'.

Marc Stock replied on Sun, 2009/11/29 - 4:29am in response to: Artur Biesiadowski

Yikes, my bad. Should of read the post more carefully.

Moldskred replied on Sun, 2009/11/29 - 7:41am

 
public boolean hasGaps( Object[] values ) {
	boolean seqStart = false;
	boolean possibleGap = false;
	boolean gapFound = false;
	
	for( Object value : values ) {
		gapFound = possibleGap && value != null;
		if( gapFound ) { break; }
		possibleGap = seqStart && value == null;
		seqStart = seqStart || value != null;			
	} // end for -- until gap found
	
	return gapFound;
} // end hasGaps( )

Artur Sobierajczyk replied on Sun, 2009/11/29 - 7:49am

Python/Jython version:


from itertools import dropwhile

def has_gaps(seq):
    start_notnone = dropwhile(lambda x: x is None, seq)
    middle = dropwhile(lambda x: x is None, reversed(list(start_notnone)))
    return any(x is None for x in middle)

(why the "code" tag is not working??)

Andreou Dimitris replied on Sun, 2009/11/29 - 10:40am

Artur solution is by far the best, I doubt scripting fanboys saw that coming, and it is quite embarrassing to see this happening! Indeed, an single pass over an Iterator is actually expressive enough to accept all regular languages modulo alternaton (i.e. concatenation and Kleene star). The problem itself is accepting the language defined by "[null]*[^null]*[null]*", in hopefully clear enough pseudocode. If alternation was needed, a single Iterator wouldn't do, since repeatable reads would be required. One should check Scala's parser combinators, which can easily express such languages and parse them (no, the input tokes doesn't have to be "char" or strings, they can be whatever, such as java.lang.Integer instances as this example).

Artur Sobierajczyk replied on Sun, 2009/11/29 - 5:15pm

Andreou, you must be coming from the academic side of computing. You talk about complexity of the solution in very theoretical terms. What matters in practice is simplicity in terms of understanding it by human. Declarative style is better than imperative. Loops are hard to analyse an buggy. Artur's solution looks nice but it uses C-like hacks (changing of state in condition check in "while") and is deeply bound to the Java Iterator API - eg. version for arrays will look much different.

Endre Varga replied on Mon, 2009/11/30 - 5:18am in response to: Artur Sobierajczyk

Very theoretical?? I think recognizing simple patters with state-machines is very elementary. At least in my university it was a first semester standard task to design a state-machine that recognizes in an incoming continuous stream of bits two predefined bit patterns, handling overlapping occurences as well. We had to minimize it and prove its minimality, then implement it with logical gates and elementary flip-flops. I cannot imagine anything "very theoretical" about this. Also I am dealing with simulations and simulation data, which could be huge. Just today I am facing several GB of data to process. Reversing a list will not scale. It will not scale even if you just reverse the iterator, and keep the list intact, because the disk access prefers continous read and write. And I do not think that processing large series is something extraordinary in the land of programming. Arturs solution is actually very easy to understand, maybe three lines of comments would make it more clear, but for me, the scripting solutions presented here are inferior in clarity. Do not get me wrong, I understand the benefits of scripting and functional languages, and I use them daily (I write a lot of Mathematica code which is very Lisp like). However, I would never do a large project completely in functional style.

Artur Sobierajczyk replied on Mon, 2009/11/30 - 12:44pm in response to: Endre Varga

If you value a solution in terms of expressiveness in some mathematical theory (be it computation complexity or automata theory), you are a computer scientists, not a software engineer. And remember that premature optimization is Evil. Of course if you are mathematician writing software for other mathematicians, using such vocabulary and valuations is OK. For "normal" people it isn't.

Honey Monster replied on Mon, 2009/11/30 - 3:12pm

return values.SkipWhile(x=>x==null).SkipWhile(x=>x!=null).SkipWhile(x=>x==null).Any();

That would be the C# version.

Moldskred replied on Tue, 2009/12/01 - 2:08am in response to: Artur Sobierajczyk

Artwar, so that I appreciate the expressiveness of regular expressions and SQL makes me a computer scientist?Hardly. Rather, if you don't value a solution in terms of expressiveness then you're not a software engineer but only a programmer. Software engineering is where theory meets practice -- you need a grounding and an appreciation of the theory to do engineering. And while, yes, premature optimization is an evil, choosing a more appropriate algorithm is not premature optimization. That's just common sense.

Endre Varga replied on Wed, 2009/12/02 - 3:09am in response to: Artur Sobierajczyk

Using the right tool for the right job -- using state machine for recognizing a regular language -- is not premature optimization. It is actually a rule of thumb. You should not even think about it, but do it automatically. Maybe you should stop using the commutativity and associativity rules of addition and multiplication over natural numbers, because they are "some mathematical theory"? Of course if you are a script zealot writing software for other script zealots, using such vocabulary and valuations is OK. For "normal" engineer it isn't. And never call me a mathematician again.

Kasim Tuman replied on Tue, 2009/12/08 - 12:31pm

As an ardent believer os KISS principle, I am just sharing my implementation to point out that there is easy way of doing this:

(defn drop-nils-from-head-tail
  [coll]
  (reverse (drop-while nil? (reverse (drop-while nil? coll)))))

(defn has-gaps?
  [coll]
  (some nil? (drop-nils-from-head-tail coll)))

 
It is not the shortest implementation but clean and maintainble. A clojure newb like me can understand and implement it easly.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.