# Thursday Code Puzzler: Efficient Duplicate Detection

Thursday is code puzzler day here at DZone. The idea is simple: solve the coding problem as efficiently as you can, in any language or framework that you find suitable.

*Note: Even
though there really is nothing stopping you from finding a solution to
this on the internet, try to keep honest, and come up with your own
answer. It's all about the participation!*

Do you have code puzzlers that you'd like to share with the DZone community? If so, please submit here.

**Efficient Duplicate Detection**

Give an array of 1001 elements, which includes the numbers 1 to 1000 inclusive in random order, plus one duplicate number, write the most efficient algorithm to detect the duplicate number. Make sure you only access each item in the array a maximum of once.

Catch up on all our previous puzzlers here.

## Comments

## Mark Priest replied on Wed, 2012/09/19 - 11:56pm

## Daniel Seidler replied on Thu, 2012/09/20 - 3:23am

Java

Call:## Mohammad Aq replied on Thu, 2012/09/20 - 3:45am

Daniel code might be a tad more efficent.

Here's mine in Java too:

static int getDuplicate(int[] array) {

## Vijay Nathani replied on Thu, 2012/09/20 - 4:48am

Groovy code using open source library Guava

## Vijay Nathani replied on Thu, 2012/09/20 - 4:51am

Daniel's algorithm in Groovy

## Nils Rudolph replied on Thu, 2012/09/20 - 6:36am

The puzzle say to solve it in the most efficient way. Daniels solution is pretty fast but it always iterate over the whole array. So i tried a different solution:

}

I did a little micro benchmarking. In comparision to Daniels Solution it is 3 times faster if duplicated numbers are at the front at the array but it is also 3 times slower if they are at the end.

With randomized arrays Daniels Solution performs 2/3 of the tries faster.

I tried to combine both algorithms, but Daniels algorithm was always faster in the average case.

## Ewald Horn replied on Thu, 2012/09/20 - 9:06am in response to: Nils Rudolph

Ah.

I had this as an interview question once, after solving it, the interviewer asked me if I could improve it - I couldn't.

Here is what he then showed me:

Good old XOR can be used if it's just numbers, the final result will be in the last position of the array. If you do the bitwise math, it starts making sense, but that's a LOT of 11111111 and 00000000 math!

## Daniel Seidler replied on Thu, 2012/09/20 - 8:43am in response to: Nils Rudolph

## Vijay Nathani replied on Thu, 2012/09/20 - 9:27am in response to: Ewald Horn

## Ewald Horn replied on Thu, 2012/09/20 - 9:35am in response to: Vijay Nathani

## Vijay Nathani replied on Thu, 2012/09/20 - 9:40am in response to: Ewald Horn

The algorithm seems to work only if the duplicate is 1. e.g. See code below.

println duplicate([250,2,250,3].toArray()) //prints 1. Wrong.

## Ewald Horn replied on Thu, 2012/09/20 - 10:13am in response to: Vijay Nathani

I just checked again:

It works fine in Java 1.6.0_33 - could it be a language difference?

## Balazs Jakab replied on Thu, 2012/09/20 - 10:14am

## I prefer Ewald's solution, but of course if will only work if the numbers satisfies the preconditions: "numbers 1 to 1000 inclusive in random order, plus one duplicate number". So if You add negative numbers or skip any of them from the list, it won't :)

Though I guess it could be a bit improved, since we could calculate the XOR-ed value of number 1-1000 (which is 1000 btw) in advance and store it as a constant in the code. Then we just need to XOR all the values in the array, at the end XOR the precalculated constant and voila, the result is the duplicated number.

## Ewald Horn replied on Thu, 2012/09/20 - 10:19am in response to: Balazs Jakab

Hi Balazs, thank you, I can't take credit for the solution though - it's just something I learned about along the way. Is there no way to do the same for negative numbers as well?

## Balazs Jakab replied on Thu, 2012/09/20 - 10:44am in response to: Ewald Horn

Hi!

I guess this algorithm works for negative numbers as well. You just need a continuous interval of numbers, for example -499 to 500 (The precalculated XOR-ed value for this interval is -8)

The trick is that if You XOR a number with itself the result is 0. Also if You XOR something with 0 the result is not changed. So in Your original solution all the numbers are XOR-ed twice (once from the array of numbers and once as the index), except of course the duplicated number which appear three times. Since pairs "kills" each other the full XOR-ed result is the duplicated number.

## Ewald Horn replied on Thu, 2012/09/20 - 10:50am in response to: Balazs Jakab

## Jagannathan Asokan replied on Fri, 2012/09/21 - 5:27am

## simon harrer replied on Fri, 2012/09/21 - 5:58am

## Vijay Nathani replied on Fri, 2012/09/21 - 6:20am in response to: Jagannathan Asokan

A set will be more efficient e.g. Groovy code

println duplicateNumber(new ArrayList(1..1000).plus(1,10)) //prints 10

## Eric Jablow replied on Fri, 2012/09/21 - 2:57pm

In Java:

## Steve Ck replied on Sun, 2012/09/23 - 11:46pm in response to: Vijay Nathani

## Vijay Nathani replied on Mon, 2012/09/24 - 12:22am in response to: Steve Ck

But the problem says that numbers can be in random order e.g. [ 5, 1, 3, 2, 10, 3]

## Balazs Jakab replied on Mon, 2012/09/24 - 2:36am in response to: Vijay Nathani

Yes, they can be in random order, but all the numbers for a continuous range should be included. So if You're using numbers from 1 to 10, in Your list You should have 11 numbers: the 10 numbers 1, 2, ... 10 and one duplicate in any order. Your example misses numbers 4, 6, 7, 8, 9, so the algorithm with XORs wont work.

## Vijay Nathani replied on Mon, 2012/09/24 - 3:19am in response to: Balazs Jakab

## Balazs Jakab replied on Mon, 2012/09/24 - 6:10am in response to: Vijay Nathani