Big Data/Analytics Zone is brought to you in partnership with:

Arthur Charpentier, ENSAE, PhD in Mathematics (KU Leuven), Fellow of the French Institute of Actuaries, professor at UQàM in Actuarial Science. Former professor-assistant at ENSAE Paritech, associate professor at Ecole Polytechnique and professor assistant in economics at Université de Rennes 1. Arthur is a DZone MVB and is not an employee of DZone and has posted 155 posts at DZone. You can read more from them at their website. View Full User Profile

# Are These "Staggering" Odds Really So Staggering?

01.04.2013
| 896 views |

I was supposed to take a holiday break, but Frédéric, professor in Tours, came back to me this morning with a tickling question. He asked me what were the odds that the Champions League draw produces exactly the same pairings from the practice draw, and the official one - an occurrence the Daily Mail describes as "staggering" at 2,000,000 to 1 odds?

To be honest, I don’t know much about soccer, so here is what happened, with the practice draw (on the left, on December 19th) and the official one (on the right, on December 20th),

Clearly, the pairs are identical, but not the order. Actually, at first, I was suprised that even which team plays at home first, was iddentical. But (it seams that) teams that play at home first are the ones that ended second after the previous stage of the competition.

And to be more specific about those draws, those pairs were obtained using real urns, real balls, so it is pure randomness (again, as far as I understood). But with very specific rules. For instance, two teams from the same country cannot play together (or one against the other) at this stage. Or teams that ended first after the previous turn can only play with (or against) teams that ended second. Actually, Frederic sent me an xls file, with a possibility matrix.

Let us find all possible pairs, regardless which team plays at home first (again, we do not care here since the order is defined by the rule mentioned above). Doing the maths might have been a bit complicated, with all those contraints. With a small code, it is possible to list all possible pairs, for those eight games. Let us import our possibility matrix,

``` > n=16
> uefa=read.table(
+ "http://freakonometrics.blog.free.fr/public/data/uefa.csv",
+ sep=",",header=TRUE)
> LISTEIMPOSSIBLE=matrix(
+ (rep(1:n,n))*(uefa[1:n,2:(n+1)]=="NON"),n,n)```

I can fix the first team (in my list, the fourth one is the first team that ended second). Then, I look at all possible second one (that will play with the first one),

``` > a1=1
> "%notin%" <- function(x, table){x[match(x, table, nomatch = 0) == 0]}
> posa2=((a1+1):n)%notin%LISTEIMPOSSIBLE[,a1]```

Then, consider the second team that ended second (the sixth one in my list). And look at all possible fourth team (that will play this second game), i.e exluding the one that were already drawn, and those that are not possible,

``` > b1=6
> posb2=(1:n)%notin%c(LISTEIMPOSSIBLE[,b1],a2)```

Etc. So, given the list of home teams,

``` > a1=4
> b1=6
> c1=8
> d1=9
> e1=12
> f1=14
> g1=15
> h1=16```

consider the following loops,

``` > posa2=(1:n)%notin%c(LISTEIMPOSSIBLE[,a1])
> for(a2 in posa2){
+ posb2=(1:n)%notin%c(LISTEIMPOSSIBLE[,b1],a2)
+ for(b2 in posb2){
+ posc2=(1:n)%notin%c(LISTEIMPOSSIBLE[,c1],a2,b2)
+ for(c2 in posc2){
+ posd2=(1:n)%notin%c(LISTEIMPOSSIBLE[,d1],a2,b2,c2)
+ for(d2 in posd2){
+ pose2=(1:n)%notin%c(LISTEIMPOSSIBLE[,e1],a2,b2,c2,d2)
+ for(e2 in pose2){
+ posf2=(1:n)%notin%c(LISTEIMPOSSIBLE[,f1],a2,b2,c2,d2,e2)
+ for(f2 in posf2){
+ posg2=(1:n)%notin%c(LISTEIMPOSSIBLE[,g1],a2,b2,c2,d2,e2,f2)
+ for(g2 in posg2){
+ posh2=(1:n)%notin%c(LISTEIMPOSSIBLE[,h1],a2,b2,c2,d2,e2,f2,g2)
+ for(h2 in posh2){
+ s=s+1
+ V=c(a1,a2,b1,b2,c1,c2,d1,d2,e1,e2,f1,f2,g1,g2,h1,h2)
+ cat(s,V,"\n")
+ M=rbind(M,V)
+ }}}}}}}}```

With the print option, we end up with

```5461 4 13 6 11 8 5 9 2 12 10 14 3 15 7 16 1
5462 4 13 6 11 8 5 9 2 12 10 14 7 15 1 16 3
5463 4 13 6 11 8 5 9 2 12 10 14 7 15 3 16 1```

i.e.

```> nrow(M)
[1] 5463```

possible pairs (the list can be found here, where numbers are the same as the one in the csv file). Which was the probability mentioned in acomment in the article mentioned previously dailymail.co.uk/…. So the probability to have exactly the same output after the practise and the official draws was (in %)

```> 100/nrow(M)
[1] 0.01830496```

Which is not that small when we think about it….

And if someone has a mathematical expression for this probability, I am interested. The only reliable method I found was to list all possible pairs (the csv file is available if someone wants to check). But I am not satisfied….

Published at DZone with permission of Arthur Charpentier, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

"Starting from scratch" is seductive but disease ridden
-Pithy Advice for Programmers