[The original title for this post was *“The story of two lovers, joint densities and the beauty of statistics”*. I dropped the “joint densities” from the title]

Recently a friend of mine gave me a puzzle. The puzzle is as follows:

Many years ago there were two people, secretly in love, and were traveling with a group of friends who did not know they are in love. The large group of friends were going around Europe by bus. Stopping in major cities and staying in hotels. In the long hallway of hotels, every person would get a separate room. Every night one of these lovers would sneak out of their room and go to the other’s room for an innocent cuddling. The puzzle is: if the long hallway is

`L`

meters long and rooms are uniformly spread along the hallway. What would be the average length that one of them needs to tiptoe to reach the other one.

Depending on how you want to solve this problem this can be an easy or a relatively hard problem. We will solve it two ways. First we can solve it with simulation and next we can validate the simulation with an analytical solution.

Let’s first simplify the problem. Assume the location of one of them is and the other is located at . We want to find the average value for $latex Z= |X-Y|

1 2 3 4 5 6 7 8 9 10 11 |
# from random import random sims = pow(2,26) # number of simulations (about 67M) X = [random() for i in range(0, sims+1)] Y = [random() for i in range(0, sims+1)] Z = map(lambda z: abs(z[0]-z[1]), zip(X,Y)) print "E[Z] = ", float(sum(Z))/len(Z) Result: E[Z] = 0.333323189834 |

And the histogram of values is shown below. is following a triangular distribution.

If R is your favorite language then you might find the rgl and akima libraries pretty useful here. You can produce visualizations of Z values (vs X and Y) using R.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
% sims <- 100000 df <- data.frame(matrix(ncol = 3, nrow = sims)) names(df)<- c("X", "Y", "Z") df$X <- runif(n = sims) df$Y <- runif(n = sims) df$Z <- abs(df$X-df$Y) cat("E[Z] = ", mean(df$Z)) library(akima) im <- with(df,interp(X,Y,Z)) with(im,image(x,y,z)) library(rgl) x <- im$x y <- im$y z <- im$z persp3d(x,y,z, col="skyblue") |

This problem can be solved in 6 lines of python or R (perhaps two line if you don’t mind a compressed notation). But can how can we find a closed form solution for this?

**Analytical solution**

We have and . We want to find the average value for . Additionally we know that and are independent . The probability density function for and are and . Since the two variables are independent then the expectation of the joint probability is in the form of:

By replacing and we have:

We can now remove the absolute value by conditioning on the value of . Meaning, if then otherwise .

Therefore:

This is equal to . So the two lovers would, on average, walk one third of the hallway to see each other.