Fog Creek Software
Discussion Board




noodles

Isn't there a simpler solution to "noodles"? The question asks for the average number of loops. I would read that as being the midpoint between the most and fewest loops. The most possible loops is N and the fewest is 1, so the average is (N-1)/2.

Kevin
Tuesday, October 08, 2002

Not exactly.  You have determined the bounds on the number of loops (1 and n).  But your answer, (n - 1) / 2, doesn't have those same bounds.

Consider what happens when you have only two noodles. 

Round 1:  Choose an end.  That leaves three other ends:
1 / 3 of the time:  You choose the other end, creating a loop.
2 / 3 of the time:  You choose an end from the other noodle.

Round 2:
No matter what a loop will be created.

Let Loops[n,k] be the probability mass function where n is the number of noodles, and k the number of loops.

The expected value E[Loops[2,k]]
= Loops[2,1] * 1 + Loops[2,2] * 2
= (2/3) * 1 + (1/3) * 2
= 4/3

Your function, on the other hand, predicts an answer of 1/2!  How could that be?  You'd need a probability mass function something like:
Loops[2, 0] = .5
Loops[2, 1] = .5
Loops[2, 2] = 0

Yet, the probability of creating zero loops should be zero!

In general form:
Loops[2,0] * 0 + Loops[2,1] * 1 + Loops[2,2] * 2 = 1/2
Loops[2,0] + Loops[2,1] + Loops[2,2] = 1

Clearly, Loops[2,0] must be > 0 for any mass function we construct.

You might be tempted to alter your answer slightly, so it can never be less than 1 (ie. (n + 1) / 2 or n).  But, any linear answer grows much too quickly.

Think about when you have a gigantic pot, with one million noodles.  Clearly, creating one million loops almost never happens.  Even for just 10 noodles, your chances of creating 10 loops is 1 in 654729075.

As for the actual answer:
E[Loops(n, k)] = Sum(i=1->n) {1 / (2i - 1)}

Loops(n, k) can be written in closed form by using psi(z), the Digamma function (the logarithmic derivative of the Gamma function).

For integer n (which is all we need), there's the convenient identity:
psi(n + .5) = -gamma - 2ln(2) +2 Sum(i=1->n){1 / (2i - 1) }
psi(n + .5) = -gamma + H(n - .5)

where gamma is Euler's constant and H(z) is the harmonic number.

So,
E[Loops(n, k)] = .5 psi(n + .5) + 2 ln(2) + .5 gamma
E[Loops(n, k)] = .5 (-gamma + H(n - .5)) + 2 ln(2) + .5 gamma

H(z) can be approximated by:
H(z) ~ log(z) + gamma

So, we can approximate E[Loops(n,k)] by:
E[Loops(n,k)] ~ .5(log(n - .5) + ln(2) + gamma)

This diverges very slowly:
E[Loops(10, k)] ~ 2.1 noodles
E[Loops(100, k)] ~ 3.2 noodles
E[Loops(1,000, k)] ~ 4.4 noodles
E[Loops(10,000, k)] ~ 5.6 noodles

Analagously, your chances of creating n loops from a pot of n noodles decrease extraordinarily quickly:

Loops(n, n) = Product(i=1->n) { 1 / (2i - 1) }

For integer n, we have the identity:

Gamma(.5 + n) = sqrt(pi) * Product(i=1->n) { 2i - 1 } / 2^n

So,
Loops(n, n) = sqrt(pi) / (Gamma(.5 + n) * 2^n)

This quickly converges to 0:

Loops(1, 1) = 1
Loops(10, 10) ~ 1.5 * 10^-9 ( = 1 / 654729075 from before)
Loops(100, 100) ~ 1.5 * 10^-187
Loops(1000, 1000) ~ 1.3 * 10^-2867

Jason De Biasio
Sunday, October 27, 2002

*  Recent Topics

*  Fog Creek Home