Fog Creek Software
Discussion Board

sum it up

I'm unhappy with the solution for the "two repeated numbers" section of "sum it up". The solution given certainly works, but it involves additional algorithms (summing the squares and the quadratic formula) and it requires cycling through the entire sequence twice (once to get c, the other to get the sum of the squares).

I think that it can be done using only the summing technique that was used to get the single repeated number. Here's how:

Note: we are looking for two repeated numbers: a and b

1) Do the summing technique to get c. We know that c = a + b

2) Call 'a' the smaller of the two unknowns. We know that a <= c/2 (if it weren't, then it would be 'b'. Think about it.)

3) Now we know that there is one repeated number <= c/2 so we just do the summing technique up to c/2 (or [c-1]/2 if c is odd). (note: this is easy of the series is in ascending order. If the series is in random order, then discard x if x > c/2)

4) Doing the summing technique up to c/2 gives us 'a'. Having 'a' and 'c' gives us 'b'.

caveat: if a = b = c/2, then doing the summing technique the second time will retun c to us. So if we get the strange result that our repeated number is larger than our upper bound, then we know thet a=b=c/2.

1) Code reuse. Pull the summing technique into a seperate function and use it twice.

2) Tighter code due to a smaller number of algorithms.

3) Don't need to remember/look up the quadratic formula.

4) If the series is in ascending or descending order, then we can cycle through a potentially much smaller portion of the series to determine a and b then by calculating the sum of the squares. Even if the series is in random order, at least we aren't dealing with big numbers and we can just discard numbers by a simple compare instead of doing calculations on all numbers in the series.

Winky Chin
Monday, February 11, 2002

Since the array is already sorted, why not just compare each number with it's previous number and store the repeating number in an array. It is still o(n). Am I missing something ??

Wednesday, February 13, 2002

puzzlepal: if your reasoning seems sound, you should then examine the premise.  It seems obvious to me from the way the problem is stated that you do not get to assume the sequence is sorted.  The poser would certainly know that the problem is trivial if that is the case.

winky chin: I looked at your solution, and I don't see why it wouldn't work.  Nice job.  However, IMO, everyone should know the quadratic formula by heart.  :-)

(I once wrote a routine to factor quartic equations.  Now that's definitely something you wouldn't want to memorize.)

Paul Brinkley
Thursday, February 14, 2002

Thanks. I actually do have the quadratice formula memorized (my dad drilled it into my head in junior high).

The real problem is that my computer doesn't have it memorized. Either I have to add that formula into the code, or I have to do what you did...pull it out into a seperate function--a solution you didn't seem to enjoy ;)

Winky Chin
Thursday, February 14, 2002

Wait a second, there's a flaw in the solution presented.

We have a constraint that the amount of memory used cannot depend on n.  But the solution requires that you have some variable to store the sum.  How many bits of storage do you use to store the sum?  No matter how many bits you use, there will be some number n for which the sum will exceed the values representable by that many bits.  For example, if you use 16 bits, then you cannot store the sum if n >= 363, because then the sum will be at least 65704 (if I did my math right).  So the larger n is, the more bits of storage you have to use.  Granted, it's not a linear progression of required storage space like the bit vector solution, but the amount of storage required is still related to n.

Erick C. Jones
Monday, March 4, 2002

Eric: admittedly, the larger the sum, the greater the number of bits needed to store the sum. AFAIK, this concern is not what is generally meant by the constraint that the amount of memory used cannot depend on n.

If you have x bits set aside to hold the sum, then you can handle (by rough estimate) sums where n = 2^(x/2). So yes the memory is dependant on n, but it is a logarithmic relation.

Solution: set aside 1024 bits to hold the sum and check that n is not > 2^512 before running the rest of the program. If n > 2^512, then throw an error saying that the interviewer is being unreasonable. ;)

At any rate, using only the summing technique will certainly require fewer bits than the "sum of the squares" technique that is given in the "official" solution.

Winky Chin
Tuesday, March 5, 2002

*  Recent Topics

*  Fog Creek Home