Small SQL challenge
This is a very little, but interesting problem I've once came across. It has a lot of poor obvious solutions, and an elegant, but very non-straightforward one. For those of you doing hiring, this may make a good addition to your interview questions arsenal.
So, you have a table with two columns: "from" and "to". Let's say for clarity that it represents intercity roads. Because of the way the data was produced, for evey record, there's always a "mirror" record, which has "from" and "to" values vice versa. That is, if there's a
[From: Washington, To: Paris] record, there's guranteed to be [From: Paris, To: Washington]. There is also some amount of records in which both fields are identical [From: Paris, To: Paris]. These records are unique.
The task is to write a query to remove all "duplicate" records. That is, from [Paris, Washington], and [Washington, Paris] we only want one to remain, doesn't matter which. [Paris, Paris] must, of course, just remain, since there're no other records representing the same information.
The point is I'm just wondering how quickly people on this board can figure out the pretty way.
Egor
Tuesday, April 20, 2004
About 5 seconds: "delete from thetable where (from > to)" (with a suitable operator that does an alphabetic comparison of the two fields).
Christopher Wells
Tuesday, April 20, 2004
About 5 seconds? WOW! Hey, everybody..."super programmer" is in the hiz-ouse!
Color Me Impressed - Not!
Tuesday, April 20, 2004
Not a very good answer but...
select * from table a
where a.primarykey in (
select a.a.primarykey
from table a, table b where a.from = b.to
and b.from = a.to
minus
select a.p from table a, table b
where a.from = b.to
);
SC
Tuesday, April 20, 2004
Christopher, you rock. The least I've gotten was about 30 minutes!
Egor
Tuesday, April 20, 2004
Oops, should be...
select * from table a
where a.primarykey in (
select a.primarykey
from table a, table b where a.from = b.to
and b.from = a.to
minus
select a.primarykey from table a, table b
where a.from = b.to
);
SC
Tuesday, April 20, 2004
In other words:
isn't "with a suitable operator that does an alphabetic comparison of the two fields"
the important part?
Color Me Impressed - Not!
Tuesday, April 20, 2004
BTW, is there a DBMS that doesn't do alphabetical comparison on CHAR type with the > operator? What does SQL standard say?
To say the truth, in my case the data was lower case (not cities, domain-specific stuff).
Egor
Tuesday, April 20, 2004
DELETE FROM routes r1
WHERE EXISTS(SELECT NULL FROM Routes r2
WHERE r1.from = r2.to AND r1.to = r2.from AND r2.from <> r2.to)
I agree that Christopher's solution is shorter and neater, but I have learned not to trust any assumptions people put forward about their data :-). My statement will handle the case where there are rows for which the reverse doesn't exist. And the question with Christopher's solution has a high aha-factor, which makes it an unsuitable interview question.
x
Wednesday, April 21, 2004
Seems I am not the only one falling for the old "do my homework for me please".
Just me (Sir to you)
Wednesday, April 21, 2004
<<
DELETE FROM routes r1
WHERE EXISTS(SELECT NULL FROM Routes r2
WHERE r1.from = r2.to AND r1.to = r2.from AND r2.from <> r2.to)
I agree that Christopher's solution is shorter and neater, but I have learned not to trust any assumptions people put forward about their data :-). My statement will handle the case where there are rows for which the reverse doesn't exist. And the question with Christopher's solution has a high aha-factor, which makes it an unsuitable interview question. >>
With all my respect to the author - it is not a solution, it removes the rows with the reverse wich should be in the table.
The other solution provided above used "MINUS" which is useful with ORacle , bit does not work with SQL SERVER , SYBASE .
LI
Wednesday, April 21, 2004
Recent Topics
Fog Creek Home
|