![]() |
SORT WHAT IS THE BEST WAY TO SORT A VERY LARGE FILE, IF THIS FILE IS TOO LARGE FOR THE MEMORY?
JOE
Send the output to the card punch. Take your stack of cards to the machine operator and tell her to run them through the sort process. Wait about 4 hours and go back to get your card stack now in sorted order. Feed them back into the punch card reader and print it all out on 11x17 fanfold paper to verify.
old_timer
Use the Bayes-Knorr algorithm with a Inverse-Tree-Traverse pattern. Google should give some more info on this.
Dennis Forbes
How old is your computer that it doesn't even type in lower case?
Bill Rushmore
Don't listen to Dennis, he's just trying to trick you.
Captain McFly
What does your data look like? Is it random or does it follow some known distribution? Does the sort need to be stable or not? You can probably use something like bucket sort (or a recursive bucket sort like quicksort). You just need to work out a reasonable scheme for storing the buckets (since they can be larger than your memory capacity).
Kalani
A toolshed or an abandoned warehouse are both excellent places to stash excess buckets until you need them again.
muppet
The great implementation advice on JOS is the main reason I keep coming back.
Bored Bystander
Don't let anyone fool you! There is only one sort: BUBBLE! But make sure to run it twice in case one slips by.
Capn' Kirk
I believe the method of superior sorting algorithm you seek is "bogosort". Here is a link for more information: http://en.wikipedia.org/wiki/Bogosort
anon-y-mous cow-ard
I thought the obvious answer would be to buy more memory?
Ankur
Undo your Caps Lock and everything will sort fine, my friend.
If you need to sort ONE very large file, then it means it has only ONE element, which means it's SORTED in any way you want.
"Bogosort"
Babu Jones
What does it mean to sort a very large file? Are you trying to arrange the bytes in the file in ascending order?
name withheld out of cowardice
As Bored Bystander said, it's also a good idea to examine how you use the file in the first place. If you write infrequently but read frequently, for example, you should store the file in sorted form (in this case your time to sort will obviously be O(1) but your insert time will be something like O(log n)).
Kalani
JOE -
Yet another anon
To give a somewhat serious answer, if faced with such a situation I would chop the file into X parts and then sort and persist each of the X parts using conventional high efficiency sorting algorithms. Imagine having 5 sorted decks of cards with the lowest cards up - it's fairly easy and efficient from there to build it into one sorted deck.
Dennis Forbes
Pfft. O(1) is weak. I'm working on a O( n - n - 1 ) sort which actually sorts the file before you create it. I'm having a little bit of trouble with duplicate values, though.
Captain McFly
Note that JOE has not checked back. Guys like JOE expect the phone to ring with a specific answer in response.
External mergesort? Clearly some people have retroactively stolen my idea. I'm going to have to go get a patent and seek justice for this wrong.
Dennis Forbes
The obvious answer is not to sort it and go home and drink a beer - it will sort itself out, given enough time.
coward
"retroactively stolen my idea"
Yet another anon
If you're sorting lines in the file, maybe you have enough memory to store pointers to the start of each line in the file (if not the actual lines themselves). In this case you could use any decent sort algorithm just reading the relevant lines from disk when you need to compare them...
Matt
The OP's question reeks of some sort of low-level computer assignment - maybe a summer school high school computer class.
Dennis Forbes
Bubble sort, of course, as invented by my acquaintance Keith, the parking lot attendant, who assures me that it is the most efficient sort algorithm of all time.
E. Naeher
I have a book at home about mathematical cranks -- it is quite amazing the number of people who are thoroughly convinced that (a) they invented "casting out nines", and (b) they could make millions of dollars by patenting their invention and selling it to... someone.
Eric Lippert
Mr. Lippert, how scriptin' going?
Sorry, I didn't realize that this was probably a joke until I saw the previous post on sorting.
Kalani
+++We had a programmer who to get a random number from 1 to 10 kept would keep calling our random function (which returned 0-65535) until the return value was between 1 and 10+++
Did you know, your caps lock uses up CPU and ram? So things will go much faster if you capitalize properly.
Anon to protect the guilty
Assuming this question is for real....
Peter
JOE, just a quick idea.
Alex
Sorry, Dennis, I just swooped right down and replied. Missed your answer.
Alex
>> if (x < 3) { break; }
Alex
SORT. I JUST WANT TO SORT.
WOULD BE SORTER
Buy a Mac.
S. Jobs
Would be sorter:
Anon to protect the guilty
The right sort for your sort is a good sort. Google "porn" and don't come back.
caltrop
This would be childsplay in Delphi.
Just me (Sir to you)
HELP ME SORT.
WOULD BE SORTER
"HELP ME SORT."
Anon to protect the guilty
|