Fog Creek Software
Discussion Board




If not files & directories, then what?

In a prior post, someone mentioned that they didn't like files & direcctories.  If not files and directories, what would be the alternative?  (Hopefully, not a super-registry).

Nat Ersoz
Thursday, February 27, 2003

Idea:a dimensionnal model

Ie: you get your object (data) thru querying in more than 1 dimension.

It's just an extension of the actual model where there is a single dimension: the directory hierarchy

Imagine a file belonging a the same time to multiple hierarchies, and some are automatically created from metadata (ex: creator, version, owner etc...) and other maybe done thru content parsing applied to ontologies.

That would be fun

R Chevallier
Thursday, February 27, 2003

I am always confused by this kind of discussion.  I can't see the fundemental difference between a registry and a filesystem.  They are both hierarchical information stores.

Some people think that you can do away with the hierarchy and find things by content or tags.

Personally I organise my tasks by directory, and the structure keeps my work day structured.

And not all registries are as problematic as the windows one - google skyos.

nice
Thursday, February 27, 2003

Maybe a DB driven filesystem? :)

ls or dir becomes

SELECT FILE FROM FILESYSTEM
WHERE FILENAME ='THISFILE'

This has already been done. I think VMS had a filesystem that was really a relational database. I might be wrong on this one though.

Patrik
Thursday, February 27, 2003

One of the next versions of windows will feature a sql based file system supposedly. Pretty interesting if it ever goes through. It's been rumored for the "next windows release" for the past 5 years.

Ian Stallings
Thursday, February 27, 2003

Sets, subsets, and elements.  Explicitly defined and implicitly defined via properties.  Just like math class :P

Georg Cantor
Thursday, February 27, 2003

It's BeOS that had the file system that was a relational database. Every file could have arbitrary extra columns.

Before you get all esoteric, usability research has constantly shown that most users don't understand folders within folders. Outlines and hierarchies are extremely logical to programmers but are not well grasped by the general public. If I had a buck for every programmer that got in trouble thinking user's love outlining and hierarchies...

Joel Spolsky
Thursday, February 27, 2003

Joel,

>most users don't understand folders within folders.

Are there any usability research of how the general public understands a DB-driven filesystem? I mean, in such a filesystem you could do away with all hierarchies and just find your stuff using personal meaningful tags that you slap on the files. Think Post-It notes.

The general public which makes up the female parts of my family would love the Post-IT-filesystem ;-)

Patrik
Thursday, February 27, 2003

"If I had a buck for every programmer that got in trouble thinking user's love outlining and hierarchies... "

Are you trying to tell us you don't? <g>

And while were on the topic... What is the most efficient (prevalent?) data structure or method used to represent hierarchical data?

Root >> Filing Cabinets >> Folders >> Files

The sinus meds have kicked in!

B#
Thursday, February 27, 2003

The file/directory structure uses a  hierarchical model (tree). If links are allowed, then the underlying model is network (graph). Simple OSs would use even less than a tree, e.g. a list - a single root directory with files inside. OO and relational fall into the graph like category too.

Alternatively we can use maps to build a library like file system: repository + catalog mappings. I guess the broader category here is "hashing/mapping".

Associative networks are an extreme case of mappings, but the way they opperate makes them unsuitable OS like applications (just think some have non-deterministic behavior).

Cheers
Dino

Dino
Thursday, February 27, 2003

OK, simple example.

I'm a programmer (common ground, right?), on a not-too-complex project

In today's world, I have a group of C language files + headers, in a directory.  There likely are some common project headers in a common directory somewhere.  These are part of a larger project where my working directory is one level deep from the top level Makefile.  Make is merely a synonym for "dependency based rule thing".

Top
+----- Common Headers
+----- My working dir
+----- Others...

So, I'm a user.  How does this simplify this user's life?  Perhaps its too simple, so it doesn't?

Anyhow, need some illustrative examples.

Nat Ersoz
Thursday, February 27, 2003

Really rich metadata. Filesize etc. could still be there as metadata, but also information about which category of data this file contains, which project it is associated with, who wrote it etc. etc. (There are probably better examples of metadata for files, I haven't thought much about this yet.).

In other words, the "filesystem" would be organized in terms of metadata, much more like a human's memory I suppose. The metadata could be automatically generated or manually entered by someone. I suppose a directory hierarchy could be implemented as metadata as well, so if you wanted directories you'd be free to use them.

This would work well with a database-based filesystem as well I suppose.

I'm just thinking out loud. =)

Karl
Thursday, February 27, 2003

A recent article on osnews.com touched on this topic and generated some interesting discussion as well.

http://www.osnews.com/story.php?news_id=2762

Brian B
Thursday, February 27, 2003

DataCubes :-)

Prakash S
Thursday, February 27, 2003

BeOS was not in any way relational. It just allowed extended attributes.

I think before trying to come up with new design of file systems the problems with the current ones should be identified. Until then it is a pure speculation which will not lead to any good design. Personally I see nothing wrong with hierachical directories.

Btw, if someone is confused by subfolders he might very well not use them.

Passater
Thursday, February 27, 2003

What I'm hoping these future files systems do is build in flexibility to handle all the things that we might want in the future but can't think of now.

What I mean by this is rather than saying: "here's how we're going to organize files", instead build a system that makes it easy to overlay organizational structures on top of the files themselves.

Then, if you want to look at your files in hierarchical folders you use the "folder viewer". If you want to look at your files as filtered attributes, use the "attribute viewer". If you want to look at your files as a related web of items, use the "relationship viewer". Etc, etc. And make it easy for programmers to create new "viewers" (with flexible meta and meta-meta-data and lots of API hooks).

What I'm getting at is that filesystem programmers should probably not try to guess what the next big thing will/should be and instead make a system that easily supports the innovations of the future. I'd like to see the filesystem of the future clearly separated between the underlying storage of data and meta-data and the views of that storage.

Did that explaination make sense?

Bill Tomlinson
Thursday, February 27, 2003

I'm 95% happy with the way file storage works now. The other 5% is that I want symlinks, dammit!

Philo

Philip Janus
Thursday, February 27, 2003

Bill, I think that makes absolute sense and I wonder what innovations were not available because current filesystems weren't powerful enough to support elegant abstractions.

sammy
Thursday, February 27, 2003

Bill makes an excellent point.  There are two problems here.  The first one is the internal structure of the system and the other is how this structure is presented to users/client systems.  Using a set or relational model as the backbone of the system is useful internally, but definitely not something I'd expose raw to any but the most sophisticated of users.

Georg Cantor
Thursday, February 27, 2003

Actually, I think one my biggest gripes about current file systems is that the presentation is basically identical to the implementation.  They are tightly bound together and you have to jump through hoops to change either one independently of the other.

Georg Cantor
Thursday, February 27, 2003

Agreed with Karl about really rich metadata.

So if a file is mis-filed you can still find it by doing a google-like search on it.

Or if you had a file 3 years ago and all you can remember was it started with xyz.

Etc.

anon
Thursday, February 27, 2003

check it out:

find / -type f -exec grep -lH xyz {} \;

If its ascii, it'll find it.  Just need a unicode grep, and I'm there.

Nat Ersoz
Thursday, February 27, 2003

Yes, search is key. Which is quicker: using google to find a file in the universe of the internet, or using your filesystem to find a file on your local constrained machine?

You can store all your files in Outlook today. (One of the many things it is designed to do.) Outlook has categories, sort by date/length/keyword. And by subject. And it has this weird journal thing which you have to remember to turn off (well, it's off by default now) which lets you search by when you looked at a file, not just when it was last saved.

What I find interesting is that I sometimes use hierarchy (I have folders for each project and move mail there), but often resort to flat views (search sent or deleted items). This is just for mail, I never put documents there, but frankly most of my 'documents' are either mail messages or pointed to by mail messages.

mb
Thursday, February 27, 2003

What you've got there is basically a grep-based query language.  A relational sort of file system would have exactly that sort of thing available, but hopefully much more powerful, because more (perhaps arbitrary) metadata would be available.  A file system that supported inter-file relationships explicitly would be more powerful still.

Grep++
Thursday, February 27, 2003

One problem with 'metadata' filesystem is how to specify the metadata. For example, if the data is organized by say, the Author name, then someone has to *first* provide that value. Which means that when a file is being saved, a whole bunch of extra information has to be provided, while today, with file system, you just select the directory (most probably the same one as before) and give a name to the file and you are done. Laziness always wins :-)

For example, all office documents today already support a rich set of metadata (File->Properties), but how often do we use that ?

satya
Thursday, February 27, 2003

I am glad others mentioned it; the BeOS file system was (is?) years ahead. Everything the hyped new MS file system will do I had while running BeOS.
The metadata on every file was great. The files knew what program created them, when, their size, and anything else you wanted to throw in. There was no idiot mapping of files to programs using three character extensions. In fact the address book was a directory with each address stored as metadata and a 0 length file.
The BeOS file system used 64 bit offsets, so files could be over a petabyte.
It was also a journaling file system. That means you did not have to wait through a file system check if you kicked the plug out of the wall. It was actually hard to loose data.
There was a sort of file manager that could index through all of the metadata. Doing a find took microseconds, no need for animations to tell users the computer is busy.
I had all this in 1997/98 and I miss it.

Doug Withau
Thursday, February 27, 2003

According to the self proclaimed "genius" David Gelernter, TIME is the most basic dimension, therefore users will relate better to the computer if all of there stuff is organized by a stream of time.

See more here: http://www.scopeware.com/

Personally, I hate it.  I like files and folders.  Screw the idiots that can't figure it out, I'm not catering to stupid people, I do programming for my own enjoyment.

Wayne
Thursday, February 27, 2003

Here are a few actual examples of using a BeOS-like database-driven file system.  BeOS works very much like this (I've simplified the explanation a bit for clarity).

I have all my MP3 files tagged and sitting in various folders on my hard drives.  I want to listen to all my techno music (which may be actually sitting in the Anime, Fun Background Music, and other folders).

I right-click on the desktop and select "New Query".  A query window pops up.  I make the following selections:

File Type: mp3
Field: "Genre" = "techno"

When I click the OK button, a new window pops up displaying all the MP3 files that match this crietria.  Because all these files are stored as a database, the list is completely populated within several seconds.  I save this query, naming it "Techno Music".

I then remember that a friend e-mailed me some time in the past week with a request for help.  I create another new query, that looks like this:

File Type: e-mail
Field1: "From" contains "george@stine", AND
Field2: "Date" is greater than "one week ago"

Boom; I see a list of all the e-mails I've received from my friend in the past week.

The OS comes with a little daemon you can set up to download e-mail automatically in the background.  I've set up a simple query that displays all the e-mail received in the past three days.  I leave that open on my desktop, and I now have a complete e-mail solution.  Double-clicking on any e-mail opens the e-mail, obvoiusly.

If I accidentally close my Techno Music query window, I can always right-click on the desktop, select "Saved Queries" and find it in the list.

Brent P. Newhall
Thursday, February 27, 2003

Just to play devil's advocate for a second...what happens when the file you're looking for wasn't properly categorized? Let's say one day I want to throw a note about a project, write it up in notepad, and save it as a readme file in the project directory. How would that work in a database scenario? I might have a few hundred readme's lying on my harddrive.

I do have a couple solutions in mind, but I'm curious how this BeOS would handle something like that.

The DA
Thursday, February 27, 2003

Karl,

The human memory is an associative one. Information is stored in the weights and thresholds  of neural networks in ways we don't really understand or at least predict.

As a plus, the brain is capable of changing semantics of symbols in the process of thinking - a very important process. For example when we read "APPLE" after recognizing the letters A P L E and putting the together as APPLE our mind can replace the symbol with the apple as a concept - what an apple means to each one of us.

Point is, this way of storing information is convenient only for "brain like" computers and completely useless for "computer like" computers. Maybe quantum computers will use this sort of memories, however a quantum database (and querying on such a database) quite different from our current database technologies.

Cheers
Dino

Dino
Thursday, February 27, 2003

>>>> Before you get all esoteric, usability research has constantly shown that most users don't understand folders within folders. Outlines and hierarchies are extremely logical to programmers but are not well grasped by the general public.<<<<<<

Alan Cooper made much of this in "The lLunatics are Running the Asylum". His example was of Mary leaving her computer for lunch with the "My Documents" folder open. The syssadmin came round and did some routine mainenance and left the computer with the explorer window open at the C drive. Mary returned and panicked to find her documents gone.

I've seen users (on a multi-user machine) actually end up with four My Documents folders and be overjoyed to find that they actually still had everything in place - normally in about three different versions.

I discussed this once with one of the lecturers who do the "Into to Computers" course at our technical college. We agreed that we could throw away the whole course but that one thing that really needed teaching was the concept of files and directories because few picked it up intuitively.

Stephen Jones
Thursday, February 27, 2003

All the above comments indicate that you techies have no role to play in any meaningful debate about the way forward.

Realist
Thursday, February 27, 2003

Realist,

Taken by your contribution to this discussion, its obvious you do.

Patrik
Thursday, February 27, 2003

MSFT has considerably poisoned the files/directories paradigm, both for developers and users:

1. Adopting backslash '\' instead of the historic forward slash '/' for directory delimiters.  Very nice.  The nice thing:  it affects MSFT more than anyone else.

2. Changing the "My Documents"  location (ie the default location of a Word document) with each new version of Windows and sometimes Office.  Its here, its there.  Now go try to find "My Documents" from the command line.

Is it really the file/directory paradigm that is broken, or is it how its being used?

Nat Ersoz
Thursday, February 27, 2003

Brent, thanks for your description of searching for things using BeOS. It's a real shame Windows doesn't do this as well.

What happens if you want to classify music as belonging to multiple genres? For instance, "Ghost In The Shell" might be "Anime" and "Techno". Fatboy Slim might come under "Norman Cook" and "Dance" whereas Beats International might be under "Norman Cook" and "Funk". Apologies if you don't recognise the music, but I'm sure you get the idea.

I'm assuming that the database-driven filesystem is designed to solve just this sort of problem. But (and this is the main question) how do you move files between computers / over a network? Everyone will have a different way of categorising, therefore different metadata. How do you keep your classifications? How do you ensure that someone else's classification system doesn't have tags that clash with yours?

One solution would be to have agreed metadata schemes for various types of structure (MP3 collection, sourcecode, etc). But then doesn't that become as restricting as the current filesystem designs?

Adrian Gilby
Thursday, February 27, 2003

>> Adopting backslash '\' instead of the historic forward slash '/' for directory delimiters.  Very nice.  The nice thing:  it affects MSFT more than anyone else.

That's such a non-issue...  what difference does it make?  And the win32 api lets you use either forward or back slashes.

Brian
Thursday, February 27, 2003

'Folders', 'Directories', 'Files' are just words for an abstraction.

1000 years from now, assuming the existance of the human race, the same abstraction will be in place, it might be called something different, but it will be the same.

Realist
Thursday, February 27, 2003

Nat,
The Backslash is a historic relic, From CPM (DOS is not UNIX based but CPM based), Also Note that the Internal Win32 Filesystem APIs can and do use the forward slash.

A Software Build Guy
Thursday, February 27, 2003

'They can and use the forward slash do delimit directories'
is my complete statement

A Software Build Guy
Thursday, February 27, 2003

Nat,

I don't know that the file/directory approach is broken.  To me, it's just a little limiting.  The file system presents a single major view of your information, but it would be nice to be able to maintain several views at once - several logical organizations built on top of a single physical data store.  All the other stuff that has been mentioned is nice too (powerful queries, extensible metadata, etc), but for me it would be a huge win to even be able to construct several hierarchies without having to copy files.

One thing that really bugs me, though, is using what essentially amounts to an address in places where one would normally want a name/identifier.  What is commonly called a filename is actually a file address.  If the file moves, that filename no longer represents the file.  In some cases, this is a good thing, but there are many cases where it is not correct behavior.  This would be like everyone referring to you by your street address instead of your name.  It would be nice if each file had a permanent, unique identifier associated with it so that you could access the file no matter where it went.  URLs exhibit the same problem, which is one reason people have to employ all the whacky mapping schemes or put real identifiers into the URL so they can be parsed out later.

Chicken Sandwich
Thursday, February 27, 2003

Interesting that BeOS is mentioned here but not the Mac. HFS had folders, but it also had metadata (filetype != filename, and  a limited # of other attributes). Files were numbered (I think, I don't know the details), not named. So you could move or rename a file and many things still worked (including aliases).

In OS X they broke much of this, since it's really Unix. Funny, the filesystem UI in OS X is much harder for me to use than either modern versions of Windows or ten year old versions of MacOS.

mb
Friday, February 28, 2003

There's a good paper on this topic at http://www.reiserfs.org/whitepaper.html

MattF
Friday, February 28, 2003

\ vs / is a non-event except those poor people like me who use the commandline occasionally:

* Try tab-complete with /

* Try to convince some apps to tell the difference between / meaning root (of drive, after all, this is windows) instead of denoting a command switch (what happened to - and --?)

If you are swapping machines constantly, the differences between unixes isn't very pronounced.  Minor things.  But the difference between windows and unix is crazy.  I now have lots of one-line batchfiles to execute the dos equiv when I type unix commands like "ls".

Back to database filesystems:

ReiserFS is well worth a look.
As is the AFS used by AtheOS.
As is this discussion here: http://www.theregister.co.uk/content/4/24648.html

nice
Friday, February 28, 2003

The file/folders hierarchy was good for small sets of simple things, but it breaks down quite quickly. As others have said before, the limitation of having to define a single taxonomy for classifying all your information is inconvenient to put it mildly. For most users "Start/Search" and "Start/Documents" have long since replaced the manual browsing down the folders in Explorer. For received files, we just leave them in Outlook since it is far more flexible to find them that way.
The file system as a DB is just a way of making these methods of access more universal. You look for that market report that contains the 2000 figures for the New York branch that edited by John and forwarded by you to the regional directors.
We can imagine many user interfaces to this, but one could just be everything we have now, only “search files or folders” will be very much more powerful integrating all separate search facilities you have now in the different applications.

Just me (Sir to you)
Friday, February 28, 2003

Nice,

in the light of research that shows most users fail to understand the folder within folder concept, the direction of the slash on the commandline certainly seems to qualify as a non-issue.

Just me (Sir to you)
Friday, February 28, 2003

So far everybody has been concentrating on the implementation details. 

Can anybody think of what METAPHOR could replace folders and files?

I think that people think spatially.  Many effective memory mapping techniques involve placing items spatially.

This is the problem with computer interfaces, they all exist on a little computer screen crowded on top of each other.  It's all recursive, with things nested inside other things.  For most users its just to confusing.

I think the idea of different 'rooms' for activities is good.  Within the room you have work surfaces and tools.  There was a lot of ideas like this when VR was all the rage.

I think this is the way forward for users.  Obviously developers would demand a completely different view of everything.

Ged Byrne
Friday, February 28, 2003

I'd like to see a natural language interface.  I don't usually want to view all my files and browse all over the place when I interact with the file system.  I don't even really care where the stuff gets stored.  It would be nice to have a list of recently used files and then a natural language query engine that  could go retrieve stuff that I specify in my query.  "Find all the files used in version 3 of the gigatron project."  Even better if the system could interact with me to refine the search and perhaps learn some of my vocabulary.  Someday :)

Chicken Sandwich
Friday, February 28, 2003

There are systems that don’t have a traditional folder (tree like structure), and you can still organize the files and data very well indeed.

A great example of this is the Pick operating system. I have often mentioned this database system on this BBS.  While the pick system now does run on Linux, Unix,, and also runs on windows NT, it for many years ran as a native OS. That os was NOT tree/folder based. In fact, any time you listed some files, you were in fact using the query processor to do so. There is no tree structure in that system.

This course meant that you could use the same selection commands to select source code files and copy them as you could with records.

So, for example, to copy some records between to databases you could go

Select PersonalNames where City = “Calgary”
>15 items selected
copy PersonalNames
(to: OldNames
>15 records copied.

For source code, then above same commands could be used, but source code in Pick DID NOT have field attributes defined  (you could not define additional fields. The reason is that source code was in fact a record in a database like everything else. (the first line of code would be consider field1, etc). When you compiled the code (p-code it used), the compiled code was actually stored in the dictionary (field) def file.

The system did not have a tree structured dir. However, it did go one level deep for files. As it turns out, that one level deep was rather just nice.

A listfiles (equivalent of a dir in windows) from that system would look like:
(I am actually running pick as I type this).

Listfiles
You get:
http://www.attcanada.net/~kallal.msn/test/listf.gif

I can also drill that ONE level down  to any of the above files. So to list the HotelRates File.

Listfiles dict HotelRates 
You get:
http://www.attcanada.net/~kallal.msn/test/listfdict.gif

That gives me the files for that database. Note that 3 items out 13 are displayed. The other items in that database are field defs. We can query that data to get those with:

Listdict dict hotelrates
You get:
http://www.attcanada.net/~kallal.msn/test/ld.gif

Note that every single one of the commands I am typing in are actually a shell command to a query processor. I could list/show what each command above actually gets translated into, but that is not needed.

And, finally, here is a actually query on the data table:

List hotelrates nameOfType
You Get:
  http://www.attcanada.net/~kallal.msn/test/Ilist.gif

The above shows just the first two records.

Note that only TWO records are displayed, but each is multi-valued. In fact, the roomtype is actually a lookup join to lookup up table called roomtypes. Note how the one record has several values for each field.

Thus, a very neat, and very different system. It is very much like going to a different planet. Each record in the system can in fact be a whole table. Hence, the multi-value, or multi-dimensional system it is so called. The tables can go ONLY one more dimension deep here.

However, most developers avoid the 3rd dimension in pick (or jBase, or IBM’s UniVerse database…since those databses are all pick compatible, and the above commands will work in all of them). I think it is quite telling that we tend to avoid the 3rd dimension.

Having developed software in a non tree like system is certainly a treat. I wish all of you could experience development in a non tree type system at least once in your computing carrier. I think it rounds out ones computing viewpoint, and gives one a different view on how to organize data and files. It is kind like a trip to Europe! It has been a great experience for me.

However, I have to say, that most systems to day do present data as a tree type system, and that is totally just fine to me. Even if the data/files are all stored in a database, it will still most likely be presented to the user as tree like structure to browse. That seems to work the best right now.

Remember, lets not confuse a general file search ability in os (which all systems should have), with the idea of a tree like dir structure. In searching, we don’t care about that tree structure. (so, BeOs, or pick is ok);. However, for ORGANIZING our own files, a tree structure is most certainly welcome.

Good Searching is mutually exclusive concept from that of a tree structured dir being useful.  You can have one, without the other.

While new users have trouble with tree like structures, once learned, they use a tree structure very well indeed. Lets not shoot the trees because we don’t have good search abilities for our files. Trees are a great way to organize our files.

Albert D. Kallal
Edmonton, Alberta Canada
Kallal@msn.com

Albert D. Kallal
Friday, February 28, 2003

Ged,

MS has hinted at this VR style interface before. Technically it is certainly feasible on todays machines, and I believe MSR has had some stuff on this. It is a nice addition to browsing, but it will not be a replacement for the unified searching that I think would be the most significant advance. To call that just an implementation does not do it justice. Think of it as the wizardly secratary sitting in your VR office world that will find all the stuff for you on the basis of your descriptions of any kind of attribute, content , relation or usage pattern or what ever else you fancy.
Would you bother cleaning up your desk if it stuff was never in the way and retrieval was just a matter of "ask and ye shall find"?

Just me (Sir to you)
Friday, February 28, 2003

How well are all the non-directory file systems going to work in a multiple-user environment?

Stephen Jones
Friday, February 28, 2003

"How well are all the non-directory file systems going to work in a multiple-user environment?"

There is no need not to have traditional file systems, as long as they are for the computer, not the human. The computer can easily manage all the intricacies of the file system.

But, for retrieval, humans need something that more closely fits our mental models. Multiple-user issues are simply one more factor to be taken into consideration while designing a better human experience.

Practical Geezer
Friday, February 28, 2003

"What happens if a file isn't categorized?"

Well, it already has some metadata applied to it -- its name, its date of creation, its file type, etc.  You can always search for all files with "README" in the name field, with a file type of "plain text".

Incidentally, BeOS stores the file's type as an attribute, using the MIME type standard.  So, when you save a new file in a text editor, the editor will set the new file's type to "text/plain".  This makes searching for files very easy.  Also, BeOS assigns plain English labels to each MIME type, so for example, instead of searching for "audio/x-mpeg", you can search for "MP3 File".

"What happens if you want to classify music in multiple categories?"

Here's my solution; I don't know if it's the best:

Let's say I have a techno piece from the anime series "serial experiments lain."  I set its Genre attribute to "techo,anime".  My query for techno music searches for all MP3 files where the Genre *includes* the text "techno".

"How do you resolve clashes when transferring files across a network?"

In BeOS, if you're copying a file between BeOS systems, the attributes remain.  If you're transferring to a non-BeOS system, the attributes are lost.  I don't know of any simple solution to this.  A standard would have to be written.

IMHO, attributes are worth having even if they're not cross-platform.

Incidentally, if you're interested in a modern alternative OS that has a BeOS-like file system, try Syllable (http://syllable.sourceforge.net/).  I understand that ReiserFS for Linux has this capability, but I'm not familiar with its status.

Brent P. Newhall
Friday, February 28, 2003

What would be wrong with a single flat directory
with a search engine? You could of course add
categeries, like meta tags in html. These categories
could be hierarchical.

valraven
Friday, February 28, 2003

---"What would be wrong with a single flat directory
with a search engine?"===

The nuimber of times you would want to save a file with the same name as another for one thing. At least with separate directories you can do that.

Also, performance problems. The register article referred to further up which deals with the BeOS refers to that.

Stephen Jones
Friday, February 28, 2003

Stephen Jones raises a good point.  Having no directories creates its own problems.

One solution is to use both directories and queries; that's what BeOS does.

Brent P. Newhall
Friday, February 28, 2003

Brent, I'm not surprised that the extended attributes are lost when copying to a different OS. I was more interested in how the metadata is "merged" when copying between BeOS machines. For instance, my definition of "genre" for MP3 files might be completely different to your definition. What happens when we transfer MP3 files to one another? MP3s are perhaps a trivial example but I'm sure there would be other more serious areas where problems would occur.

What I'm trying to get across is that there's no accepted standard for how you classify your files, and therefore everyone's classification systems will be incompatible. Database-like "any attributes you like" file storage surely needs to have some imposed metadata structure to avoid this problem?

Adrian Gilby
Friday, February 28, 2003


---"What would be wrong with a single flat directory
with a search engine?"===

>>The nuimber of times you would want to save a file with the same name as another for one thing. At least with separate directories you can do that.

No, there is no problem entering a person with the same name into a database twice. You folks are still thinking in terms of file names, and not a database driven system. If you have two names the same, be it a file name, or a persons name, a database handles this no problem. In fact, for tracking revisions, or a nested un-do/go back document system it can’t be beat. Why start numbering 10 revisions of the same document by tagging on a number at the end? Why number it Sales1, Sales2, Sales3 etc?? Why not just the documents with the SAME NAME all in order by last edited?  A database system is MUCH BETTER in this regards.

>>Also, performance problems.

Well, for large numbers of data sets and files, a database is MUCH faster then searching a stupid dir based system. Just like a spreadsheet is fine for 5 to 10,000 items, after that a database runs absolute circles around a spreadsheet. There is no contest.

In fact, most of the new high performance file systems are indexed, and are really a database now anyway. Throwing in a engine is natural progression of this.

We can use a database now because we don’t have the restrictions on memory and resources that most “dir” based systems were developed under.

A database is no good when you have a 16K Apple II, or a trs80 system.

When you give that database engine enough memory and processing, it begins to easily win hands down as compared to a simple file based dir system on disk.

With the average pc starting to push 100,000 files, a database engine is the ONLY WAY to go.

The folks in Redmond have a very good grasp of this problem. The DIRection they are going make total sense. Do the other OS venders realize this?

Does Sun see this?

Will the open source community be caught of guard on this issue also?

Albert D. Kallal
Edmonton, Alberta Canada
kallal@msn.com

Albert D. Kallal
Saturday, March 01, 2003

With BeOS, lots of attributes are defined by the OS automatically.  All MP3 files have attributes for Artist, Album, Title, Track, Year, Comment, Genre, Rating, etc.  If you create or download a new MP3 file, those attributes are created on that file automatically (even if they're all blank to begin with).

If I add an attribute named "My Little Attriubte" to a file, I can copy it to other machines and that attribute will stay on the file and the OS on the other machine will see it and be able to manipulate it.

Brent P. Newhall
Saturday, March 01, 2003

*  Recent Topics

*  Fog Creek Home