Fog Creek Software
Discussion Board




Difference between word doc files and RTF files?

Anybody know what is the difference?

Anon
Wednesday, April 21, 2004

.doc = Microsoft Word Document (usually)
.rtf = Rich Text Format Document

Word can read all RTF, but not all RTF readers can read DOC files.

Greg Hurlman
Wednesday, April 21, 2004

Duh!

One format is proprietary and one is open?

You can not legally reverse-engineer the format of .DOC files, however I think you can license the format from MS.

Patrik
Wednesday, April 21, 2004

>>You can not legally reverse-engineer the format of .DOC files

Of course you can. If you're thinking of the DMCA, it allows for reverse engineering for interoperability.

RocketJeff
Wednesday, April 21, 2004

"The Rich Text Format (RTF) Specification provides a format for text and graphics interchange that can be used with different output devices, operating environments, and operating systems. Version 1.8 of the specification contains the latest updates introduced by Microsoft Office Word 2003. RTF uses the American National Standards Institute (ANSI), PC-8, Macintosh, or IBM PC character set to control the representation and formatting of a document, both on the screen and in print. With the RTF Specification, documents created under different operating systems and with different software applications can be transferred between those operating systems and applications."


Download the latest specification from http://www.microsoft.com/downloads/details.aspx?familyid=ac57de32-17f0-4b46-9e4e-467ef9bc5540&displaylang=en

Just me (Sir to you)
Wednesday, April 21, 2004

RTF is a markup format (like HTML) and (relatively) standardised, while DOC is basically just a dump of Word's internal data structures (i.e. it's binary and changes at the whim of MS). DOC is much richer than RTF which can only handle a small subset of Word's features - now that word 2003 uses XML I guess RTF is basically obsolete.

choox
Wednesday, April 21, 2004

Worth noting that word will autoread an .rtf if it's been saved with a .doc extension.

Peter Ibbotson
Wednesday, April 21, 2004

Word will also auto read an RTF if you associate that file type with Word.

Steve Barbour
Wednesday, April 21, 2004

Slight tangient, but I read that Microsoft copywrited their new XML/DOC format for MS Office (whatever-the-hell-current-version-system-they-are). 

Does that comment about reverse engineering for interoperability apply then?  I'm thinking about products like OpenOffice/Star Office.

Lee
Wednesday, April 21, 2004

Lee: It isn't a copyright issue for MS's XML formats but one of patents: http://news.com.com/2100-1013_3-5146581.html

Under the DMCA (Digital Millennium Copyright Act), reverse engineering is permitted for interoperability.

MS is trying to patent the XML formats for its office applications. Patents, copyrights and trademarks have all been lumped together as "Intellectual Property", but they are all different and covered by different laws.

RocketJeff
Wednesday, April 21, 2004

Copyright applies to copying as in duplication and assigned to the author of any creative work by default. I.E only the author has the right to make copies of work.

Reverse engineering is completly safe. Cracking on the other hand is not for the same reason.

To protect something against reverse engineering you need a patent.

Eric Debois
Wednesday, April 21, 2004

...and even if there are patents, you can still reverse engineer, you just cant build a work-alike based on what you learn.

Eric Debois
Wednesday, April 21, 2004

Under DMCA the _only_ thing you need to do to protect against reverse engineering is encrypt it.  To qualify as encryption, you need only employ a mechanism which prevents plain text viewing.  Can someone say rot13?

With the new XML format, and it patent, along with the DMCA, interoperability will not be allowed with licensing. A legal fix to a pesky problem.

Anonanonanon
Wednesday, April 21, 2004

No one seems to have mentioned that .doc is really a family of formats...

and MS reserves the right to add to the family ad nauseum.


Wednesday, April 21, 2004

Have not seen Open Office run into legal troubles yet it reads the Word 2000 format perfectly I think

foo
Wednesday, April 21, 2004

There are lots of 3rd party libraries that read and write .doc files (well, all the Office files, really... we use the Excel reader and writer from Aspose in .NET, and it's excellent).

Brad Wilson (dotnetguy.techieswithcats.com)
Wednesday, April 21, 2004

RTF changes too... this just popped up on the 'recently added to MSDN' list:

Version 1.8 of the [RTF] specification contains the latest updates introduced by Microsoft Office Word 2003.

http://www.microsoft.com/downloads/details.aspx?familyid=ac57de32-17f0-4b46-9e4e-467ef9bc5540

mb
Wednesday, April 21, 2004

So does anyone else think things have gotten waaaaay out of control if an _XML format_ can be patented?

Kyralessa
Wednesday, April 21, 2004

Yeah, it's ridiculous. Saving text in an existing text format is a patent? WTF?

Chris Nahr
Thursday, April 22, 2004

Yes. Everything digital is just [01]*, right? What's the big deal?

Just me (Sir to you)
Thursday, April 22, 2004

Don't be an idiot. The idea of encoding print formatting as ASCII markup along with the text itself has been around for decades. Off the top of my head here's TeX, troff, SGML and DocBook, HTML, and RTF. Trying to patent another slightly different flavor is ridiculous, there's no innovation whatsoever.

Chris Nahr
Thursday, April 22, 2004

"Don't be an idiot."

I tried, but I can't help it.
What I was commenting on is the "in an existing format" part. It was probably an overreaction.
I'm probably sensitive to this since I am surrounded by so many sunnyboys around here that think that once they have declared the format for the data will be "XML" they have actually made a contribution.

"Hey Bert, we need a good format for streaming these graph data files over a fast but very unstable network. Usually they are heavily clustered with dense connections within but sparsly interconnected. Any ideas?"
"Yeah, use XML"

Just me (Sir to you)
Thursday, April 22, 2004

I'd guess it's not the fact that it's XML that's patented, but rather what's expressed using XML?

"Engineering diagrams on blueprints have been around for decades: so if it's on a blueprint then it can't be innovative or patentable."

Christopher Wells
Thursday, April 22, 2004

Strawmen are flying low these days... When you submit a patent application with a blueprint, you want to patent the _content_ of the blueprint, whatever contraption is depicted in it.

What Microsoft wants, to stay with this image, is to patent a _style_ of blueprint. Like differently shaped arrowheads and what they should indicate.

Chris Nahr
Friday, April 23, 2004

I was hoping that, if they were patenting, then they were patenting content.

A copy of the patent application is here: http://v3.espacenet.com/textdoc?DB=EPODOC&IDX=EP1376387&QPN=EP1376387

It does look like prior art to me, in general. Maybe their motive is defensive then? To try, and then to be able to say "Our failing showed that all this isn't patentable: so, don't you come claiming that we're violating a patent of yours."

Or, perhaps the 'devil' is in the details: for example ...

    [0047] Other information may also be included within the document that is not needed by the word-processing program. According to one embodiment of the invention a "hints" element is included that allows external programs to easily be able to recognize what a particular element is, or how to recreate the element. For example, a specific number format may be in a list and used by the external program to recreate the document without knowing the specifics of the style.

... and then argue about whether this "hints" element is prior art.

Christopher Wells
Saturday, April 24, 2004


can anyone give me the link where i can find .doc 97/2000/xp specification. will it be available on Internet or not?

phani
Wednesday, June 16, 2004

*  Recent Topics

*  Fog Creek Home