Fog Creek Software
Discussion Board




Format for a log file

Hi: I am writing a simple class to log messages and errors into a text file. I'm thinking of logging the messages in XML format:

<Log>
<File> xyz.cpp, line no </File>
<Message> ... </Message>
<Time>.... </Time>
</Log>

Is there anything that I may be missing or is there a standard format for a log file?

Bob
Monday, March 15, 2004

XML is okay but since your log is probably strictly two-dimensional (e.g. every row has the same columns) it will probably be somewhat verbose -- column names are repeated (twice, no less) on each row.

On both Windows and Unix you can use built-in logging facilities (event viewer and syslog, respectively) which have the advantage of an entire ecosystem of utilities that play well with them. For example one day you may want to get SNMP traps associated with logs, or get error events emailed to you. There are a ton of problems with event logging which you need to think about:

* same message repeats a million times, filling up the log ... you need to get the dupes removed and replaced with a comment like "previous message was repeated 983 times"
* logging things like "out of disk space" error messages, when the error log itself is on disk ... you need to preallocate space for a log and allow old messages to scroll off
* putting input that comes from a user into a log ... frequently has the potential to be a security hole unless you are very, very careful about the length of the data that was handed to you and very, very careful that all your downstream tools will be able to parse it
* multithreaded issues ... if one thread writes half a line to the log file and gets suspended, will the next thread, which needs to write a line, corrupt the file?

You could do a dissertation on this one topic.

Joel Spolsky
Fog Creek Software
Monday, March 15, 2004

Maybe this would make more sense, and cut down on redundancy of labels.

<Log time="..." file="..">[message]</Log>

Bleh
Monday, March 15, 2004

Heck you could probably write an operating system. :) An RDBMS is nothing more than a glorified logging system.

Li-fan Chen
Monday, March 15, 2004

Just because you can do something,  doesn't mean you should (:

Bleh
Monday, March 15, 2004

Look into log4j if this is for Java; a lot of these issues have already been addressed for you in a cross-platform way.  Even if you're not using Java, it's probably an excellent resource for learning more about the types of problems you'll run into.

Mr. Nobody
Monday, March 15, 2004

There's no reason to use XML. I'd use a concise format, so you can see as much information as possible on a single screen.

(2004/03/15 23:56:32) [xyz.cpp:144] Posting to JoS

If you wanted a program to read the log file for some reason, it's trivial to parse.

Julian
Tuesday, March 16, 2004

When should you write </Log> to the file?

It might be better to use <td class="LogLevel1">blah</td> with associated <tr> etc and use a browser to view the logfile directly. You can use css or dhtml to provide filtering, colours etc.

Justin
Tuesday, March 16, 2004

You will have to define your requirements first.

Should the log file be human readable? Should the log file be parseable? Does it have to be XML, so you can put on the product box that your application is XML enabled? Is the log file format constant or will you have different fields for each message? Is the log structure flat, tree like or even more complex? How much data will be added to the log per second? How many data will be added in 2 years from now? How do you rotate data? Do you need multi-threaded access to the log data? Will there be reporting tools that need to sort data or make queries? Do any reporting tools need to be live?

Depending on your answers to the above requirements a database, the OS log system, text and XML formats could all be your best candidates.

Personally, I like to keep things as simple as possible. For simple logs nothing beats UNIX type text files that can be easily monitored using a tail command. But I like Bleh's solution too, because it uses both XML and has only one message per line, making it better parseable at the cost of loosing just a bit of readability.

Jan Derk
Tuesday, March 16, 2004

loose lose

I am a bilingual illiterate. I can't read or write in two languages.

Jan Derk
Tuesday, March 16, 2004

and there's log4net for .NET, though I haven't used it or log4j. they apparently handle some formatting and output.

another reason XML may be useful is aggregation. an example is nant and cruisecontrol.net. nant can output either 'human friendly' or XML format (it uses log4net). nunit can output XML (same thing). cruisecontrol reads in the output of the various things & presents it all in one place.

and if you get your really important events in a format designed for aggregation (e.g. RSS) you can read them in your favorite 'news' reader. one inbox and all that. an existing XML format lets you write an XSL to do this. Other formats (including, say, SQL) can of course do the same thing, just slightly different code.

mb
Tuesday, March 16, 2004

A log file should be informationally dense, self explanatory and readable with minimals if any special tools.

I.e.:

10 Jun 2003 11:58:36 [Module] [Error] message...

The goal being that after a fubar, one can open the log file in vi if it's Unix, any random text editor in Mac OS X or Windows or the Windows Event Viewer. Not with "MyApp Event Viewer". Why reinvent the wheel? Even more importantly, why go for a log format that cannot be easily scripted on?

Alex
Tuesday, March 16, 2004

In my experience, mission critical applications rely utterly on their log file. You don't want to be in a situation where, with your job on the line, you don't have access to your normal tools when the application has crashed, assigned the money to the wrong person etc.

For instance, you may be remote, and have to telnet/ssh into your system, only to find your verbose log file is hard/impossible to analyze.

An informationally dense textual log file as recommended earlier is definitely preferable. Bear in mind that it is easy to script a process to convert the log file into a suitable xml format.

Chris
Wednesday, March 17, 2004

Ideally the logfile writer should warn the user when space runs low, but is not quite full.  For example, emailing/paging the admins in a server app.  Running out of disk space is not just bad for the log but will nuke other operations.

Or in an iTunes-like app, it could pop up an unobtrusive message.  Because almost the entire effect of such an app is to eat hard drive space.

This is why I would like to see the Conditions System migrate to other languages, like Python or C#.  It is quite possible for a library to signal warnings that the app handles (say, by contacting admins), or ignores, without obliterating the stack.  Or breaking modularization.

Tayssir John Gabbour
Wednesday, March 17, 2004

* multithreaded issues ... if one thread writes half a line to the log file and gets suspended, will the next thread, which needs to write a line, corrupt the file?

Just to be clear here - this isn't an issue unless you do something odd in building this kind of an application.

Always open the file in append mode (It's a log file, you don't ever need to seek in it), and always write your output in a single call of less than 4k. (Really, the C PIPE_BUF value, which is probably the same as the native page size, which in turn is typically 4k)

As far as I can tell, that gets you pretty much guaranteed atomic writes with no chance of corruption, on pretty much any operating system and file system.

Ryan Anderson
Wednesday, April 07, 2004

*  Recent Topics

*  Fog Creek Home