Fog Creek Software
Discussion Board




Reading text files

I have an application which parses documents. The documents are received in XML format then processed into a database. The database has a web interface for the users.

For admin purposes, on occasion we have to review the contents of the received XML documents (usually tracking down parsing bugs). In the admin interface I currently have a file listing, but the file names don't say much. I'd like to display some content from each file.

Is there a better way than simply opening each file and reading the info I need?

Philo

Philo
Saturday, May 03, 2003

Do you need to look at the XML files because they're not being parsed into the database correctly? Otherwise couldn't you just write a query once it's in the database?

Most server side scripting languages have file handling capabilities.. You can probably write a script that opens the file & parses it... or find one that did and modify it for your needs.

www.marktaw.com
Saturday, May 03, 2003

Think Google.

Instead of showing a list of file names, show a google type listing with the filename and add the first few lines of each XML file and any other info that might help identifying what you're looking at.

So instead of:

FuzzyFile.xml
IHateClearNames.xml
Something.xml

You get:

FuzzyFile.xml - Create 5 January 2003 - 23.4KB
<?xml version="1.0" encoding="ISO-8859-1" ?>
<atag>
  <contents>I like reading XML.</contents>
</atag>

IHateClearNames.xml - Created 6 February 2003 - 12.6KB
<?xml version="1.0" encoding="ISO-8859-1" ?>
<atag>
  <contents>Whatever.</contents>
</atag>

Something.xml - Created 7 February 2003 - 16.0KB
<?xml version="1.0" encoding="ISO-8859-1" ?>
<anothertag>
  <contents>Bla.</contents>
</anothertag>

Jan Derk
Saturday, May 03, 2003

Mark - exactly. When in doubt, look at the untouched source. :-)

I'm just making sure I'm not missing something regarding flipping through each and every file to get the header data. I don't think so, but I wanted to double-check.

Luckily it's an admin-only function.

Philo

Philo
Saturday, May 03, 2003

does the os have anything?
windows explorer has a preview for files. i think it does some caching (certainly for image thumbnails in preview mode). however i don't know the details or if there is an API to get this cached info. (it would probably be an explorer (aka shell) method)

if you want to be clever write your own pseudo-xml parser to try to extract some interesting nodes, otherwise they'll all say <?xml version="1.0"?>. And why pseduo-xml? because you're dealing with broken files, right?

mb
Sunday, May 04, 2003

Just a crazy suggestion, but I used XMLspy at my old job when working with XML files... It may help you out here... Also, are there no XML validators?

www.marktaw.com
Sunday, May 04, 2003

I'm validating the XML against the schema - that would be trapped as an error. The problem is when there's bad data.

It becomes an issue whenever we integrate a new company. :-)

Philo

Philo
Sunday, May 04, 2003

Gee, no wonder you were sacked.

Clutch Cargo
Sunday, May 04, 2003

Clutch, care to elaborate?

Philo

Philo
Sunday, May 04, 2003

Hey Clutch, you wouldn't happen to be a band from Brooklyn, would you?

www.marktaw.com
Sunday, May 04, 2003

*  Recent Topics

*  Fog Creek Home