Fog Creek Software
Discussion Board




Database Written-Content Storage

When you're building an application that stores press releases submitted via an html form, and the target distribution for the release is the web, at what point do paragraph tags and other formating get added?

This is probably common knowledge, I'd just never thought of it before till now, and now I need to know. What's the easiest and most efficient plan--ask the users to add paragraph tags to the content, or parse the release looking for excess amounts of whitespace to replace with paragragh tags?

IS THERE an easier way, or a more standardized way of accomplishing this? I'm using J2EE if it matters.

Mark
Friday, April 25, 2003

Too bad on the J2EE...
There's an ASP.Net control called RichTextBox www.richtextbox.com that allows users to WYSIWYG the formatting, then presents HTML encoded text for storage. It's a win/win - the users love the interface, and it's zero work for me to simply stuff the string in the database.

So my recommendation would be to try to duplicate this functionality - hopefully there's a J2EE component you can use?

Philo

Philo
Friday, April 25, 2003

You can achieve the same functionality via Internet Explorer using DHTML / Javascript.  Do a search for "execCommand" in the MS Knowledgebase, the links there will explain how to turn a DIV or SPAN into an editable area... fairly complex but cool when you get it running...

actually, here's a link top some examples: http://www.devarticles.com/art/1/90/1

surreal
Friday, April 25, 2003

The problem with the RichTextBox control (we use it) is that it only works on IE (and I think only IE/Win32, but I don't have a Mac handy to verify that).

Brad Wilson (dotnetguy.techieswithcats.com)
Friday, April 25, 2003

You know, I forget now who has it, but I've actually tried and successfully used a Flash component that is a WYSIWYG editor. Given that Flash penetration is even better than IE penetration, and it works on any platform (not just ASP.NET), that might be worth investigating.

Brad Wilson (dotnetguy.techieswithcats.com)
Friday, April 25, 2003

To answer the posters question:

In my companies content management software, all the formatting gets added just before the content gets displayed.  In the database, there is no HTML. 

Their are a couple of reasons for doing it that way:

1) Seperates the content from the presentation; allows you to change the presentation without changing the content.

2) Editing; unless your users edit in HTML then storing it in HTML is not a good idea.

Wayne Venables
Friday, April 25, 2003

There's a free simple cross-browser (IE5+, Mozilla 1.3+) rich text editor at http://www.kevinroth.com/rte/demo.htm which should work if you want your users to be able to edit in HTML.  Then you don't have to perform any post-processing.  The downside (if there is one) is that the release is stored in HTML form.  Not a biggie though since you can run it through an XSLT transform and do whatever you want with it.  (For example change "P" tags to "DIV class=" tags to format according to your website's stylesheet.)

-Thomas

Thomas
Friday, April 25, 2003

If you're going to store text from a simple HTML textbox in the database, all you have to do is replace the and CrLf with a <br /> tag and you'll get the paragraph breaks in the appropriate place.

Now if you're using some kind of WYSIWYG control to make things bold, italic, different colors, etc, then it would seem that you would have to store the actual html in the database or some kind meta data to show where formatting tags should be placed.

Chris
Friday, April 25, 2003

Mozile produces clean valid xhtml. Gecko-based browsers only!

http://mozile.mozdev.org/index.html

fool for python
Friday, April 25, 2003

Why not split up the idea of a press release into a few different components?

1) Title
2) Header
3) Release Date
4) Company Name
5) Paragraph 1
...
N) Paragraph N

You can have a button to add paragraphs, etc.  This way, you can create some different styles based on section + some sort of user input.  If you really want to get fancy, you can show the users a preview of the styles they have applied so far.


You should not mix your Model (the text + sections) with your View (the markup)!  This way, you can easily model the document in the database, and include markup attributes associated with the text.  Splitting up the Model logic from View logic is just a good idea no matter what type of environment you are using.

Joe
Friday, April 25, 2003

Thanks to everyone so far, you guys have helped me narrow my options down considerably. I already knew I shouldn't mix content with markup, but I wasn't sure how the rest of the world deals with it, and if it was standard practice, I would have went the road most travelled.

I already have seperate fields for all of the components of the release, they're stored in seperate columns in the db. It's just the paragraph thing that was/is messing me up. I wasn't sure if it would be wasteful to replace all of the crlfs with open and close p tags on every page request. Actually, I'm still not sure, but that's the way I'm headed.

Thanks Again.

Mark
Friday, April 25, 2003

You shouldn't mix content with markup, but in some situations the markup is as much content as the content. And if hte users want to define markup on the fly then you have to split the markup and text to store it, then recombine them to display it, and sometimes you have to ask "why?"  [grin]

Philo

Philo
Friday, April 25, 2003

What's wrong with just using the HTML PRE tag?

Somebody
Friday, April 25, 2003

*  Recent Topics

*  Fog Creek Home