Fog Creek Software
Discussion Board




Knowledge Base
Documentation
Terry's Tips
Darren's Tips

Problems with UTF-8 (English)

The CityDesk editor seems to be messing up symbols like ® and ©  When I go from the HTML mode to the Normal view, the entities seem to be actually converted, rather than just rendered.  Exact steps:

Edit a page in HTML mode.
Enter the entity <&reg;> to represent the R in a circle.
Switch to Normal mode.
Observe the beautifully rendered R in a circle.
Switch back to HTML mode.
Be surprised to see that the &reg; has now been replaced by an R in a circle.
Publish the site.
Open the generated HTML files with a text editor.
Be dismayed to see that there is now a bunch of crud (random high-ASCII chars) in the text file where the circle-R used to be. :-(
View the site in a web browser (IE or Mozilla Firebird).
Be dejected as you observe the crud in the browser too.

Note, this works whether I use the &reg; or the &#174; identifier.

Any thoughts? 

(And, yes, I'm using <meta http-equiv="Content-Type" content="text/html;charset=utf-8" /> as the first thing in the head section, although I don't know if that's relevant here.)

Environment: Win2K Pro, IE 6

Scott B lank steen
Friday, November 21, 2003

I see that this behavior seems to be intentional.  If it actually worked, I guess I could accept the explanation given at http://www.fogcreek.com/CityDesk/kb/troubleshooting/NamedEntitiesareReplaced.html.  However, given that it is not working on two modern browsers, I question why the behavior is so.  I'd consider it a bug, actually.  I have had to resort to the workaround of defining variables for the entities that I want to use. 

Pretty annoying. 

Scott B lank steen
Friday, November 21, 2003

When you show the page in e.g. Internet Explorer, what encoding does IE say it uses? (look at View > Encoding to find out) If IE says that is using anything but Unicode (UTF-8) there probably is some error either in your HTML or in the web server configuration.

Since you say that you have specified UTF-8 as encoding in the HTML (using that meta tag), I don't think the HTML is the problem. Perhaps the server tell the browser that the site uses completely different encoding? This shouldn't be the problem though, since the browser is supposed to use the encoding specified in the HTML before the encoding given by the server. But I've experienced similar problems due to this.

Henrik Jernevad
Friday, November 21, 2003

I had this problem.  I was telling the browser UTF-8 in my HTML code in CityDesk, but the web server header was telling the browser IsoLatin some such...

David Burch
Tuesday, November 25, 2003

*  Recent Topics

*  Fog Creek Home