bug in importing from word

cut and paste a word 2000 document (in french, so the ' character is very present) in the article view. everything is OK, the pasted doc looks fine.

view html then come back to normal view.

the ' char is replaced by a   char in the whole text. If you go to the html view, the ' char has been replaced too.

the bug doesn't appear if you "paste without formatting" but in this case you lose all the useful formatting of your document: bulleted lists, indented lists, and so on... and put them again is painful.

config info: office 2000, windows 95, french version, IE5.00.2314  on the machine

Vincent Bénard
Thursday, April 11, 2002

... and if you try to cut and paste from word pad,

then switch to html view, then come back to normal view, this bug doesn't appear, but:
"..."  (three dots attached) is replaced by "&"
the œ char is replaced by a square.

Vincent Bénard
Thursday, April 11, 2002

Does this help?

Thursday, April 11, 2002

thanks for the suggestion.

alas, no, it doesn't solve the problem.

The charset is mentionned at the level of the template, and the template I used for my test was the standard one, that has no "charset" defined in the <head> section.

I've added the reference to 8859-1 charset in the simple template but it didn't change anything. the problem appears before previewing or publishing, so the "body" is still not merged with the template when this phenomenon can be observed. (of course after publishing, the bad characters remain).

Vincent Bénard
Thursday, April 11, 2002

This is similar to a problem I'm having at work. I have a very kudgy workaround that may or may not work for you.

Export the Word doc to HTML. Do a massive search & replace on the HTML to replace it with the proper ISO compliant code (i.e. &345; ).

Then bring it in to CityDesk. Should work, but I haven't tried that last step (bring it into CityDesk).

Mark W
Thursday, April 11, 2002

seems to be a config problem

under w98 (over a vitual pc layer), with IE 5.5 and word 97, the problem doesn't appear. The word import works perfectly under such environment.

so does the issue come from office 2000 or from windows 95 ? (Or IE 5.00 ?)

(nb. CD version: 1.0.27 on the working configuration, 1.0.29 on the non working one - don't think it's important ?)

vincent bénard
Thursday, April 11, 2002

œ … ‘ ’ '

Are you sure that the problem is ' and not ‘ or ’ which Word often uses instead of ' as in ‘quotes’ or even characters with ' built into them like à or á?

I'm not able to replicate this problem. What is your default windows font? Mine is Times New Roman. Also, what font(s) are used in your document?

Mark W
Thursday, April 11, 2002

Mark, you're right.

the problems comes with the ’ and not with the  '.

But I dont know how to prevent office from putting this garbage into my texts.

My system fonts are standard default fonts (times new roman activated by default in office or most MS programs) - the phenomenon appears with docs in times new roman or arial, didn't try others.

Vincent Bénard
Friday, April 12, 2002

In Word, go into Tools, AutoCorrect, and on the AutoFormat tab uncheck the option to turn straight quotes into "smart quotes".

Mike Gunderloy
Friday, April 12, 2002

Hypothetically and not having tested this, is it possible to do a search & replace for variations on ' that might be hanging things up in CityDesk?

Mark W
Friday, April 12, 2002

