Fog Creek Software
Discussion Board




Bzzzt!

Imagine this: It is 1992-ish and you are writing an HTML page.  The browser keeps rejecting your HTML without rendering it.  You get a hot head, let out some steam and quit trying.

Now Image this: It is 1992-ish and you are writing an HTML page.  The browser accepts all HTML you throw at it and you simply observe how it rendered the HTML until it is rendered according your liking.

This is not the same as compiling a C program and having the compiler catch the errors until it compiles correctly.  C is a strict programming language.  HTML is a simple markup language to be used by the common person.

Bzzzt!
Monday, July 26, 2004

if a common person can't figure out how to type tags in pairs, and nest them properly, then they probably also have trouble wiping themselves after they shit.

muppet
Monday, July 26, 2004

The problem being if one browser accepts your dodgy HTML but others don't, you're in trouble.

Matt
Monday, July 26, 2004

Nope.  That's not a problem either.

When you make a parser for HTML how would you account for missing tags.  Come on people.  This is trivial.  You accept the fact that the tags are missing, you accept the content of those tags and you move on.

You also test your HTML in the browsers it will be used in most.

I would like to see a sample of some HTML with missing tags that different browsers render differently.  I'd bet it can't be done.

Bzzzt!
Monday, July 26, 2004

Average users will select browsers they perceive to "work better". That will be the most "accepting" one. Hence there will always be a bias towards acceptance rather than strictness.

Strictly conforming browsers will be perceived as "buggy". End users make the choice, whether developers like it or not.

sgf
Monday, July 26, 2004

The original poster's point is not mutually exclusive with the thing Joel quoted. I can conceive of both being true statements.

This is an engineering issue, marked by tradeoffs and shifting politics. As the number of endusers creating html on notepad decreases (as a %) and error-intolerant apps increase, things will shift.

Tayssir John Gabbour
Monday, July 26, 2004

>> "Average users will select browsers they perceive to "work better". That will be the most "accepting" one. Hence there will always be a bias towards acceptance rather than strictness."

Nope.  Average users use the browser they are given.  It is the browsers responsibility to be flexible enough to account for errors in the HTML.

I'm not old enough to remember but in 1992 were there HTML validators?  Were there tools to generate HTML?  I'd bet the prevalent HTML editing tools were text editors.  Now you try to write an HTML page in a text editor without errors?

Bzzzt!
Monday, July 26, 2004

I write my HTML in a text editor CURRENTLY without errors, what was the problem in 1992?

In 1992 one did most of one's web browsing with a terminal window.  It wasn't terribly difficult to parse HTML then, I'd imagine.

muppet
Monday, July 26, 2004

I don't have a problem with the browser rejecting my HTML; I just want it to tell me why.  Something like "Missing a </b> tag" rather than "Unspecified HTML syntax error."

Kyralessa
Monday, July 26, 2004

Well that's a brave statement muppet.  How often do you run your perfect HTML through a validator and how often does it validate perfectly?  Further more, how complex is it?  As complexity increases so do errors.

Bzzzt!
Monday, July 26, 2004

+++ don't have a problem with the browser rejecting my HTML; I just want it to tell me why.  Something like "Missing a </b> tag" rather than "Unspecified HTML syntax error." +++

I don't see why this couldn't be done.  The PHP interpretter gives pretty specific errors like this.

muppet
Monday, July 26, 2004

http://validator.w3.org/check?uri=http%3A%2F%2Ftest.madebymonkeys.net

well apparently the w3c validator throws up on urls containing query strings, thinking that they're & codes for various characters.  I find this hilarious.

it also spits at alt='' tags in images.  Ah well.

The XHTML is still valid and still very renderable.  I'm not overly concerned with the uber-nitpickiness of the w3c.  I still don't think browsers ought to render utterly foobared html, with missing end tags and the like.

muppet
Monday, July 26, 2004

The browser should have a special debugging mode where it would report errors like that.  Obviously that type of error wouldn't be user friendly.

Bzzzt!
Monday, July 26, 2004

Browsers have to render HTML in any way they can simply because a browser is an end user tool.  If a browser was an IDE then it would be different.

Bzzzt!
Monday, July 26, 2004

Muppet, read those messages.  Things like "If you want to use a literal ampersand in your document you must encode it as "&amp;" (even inside URLs!)." ought to give you the hint that it is your page that is wrong, not the W3C validator!

<a href="hmmm" alt="wtf? alt in an &lt;a&gt;??">since when did this need an alt?</a>...

muppet by name, muppet by nature?

i like i
Monday, July 26, 2004

oh I read it, I just think it's freaking retarded to encode ampersands in URLs as &amp;.  What's the point?  They're not intended to be viewed, they're obviously inside a tag attribute.  Why is it 'standard' to type four unnecessary characters in every URL with a query string?

It's nonsense.

muppet
Monday, July 26, 2004

I have an alt in an <a> ?  really?

oops.

muppet
Monday, July 26, 2004

s/b title=""

I'm pretty sure I have that in the current iteration of the (not yet uploaded) code.

anyway back OT, if browsers enforced strict HTML then there wouldn't be such a glut of unqualified "web developers" around.

muppet
Monday, July 26, 2004

> anyway back OT, if browsers enforced strict HTML then there wouldn't be such a glut of unqualified "web developers" around.

where would that leave you, muppet?

muppet by another name, muppet all the same
Monday, July 26, 2004

ha

I didn't say 'w3c' html.  :)  Who put them in charge, anyway?  Oh right, no one did!

muppet
Monday, July 26, 2004

"anyway back OT, if browsers enforced strict HTML then there wouldn't be such a glut of unqualified "web developers" around."

And the web (and LaTeX) would be all the rage among physicists sharing their research with one another.


Monday, July 26, 2004

you honestly think that it takes a university educated scientist to properly mark up HTML?  Give me a break.  You must congratulate yourself daily on your genius.

muppet
Monday, July 26, 2004

I disagree with the original poster.

Writing html that is sloppy and invalid is like writing C without using Lint, and turning off all warnings. Yes, the code works some of the times, but you are betting your page/program on the fact that one sloppy browser will never stop being sloppy. Or that your code will do what you want, instead of do what you say. The DWIM (do what i mean) programming language has not yet been invented, nor will it ever be.

Peter
Monday, July 26, 2004

"you honestly think that it takes a university educated scientist to properly mark up HTML?  Give me a break.  You must congratulate yourself daily on your genius."

You missed the point.  I'm not saying it would take a scientist to properly mark up HTML.  I'm saying that the fact that you could mess it up and still have a page that looked decent to show your friends is why it took off the way it did.


Monday, July 26, 2004

Browsers have to be purposely "sloppy" (they really aren't sloppy) in their parsing of the HTML in order to accommodate the end user.  HTML is purposely not a compiled, strictly typed language like C.  It is simply a mark up language.

C and HTML are apples and oranges.

Bzzzt!
Monday, July 26, 2004

Remember that all HTML markup is just suggestions to the rendering engine; it can ignore all HTML and display plain text if it wants.  From that perspective, Bzzzt has a good point; render what can be rendered, ignore the rest.

Tom H
Monday, July 26, 2004

"I don't have a problem with the browser rejecting my HTML; I just want it to tell me why.  Something like "Missing a </b> tag" rather than "Unspecified HTML syntax error.""

Internet Explorer does this right now for XML.  Gives you the line number the error occured on, type of error, and even a bit of the text -- not too different from an error from a C compiler.

And yes, this is a good thing.  Strict = Better in most cases.

Almost Anonymous
Monday, July 26, 2004

> what was the problem in 1992?

Here's a guess.  Was the problem that maybe 8 people had ever heard of HTML in 1992?

Alleck
Monday, July 26, 2004

Once again I would argue that XML compared to HTML is like comparing apples to oranges.  HTML is a simple text markup language.  XML is also a markup language but it's use is not the same as HTML.

XML deals with data.  Anytime you have something dealing with data it is better to be strict.  HTML is not this way.

Bzzzt!
Monday, July 26, 2004

Bzzt - you misunderstand. See my post just next to this on the list for an example of where one browser accepts bad HTML that other browsers don't, causing problems for the careless.

Matt
Monday, July 26, 2004

(I wasn't talking about missing closing tags)

Matt
Monday, July 26, 2004

"The XHTML is still valid and still very renderable.  I'm not overly concerned with the uber-nitpickiness of the w3c.  I still don't think browsers ought to render utterly foobared html, with missing end tags and the like."

Erm, no. Not only is it not valid, it's not well-formed (due to the unescaped ampersand).

E. Naeher
Monday, July 26, 2004

It's beyond me why browsers for the most part don't display an icon that indicates the correctness of the page. If a valid page was met with a smiley face and an invalid page with a frowny face, you would still get all the benefits of "loose" parsing but people would have a visible incentive to fix problems. Of course, clicking on the frowny face would reveal the exact error messages. I submit that if this had been standard practice in browsers since Day 1, we wouldn't have all these problems with invalid HTML.

With respect to the ampersand in a URL issue, the ampersand needs to be escaped because the URL could contain characters not in the document character set. These characters would have to be represented by an entity, indicated by an ampersand. If ampersands that are actually ampersands are not escaped, then there is no way to tell between actual ampersands and the beginning of an entity.

Isaac Morland
Monday, July 26, 2004

E. Naeher  -

stand back, you're getting Zealot-smell on me

muppet
Monday, July 26, 2004

*  Recent Topics

*  Fog Creek Home