08.19.99
Oh to structure Code
Aug 19 Thu (06 PM)
(This is the start of a new WebDev article I want to write. Your comments are appreciated.)
Today I read an article about XHTML on http://www.Builder.com that was pointed to from http://www.scriptingnews.com which discussed (in a very short article) some of the major points concerning XHTML, what it is, why it is, and why web developers should know about it. It was a nice little article. Then I read the ‘Feedback’ message board.
http://buzz.builder.com/cgi-bin/WebX?13@@.ee7b2bd/0
Good god. I didn’t get through the whole archive, but what I saw gave me a good scare. And it reminded me of something I had noticed before. A deep, untold secret that most people don’t know, and most web developers desperately need to be told: Poor HTML Code Crashes.
This is what threw me for a loop:
Does anyone else go “ick” at the idea of well-formed HTML? I mean, HTML escaped the rigidity of SGML, and got rid of the concept that every tag was a container and needed a close tag.
It actually made me sick to my stomach. Closing tags and well formed HTML are so very important. Very very important.
If you pick apart the pieces of a web browser, you’ll find that there’s an ‘engine’ that tries to figure out what each character in an HTML text file means. It’s called a (parser). For a great look at the pieces of code in a web browser, see this article http://www.webtechniques.com/archives/1999/03/gessner/ on Gecko from WebTechniques magazine.
Imagine you were trying to follow a recipe for a cake, but the steps were out of order. Now imagine that you’re really stupid and have no idea how to cook, but all you have is this recipe. I think you’d fail. You might come out with a lump of flour, chocolate and eggs, but few would call the results of that recipe a cake.
In the same way, poorly formed HTML is like a mixed up recipe. But programmers have done their best to make parsers that will interpret these poorly written (HTML) recipes.
However, by making guesses and assumptions (something computers are _not_ cut out to do.) the parsers don’t always get things right. Now this may not be a problem when it comes to displaying things on a screen, but a parser is more than it’s graphical output. It’s a computer program that manlipulates _lots_ of data and bits and bytes.
My theory is that parsers are written by humans. Given this premise, I logically deduct that since humans are fallible, their parsers are fallible as well. Everyone with me?
If the parser doesn’t work perfectly, those bits and bytes might get loose, and get scattered all over your RAM. This is technically known as a ‘memory leak’. What happens quickly after this? Crashes. Big ol’ honking Crashes.
I noticed a few years ago that when working on certain pages of HTML, Netscape would crash. It wouldn’t crash the first time the page was rendered, it might happen the eighth or twelveth time. Similar problems would crop up with frames. Sometimes the page wouldn’t crash the browser right away, but the browser would crash later on some unrelated web site.
However, once I cleaned up the HTML, the crashes stopped. I imagine that the parser was no longer running into bugs in it’s programming. The amount of programming that goes into rendering a structured HTML document is _miniscule_ compared to the amount of programming that a browser has to have to clean up poorly coded HTML. Percentages say that the bigger pieces of code have more bugs. Since these bugs don’t necessarily raise their ugly heads in the displayed page, they are overlooked. But they’re there. And they’re wrecking havoc.
No piece of software should crash. But since we don’t live in _that_ perfect world, web developers need to take responsibilty for their code. HTML may not be programming, but poor HTML can be just a hazardous as poor programming when the two are brought together on a computer screen. Keeping your HTML code correctly structured is being a responsible web developer.