Skip to content

ara pehlivanian

Web Standards, Web Culture, Web Everything.™

HTML or XHTML: A purist’s dilemma

Whenever possible, I like to do things the right way. Proper implementation of anything—though it may take more effort initially—returns dividends in the form of higher performance, reliability, scalability, re-usability, and so on and so forth. This principle applies to life in general and isn’t limited to web development (read: Tacoma Narrows Bridge). It is however at the core of this article and the focus of the issue I am wrestling with.

For some time now, I was under the impression that XHTML was simply “better HTML,” due to its inherent well formedness—what with it being written in XML and all. But it seems I was mistaken (and I’m willing to admit it). Recently I’ve been reading articles by Anne van Kesteren, and it’s gotten me wondering if I didn’t just jump on the XHTML bandwagon without even asking myself that one all-important question: “Do I really need this?” Had I taken the time to explore the issue a little, I would have found out that in order to deliver XHTML to a client properly, I should be doing it via the application/xhtml+xml MIME type. Currently my documents are delivered as text/html, so all I have to do is change the MIME type right? Not really. While Firefox supports it, IE doesn’t. A document delivered to IE using the application/xhtml+xml MIME type causes the browser to display a download window as it doesn’t support it as a known document format. I could rig my web server to deliver the document in text/html for IE, but then I’d have to ask myself: “Is the effort worth it?” In other words, what advantage would XHTML bring me for the effort I’d have to expend in properly implementing it? Not to mention the fact that it would still be delivered in the incorrect MIME type in IE. The answer unfortunately is: none. Though there really aren’t any major semantic differences between HTML and XHTML, I’d really only need to use the latter if I had to extend HTML with MathML or some other custom markup languages. For more on this, read Anne’s post entitled: “MIME types matter; DOCTYPEs don’t

Now, some of you may be thinking “big deal, it still validates as XHTML even if the MIME type says it’s HTML and not XHTML.” Well, that would be a bad assumption to make since according to the W3C HTML WG Chair Steve Pemberton, …documents served as text/html should be treated as HTML and not as XHTML. This means that even though I work very hard at writing properly formed XHTML, in reality what I’ve been doing is exactly what I’ve been trying hard to avoid: I’ve been writing tag soup! That’s right, since in essence I’ve been delivering HTML documents using mal-formed HTML.

My dilemma is the following: Do I switch to HTML 4.01 and get used to uppercase tags that don’t self-close? Do I stay with XHTML 1.0 Strict and keep delivering it with the incorrect MIME type? Do I rig Apache to deliver my XHTML with the application/xhtml+xml MIME type and abandon visitors with older browsers (like IE 6). Or do I find a happy medium that delivers application/xhtml+xml to conforming browsers and text/html to everyone else?

What do you think?

Related Reading

Sphere: Related Content

  • Comments closed

Buy my book

The Art & Science of JavaScript / SitePoint
The Art & Science of JavaScript

Advertisement

Firebug - Web Development Evolved

Advertisement

9 Comments

  1. Gravatar for Molly E. HolzschlagMolly E. Holzschlag says:

    Actually, there are great benefits to using XHTML 1.0 in the here and now. It’s easier to learn (and teach), it’s great in multi-team environments, and if you ultimately will be tapping into extensibility, you’re documents are more flexible.

    XHTML 1.0 can be sent as text/html - that’s according to the specs. This was done for backward compatibility. Now, XHTML 1.1 sent as text/html would be tag soup, but XHTML 1.0 sent as text/html is fine.

    Anne is one smart guy and I respect him a great deal, but this is an area he and I will never agree on, although HTML 5.0 is pretty intriguing and his work on that impressive. Steven is also one savvy fellow, and what he’s saying in that email isn’t that XHTML (1.0) can’t be served as text/html but should be treated as HTML by browsers rather than XHTML to support the goals of backward compatibility.

  2. Gravatar for AraAra says:

    XHTML 1.0 can be sent as text/html - that’s according to the specs.

    You see, I must have missed that.

    …what he’s saying in that email isn’t that XHTML (1.0) can’t be served as text/html but should be treated as HTML…

    Okay, but if it’s treated as HTML, then wouldn’t it be tag soup? Or at the very least, broken? Because according to the HTML spec, <img /> is forbidden. “Start tag: required, End tag: forbidden

  3. Gravatar for I’ve converted to application/xhtml+xml | ara pehlivanian—Web Standards, Web Culture, Web Everything.™I’ve converted to application/xhtml+xml | ara pehlivanian—Web Standards, Web Culture, Web Everything.™ says:

    [...] So I’ve gone ahead and decided what to do about my dilemma. I’m now serving my pages using the application/xhtml+XML MIME type to compliant browsers and the text/html MIME type to everything else. In order to be able to do this, I installed a wonderful little plugin called WP Content Negotiator written by Admiral Justin. It feels like I just got my first tattoo. It’s a combination of feeling like “wow, I did it!” and “oh boy, I hope I don’t regret this.” [...]

  4. Gravatar for matturmattur says:

    there are great benefits to using XHTML 1.0 in the here and now. It’s easier to learn (and teach),

    Debatable, imho. The rapid uptake of the web suggests HTML has had no particular difficulties being taught or learnt.

    it’s great in multi-team environments,

    Oh come one, this is pure rubbish! It might work with newbies, but any tech-literate person is going to have trouble with such a ludicrous statement.

    and if you ultimately will be tapping into extensibility, you’re documents are more flexible

    Agreed - if the only way you can do something is by using XHTML then use it. Otherwise, don’t - you can always automagically transform good HTML to XHTML at a later date, should the requirement ever arise. Sooner or later WASP are going to have to admit this… I’d suggest sooner would be the better option.

  5. Gravatar for TildeTilde says:

    My dilemma is the following: Do I switch to HTML 4.01 and get used to uppercase tags that don’t self-close?

    Waaaaiiit, wait, wait, wait, wait, wait, wait.

    Wait.

    HTML can be lowercase, you know. It can also be very well-formed and consistent-looking and all those things that are required by XHTML.

    It can even be well structured and semantically sound and well designed, something that is not required by XHTML.

  6. Gravatar for AraAra says:

    Tilde: I realize that now. But through this whole experience I’ve come to the conclusion that I want my site to be a place that I can mess around with different techs that appeal to my obsessive compulsive side. I also misunderstood the W3C’s ”forbidden closing tag” statement to mean that you couldn’t self close <img> and <br> when really they literally meant no closing tags, i.e.: <br></br>

  7. Gravatar for bartbart says:

    I completely agree with Molly.

    The reason for XHTML being better in team environments is that everyone uses the same coding conventions because of the strict syntax rules. With HTML, some people may quote numeric attributes some may not and others may replace them with CSS.

  8. Gravatar for AraAra says:

    Bart: Yeah, but without rigorous validation in your development process, you can still get away with aweful markup in XHTML. Unless you deliver the document in application/xhtml+XML, nobody’s holding you accountable. However, if you do deliver it in application/xhtml+XML then the page will crash, thus, you’re held (severely) accountable for sloppy code.

    It’s not that I don’t agree with you. XHTML (the spec) demands stricter markup, however, without anyone making sure you stick to those rules, you may as well be writing plain ol’ HTML.

    Having said that, HTML also requires that you follow at least some rules, and even those can be broken (and routinely are) by sloppy developers. I honestly think that in the end, the quality of a page’s markup relies more on the quality of the skills of the person who’s doing the marking up, rather than the spec they’re supposed to follow.

  9. Gravatar for The Morning-ish Post » Blog Archive » HTML versus XHTML - the battle ragesThe Morning-ish Post » Blog Archive » HTML versus XHTML - the battle rages says:

    [...] Further reading: What is XHTML anyway? http://www.alistapart.com/articles/xhtml/ That’s not XHTML, that’s tag soup IE7 blog post on XHTML and MIME type Anne van Kesteren has written a lot about this topic [...]

Sorry, the comment form is closed at this time.

Skip to navigation

More stuff by Ara elsewhere on the web

    Snook Approved!

    © 2005-2008, Ara Pehlivanian.

    Stock photography courtesy stock.xchng. This site uses Akismet to catch spam (55,508 caught since May 2006) is hosted by DreamHost and powered by WordPress.