The Trouble With hCard and Microformats in General

If you haven’t already heard of Microformats, internist they are a set of simple, pathopsychology open data formats built upon existing and widely adopted standards. Basically, artificial they allow you to use a combination of class names to mark up data in your page along the same lines as existing data formats. So for example, you’d mark up contact info on a page using class names based on the popular vCard format. The resulting markup would be an hCard. In essence, Microformats aren’t in the business of reinventing the wheel, they’re all about reuse of existing patterns, and therein lie their genius.

<span class="tel">
<span class="type">home</span>:
<span class="value">+1.415.555.1212</span>
</span>

An example of an hCard from: http://microformats.org/wiki/hcard

There is however, one glaring problem with Microformats and I ran into it head on when I was marking up a page in French. Since property values need to be in the clear (in this case, “home”), and those values need to follow an established format (in this case vCard), you can’t use any other language but English. Yep, you heard me, Microformats (at least hCard) are English only. So much for i18n.

<span class="tel">
<span class="type">Téléphone</span>:
<span class="value">+1.415.555.1212</span>
</span>

An example of an invalid hCard due to a property value in French

I think that the defining difference between hCard and vCard is that the former needs to have its property values in cleartext whereas I don’t think the latter does. In other words, you can exchange vCards containing English property values using Japanese applications and since the vCard is simply a file, the Japanese app can open it up, read the properties in English and then display them with Japanese labels.

I tried discussing the issue with the Microformats community, but the results were far from conclusive. I was told that the abbr property could be used to remedy this situation, however the spec doesn’t discuss this in terms of i18n. Rather the intended use of abbr is to differentiate between human and machine readable formats of the same content. For example dates:

<abbr title="2008-04-15T00:00:00">April 15th, 2008</abbr>

If I’m way off base here or Microformats have evolved since to include i18n, please by all means let me know in the comments.

Tags: