Friday, January 27, 2006
The future of HTML, Part 2: XHTML 2.0
25 Jan 2006
In this two-part series, Edd Dumbill examines the various ways forward for HTML that Web authors, browser developers, and standards bodies propose. This series covers the incremental approach embodied by the WHATWG specifications and the radical cleanup of XHTML proposed by the W3C. Additionally, the author gives an overview of the W3C's new Rich Client Activity. Here in Part 2, Edd focuses on the work in process at the W3C to specify the future of Web markup.
In the previous article in this series, I described why HTML is due for an update, both to fix past problems and to meet the growing requirements of the tasks to which Web pages and applications are put. I explained the work of the Web Hypertext Application Technology Working Group (WHATWG), a loose collaboration of browser vendors, in creating their Web Applications 1.0 and Web Forms 2.0 specifications.
In this article, I'll examine the work of the World Wide Web Consortium (W3C) in creating the next-generation version of their XHTML specification, and also their response to the demand for "rich client" behavior exemplified by Ajax applications.
The W3C has four Working Groups that are creating specifications of particular interest:
- HTML (now XHTML)
- XForms
- Web APIs
- Web Application Formats
You can find links to each of these in Resources. This article mainly focuses on the work of the the HTML Working Group, but it is worth discussing each of the others to give some context as to how their work will shape the future of the Web.
XForms are the W3C's successor to today's HTML forms. They are designed to have richer functionality, and pass their results as an XML document to the processing application. XForms are modularized, so you can use them in any context, not just attached to XML. XForms' key differences from HTML forms are:
- XForms separate user interface presentation from data model definition.
- XForms can create and consume XML documents.
- XForms are device independent. For example, you can use the same form in a voice browser and on a desktop browser.
- XForms allow validation and constraining of input before submission.
- XForms allow multi-stage forms without the need for scripting.
As it is a modularized language, XHTML 2.0 imports XForms as a module for its forms functionality.
The W3C's Web APIs Working Group is charged with specifying standard APIs for client-side Web application development. The first and most familiar among these is the XMLHttpRequest
functionality at the core of Ajax (which is also a technology that the WHATWG has described). These APIs will be available to programmers through ECMAScript and any other languages supported by a browser environment.
Additional APIs being specified are likely to include:
- An API for dealing with the browser
Window
object - DOM Level 3 Events and XPath specifications
- An API for timed events
- APIs for non-HTTP networking, such as XMPP or SIP
- An API for client-side persistent storage
- An API for drag and drop
- An API for monitoring downloads
- An API for uploading files
While these APIs do not need to be implemented in tandem with XHTML 2.0, browsers four years in the future will likely integrate them both to provide a rich platform for Web applications.
XHTML 2.0 is one part of the Web application user interface question, but not the totality. Technologies such as Mozilla's XUL and Microsoft's XAML have pushed toward a rich XML vocabulary for user interfaces.
The Web Application Formats Working Group is charged with the development of a declarative format for specifying user interfaces, in the manner of XUL or XAML, as well as the development of XBL2, a declarative language that provides a binding between custom markup and existing technologies. XBL2 essentially gives programmers a way to write new widgets for Web applications.
The purpose of XHTML 1.0 was to transition HTML into an XML vocabulary. It introduced the constraints of XML syntax into HTML: case-sensitivity, compulsory quoted attribute values, and balanced tags. That done, XHTML 2.0 seeks to address the problems of HTML as a language for marking up Web pages.
In his presentation at the XTech 2005 conference in Amsterdam (see Resources), the W3C's Steven Pemberton expressed the design aims of XHTML 2.0:
- Use XML as much as possible: Where a language feature already exists in XML, don't duplicate or reinvent it.
- Structure over presentation: Thanks to CSS stylesheets, you no longer need explicitly presentational tags in HTML.
- Make HTML easier to write: Remove some of the needless idiosyncrasies of HTML.
- More accessibility, device independence: Make as few assumptions as possible about the way a document will be read.
- Improved internationalization.
- Better forms: Long overdue improvements are required!
- Reduce the need for scripting: Include typical scripting usages in HTML itself.
- Better semantics: Make it easier to integrate HTML with semantic Web applications.
These aims certainly appear pretty laudable to anybody who has worked with HTML for a while. I'll now take a deeper look at some ways in which they were achieved in XHTML 2.0.
When I was a newcomer to HTML many years ago, I remember experiencing a certain amount of bemusement at the textual structural elements in the language. Why were there six levels of heading, and when was it appropriate to use each of them? Also, why didn't the headings somehow contain the sections they denoted? XHTML 2.0 has an answer to this, with the new and
(heading) elements:
|
This is a much more logical arrangement than in XHTML 1.0, and will be familiar to users of many other markup vocabularies. One big advantage for programmers is that they can include sections of content in a document without the need to renumber heading levels.
You can then use CSS styling for these headings. While it is to be expected that browsers' default implementations of XHTML 2.0 will have predefined some of these, written explicitly they might look like this (abstracted from the XHTML 2.0 specification):
|
Another logical anomaly in XHTML 1.0 is that you must close a paragraph in order to use a list. In fact, you must close it to use any block-level element (blockquotes, preformatted sections, tables, etc.). This is often an illogical thing to do when such content can justly be used as part of the same paragraph flow. XHTML 2.0 removes this restriction. The only thing you can't do is put one paragraph inside another.
The
tag in HTML is actually pretty inflexible. As Pemberton points out, it does not include any fallback mechanism except alt
text (hindering adoption of new image formats), the alt
text can't be marked up, and the longdesc
attribute never caught on due to its awkwardness. (longdesc
is used to give a URI that points to a fuller description of the image than given in the alt
attribute.)
XHTML 2.0 introduces an elegant solution to this problem: Allow any element to have a src
attribute. A browser will then replace the element's content with that of the content at the URI. In the simple case, this is an image. But nothing says it can't be SVG, XHTML, or any other content type that the browser is able to render.
The
tag itself remains, but now can contain content. The new operation of the src
attribute means that the alt
text is now the element's content, such as in this example markup:
|
This is especially good news for languages such as Japanese, whose Ruby annotations (see Resources) require inline markup that was previously impossible in attribute values.
XHTML 2.0 offers a more generic form of image inclusion in the element, which you can use to include any kind of object -- from images and movies to executable code like Flash or Java technology. This allows for a neat technique to handle graceful degradation according to browser capability; you can embed multiple
elements inside each other. For instance, you might have a Flash movie at the outermost layer, an AVI video file inside that, a static image inside that, and finally a piece of text content at the center of the nested objects. See the XHTML Object Module (linked in Resources) for more information.
HTML has long had some elements with semantic associations, such as
. The problem with these is that they are few and not extensible. In the meantime, some have attempted to use the class
attribute to give semantics to HTML elements. This is stretching the purpose of class
further than it was designed for, and can't be applied very cleanly due to the predominant use of the attribute for applying CSS styling. (Some argue about this assertion of the purpose of class
, but the latter point is undeniable.) Moving beyond these ad-hoc methods, XHTML 2.0 introduces a method for the specification of RDF-like metadata within a document. RDF statements are made up of triples (subject, property, object). For instance, in English you might have the triple: "my car", "is painted", "red".
The about
attribute acts like rdf:about
, specifying the subject of an RDF triple -- it can be missing, in which case the document itself will the subject. The property
attribute is the URI of the property referred to (which can use a namespace abbreviation given a suitable declaration of the prefix; more detail is available in the XHTML 2.0 Metainformation Attributes Module, see Resources).
Finally, the third value in the triple is given by the content of the element to which the about
and property
attributes are applied -- or if it's empty, the value of the content
attribute. Here's a simple usage that will be familiar from existing uses of the HTML tag, specifying a creator in the page header:
This denotes the heading as the XHTML 2.0 title of the document, and specifies it as the inline heading. Finally, an end to writing the title out twice in every document! Thanks to a simple transforming technology called GRDDL (Gleaning Resource Descriptions from Dialects of Languages -- see Resources), you now have a single standard for extracting RDF metadata from XHTML 2.0 documents. XHTML 2.0 has plenty of other changes, many of which are linked in with the parallel development of specifications such as XForms. I don't have room to cover them all here. Regardless, it's certainly a marked leap from XHTML 1.0. A few other new toys in XHTML 2.0 Fed up with of writing ? Now you can use the new element. To help with accessibility requirements, XHTML 2.0 now has a Browsers currently support some navigation of focus through the Tab key, but it can be arbitrary. The new However deep the changes in advanced features, XHTML 2.0 is still recognisably HTML. Although it has new elements, a lot of XHTML 2.0 will work as-is. The to elements were carried through as a compatibility measure, as was |
![]() |
|
Learn
- Read the first article in this two-part series on the future of HTML (developerWorks, December 2005).
- Reference the XHTML 2.0 specification.
- Get the latest news on developments with XHTML -- visit the W3C HTML Working Group.
- Visit the W3C XForms page, which includes information on the XForms Working Group.
- The W3C Web APIs Working Group is charged with specifying standard APIs for client-side Web application development.
- The W3C Web Application Formats Working Group is charged with the development of a declarative format for specifying user interfaces.
- Read Steven Pemberton's XTech 2005 presentation: "XHTML2: Accessible, Usable, Device Independent and Semantic."
- Learn more about Ruby annotations, which are used in Japanese and Chinese to provide pronunciation guides.
- The Metainformation Attributes Module of XHTML 2.0 supports the specification of RDF metadata in HTML documents.
- You can use the Object Module of XHTML 2.0 to include arbitrary objects.
- Want to extract RDF triples from XHTML 2.0 documents? Check out the Gleaning Resource Descriptions from Dialects of Languages (GRDDL) transformation technology.
- The W3C's note on XHTML Media Types describes best practices for serving XHTML from your web site. In particular, XHTML 2.0 should not be served as
text/html
as is possible with XHTML 1.0 crafted to the HTML Compatibility Guidelines. - Microformats are a way to make human-readable elements in Web pages carry semantics that computers can interpret too. They are a bridge between today's HTML-based ad-hoc semantics and tomorrow's RDF-compatible XHTML 2.0 metadata.
Get products and technologies
- Take a look at the X-Smiles browser, an experimental platform with early (and sometimes only partial) support for many of the W3C's new client technologies, including XHTML 2.0, SVG, XForms, and SMIL.
![]() |
|
![]() | ||
![]() | Edd Dumbill is chair of the XTech conference on Web and XML technologies, and is an established commentator and open source developer with Web and XML technologies source:http://www-128.ibm.com/developerworks/xml/library/x-futhtml2.html?ca=dgr-lnxw01XHTML2 |
victim # |
Previous Posts
- "The X Prize Foundation, the group behind the $10 ...
- The New Boom
- 7 myths about the Challenger shuttle disaster
- The Bug in Microsoft's Ear
- Britons unconvinced on evolution
- Global Temperature Trends: 2005 Summation
- Web game provides breakthrough in predicting sprea...
- Microsoft Agrees to License Windows Source Code
- DISNEY TO ACQUIRE PIXAR
- Three-Dimensional Structure of HIV Revealed
Links
Archives
- 06/22/2005
- 06/23/2005
- 06/24/2005
- 06/25/2005
- 06/26/2005
- 06/27/2005
- 06/28/2005
- 06/29/2005
- 06/30/2005
- 07/01/2005
- 07/02/2005
- 07/04/2005
- 07/05/2005
- 07/06/2005
- 07/07/2005
- 07/08/2005
- 07/09/2005
- 07/10/2005
- 07/11/2005
- 07/12/2005
- 07/13/2005
- 07/14/2005
- 07/15/2005
- 07/16/2005
- 07/17/2005
- 07/18/2005
- 07/19/2005
- 07/20/2005
- 07/21/2005
- 07/22/2005
- 07/23/2005
- 07/24/2005
- 07/25/2005
- 07/26/2005
- 07/27/2005
- 07/28/2005
- 07/29/2005
- 07/30/2005
- 08/01/2005
- 08/03/2005
- 08/04/2005
- 08/05/2005
- 08/07/2005
- 08/08/2005
- 08/09/2005
- 08/11/2005
- 08/12/2005
- 08/15/2005
- 08/16/2005
- 08/17/2005
- 08/18/2005
- 08/22/2005
- 08/24/2005
- 08/25/2005
- 08/26/2005
- 08/29/2005
- 08/30/2005
- 08/31/2005
- 09/01/2005
- 09/06/2005
- 09/07/2005
- 09/08/2005
- 09/09/2005
- 09/12/2005
- 09/13/2005
- 09/14/2005
- 09/15/2005
- 09/16/2005
- 09/19/2005
- 09/20/2005
- 09/21/2005
- 09/22/2005
- 09/23/2005
- 09/26/2005
- 09/27/2005
- 09/28/2005
- 09/29/2005
- 09/30/2005
- 10/03/2005
- 10/04/2005
- 10/05/2005
- 10/06/2005
- 10/07/2005
- 10/11/2005
- 10/13/2005
- 10/14/2005
- 10/17/2005
- 10/18/2005
- 10/19/2005
- 10/20/2005
- 10/21/2005
- 10/24/2005
- 10/25/2005
- 10/26/2005
- 10/27/2005
- 10/28/2005
- 10/31/2005
- 11/01/2005
- 11/02/2005
- 11/03/2005
- 11/04/2005
- 11/07/2005
- 11/08/2005
- 11/09/2005
- 11/10/2005
- 11/14/2005
- 11/15/2005
- 11/16/2005
- 11/17/2005
- 11/18/2005
- 11/21/2005
- 11/22/2005
- 11/23/2005
- 11/24/2005
- 11/25/2005
- 11/28/2005
- 11/29/2005
- 11/30/2005
- 12/01/2005
- 12/02/2005
- 12/05/2005
- 12/06/2005
- 12/07/2005
- 12/08/2005
- 12/09/2005
- 12/12/2005
- 12/13/2005
- 12/14/2005
- 12/15/2005
- 12/16/2005
- 12/19/2005
- 12/20/2005
- 12/21/2005
- 12/22/2005
- 12/23/2005
- 01/17/2006
- 01/18/2006
- 01/19/2006
- 01/20/2006
- 01/23/2006
- 01/25/2006
- 01/26/2006
- 01/27/2006
- 01/30/2006
- 01/31/2006
- 02/01/2006
- 02/02/2006
- 02/03/2006
- 02/06/2006
- 02/07/2006
- 02/08/2006
- 02/09/2006
- 02/13/2006
- 02/14/2006
- 02/15/2006
- 02/17/2006
- 02/20/2006
- 02/21/2006
- 02/22/2006
- 02/23/2006
- 02/24/2006
- 02/27/2006
- 02/28/2006
- 03/01/2006
- 03/02/2006
- 03/03/2006
- 03/06/2006
- 03/07/2006
- 03/08/2006
- 03/09/2006
- 03/10/2006
- 03/13/2006
- 03/14/2006
- 03/15/2006
- 03/16/2006
- 03/17/2006
- 03/20/2006
- 03/21/2006
- 03/22/2006
- 03/23/2006
- 03/24/2006
- 03/27/2006
- 03/28/2006
- 03/29/2006
- 03/30/2006
- 03/31/2006
- 04/03/2006
- 04/04/2006
- 04/05/2006
- 04/06/2006
- 04/07/2006
- 04/10/2006
- 04/11/2006
- 04/12/2006
- 04/13/2006
- 04/17/2006
- 04/18/2006
- 04/19/2006
- 04/20/2006
- 04/21/2006
- 04/24/2006
- 04/25/2006
- 04/26/2006
- 04/27/2006
- 04/28/2006
- 05/01/2006
- 05/02/2006
- 05/03/2006
- 05/04/2006
- 05/05/2006
- 05/08/2006
- 05/09/2006
- 05/10/2006
- 05/11/2006
- 05/12/2006
- 05/15/2006
- 05/16/2006
- 05/17/2006
- 05/18/2006
- 05/19/2006
- 05/24/2006
- 05/25/2006
- 05/26/2006
- 05/29/2006
- 05/30/2006
- 05/31/2006
- 06/01/2006
- 06/02/2006
- 06/05/2006
- 06/06/2006
- 06/07/2006
- 06/08/2006
- 06/09/2006
- 08/03/2006
- 08/14/2006
- 05/23/2007
- 06/13/2007
- 06/14/2007
- 06/15/2007
- 06/18/2007
- 06/19/2007
- 06/20/2007
- 06/21/2007
- 06/23/2007
- 06/26/2007
- 06/27/2007
- 06/29/2007
- 07/03/2007
- 07/04/2007
- 07/05/2007
- 07/06/2007
- 07/09/2007
- 07/10/2007
- 07/11/2007
- 07/12/2007
- 07/16/2007
- 07/25/2007
- 07/26/2007
- 07/30/2007
- 07/31/2007
- 08/01/2007
- 08/02/2007
- 08/03/2007
- 08/07/2007
- 08/08/2007
- 08/09/2007
- 08/10/2007
- 08/13/2007
- 08/14/2007
- 08/15/2007
- 08/16/2007
- 08/20/2007
- 08/21/2007
- 08/22/2007
- 08/23/2007
- 08/24/2007
- 08/27/2007
- 08/28/2007
- 08/30/2007
- 08/31/2007
- 09/06/2007
- 09/10/2007
- 09/12/2007
- 09/13/2007
- 09/17/2007
- 09/18/2007
- 09/20/2007
- 09/21/2007
- 09/26/2007
- 10/05/2007
- 10/11/2007
- 10/16/2007
- 10/24/2007
- 12/03/2007
- 04/17/2009