XHTML Design Guide
Document Specs

Document Specifications

2. Hypertext Document Specifications

2.1 DTD
2.2 HTML 3.2
2.3 HTML 4.01
2.4 XHTML 1.0
2.5 XML
2.6 Standards and Compliance
2.7 Validation


2.1 Document Type Definitions (DTD)

A DTD is a "Document Type Definition" which specifies the syntax (grammatical structure) of a web page. Basically this allows the software that reads a web page code to know what kind of document rules it uses for it's code.

Document Type Definition (as it relates to HTML)
An HTML DTD describes in precise, computer-readable language the allowed syntax and grammar of HTML markup.

There are various versions of HTML, each described by a unique DTD. Validating an HTML document's content involves checking its markup against a DTD and reporting markup errors. There are various HTML document types in use on the Web, arising from various levels and versions.

DOCTYPE is a declaration that should go in the begining of any valid HTML document.

<!DOCTYPE HTML PUBLIC
       "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<TITLE>HTML Validation</TITLE>
</HEAD>
<BODY>
<H1>HTML Validation</H1>
...
</BODY>
</HTML>
				

The currently valid and in-use Document Types for HTML documents have several flavors: HTML 3.2, HTML 4, XHTML 1.0 and XML

Check below to see a list of the most commonly used definitions


2.2 HTML 3.2 (specs)

An important, widely accepted and implemented standard to remain in use today is HTML 3.2, affectionately known as Wilbur to the w3. HTML 3.2 became a W3C Recommendation in January of 1997

HTML 3.2 Document Type Definitions DTD's

HTML 3.2 DTD

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
				

This declares the document to be HTML 3.2. HTML 3.2 is well supported by most browsers in use. However, HTML 3.2 has limited support for style sheets and no support for HTML 4.0 features such as frames and internationalization.

Choose a flavor of HTML depending on what kind of audience you will be targeting.

Older flavors of HTML like Wilbur have the nice benefit of being almost universally browsable. That is, pretty much anyone nowadays should be able to view a page (as it's creator intended) written in HTML 3.2, regardless of browser, OS or hardware.

If you are unsure whether any given tag will work with either of the two major browsers use this excellent reference table from Blooberry.com to ensure the tags you are using will work.


2.3 HTML 4.01 (specs)

HTML 4.0 Strict:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
				

The Strict DTD includes elements and attributes that have not been deprecated or do not appear in framesets. Use this DTD when you're doing all of your formatting in Cascading Style Sheets (CSS). In other words, you aren't using <font> and <table> tags to control how the browser displays your documents.

HTML 4.0 Transitional:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
				Transitional//EN">
				

Use this when you need to take advantage of HTML's presentational features because many of your readers don't have the latest browsers that understand Cascading Style Sheets.

HTML 4.0 Frameset:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
				Frameset//EN">
				

Use this when you want to use HTML Frames to partition the browser window into two or more frames.



2.4 XHTML 1.0 (specs)

XHTML 1.0 became an official W3C Recommendation January 26, 2000. A W3C Recommendation means that the specification is stable, that it has been reviewed by the W3C membership, and that the specification is now a Web standard.

XHTML 1.0 is the first step toward a modular and extensible web based on XML(Extensible Markup Language). It provides the bridge for web designers to enter the web of the future, while still being able to maintain compatibility with today's HTML 4 browsers. It is the reformulation of HTML 4 as an application of XML. It looks very much like HTML 4, with a few notable exceptions, so if you're familiar with HTML 4, XHTML will be easy to learn and use.

XHTML is very similary to HTML in structure, being almost identical to HTML 4.0 however, XHTML is a stricter and cleaner version of HTML.

XHTML in a Nutshell

  • XHTML documents are all lowercase.
  • All tags, including empty elements, must be closed.
  • All attribute values must be quoted
  • Elements must nest, not overlap
  • All documents must have a doctype declaration

There are many excellent walkthroughs of the XHTML vs. HTML differences available for your perusal online through the links and resources below.

Minimalist XTHML document

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/strict.dtd">
<html xmlns="http://www.w3.org/TR/xhtml1">
<head>
<title>simple document</title>
</head>
<body>
<p>a simple paragraph</p>
</body>
</html>

To learn more about XHTML try taking a quick and painless tour of XHTML School. If you know HTML and feel pretty comfortable writing HTML you should be able start spewing out XHTML pages in a matter of minutes!

Check out the extensive XHTML Standards, Document Specifications, and Resources for more information on XHTML and it's applications.

XHTML 1.0 Strict:

<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
				

Use this when you want really clean markup, free of presentational clutter. Use this together with Cascading Style Sheets.

XHTML 1.0 Transitional:

<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
				

is used when you need to use presentational markup in your document. Most of us will be using the transitional DTD for quite some time, because we don't want to limit our audience to users with browsers that support CSS.

XHTML 1.0 Frameset:

<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
				

The Frameset DTD includes everything in the transitional DTD plus frames.



2.5 XML (specs)

XML: an "eXtensible Markup Language"
The W3C began to release recomendations for XML early in 1998. Since then many drafts have changed and XML is starting to look like the network document language of the future web. What does this mean to the humble HTML developer? Probably not a whole lot.

If you want to learn more about XML check out this link. It's a good starting point for begining your journey into the land of XML. XML Basics from FineTuning.com



2.6 Standards and Compliance

It is important to keep in mind when discussing standards that not all web pages adhere to these standards. Probably only a very small percentage of the documents floating around the internet today adhere to any standards whatsoever!! Alan Richmond in HTML Standards Compliance - Why Bother? argues the following: "By adhering to the standards you maximise the accessibility of your work to the widest range of user agents, and therefore, users."

Recently, a global initiatives formed that encourages web authors to adhere to the standards laid out by the W3C. Compliance with standards in HTML production is scattered at best. One of the main voices in this argument has been WSP or the Web Standards Project.

WSP Baseline Standards Proposal
"Creating multiple versions of the same Web page because of incompatibilities among browsers is wasteful and self-defeating for Web developers and their clients. The alternative is to try to resolve the incompatibilities by often complicated workarounds that are costly for developers and their clients - at the cost of preventing Web pages from being flexible enough to be used by emerging television-based and PDA-based browsers."

2.7 Real HTML Validation

Genuine HTML validators employ SGML parsers to check a document's syntax against a document type definition (DTD). HTML standards issued by the W3C specify DTDs for checking the validity of HTML documents.

There are other programs, commonly called "lints" or "linters," that also check HTML documents. However, lints generally do not use an SGML parser with a DTD, but instead use a simpler, less formal parser. Lints do not find all errors caught by a real validator, and they often report false errors.

Lints are useful tools for tracking down problems other than invalid HTML. For example, a lint might point out that the OBJECT element is poorly supported among current browsers or that the FONT element is considered harmful. Reports from lints are subjective--they reflect the opinions of the lint's developer. Reports from validators are objective--they simply tell you what your errors are according to HTML standards.

Using a lint is not a substitute for real HTML validation. New developments on the Web--such as the growing use of XML and the strong standards focus of Mozilla--continue to demonstrate the need for valid HTML. Using just a lint is not enough.

Unfortunately, some programs claim to be "HTML validators" when they are really lints. In an attempt to avoid confusion, here are separate lists of true validators and lints. As you can see, many of the lints generate bogus errors for this document, which is valid HTML 4.0 Strict and compatible with any browser.

Real Validators

Lints

WDG HTML Validator
This validator is a bit more user friendly than the W3C's. It supports the new XHTML DTD's. If a document fails to include a DOCTYPE, the Validator assumes HTML 4.01 Transitional.
W3C HTML Validation Service
This is an easy-to-use HTML validation service based on an SGML parser. It checks HTML documents for compliance with W3C HTML Recommendations and other HTML standards. You can test this set of documents, via the "Valid XHTML 1.0!!" button at the bottom of the page. When you click on it you will see the validation results. If there are "No errors found" then we're doing good.

§ Further Reading §




The Onaicul Project
Valid XHTML 1.0! Under OpenContent License Made with CSS
contact: <luciano$onaicul,com>