6.1 HTML Document Structure

6.1 HTML Document Structure

The proper HTML document begins with a prolog just like an XML document. This prolog consists of the Document Type Declaration and comes in three versions. Each of these may limit or increase the usage of particular markup. Original versions of HTML used some markup that has become deprecated (outdated or revised) or obsolete. Any of these deprecated elements used in this chapter are so noted. The three !DOCTYPEs are:

  • <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">This declaration is for very strict pages with no deprecated elements and attributes or framesets.

  • <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
    This declaration contains all the elements and attributes from the strict declaration and includes the use of deprecated markup. Most of these deprecated elements are for the styling of text in the HTML document.

  • <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN http://www.w3.org/TR/html4/frameset.dtd">
    This is the most common markup. It uses all of the transitional elements and includes framesets and frames.

XHTML has similar Document Type Declarations:

  • <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

  • <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/ xhtml1-transitional.dtd">

  • <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">

HTML and XHTML documents must have a root element to make them well formed. This root element may have attributes to further define the document. Browsers may render the page differently based upon these attributes. The version attribute specifies the version listed in the DOCTYPE. The lang attribute can list the base language of the page. The dir attribute works with the lang attribute to specify the direction of the language as it is read natively. The values of the dir attributes can be left to right (LTR) or right to left (RTL). The <dir> element has been deprecated.

<html version="4.01" lang="EN" dir="LTR|RTL">
<!-- comments are the same in HTML as in XML -->