See these sample files for additional XHTML information and lots of informative XHTML examples.

History of (X)HTML

SGML and XML are meta-languages, which means they can be used to describe other (markup) languages.

Tim Berners-Lee specified the first version of HTML, and based it on SGML.

Later, a much-simplified version of SGML, namely XML, was used to rewrite HTML as XHTML.

The timeline and relationships look something like this:

SGML     -> HTML -> HTML 2.0 -> HTML 3.2 -> HTML 4.0 -> HTML 4.01 -> HTML 5
 |          1990    1992          1996        1997        1999         ??
 |           |
 V           V
XML 1.0 -> XHTML 1.0 -> XHTML 1.1 -> XHTML 2
            2000         2001        (working group now disbanded)

XHTML Rules

  1. There must be only one "root" element, that is, only one html element.
  2. All tag names must be in lowercase.
  3. All non-empty elements must have a closing tag, and all empty elements must have this form: <tag />
  4. All attributes must have values, which must be enclosed in quotes. (That is, "attribute minimization" is not allowed.) Example: checked="checked"
  5. Proper nesting of elements is enforced: <tag1><tag2> ... </tag2></tag1>
    Part of "proper nesting" are these rules:

XHTML Element Groupings

Basic Page Elements (all required for page validation)


Elements That Require Opening and Closing Tags and Have Content

Block-Level Elements with Content
Inline Elements with Content
Elements with Content That Can Be Block-Level or Inline

Empty Elements (that have no content)

Block-Level Elements without Content
Inline Elements without Content

Additional Useful Elements

XHTML Entities

XHTML uses some symbols in a special way. For example, it encloses tags in angle brackets (the "less than" and "greater than" signs, if you like). This makes it difficult to use those symbols in the "normal" way, that is, as less than or greater than symbols. For example, if you write an expression like a<b, how is XHTML to know if you mean that a is less than b, or if <b is the start of an opening bold tag?

Symbols like this are referred to as meta-characters, and they are represented by a special syntax that starts with an ampersand sign (&), is terminated by a semicolon (;), and in between has a sequence of letters that serves as an acronym or shorthand for the symbol in question. This representation is called a XHTML entity. XHTML entities can also be used for inserting unusual characters into a web page, and hence there are a great many of them. Here are a just a few of the more common ones:

&lt;  for  <
&gt;  for  >
&amp;     for  &
&copy;  for  ©

Form Elements

The following combination of specific and generic markup shows the structure of a typical form and and many of the form controls (or "widgets") and their attributes that you might see in such a form:

<form action="?" method="get|post">

  <input type="text" name="?" size="#" maxlength="#" value="...displayed text..." />
  
  <input type="radio" name="same_as_others_in_group" value="?" checked="checked" />
  
  <input type="checkbox" name="same_as_others_in_group" value="?" checked="checked" />
  
  <input type="submit" name="?" value="text on button" />
  
  <input type="reset" name="?" value="text on button" />
  
  <input type="button" name="?" value="text on button" onclick="someAction()" />
  
  <input type="password" name="?" size="#" maxlength="#" value="?" />
  
  <input type="file" name="?" size="#" maxlength="#" value="?" accept="MIME type" />
  
  <input type="hidden" name="?" value="?" />
  
  <textarea name="?" cols="#" rows="#">...displayed text... </textarea>
  
  <select name="?" size="#" multiple="multiple">
    <option value="?" selected="selected">...displayed text...</option>
    ...more options...
  </select>
    
  <fieldset>
    <legend>text for fieldset legend</legend>
    -related group of form controls-
  </fieldset>
  
  <label>lablel for implicitly labeled element</label>
  
  <label for="idOfLabeledElement">label for explicitly labeled element</label>
</form>

Notes:

Deprecation

The following elements are deprecated in HTML 4.01 and/or XHTML 1.0, and missing altogether from HTML 5:

The following elements are not deprecated in HTML 4.01 and/or XHTML 1.0, but are nevertheless missing from HTML 5:

The following element is deprecated in HTML 4.01 and/or XHTML 1.0, but has been resurrected in HTML 5:

(X)HTML Attributes

An XHTML attribute is a kind of modifier for an XHTML element. It appears in the opening tag of an element, immediately following the tag name. For example, in <table border="5">, border is an attribute of the table element. As illustrated by this attribute, in XHTML every attribute must have a value, and that value must be enclosed in quotes (single or double). Also, an element may have several attributes, separated by whitespace. If there are multiple attributes, their order is irrelevant.

Attribute Categories

Here is one way to categorize (X)HTML attributes, with some examples of each:

Validation

If you wish to validate your XHTML documents as strict XHTML, then each document should begin like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<!-- filename.html -->

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <title>Your Title</title>
</head>
<body>
...

When we say "begin like this", we mean it, in the sense that if you do not have your DOCTYPE as the first thing in your file, the browser may switch into something called quirks mode, which you do not want to happen if you have written (or hope you have written) a standards-compliant web site.

SSI (Server-Side Includes) and the base Element

A server-side include makes use of the construct shown below in a file of markup to "include" or "import" into that markup the content of file.html from the subdirectory called common, which then replaces the comment.

<!--#include virtual="common/file.html"-->

Be sure to get the syntax of the above line exactly right. The smallest error may cause SSI not to work. In particular, watch out for extraneous blank spaces.

This mechanism, along with an appropriate use of the base element, is extremely useful for accomplishing the following tasks:

Additional Notes

  1. Be aware that HTML 5 is quickly "coming down the pike", and you should be making yourself aware of the new facilities it provides. It will be some time before it is both fully and widely supported, but even now most browsers support some of its capabilities. Cross-browser inconsistencies, however, make it somewhat "dangerous" to make use of too much HTML 5 as yet, unless you wish to do to the additional trouble of incorporating markup that tests for feature availability.

Best Practices

  1. Make your DOCTYPE declaration the first item in your file, and make your file-identifying comment the second item in your file.
  2. Validate your files with the W3C Validator. If you are using the Firefox browser with the Web Developer toolbar installed (for example), this is a single-click operation.
  3. Prefer charset=utf-8, though other character encodings are possible.
  4. For simplicity put all files containing common markup in a single subdirectory (perhaps called common, for example).

Potential Gotchas

  1. SSI will not work unless your server is SSI-enabled. Check with your administrator to make sure it is enabled if you plan to use it.