HTML tutorial and page design guide
ardcopy featured a 10-year-old boy one night back in the mid-1990s. His psychotic mother wouldn't take her meds and was beating him up. He wanted to live with his father but the judge wouldn't change his custody arrangement. So the 10-year-old kid built a Web site to encourage Internetters to contact the judge in support of a change in custody.
If you think that you need professional help to build a static HTML Web site, tell yourself "The abused 10-year-old got his site to work; I think I can, too."
You May Already Have Won $1 Million
Then again, maybe not. But at least you already know how to write legal HTML:
My Samoyed is really hairy.
That is a perfectly acceptable HTML document. Type it up in a text editor, save it as index.html, and put it on your Web server. A Web server can serve it. A user with Netscape Navigator can view it. A search engine can index it.
Suppose you want something more expressive. You want the word really to be in italic type:
My Samoyed is <I>really</I> hairy.
HTML stands for Hypertext Markup Language. The <I> is markup. It tells the browser to start rendering words in italics. The </I> closes the <I> element and stops the italics If you want to be more tasteful, you can tell the browser to emphasize the word really:
My Samoyed is <EM>really</EM> hairy.
Most browsers use italics to emphasize, but some use boldface and browsers for ancient ASCII terminals (e.g., Lynx) have to ignore this tag or come up with a clever rendering method. A picky user with the right browser program can even customize the rendering of particular tags.
There are a few dozen more tags in HTML. You can learn them by choosing View Source from a Web browser when visiting sites whose formatting you admire. You can also work through a comprehensive HTML guide, e.g., http://www.w3schools.com/html/html_reference.asp (Web) and HTML & XHTML: The Definitive Guide by Musciano and Kennedy (O'Reilly, 2002; print).
Armed with a big pile of tags, you can start strewing them among your words more or less at random. Though browsers are extremely forgiving of technically illegal markup, it is useful to know that an HTML document officially consists of two pieces: the head and the body. The head contains information about the document as a whole, such as the title. The body contains information to be displayed by the user's browser.
Another structure issue is that you should try to make sure that you close every element that you open. So if your document has a <BODY> it should have a </BODY> at the end. If you start an HTML table with a <TABLE> and don't have a </TABLE>, a Web browser may display nothing. Tags can overlap, but you should close the most recently opened before the rest, e.g., for something both boldface and italic:
My Samoyed is <B><I>really</I></B> hairy.
Something that confuses a lot of new users is that the <P> element used to surround a paragraph has an optional closing tag </P>. Browsers by convention assume that an open <P> element is implicitly closed by the next <P> element. This leads a lot of publishers (including lazy old me) to use <P> elements as paragraph separators.
Here's the HTML template from which documents at philip.greenspun.com start out:
<body bgcolor=white text=black>
by <a href="/">Philip Greenspun</a>, revised April 1, 2003
yet more text
Let's go through this document piece by piece (see for how it looks rendered by a browser).
The <HTML> element at the top says "I'm an HTML document". Note that this tag is closed at the end of the document. It turns out that this tag is unnecessary. We've saved the document in the file "basic.html". When a user requests this document, the Web server looks at the file's ".html" extension and adds a MIME header to tell the user's browser that this document is of type "text/html".
The <HEAD> element's primary purpose in this document is so that one can legally use the <TITLE> element to give this document a name. Whatever text is placed between <TITLE> and </TITLE> will appear at the top of the user's browser window, on the menu that pops up when the user clicks on the Back button, and in his bookmarks menu should he bookmark this page. After closing the head with a </HEAD>, the body of the document is opened with a <BODY> element, to which are added some optional parameters to set the background to white and the text to black. Some Web browsers default to a gray background, and the resulting lack of contrast between background and text is sufficiently offensive that it may be worth changing the default colors. This is a violation of some of the principles articulated in this book because it potentially introduces an inconsistency in the user's experience of the Web. However, one need not feel too guilty about it because (1) a lot of browsers use a white background by default, (2) enough other publishers set a white background that white pages won't seem inconsistent, and (3) it doesn't affect the core user interface the way that setting custom link colors would.
Just below the body, there is a headline, size 2, wrapped in an <H2> element. This will be displayed to the user at the top of the page. One could alternatively use <H1> but browsers typically render that in a ridiculously huge font. Underneath the headline, it makes sense to indicate authorship, link to a parent work, and specify the revision date. The authorship link shows that someone is taking responsibility for the content. The link to the parent work, e.g., a book table of contents if the file is one chapter, helps users who've landed on this page from a public search engine. The revision date is important because Web pages often linger forgotten by the author but still available to the public long after they are obsolete. Notice in this example that the authorship phrase "Philip Greenspun" is a hypertext anchor which is why it is wrapped in an A element. The <A HREF= says "this is a hyperlink." If the reader clicks anywhere from here up to the </A> the browser should send him to the root page on the server ("/").
After the headline, author, and optional navigation, the template adds a horizontal rule tag: <HR>. Don't overuse these big lines across the window: Real graphic designers use whitespace for separation. This template uses <H3> headlines in the text to separate sections and <HR>s at the very top to separate the document contents from the headline information and at the very bottom to separate the document contents from the author's email link.
Underneath the last <HR>, the document is signed with "email@example.com". The <ADDRESS> element usually results in an italics rendering. Readers expect that they can scroll to the bottom of a browser window and find out who is responsible for what they've just read. Note that this one is wrapped in an anchor tag. If the user clicks on the anchor text (my email address), the browser will pop up a "send mail to firstname.lastname@example.org" window. It is generally a good idea to wrap every email address on a Web page in a "mailto" tag. Sadly in Age of Spam it may not be a good idea to put any email address on a Web page. An alternative to the author's personal email address would be a form that a reader could use to send a message to the author or editor.