Printable version of lesson; return to section index when finished.

Basic HTML

Introduction

This lesson covers the absolute basics of HTML. After completing it, you should be able to create a basic, linear web page, with a heading and some text.

What is HTML?

HTML (Hypertext Markup Language) is a relatively simple document markup language. It is not a programming language. An HTML file, usually representing a single page on the Web, simply contains the text content of that page, and a few special markers that indicate different types of text such as headings.

HTML files are plain text

HTML files are written in plain text, which means they contain only letters, numbers, and some symbols (e.g. = ! + % ^). There is no inherent formatting in the plain-text format. A Web browser reads this plain-text and detects special symbols and text which affect the display, so that the final result can contain formatting.

You can view the original plain text with a simple text editor like Windows Notepad. This is how you'll see the HTML "code" as you create it. When you're learning to create Web sites it is best to work directly with the code, to make sure you fully understand how everything works. Once you become experienced, you may decide that a good visual editor (i.e. not FrontPage), which will allow you to work directly with the formatted output, could save time - but you'll still need the understanding you achieved from working with the raw HTML.

Example

Take a look at this very simple example Web page. While you're there, view its source using the instructions on the page. This will show you how the page was created.

Keep the source available so that you can refer to it as you read the rest of this section. Gradually, each part of it will become clear.

Tags

"Tags" - special words within triangle brackets (< >) are the basis of HTML markup.

Tagspotting

A tag is a word or sequence of characters within triangle brackets. If you look at the source of the example given previously, you should see numerous tags.

For example, there is an <html>tag; a <title>tag; an <h1>tag; and so on.

Pairs of tags

You probably noticed that tags tend to occur in pairs. First there is a "title" tag, then after some text, there is a "/title" tag to match. The same pattern occurs with all the other tags in that example except the !doctype tag.

The tag name on its own (<title>) is an opening tag. With the / symbol (</title>), it is a closing tag. Whatever occurs in between those two tags (text or other tags) is said to be "contained by" the tags. Tags have some effect on whatever they contain.

Simple tags

The example given contained three simple tags.

<h1>

The H1 tag means "top-level heading". It is used to mark the most important title on a page. As you saw, the text marked within the H1 was shown by the browser as a large heading. (H1 is not always displayed like that; H1 simply indicates that the text is an important heading, and the large serif font that appeared is the default font for important headings. We'll see how to change the display of page elements, including headings, in a later class.)

<p>

The P tag means "paragraph". Paragraphs are very important in HTML. Each paragraph of text should be contained within P tags.

If you look at the source of the example, you may notice an interesting fact: the additional spaces and line breaks included in the source were completely ignored. When you put more than one space or line break between words, HTML ignores this and treats it as a single space. So to divide text into paragraphs, you need to use the P tag; hitting return a few times will have no effect.

<title>

The TITLE tag is used to assign a title to the page. This title will be used wherever a Web browser (or sometimes another Web site, for example a search engine) needs to display a title for a page.

For example, titles appear in the browser's title bar. If you bookmarked the example page, the bookmark would also be labelled with that information from the title tag.

Boilerplate HTML

Some HTML content is standard throughout most pages and rarely needs to be changed. You could copy this from a previously-written page (or from that example source) when you start a new page.

Web browsers will still be able to display your pages even if you miss out some of these tags, but it's good practice to include all of them.

<!doctype ...

DOCTYPE is a special tag; note that it begins with an exclamation mark. It is only used at the start of a document. Technically, it indicates (to browsers and to other tools) which version of HTML is in use.

The DOCTYPE tag included references the "strict" version of HTML 4.0, which is defined at the World Wide Web Consortium (W3C)'s web site. However, you don't need to understand this tag; it can simply be copied into the start of each web page.

<html>

The HTML tag indicates where the HTML document begins and ends. (!DOCTYPE is not technically part of the HTML document, which is why it's outside the HTML tag.)

<head>

HEAD contains the header of the HTML document. The header is used for information that isn't actually shown as part of the visible page but may be used in some other way.

The document TITLE is part of the header, because it isn't shown as part of the main page (even though browsers may display it in the title bar). Other information is sometimes included, such as a description of the page's contents, or a list of keywords that might help to index the page.

<body>

The BODY tag contains the displayed portion of the page (here, the heading and a few paragraphs of text). Everything that is displayed in the main page will be included within this tag.

Summary

HTML is a simple language based on text and tags.

Tags are short words or codes, inside triangle brackets. They usually occur in pairs; a closing tag has the same word as the starting tag, but begins with a / symbol (within the triangle brackets). Whatever text or tags occurs between the opening and closing tags is contained by the tags, and is affected by the meaning of the tags.

Some useful tags are H1 (marks a top-level heading), P (marks a paragraph of text), and TITLE (gives a title to the Web page).

Some tags - !DOCTYPE, HTML, HEAD, and BODY - are included in all web pages. These specify the type of the document and divide it into two sections (the heading and body), but it isn't important to completely understand these tags.

Try it for yourself

Individual exercise

Based on the example site, try to create a page about yourself using a text editor. Begin with your name as the heading. Write a short paragraph underneath that, giving your date and place of birth and where you currently live.

Save the file and make sure its name ends with ".html". (If you use Windows, you'll have to make Windows show file extensions: from a folder window, choose "View", "Options", then the "View" tab, and make sure the checkbox "Hide extensions for known file types" is not checked. After this, you may need to rename your file so that it ends with ".html".)

Then drag it into your Web browser to view it. Drag it back into your text editor to correct any mistakes.

Once you've achieved that, try to extend the page slightly. Create two subheadings (for example, "appearance" and "occupation"), and write a paragraph of text under each one. You can use the H2 tag for subheadings; it works exactly like the H1 tag.

References

If you wish to get more advanced information about HTML, you might want to use some of the following reference sites. These are references and aren't too helpful at explaining things to new users, though. They may be more useful to you later on in the course.

I don't really recommend you use these sites to learn more HTML because it is very easy to pick up "bad habits"; a lot of the HTML tags that exist really shouldn't be used. Obviously, as the course progresses I'll cover more of the language and will also show the "correct" ways to do some of the things you might want to do using the more dubious HTML tags (e.g. FONT - avoid this tag at all costs :)

There are hundreds of HTML tags and you really don't need to use more than a handful of them in producing a Web site. The course Web site only uses 18 different tags: !DOCTYPE, HTML, HEAD, TITLE, BODY, H1 .. H4, P, IMG, UL, LI, A, LINK, STYLE, DIV, and SPAN. All these tags (and probably a few more) will be covered during the course, as well as the CSS language that you use to add graphical style to the HTML structure.

HTMLHelp.com

http://www.htmlhelp.com/

A useful and cleanly designed reference site. Contains a full reference to HTML 4.0 (the current standard which we're using) and CSS. If you click the "offline versions" link, you can download the references either in HTML format, or if you use Windows, in convenient Windows Help format.

Index Dot HTML

http://www.blooberry.com/indexdot/html/index.html

Good reference covering all HTML tags (and CSS too) along with information about browser support. Slightly ugly, but contains a lot of useful information.