LESSON 1 - 1 -- THE MINIMAL PAGE
OK, first, understand that HTML stands for HyperText Markup Language. The trick is that we are going to use only text characters to tell the browser who to display text characters. =) This means we need to make a small number of characters "special". Since the original people working on HTML were familiar with SGML -- Standard Generalized Markup Language -- they decided to use the SGML special characters. Then they decided to go ahead and use a simplified version of SGML since it made it easy to explain to other researchers who already were stuck using SGML. Also, they could reuse programs they had already written.
Don't worry about all that except to be glad they didn't use SGML and to understand that their laziness is a virtue. Them being constructively lazy has made the web what it is, simple to use, simple to explain and simple to write programs for. Thus, it all exploded like mad.
NOW WE MOVE ON TO THE JUICY STUFF! Here is what HTML tags look like:
"<html>" starts a HTML document. "</html>" ends the whole page. Normally the only thing in a HTML document that falls outside these tags is a special SGML tag that tells all the SGML and HTML readers which version of HTML you are using. That tag looks like "<!DOCTYPE ...>" where the "..." is a whole bunch of special text. This tag has no closing tag.
Inside the "<html>" and "</html>" there are two sections marked out with opening and closing tags just like these. The "<head>" and "</head>" tags contain the head section where we tell about the document that follows. There shouldn't be any real document text in this section, it s just for data about the document, like the name of the document and the name of the author and stuff of that sort. The "<body>" and "</body>" tags contain all the real document text. Everything that the document will display is here.
Now I'll show you three markup tags, one for the "HEAD" section and two for the "BODY" section. Follow this little bit and read the section below it where I explain how the special characters are handled and you'll understand every thing I used to make this page appear the way it did! Hit viewsource after this and you shouldn't see anything you don't get.
The first tag we'll teach you is the "<title>" tag. It has a closing tag, "</title>". It goes in the "HEAD" section and the text inside the two tags becomes the text you see in the title bar of the window that the page is loaded in. If you look at the viewsource for this page you'll see a line like this: "<title>Lesson 1 - 1 -- The minimal page</title>"
The next tag we teach is the simplest and most used tag there is. It is a "BODY" tag and it looks like "<p>". Yup, just a "P". This is the paragraph tag and it is placed at the beginning of every paragraph. There is a containing "</p>" tag but you'll never use it. When ever you start a new paragraph you are obviously done with the last one! This is so simple that it got built-in from day one. Less work for you is good, right?
Now you might be saying, "Hey, a paragraph tag? Well then what good is this fancy return/enter key on my keyboard?" Well, you could wear it as an earring if you didn't need it for other programs. One of the tricks of SGML and HTML is that all whitespace is the same. A return, a space or a tab, they all get treated the same. The text is word-wrapped when it is longer than the browser is wide. You can type any number of spaces, tabs and returns between words and the browser will treat all of them as ONE SPACE. If the word after a space hits the right edge of the browser it treats the space as a line return and puts the rest of the text on the next line. More on this later but for now accept that the only way to break up chunks of text is with markup. Returns are just whitespace.
One more tag and we are done. This tag is the big doozy, the reason the web is hot at all. "<a>". Yup a simple "A". It has a closing tag, "</a>" but that isn't enough for the HYPERTEXT LINK TAG! Nope, we have to tell them where the text we are linking is going. To do that we have to add an option to the "A" tag. The option is the "HREF=" option and the value that follows the "=" is the address of the web page that the link will take us to. Confused? Let me put it together for you. Making the word "Hostile" link to my homepage at "http://www.hostile.org/" is this simple: '<a href="http://www.hostile.org/">Hostile</a>'
That is all there is. Really. Everything else is just fancy pants stuff and prettifying. Options and details. Wimpy stuff. This is most of the real web though. Paragraphs of text and links to more pages. Read the rest of this page for more detail that will help you understand how I made the special characters appear so you could see the details of the HTML tags. There is a summary at the bottom of the page, like all the lessons. There is an example for you that will save you digging through this big file for details. I cheat and use an extra tag on the example so that's a bonus right? Viewsource it and see!
MORE DETAIL!
I told you the HTML guys grabbed everything they could from the SGML guys, right? The special characters the SGML people picked were the "<" and the ">", known collectively as the "angle brackets" and individually as the "less than symbol" or "left angle" and the "greater than symbol" or "right angle". They use these characters to enclose their markup so you can tell it apart from the real text to be displayed.
Obviously, we now have a problem. How can we ever use angle brackets for anything else? Sure we don't use them much but when we need them, we need them bad. Now we either need to do something special inside angle brackets that means "print a greater than symbol" or "print a less than symbol" or find another way to mark them up. At this point the guys who had said "What is inside angle brackets is markup, not text!" stuck to their guns. They rightly knew that it would only confuse if sometimes the markup caused real text to magically appear.
Thus we pick another character, in this case the "&" known as the "and symbol" or "ampersand". We define it as a special character that describes other characters in the text. Thus, & becomes a ampersand when the text is displayed. Now we can use "<" and ">" to be our special text that becomes the left and right angle brackets. Notice that "amp" is short for "ampersand", "gt" is short for "greater than" and "lt" is short for less than. We also made the ";" special at the end of a ampersand string or "entity". It marks the end of name of the special character. That way the computers don't get confused if we type "<doggy". The computer would never know if we wanted a less than in front of "doggy" or some weird "ltdoggy" symbol that it had never been programmed with. Their laziness in programming future proofed HTML, if we ever want to add a "<d" symbol that means "limited" we wont screw up everybody who just wanted a less than symbol in front of their dog. =)
Now that we have our special characters we decide on some tags we use to mark sections of our document and how we want them handled. All of our tags are going to be of the form "<sometext>" which is pretty simple, right? Let's make another quick rule, the tags aren't case sensitive. Upper or lowercase is fine, the browser treats it all the same.
Now some tags mark a section of text -- they contain text to markup -- and so they need a closing tag. So we make a simple rule the closing tag for any tag is the same word inside angle brackets, only with a "/" in front of the word. Like this "</sometext>", which is also really simple.
Now some tags need to be able to take options in order to tell the browser what to do with a marked-up section. In this case we make another rule, the tag name ends with the right angle bracket or with whitespace like a blank space from the "space bar" or a return from the "enter key" or "return key" or even a tab from the "tab key". Basically anything we would think of as the end of a "word". Everything after the white space should look like this: option="value". What ever follows the "=" needs to be in quotes. So a whole fancy tag might look like this '<sometag option1="a value"> some text here </sometag>'.
REVIEW SUMMARY
* HTML is HyperText Markup Language
* SGML is HTML's evil ancestor.
* The first line of a HTML document should be a "<!DOCTYPE ...>" tag.
* I haven't told you how to make a "<!DOCTYPE ...>" tag yet, so there.
* The whole document after the "!DOCTYPE" tag is contained by a "<HTML>" and a "</html>" tag pair.
* All non-text data goes into the "<head>" and "</head>" container.
* All text data goes into the "<body>" and "</body>" container.
* Unless you are nuts, you have spotted that the "HEAD" goes in before the "BODY".
* I usually show the tags in lowercase when they are in brackets but I type then in all uppers when I formally name them.
* Container tags have a closing tag that is the same as the opening tag but with a "/" slash preceding the text.
* The "TITLE" container goes in the "HEAD" and sets the text title in the browser window bar.
* The "P" -- paragraph -- container goes in the "BODY".
* Everybody gets to be lazy and leave the closing tag off of the <p> tag!
* The "A" -- anchor -- container goes in the "BODY"
* The "A" tag needs an option called "HREF=" to tell the browser where to go when you click on the link text in the container.
* You have no idea why "A" stands for anchor yet.
* Options in tags are called "attributes".
* The "value" of the attribute is the bit after the "=".
* The "value" of an attribute should ALWAYS be in double quotes!
* Whitespace one or more spaces, returns, and tabs.
* All whitespace winds up being treated as a single space.
* The browser word wraps long text by breaking at whitespace.
* "<", ">", and "&" are special.
* When we want a special character to be seen by the user and not the browser we encode it in the text with a "&".
* Ampersand "entities" are constructed like this:
*** a "<" is encoded "<".
*** a ">" is encoded ">".
*** a "&" is encoded "&".
* The semicolon at the end of the entity is important. Don't leave it off!
* There are 225 '"' -- doublequotes -- in this file, inclusive. =)
Now, head back to the Main tutorial page back to Lesson 0 or move on to Lesson 2.