PPI - Lesson 5

Internet Publishing

"Materials used in this course are the property of the author. These lessons may be used only by course participants for self-study purposes. Application for permission to use these materials for other educational purposes such as for teaching or as a basis for teaching should be directly submitted to the author."

5. More on HTML and Tools

The material in this lesson can be found in the corresponding chapters 10, 11, 12 and 13 of the text. You may use Chapter 12 as a reference for the tools you will be using. It is not necessary to learn about all these tools, but you may find it helpful to skim the chapters, so that you are familiar with what is possible to accomplish with the tools. Things develop quickly here -- new tools appear daily!

This lesson deals with the following topics:

Organization of Information for Publication
Layout
More about HTML Coding
Tools
Placing Information on Servers
Exercise

This lesson was first published on 8/04/97.

5.1 Organization of Information for Publication

A common problem with hypertext is that the user can easily get lost in hyperspace. The reasons are many. The most important of these is that the hypertext mechanisms, in principle, are "one-way streets". It is very simple to steadily choose new links without knowing ahead of time if they lead to the information one is looking for.

Most clients, therefore, remember the links you have visited and give you the opportunity to go back one step at a time by using the "Back" button. However, this is not a very efficient way to navigate, and one quickly feels lost.

Possible solutions:

Implementation of guides: These are help mechanisms which we can use to choose the "right" way, that is the correct URL. In practice, this is a search engine where the user can enter one or more keywords in a text. The search engine has become familiar with hyperspace and saved links in a large, searchable database. The URLs of the registered texts can be sent back to the user. It is common for clients to have their own buttons or menu choices to start up search engines.
- Netscape has the "NetSearch" button which leads to http://home.netscape.com/home/internet-search.html. Try it!
- Internet Explorer has the "Search", button which leads to http://home.microsoft.com/access/allinone.asp. Try it!
Otherwise, we are familiar with the general search engines (Alta Vista, HotBot, etc.) which we have used earlier. It should be mentioned that it can be of interest to make your own pages searchable with the help of your very own search engine.
Another way to help the user: Make a logical and easily recognizable information structure for the information which shall be published. This can, for example, be accomplished by making structures most users are familiar with. Structures which are familiar from books are: table of contents and indexes. From calendars we recognize chronological statements, that is information sorted by time.
A third way: The user makes his own mechanisms to help with navigating. This can be, for example, bookmarks. One such procedure assumes that the user himself can follow a set of links and mark the places he finds interesting.
Use of "frames": This technique has not yet been standardized. Neither is it recommended in HTML 3.2. This expansion is still as well supported by software developers. The technique divides the client window into a few frames. It is now possible to click on a link in one frame, and the new result will appear in another frame. In this way, the user will not lose his starting point while using links.

(Lessons) (Back to Top) (Jump to 5.2 Layout)

5.1.1 Guides

Lesson 9 deals with indexing and search engines so these will not be taken up here.

5.1.2 Information Structures

Several "guides" and guidelines have been published on how information should be presented. One of these is Style Guide for Online Hypertext which was created by W3C (W3 Consortium) which handles the official specifications dealing with WWW technology.

5.2 Layout

Read Style Guide for Online Hypertext about layout for HTML code. In this document, I have tried to follow most of the guide's principles. Pay special attention to the insertion of lines with links before each main section. They are provided to guide the reader through the document in a structured and easily recognized manner.

(Lessons) (Back to Top) (Back to 5.2 Layout) (Jump to 5.4 Tools)

5.3 More about HTML Coding

Here we will look at:

use of tables
use of clickable images

Use of forms will be discussed in its own lessons.

5.3.1 Tables

Tables are new in HTML 3.0 and 3.2, and we shall take a quick look at these:

Tables are defined with the tags:

<TABLE>
</TABLE>

If you want the cells in the table to have borders, these can be set as the attribute BORDER.

Examples:

<TABLE BORDER=1>
<TR><TD>Name</TD><TD>Address</TD></TR>
<TR><TD>Per</TD><TD>5 Main Street</TD></TR>
</TABLE>

Result:

Name	Address
Per	5 Main Street

The table is divided into rows (lines) and columns. Each row is defined by the tags <TR> ...</TR>
Within each row one or several columns are defined, each within the tags <TD> .. </TD>
Each cell may contain formatted text, pictures and the like.

Possible attributes in the TABLE element:

BORDER=x: "x" represents the number of pixels making up width of the borders surrounding the cells within a table.
CELLSPACING=x: Cellspacing states how many pixels separate the cells in a table.
CELLPADDING=x: Cellpadding determines the number (x) of pixels between the border and the contents of the cells.
WIDTH=x: Here, x is the width of the table. This can be state in the absolute number of pixels, or as a percent of the current window's width, e.g. WIDTH="70%". If the WIDTH attribute is omitted, the table will adjust the contents with relation to the cells.
ALIGN=left | right |center: This attribute guides the adjustment of the table to the left, right or center of the document.

Examples of Tables

....	BORDER=1
....	CELLSPACING omitted
....	CELLPADDING omitted

....	BORDER=1
....	CELLSPACING=10
....	CELLPADDING omitted

....	BORDER=1
....	CELLSPACING=0
....	CELLPADDING omitted

....	BORDER=10
....	CELLSPACING omitted
....	CELLPADDING omitted

....	BORDER=10
....	CELLSPACING=10
....	CELLPADDING omitted

....	BORDER=1
....	CELLSPACING omitted
....	CELLPADDING=10

If you want a cell to overlap x number of columns in the following row, the attribute COLSPAN=x can be used in the TD tag. If you want a cell to overlap x number of lines in a column, the ROWSPAN=x attribute can be used in the TD tag.

Between the TABLE tags we can also have the CAPTION tag which defines the name of the table, e.g. <CAPTION>Price List </CAPTION>.

A good description of tables is provided by Ian Graham.

5.3.2 Clickable Images - Imagemaps

You will learn to make clickable images - or imagemaps. Take a look at Chapter 13 in your text or try other resources. These are pictures with "clickable" zones which are linked to new URLs. There are several ways to accomplish this.

Use of IMG Tags and ISMAP attribute: This is the method which is standard for HTML 2.0. The principle is that the "map" which describes the clicable areas are placed on the server. Special rights are needed to set up these kinds of maps. By clicking in a clickable image, the co-ordinates clicked upon are transferred to the server. The server then looks up the map and finds out what should happen next.
Use of FIG Tags: This was a HTML 3.0 proposal where the description of the clickable areas (that is, the map) lies in the HTML code and therewith is managed by the client. This is significant for those who are doing the publishing, because one doesn't need to save the map as a separate file on the server. FIG tags are supported by Netscape. There are also laid in mechanisms to use so called overlays in an image. This allows one to change parts of an image without having to transfer it again. The FIG tags are not proposed for standardization in HTML 3.2. Therefore, they remain a special tag for Netscape.
Use of the IMG Tag and the MAP Element: This is an expansion on the IMG tag with a new attribute, USEMAP, and a new HTML element called MAP which describes the clickable areas. This makes it possible for the map to be managed on the client-side, just like the FIG element, instead of on the server. See the example below and a good description of the MAP element.

Example Using the MAP element:
Displayed below is an image of the command buttons from Netscape.

For the time being, the image is not "pressure sensitive". Now, I will create explanations for the three first buttons so that one can click upon a button to display a help message. In order to do this, I must make a map of the three buttons:

<MAP NAME="buttons">
        <AREA SHAPE="rect" COORDS="0,0,50,40" HREF="back.htm">
        <AREA SHAPE="rect" COORDS="50,0,100,40" HREF="forward.htm">
        <AREA SHAPE="rect" COORDS="100,0,150,40" HREF="home.htm">
        <AREA SHAPE="default" HREF="all.htm">
</MAP>

As the code reveals, the map is given the name buttons, and we have defined links to new URLs for each area. In order to be able to use this map, we use the attribute USEMAP in the IMG tag:

        <IMG SRC="buttonrow.gif" USEMAP="#buttons">

The result of this code follows. Try clicking on the image::

The above-mentioned example on the use of the MAP element and its management on the client are the simplest. In Lesson 8, you will also create other variations where the server deals with the map. Read about server side imagemaps in your own documentation, or from Per's practice server.

	Own Documentation	Identical documentation on Per's Practice Servers
For Win 3.x/httpd 1.4:	About Imagemaps	About Imagemaps
For Win95 / WebSite	About Imagemaps	About Imagemaps

Commonalties for server-side and client-side imagemaps:

One must have an image in the form of a .JPEG or .GIF file.
One must create a map.

There are many programs which make so-called .MAP files for use on servers. The same programs can be used to make the foundation for the contents of the MAP elements in HTML-files. The set up of the files are pretty simple so that it can be done by hand, but to find the co-ordinates it's good to have a program.

Install a program to create a map, for example, MapEdit. We shall use this in Lesson 8.

You can also use Paint Shop Pro which is free. See exercise. For those of you using WebSite, a program comes with it, ImageMap, which can be used for this purpose.

(Lessons) (To Top) (Back to 5.3 More om HTML) (Jump to 5.5 Placing Information on the Server)

5.4 Tools

This topic is taken up in Chapter 12 in the text.

Most of you probably downloaded yourselves a program in order to work on the previous lesson. If not, you will find many HTML editors on Stroud's list of software: Windows 3.x-editors, Windows95 editors, or from TUCOWS. Some are:

HTML Assistant

This is a simple tool for editing HTML code. Short and sweet, this is an editor which shows you all the HTML code as text. The program has buttons which make it simple for you to set in the appropriate tags and links.

The program can be set up to run a WWW client to test the produced code. Use the menu File - Enter Test Program Name... This is highly recommended to make testing easier.

If you are using the WWW client Netscape to look at HTML code, be aware that Netscape as a cache mechanism. This means that the client remembers where it has received information from, so that it does not need to download the information each time the same URL is requested. This can be helpful when static information must be downloaded from the network, but it can be very annoying when you are testing and constantly changing the code. The problem can be solved by using the button or menu option Reload in Netscape. Then, Netscape will not download the old information from cache, but the new information from the URL provided.

HTML Writer

This is almost exactly like HTML Assistant, described above.

HoTMetaL

This is an editor with a good deal of functionality. I use the PRO-version which costs money and works well. This editor has won many tests.

General Text Editor

When HTML code is just text and relatively simple, it is entirely possible to use a regular text editor such as Notepad for MSWindows.

Word IA

Microsoft has released an extension for Word for Windows called Internet Assistant - IA. This is available free-of-charge. It requires Word 6.0a or later.

IA uses a HTML template to produce HTML code. The user will experience the text as formatted (WYSIWYG - What You See Is What You Get) and will not see the HTML code. This document is, for the most part, produced with Word IA.

XEmacs

This UNIX editor has a HTML modus which has functionality somewhat surpassing HTML Assistant. It is not WYSIWYG; here, you will see HTML code.

Some editors support validators which check the HTML being written to make sure it conforms to HTML standards. It is very early in the development of HTML tools and we can expect many improvements in this area.

(Lessons) (To Top) (Back to 5.4 Tools)

5.5 Placing Information on the Server

First, we must take a look at identification of machines on the Internet. In order to understand how this is done, a little background on IP numbers and naming must be covered.

5.5.1 Domain Name System and IP Number Distribution

All machines connected to the Internet have a unique IP number which identifies the machine in the network. Usually, we use a name, domain name, instead of the IP number when we identify a machine. Programs, however, use IP numbers to communicate on the network.

Examples:

The machine oversoul.idb.hist.no has the IP number 158.38.60.250

The machine you read lesson from is called astfgl.idb.hist.no and has IP number 158.38.61.236. In addition to this name, the machine has an alias name www.idb.hist.no. When you all use the a WWW-client and give the name www.idb.hist.no to reach the WWW server, the client must first find the IP number to the WWW server to establish contact.

All machines and machine names on the Internet are registered on a database which is called DNS - Domain Name System. This is actually a distributed database which is located on many machines or DNS servers. Your machine uses one of these DNS servers to find the connection between machine names and IP numbers on the Internet. You can find out which DNS server you use by looking at how your TCP/IP software is configured.

For everyone using Trumpet TCPMAN, try the following:

Start TCPMAN, or if it is already started,
Choose File - Setup

Now you can read the IP number to the Name Server. Write this down or copy it to the clipboard.

Everyone using the TISIP Dial-Up package will probably see 158.38.60.240. Which machines is this? You can find this out by using the PING program:

Start the PING program.
Choose the menu Lookup. Place the IP number in the field for Host and click OK.

Now, a message will be send to the machine you use as a DNS server. The machine answers with its name when it is "Pinged". (NB! It is the IP software on the DNS machine which created this answer -- not the DNS server which is also located on the same machine.)

Everyone using Windows95's own TCP/IP software can check their IP number and DNS machine by clicking on the Start button and then going to:

Settings/Control Panel/Network/TCP/IP

Modem Connection

By connecting a PC by modem, it is possible to refrain from giving the PC an IP number before the moment it is connected to the Internet. This is a smart solution because we can then assign an IP number to each modem hook-up on the Internet side instead of each, single PC: This reduces the number of IP numbers used and creates less problems with duplicate IP numbers, etc.

In working with this course you will install a WWW server (HTTPD) on your own machine. This is done for two reasons:

It becomes a common setup that we can refer to while teaching.
It is least expensive for those who use modems, since one can learn its use without connecting to the Internet.

The effect of using a WWW server which is installed on a machine with a modem connection is that no others can, in practice, use the web server from the Internet. Why not?

In this text, I will further discuss a few issues which ought to be taken into consideration when working with a WWW server which shall be generally accessible on the Internet, but which are impossible or unnecessary on a practice server which runs on a modem connected machine.

5.5.3 Issues Concerning Identification of Your Server and Information

5.5.3.1 Give the Server a Name

You ought to give your WWW server a name which is logical from the users point-of-view.

Example: HiST/IDB (The dept. where I work) runs its WWW server on the machine astfgl.idb.hist.no. This would be impossible for external users to guess. Therefore, we have defined the name www.idb.hist.no as an alternative name on the same machine. The new name (www.idb.hist.no) is called an alias.

Such aliases are defined in DNS and are the alternative name for the same machine on the Internet. If you are not responsible for DNS at your organization, you must contact whoever is to get such an alias set up.

Using alias for your server also have other advantages. You can later move your WWW server to another machine. You will then only need to change the DNS so that the alias points to the new machine and the users will still be able to use the same name.

Testing

A WWW server is identified with the help of a name (or an alias) or its IP number. The same machine can run many services, and the various services can use different TCP ports. The various services have standard ports ("well-known ports") which are normally used.

Examples: Email servers use port 25, FTP servers use port 21, Gopher servers use port 70 and WWW servers use port 80.

If you wish to test your WWW server before you open it on the Internet, you can use another port, for example 8345. You can then test out the server and see if it's working as you wish before you move it to port 80. You can set up the port number in the configuration file for the server.

In order to come in contact with the WWW server that uses another port instead of the standard port, the URL designation and port number must be used. See the fictitious example below:

http://www.tisip.no:8234/

5.5.3.2 Give the Administrator a Name

In the same way the WWW server ought to have an alias, the one responsible for your WWW server in your organization should also have his own post address which is impersonal. An email address of type webmaster@tisip.no which is an alias for the person who is responsible for the services at your organization.

In order to make such an address, you must contact the systems manager for the e-mail service at your organization, often called postmaster@your.organization.countrycode

5.5.3.3 Make an Index

A WWW server always has a "root" the directory where the tree for the information for your server starts. This is the place one addresses oneself from when giving a path/filename in a URL. In addition to the subdirectories of this root, aliases can be defined here which point to other places on the hard disk outside of the original root.

The point here is that the root is the place a user lands if a URL is used without referring to a path or file; that is, URLs of the type:

http://www.tisip.no/
http://pb1.idb.hist.no

The WWW server will in such cases search for a file with the name index.htm or index.html and send this to the client. Every WWW server ought to have such an index file as an entry. General information identifying the server should be placed here. It should say something about what kind of information can be found and perhaps whom the target audience is for the information.

Try some organizations -- and see if they have a sensible index:

http://www.microsoft.com/

http://www.compaq.com/

5.5.3.4 Create a Well Thought Out Structure for the Information

Think through what kind of information should be placed on the the WWW server. Create a directory structure which separates the different information areas, and make a sensible hierarchy of subdirectories which make the information easy to survey.

5.5.3.5 Create Rules for Designating Responsibility for the Information

In many organizations, there are several people who have responsibility for various areas of information which will be placed on the server. In order to keep the server reasonably updated, it is important to clearly define areas of responsibility for each of these people.

A common process is to create a template/guideline for the publication of information. This way the published information will have a common layout. Each page should be signed by those who are responsible. "Log files" can be kept in the same directory as the HTML files. These logs would contain the name of the person responsible and the history of the current HTML information. Source information used to write the HTML documents can also be saved here.

5.6 Exercise

Due date: 21 April 1997

Create a table as shown below:

Clickable images:

Use the image above. You can save it by clicking on the RIGHT mouse key (if you are using Netscape).

Use Paint Shop Pro, MapEdit or ImageMap Editor as mentioned in 5.3.2 to find the co-ordinates for the three objects (Client, Server and Net) in the picture. Create a .MAP file for server management of pressure sensitive areas. Use NCSA format in the MAP file. (We shall continue working with this file in Lesson 8).

Send the HTML code for the table and the contents of the MAP file as the answer to the exercise.

Last updated 7/4/97

Per.Borgesen@idb.hist.no