Distance Learning From NITOL - HiST


Materials used in this course are the property of the author. These lessons may be used only by course participants for self-study purposes. Application for permission to use these materials for other educational purposes such as for teaching or as a basis for teaching should be directly submitted to the author.


Subject: LAN Administration

Lesson: 12 - WWW Server Management


Summary: Web-servers have had a tremendous development lately, and they becoming increasingly easier to set up and administer. With the arrival of the servers for Windows NT and Windows 95 arrived, being a web-master became easy. In this lesson we will discuss different platforms for web-servers, and we will take a look at some different web-server platforms, before we investigate one specific server: WebSite 1.1.


Introduction

Web-servers have had a tremendous development lately, and they becoming increasingly easier to set up and administer. With the arrival of the servers for Windows NT and Windows 95 arrived, being a web-master became easy. In this lesson we will discuss different platforms for web-servers, and we will take a look at some different web-server platforms, before we investigate one specific server: WebSite 1.1.

Platforms

When choosing a platform for web-servers, Unix has traditionally been the first choice. This because of Unix' excellent abilities for multitasking, which is important for web-servers. A web-server has several accesses simultaneously, which demands a lot of the operating system offering the services. Lately Windows NT and Windows 95 have arrived as a popular platform for WWW-servers. This mostly because of the simplicity of the administration of these platforms, but also because of the stability and the development of increasingly faster computers. We will now take a look at different platforms for WWW servers, and discuss their advantages and disadvantages.

UNIX

The best choice of platform considering the performance is without doubt UNIX. The reason for this is as we already mentioned UNIX' multi-session abilities. In addition it has a good support for script programming. Especially UNIX' standard in (stdin) and standard out (stdout) supports this well.

DOS or OS/2 withWindows 3.*

There are several different versions also for this platform. These are "simpler" types than the UNIX versions, which is noticeable in the capacity because the host works noticeably harder.

Administrating servers under DOS/Windows is normally done by commands or editing text files. There exists for example a text file that defines legal users for folder or text files that defines "hotzones" in clickable maps (more of this later). Most of the servers for DOS or OS/2 are freeware.

Windows 95 or Windows NT

Lately there has been delivered several WWW server applications for the Windows NT and -95 platforms. Common for those is the user friendliness compared to the DOS or UNIX versions. The installation is normally simple. After running the file SETUP.EXE most things go automatically. The administration is relatively easy to learn without too much documentation.

Another advantage these server solutions have over the UNIX platform is the simplicity of the file management. IN these solutions the file system on the server is used, a Windows NT server uses the NTFS system, while a Windows 95 system uses the "good old FAT "system. There is no need to transfer any files over FTP as between the Windows workstation and the UNIX server.

WWW-servers are often free for these platforms, but there exists a few versions that demand a small fee for the license for commercial use. This is done to support software developers that work hard to bring the users good software. The fee is often around $100 to $500.

Ways of connecting

Permanent lines

The only logical way of connecting for businesses that want to publish onto the Internet is a permanent line. As a "information-pusher" you wish to have your information available 24 hours per day and 7 days per week. A permanent line guarantees a certain capacity at any time, and the cost is independent on the traffic that uses the line.

A permanent line is deer, and the price depends on the wanted capacity and the distance. An alternative to a permanent line is to place your server in the close proximity (or at) and established Internet provider (that has powerful permanent lines going out). Administrating this server will be handled by a phone connection, which will be considerably cheaper since the administration takes a small amount of time.

ISDN (or a modem connection) is a possible alternative. The cost of this alternative depends on the traffic going out from the server, and there is no good way to control how long or how much the ISDN line is used.

Intranet

Using a web-server for Intranet solutions means running a web-server internally, not connecting it to the Internet. Intranet is where the use of web is growing the most, and an internal web-server might be used for many different internal network services:

Please mark that a web-server demands a TCP/IP-protocol. Web-servers are made for the Internet, which runs on TCP/IP. To make computers in a network talk, a protocol is needed, and web-servers use the TCP/IP protocol (see lesson 5).

A few web-servers

We will now discuss a few WWW-servers briefly. I have picked a few types for each platform, and for each I will present a short summary containing how they work, their possibilities and the price for those of them that cost anything.

A list over most of the available WWW server software can be found at

http://www.w3.org/pub/WWW/Servers.html or Yahoo - http://www.yahoo.com/Computers_and_Internet/Software/Internet/World_Wide_Web/Servers/.

UNIX

NCSA-server

A server that came early was the NCSA-server. It is reckoned to be one of the most stable and are also one of the most used WWW server software solutions. This institute, the institute for computer engineering (IDB) here at the High School in South Troendelag, uses this software. The NCSA is very powerful, without demanding to many resources from the host computer. It is also relatively secure - it is possible to have password restrictions (as used for the lessons in this class). All administration of the server (for example administering accesses) is done through text files. The administration of, for example, users and passwords are done through the file .htaccess that are placed in the folder that is supposed to be restricted. The NCSA-server software is free. More information about NCSA is located at http://hoohoo.ncsa.uiuc.edu/.

APACHE

Apache for UNIX is a good alternative to the NCSA-server-software. It is also free and is also often used as WWW server software.

More information is located at http://www.apache.org/.

NETSCAPE Fast Track Server 2.0

NETSCAPE has developed this web-server-software. It is available both for UNIX and for Windows NT. We will present this software more later on.

PC with Windows 3.1 or WfW3.11

Windows HTTPD

As far as I know, there is only one realistic alternative for DOS-computers. This is the Win HTTPD-server (the d in the httpd is an abbreviation for daemon, which means server in Computer communication terminology).

This software cost $99 for commercial use, but is free for private or educational uses. Servers running under DOS/Windows have to be small and relatively simple. This is the Win HTTPD-Server. The Software is a complete server package, by which I mean that it has all the abilities and functions a server needs. It is also well documented, and therefore easy to learn to administer.

More information is located at http://tech.west.ora.com/win-httpd/.

Windows NT or Windows 95

NETSCAPE Fast Track Server 2.0

Netscape has approximately 70-80 % of the marked of the Web-readers. Off-course they had to develop a server-version as well. This makes it possible to work with a well-known user-interface when working with this computer. As with all other NT-server software it is easy to install, and easy to administer. It is possible to remote administer it (it is possible to administer the server from another part of the network) and there are goo statistical functions in it.

Netscape Fast Track Server 2.0 needs NT Server or NT Workstation, and does not run on Windows 95.

The price of a full version of the Fast Track Server 2.0 is $295, -.

More information is located at http://www.netscape.com/inf/comprod/server_central/product/fast_track/index.html.

Microsoft Internet Information Server (IIS)

Internet Information Server is Microsoft's answer to Netscape's WWW-server-software. It is integrated as a part of NT Server 4.0. This software is also easy to install and administer. Also Microsoft give a well-known user interface to its server software.

IIS in available in version 3.0 for NT Server 4.0. For those of you using NT 3.51, there exists a version 1.0. IIS is free of charge.

IIS abilities are mostly the same as for Netscape Fast Track Server (Remote administering, statistics and more).

More information is located at http://www.microsoft.com/iis/default.asp.

WebSite 1.1

A good alternative to those two giants in the web-marked is O'Reillys WWW-server WebSite. An essential difference between the giants and WebSite is that WebSite also runs on Windows 95 as a server platform, as well as Windows NT. Since WebSite is a Windows 95/NT -server, installation and management is easy. Many of the functions mentioned above when talking about the giants also apply for WebSite. In addition WebSite has a built in search engine, and a "image mapper" that makes making image-maps easy. WebSite is free for educational or test use. For commercial use it has a price of $500,-. But this gives you the professional edition that has extended functions compared to the free version (that is version 1.1e). Because of the possibility of using both Windows95 and Windows NT as a platform we will take a closer look at this software.

WebSite 1.1

I will now bring forward a few examples of how to administer a WebSite WWW-server. The topics I will discuss may apply to any kind of server software. WebSite is best suited because of its support of Windows 95, which makes it easy to practise.

Access controll

One of the more important tasks a web-master has in an Intranet is giving the right people the right accesses. Limiting accesses may be done in two different ways; either by defining which users are allowed to use which folders, or by listing which computers (IP-numbers) are allowed access the information.

Password protection

With WebSite 1.1 there comes an administration tool, WebSite Server Admin. From this "Filing-cabinet" view the server is administered. And this is where the access control is done. Under "Users" it is possible to define different user groups to the server. In the example below we make the user "Arne" which is supposed to have the password "test".

The user Arne has to be a member of a group. As default he is a member of the group "Users", which is enough.

It is now possible to make folders may be accessed by the user Arne with the password "test". Without proper name or password, a message telling you "Authorisation Failed" pops up.

6.1.1.1 How to choose folders that should be password protected.

With WebSite the same tool is used to define which folders that are protected and which are public. The next face is to define the folder you want to protect. The Figure below shows us the contents of "Access Control"


In the field on the top just below "URL Path or Special Function:" a folder name may be entered after pressing the "new" button. It is possible to declare if it is possible to list all files in a folder, or if a file may only be accessed if it is entered manually or through a hyperlink. For the time being we will ignore the right hand side of the figure. In the field below we choose which groups or users that are allowed access. I a group is chosen it grants all the users in that group access to that folder, but they will still have to log on with their logon name and password. The group does not have a logon name.

Protection against special domains or users.

WebSite also opens the possibility to protect your files against special groups of users. You might want to allow only users from inside your country to be able to access your site, or you generally dislike Americans and want to prevent them from entering. It is possible to prevent single users or domains from entering your site.

6.1.2.1 Protection at domain level

It is possible to allow or deny access to users on any level on the DNS-tree. If you should want to make an internal web-server for your network, it is done by choosing the folder you want to protect, then clicking "Deny, then allow". If there is listed "all" under "Allow classes" it has to be removed. Thereafter you enter the domain you would like to allow (idb.hist.no to allow everyone from our institute to enter your site). Anyone not in this site will be denied access. The same method id used to deny domains access. It is also possible to deny a single workstation access.

6.1.2.2 Protection at IP-number-level

It is also possible to protect using the IP-numbers. This is a better way to do it since looking up the name of a domain in the central lists to find its IP-number takes more time than simply using the number directly. The web-server uses more resources on a domain level restriction than on an IP-level restriction. If you want to protect a folder using IP-numbers this is done the same way as I outlined above, just entering the IP-number instead of the domain name.

As an example; if I want to protect a folder, and grant all employees at IDB access, I will have to protect it in a way that allows only one IP-number, 158.38.61.*, to read it. In other words I swap idb.hist.no from the last chapter with 158.37.61. It is also possible to go the other way (as to prevent all employees to enter specific pages).

Please mark the "Logical OR users and class" box. If this box is marked, it is enough that only one of the two ways of getting access (though an authorised user or a domain/IP-number) is used.

Loging of users

WebSite logs everybody that reads anything from the web-server. The log is saved as a text file. This is some of the information stored in the file:

In contradiction to most other servers, WebSite may find the domain of the visitor instead of his rather cryptic numbers. This helps you to have some control over who your visitors are, but demands more from the server. I will warn you against this feature. The reason is that it takes quite a lot of time to access the DNS (Domain Name Server) to find the proper domain name for those numbers.

I addition to this text file, there is also a graphical interface that works as a monitor that shows you the activity now, and the last period of time. This is especially interesting if a computer is running as a dedicated web-server, since the system operator might monitor the usage onscreen. The figure below demonstrates a screen capture from this graphical interface.

Publications

Many Internet-users finds it amusing to publicise statistics about which domains that read their documents. Good behaviour is not to do this. If statistics has to be revealed, never reveal anything more than the top-domain (the country-code). Publicising the entire domain name much like personal information about users or organisations/businesses and that is not good! Statistics is for internal use only.

Robots

Some of this material is copied from lessons in the DE-class "Publications on the Internet" (POI).

There are several search engines trying to catalogue the web. The World Wide Web is so big that it is an impossible task to catalogue it all. This makes cataloguing a continuous task.

Robots have been working in the net for a few years now, and they do a lot of good and necessary work. Work that could not have been done without the help of the robots. It is pretty misguiding to call them robots. Robots give me the impression that they are programs moving around in the world. But the robots stay put where they are, instead they contact different WWW-servers and reads all the documents it can find.

All of this is off course not free. It costs bandwidth. A robot not behaving properly might overload a server by asking for several documents in a very short period of time (Rapid fire). Netscape is also guilty of rapid-fire when the client ask to get several inline-pictures simultaneously.

"Help, I'm being counted " (from lesson 9 in POI written by Fredrik Wilhelmsen)

In Norway we have a children's tale about a goat kid that where able to count. Alf Prøysen wrote it. Most of the other animals where unknown to the concept of counting, and where nervous for it being dangerous. (Now He counted you as well!). I got some of the same feeling when I discovered that my www-server was getting visits from search robots ("Help, I'm being indexed!").

How do you discover a visit from a robot? All transactions leave traces in the log. The log normally just contains a bunch of IP-numbers, but with some training you can recognise foreign numbers accessing dusty old pages no one ever reads anymore. This is how I discovered Scooter. I found scooters homepage with some help from an application I have on my computer called "IP-resolver" (a Novell windows application). I could also have logged onto a UNIX computer and write nslookup (name server lookup). I had the address 204.123.2.54 in my log, and the IP-resolver told me that this number was the same as the address scooter.pa-x.dec.com. So now I searched Web-Crawler, and found the web-site of scooter at http://scooter.pa-x.dec.com/.

Scooter claims to be a kind and good robot. He informs us that he is registered in a list over robots at http://info.webcrawler.com/mak/projects/robots/active.html. The Robot also advises us how to set up a file called "robot.txt" in the root of the web-server so that robots dropping by know where to go and where to keep away. Scooter also claims that it does not find new web-servers. If scooter visits you, it means that your server is listed in an already known list. That is a thought about just as scary as the on above: Somebody knows where I live! Scooter is used by, among others, Alta Vista that, for the time being, is the "hottest" indexed database around.

Because I didn't have any robots.txt file, all robots visiting me and asking for that file will generate an error message. I could make a list over all of these by searching my error.log (grep "c:\httpd\htdocs\robot.txt" error.log > sniffers.txt). This way I discovered a visit from The Architext Spider from http://www.atext.com/.

Now I have made myself a file called robots.txt. It looks like this:


# /robots.txt for http://pc130.idb.hist.no/
User-agent: * #attention all robots:
Disallow: /~fredrik/lest/ #private notes
Disallow: /temp/ #temporary or not original files


The grid means the text following concerns all robots. Then a list of folders the robots are not allowed to enter. This way I avoid having the robot waste time on valueless documents.

Offer your own search engine

If you run a www-server, you can set up a search engine that makes it possible to search html-documents. One such program is Web Server Search for Windows and a copy may be downloaded from http://wgg.com/wgg/best/search.htm. This program is written in Visual Basic, and it runs under Windows 3.11 or better.

Some web-servers offers a utility made for searching your own web. WebSite has such a program. This program is named WebIndex, and the figure under is a screenshot from it.

In this screen you choose which folders that should be indexed, and you name the index. When you start searching this index, a cgi-script is loaded from the server's cgi-folder (for example http://abm.idb.hist.no/cgi-bin/webfind.exe). From this form you may search your own web. A screen shot is shown beneath.

Setting up aliases/redirecting

Aliases

It is possible to set up pointers or aliases to folders on the disk. The physical folder structure will be invisible, but a logical structure is chosen to make it easier for the user.


Example:

The HTML-documents for this test-server is placed under c:\website\htdocs. I addition I have an old web lying around under c:\_d\div\web\fu\prosys. This is an older version of the distance education class project oriented system work. If I wish to make a pointer to this as /fu/ that is possible. I still use the general tool Server Admin, and this time we look at "Mapping"

In the field "Document URL Path" the logical folder is written, and I enter /fu/ here. I the field "Directory (…)" the full path to the folder on the disc (c:\_d\div\web\fu\prosys).


Redirecting to a new URL

It is also possible to map at URL-level. If you have moved your documents to a new place, you might wish for the user to automatically get the new document in stead. A click in "Redirect" brings the following screen forward.

In the field "Original URL" the "old URL is entered, while in the field "Redirected URL" the new URL is entered. If a user enter the old URL in web-reader, he will be automatically transported to a new place. This URL may also be an URL to another web-server.

Security-holes and mapping

There is a dangerous security-hole it is important to be aware of when making logical connections. If you map a logical URL to a folder that are password protected, the protection is voided.


Example:

The folder /password/ is password protected. If I make a logical connection named /sec_hole/ and maps this to the folder c:\website\htdocs\password\ (take a look at the figure below) everybody gains access to the password protected folder without having to use the password.


Image Maps

Clickable images or image maps links an image to several HTML-files. The row of buttons in Netscape is used as an example in WebSite


Example


The row of buttons in Netscape is copied out as a GIF-file, and each button are mapped to a web-site that will explain what that button does.



To get this row of buttons to work, the following code has to be written:

<A HREF="knapprad.map>
<IMG SRC="knapprad.gif" ISMAP WIDTH="403" HEIGTH="44" </A>

We can see that the picture is linked to a file named "knapperad.map". This is the map-file that is made very easily with WebSite. In the WebSite package there is an application named MapThis where a picture is loaded, and the "click-sensitive" areas are marked (take a look at the figure).

In this case rectangles where needed. But it is also possible to use ellipses, circles, points, or freehand figures. When you have chosen one or more areas, these will be shown in the "Area list" at the right of the figure. The "Area list" can be switched on or of with a button in MapThis (button No. 9 from the left). When the Area list is on, it is possible to see which HTML-file an area is mapped to, either by editing the area under the list, or by clicking "edit".

If you do not click on any area at all, it is possible to set up a default file that will be loaded. This is done with the button .

WebSite is special since it does not need to run any special application to treat an image map. This makes such images very easy to work with under WebSite.

Summary

Administration of WWW servers is a bit outside the curriculum of this class. It is still a reality that increasingly more businesses and organisations install a Web-server. There are several reasons for this. Some wish to offer information to the rest of the world, while others wish to use it for an internal web. The principles and the administration are exactly the same in any case, and as system administrator it is important to know these.