Internet Publishing - PPI

"Materials used in this course are the property of the author. These lessons may be used only by course participants for self-study purposes. Application for permission to use these materials for other educational purposes such as for teaching or as a basis for teaching should be directly submitted to the author."


Lesson 2: Outline of Technical Services

Topology in Cyberspace
Intranet
Web Hypertext Terminology
On the Edge of the Web
HTML and SGML
Tasks for this Lesson -- Exercise 2
References and Additional Reading Material
Footnotes

Topology in Cyberspace

Cyberspace refers to the mental construction a person creates when he/she is interacts with a computer. The concept was developed by William Gibson who through his books (Neuromancer, Mona Lisa Overdrive, etc.) created a whole new genre: Cypberpunk. Bruce Sterling also had a finger in the pie in cyperpunk's childhood?

When you sit in front of your computer and surf the Net, you find yourself in cyberspace, but cyberspace is actually in your head. Likewise, you find yourself in cyberspace when you play videogames. Cyberspace can be the entire global village, or just located on a standalone PC. We are most interested in services which are connected to the global network.

The global cyberspace consists of a slue of different networks which are used to communicate over the whole world. Internet is one such network.

kyberromet
Figure 1. Overview of Networks in Cyberspace

Cyberspace consists of among other things:

  1. Private Networks. PAN = Private Area Network. Networks which perhaps only exist in one room or within one building, not connected to another network. (personal web)
  2. Private networks can be connected to the Internet in different ways (PPP, SLIP, Gateway, ...)
  3. FidoNet: Networks of private individuals who send data to one another over telephone lines (or even by diskettes via snail mail). This network is also called toasternet because you don't need top of the line equipment to set up such a network.
  4. BITNET: (Because It's Time Network).
  5. UUCP: Unix to Unix Copy Program

The reason for the explosion of users on the Internet (no, the users haven't exploded, the number has exploded) is that it has become so easy to navigate (surf). The World Wide Web, or just Web (WWW, W3, W^3) is Internet's killer application. We will learn a lot about the Web in this course. However, the Web is not the same as the Internet. The Web encompasses all of the resources available on the Internet with the help of a Web client. So, the Web is a subset of the resources available on the Internet.

Intranet

Intranet is a much used word in recent times. This is Internet technology put to use on a private network curtained off -- either by not connecting the network physically to the Internet or by equiping the network with a firewall -- from the outside world. A firewall can be set up in several different ways, but the most usual allows users on the inside (within the private network) to reach certain resources on the outside, but keeps those outside (on the Internet) from accessing the resources on the private network.

Extranet

Another way to make use of Internet technology would be to connect several private networks, which can each be intranets; together in a logical network, or in a company network if you like. Between the individual networks, the Internet can serve as a connecting link. We could call this an Extranet - even though this is not yet a household word.

Services on the Web

web-space
Figure 2: The Web Lies within the Internet -- FIGUREN MÅ ENDRES PÅ

The Web provides access to many services on the Internet. Figure 2 shows a Web client connected to two services: a HTML document and an FTP server. When we refer to the Web, we usually mean all hypertext documents written in HTML which make up the larger web. But also accessible through the Web client are:

  1. FTP (File Transfer Protocol): Much used protocol for transferring information (text, graphics, video) over the Internet --Files are transferred so information does not pop up on the screen.
  2. NNTP (Network News Transfer Protocol): A protocol for transferring Usenet news
  3. Gopher: A protocol for exchanging information in the form of menus and text documents
  4. Telnet: A way to log into a host on the Internet (ASCII terminal)

Web Hypertext Technology

Hypertext has been with us since Ted Nelson described in during the 1960's. The following terms are often used when speaking about Web-based hypertext:

page
Refers to a single document of hypertext (HTML).
home page
Refers to a number of entrances to a local web; refers also to what a person looks upon as his/her main page; often contains information about that person (private or careerwise).
hotspot
A region on a hypertext document which moves you to another place on the Web when you click on it; these are also called links; we could also call them anchors, because that is what they in fact are: we can click on an anchor and follow the link to the new information.
web (with lowercase w)
A collection of hypertext documents which is looked up as being a single work; lies often on a single server but this is not a condition; often used synonymously with home page.
Web (with uppercase W)
The set of all hypertext documents which are available throughout the world; often used with a wider meaning -- all information accessible through a web-client interface.

On the Edge of the Web

Since the Web as a rule is the gateway (or steadily more often the wormhole) to the Internet, it is natural to regard the services we reach through the Web, but which are not on the web, as the Edge of the Web.

Web-amøbe med perifere tjenester
Figure 3: On the Edge of the Web

These are services which are not a part of the Web, but which we can reach through our webclient. We can also look at it in this way:

Klient/tjener forhold i kyberrommet
Figure 4: Clients and Servers in Cyberspace

Figure 4 shows the relationship between client (square with blue top) and server (square with shading) in cyberspace. These maps are not absolute or static.

There are, as a rule, specialized clients for each of these different services. In addition, the number of possible services available on the web clients will grow steadily. As new tools and services come into being, they must be constantly renewed. For the services where the web-client cannot be the client itself, a WWW browser can usually start other programs which can act as clients.

Each person must draw his/her own picture (construct one's own image inside one's head) of the way cyberspace looks.

HTML and SGML

The Web consists of hypertext documents written in HTML. HTML means Hyper Text Markup Language. If you want to see what HTML looks like, chooseView/Source on your web client (browser). You will then be able to see what the HTML code looks like for this document. The documents are flat ASCII files which are transferred from the server to the client with the help of the HTTP protocol. When the document is downloaded to your client, you can do what you want with it. You can look at the document, save it, edit it, cut and paste... The only thing you cannot do is put the document back where it came from.

Imagine if all the documents sent to a national library had to be available in digitally readable form. What a thought! Most documents have been through a digital form in the production process. However, all this information will not be easily accessible as long as documents continue to be represented in so many different formats. We have enough problems trying to keep the different versions of Microsoft Word separate. Some of us grumble when we receive WordPerfect files and throw ourselves on the floor if we have to come near a diskette from an MacIntosh.

What we need is a universal language which can describe every kind of document. The foremost candidate for being this universal language is called SGML (Standard Generalized Markup Language).

Now, there are to ways to look at a document: We can look at the document's contents, what information is contained within the document. This is the manner in which an electronic librarian looks at documents. They should be "searchable" according to content. Librarians want SGML to be a language for organizing and structuring information.

On the other hand, we can look at the way a document appears. This is what marketing people want -- desktop publishing. Since Netscape is a commercial entity, they want to attract commercial sponsors. It is becoming more and more common to see advertising on popular web pages. (Take a break and look at The Dilbert Zone. See what I mean?) HTML has, therefore, not become the general descriptive language that librarians wished for. Those who putter with SGML do not, therefore, wish to be associated with HTML: "... SGML advocates treating HTML like a bastard child - related but better left unmentioned." [Silverman].

Connecting to Databases

At the moment, we see a strong trend to connect webservers with databases. There are several reasons for this. The most important are:

Today, various database distributors can offer connection and development environments for the webserver.

Dataformats on the Web

Read Chapter 2 i the book where different dataformats and the for describing them, MIME, is discussed.

Tasks for This Lesson - Exercise 2

Only the presentation part in point 5 of this exercise is obligatory. Questions regarding this exercise should be sent to your assigned teacher's assistant. You will find their addresses on the "Registered students page". The deadline for sending in this assignment is 18 March. If you have questions or comments on the rest of the points you may be sent them to the discussion forum on a voluntary basis.

1. Read More About What Interests You

You may have found many of the concepts in this lesson unfamiliar, somewhat unfamiliar, not much used in your branch or just a little vague. Try and search the Web for the topics you are unsure about (for example, archie, wais, ftp, etc.). Here are some search engines:

WebCrawler Searching
Alta Vista
The Lycos Home Page: Hunting for WWW Information
Kvasir (Norwegians)
W3 Search Engines: A list of search engines.

2. What is cyberspace and the Web?

Write a definition, or find a definition for cyberspace or the cybernetic landscape which is better than the vague thoughts I referred to in the beginning of this lesson. Give references.

3. What is UUCP and How Does It Work?

From Figure 1 we see that UUCP is not a part of the Internet. This means that UUCP does not use IP (or TCP/IP). Is this correct? Does USENET not, for the most part, use UUCP to transfer news between servers? Write something informative about this. (List your sources.)

4. In Xanadu did Kubla Kahn ...

Neither Ted Nelson nor the Xanadu project are dead. Search the Web and find out what you can about both. Write a short summary. Do not forget to mention where and how you found information. (Perhaps I should mention that Kahn's name often is misspelled Khan).

5. On the Edge of the Web

I assume that you use Netscape or Internet Explorer as your Web client. They do not support Telnet which is a Terminal Service). However, it supports both e-mail and news groups. The amount of support it provides varies from version to version.

Telnet:

Set up your web-browser to start a Telnet-client. Find a Telnet client on your machine. (You have one if your are running a dial-up package. If not, you can get one at ftp://ftp.unit.no/pub/bibsys/ewan/ ).

In Netscape you can use the following URL:

telnet://elib.zib-berlin.de to start a telnet-session to an electronic library. You can log in as user: elib, and you are able to search for articles and programs.

The purpose of this exercise is to show how web-technology interface other services on the Internet.

For those of you who understand Norwegian, you can try:

telnet://eros.bibsys.no/

Log in as user bibsys and password bibsys. Try to find out if Geir Maribu is listed as an author. If you need help, or you want to see if the same information is available within the Web, you can look at WWW-inngangen til Bibsys simultaneously.

News:

Your Netscape client can be a client for News.

In our course we do not use the Internet News, but we're trying the web-based HyperNews for the first time.

Presentation

If you haven't done so yet, please enter the classlist group (also available from Internet Publishing homepage) , and write a short introduction about yourself. Say who you are, what you do, where you live, your interests...

This presentation is the obligatory part of the exercise -- so make sure you sign it with your full name.

References and Additional Reading Material

[Silverman]
David Silverman, "Toward a Universal Library; SGML and the future of electronic documents," Wired. August 1995, vol. 3, nr. 8. David Silverman is chief scientist for the Innodata Corporation in New York and vice president of the Interantional SGML User's Group.
SGML
Read more than you want to know about SGML on the SGML Web page at URL: http://www.sil.org/sgml/sgml.html
Not Archie
Archie is a service which you can use to search on FTP. Archie is a registered trademark. At the program workshop at NTNU, they have written their own search program. It is very good. But, it is not Archie.http://ftpsearch.ntnu.no/ftpsearch/
WWW
Read everything about the WWW organization at http://www.w3.org/. This is the root for everything that has to do with the Web. :-)
Stroud
Fascinated by what happens on the Internet? Want to be on top of the latest software, freeware, shareware, etc.? Follow this link to Stroud's fantastic list. Norwegian users can find it (mirror site) at http://www.interlink.no/cwsapps/.

Footnotes

Killer application, an application which suddenly makes everyone see the value of a tool. VisiCalc (visual calculator), the first spreadsheet was a killer application for PCs. Before VisiCalc, PCs were something hobbyists and engineers played with and put together. VisiCalc made businesses and accountants see the usefulness of PCs.

Internet are machines which are bound to one another by Internet Protocol (IP).

Ted Nelson was the first to use the word hypertext. Read the gripping summary of the Xanadu project in Wired 3.06: Gary Wolf, "The Curses of Xanadu", June 1995, pp 137-152, 194-202.

Personal Web: In Lesson 1, we talked about two kinds of Web services -- external and internal. We can imagine a third and lower level: The personal level. This is a web where you post things that you normally would not show other people.


07 March 1997 Fredrik Wilhelmsen and Per Borgesen <per@idb.hist.no>