|Cured by Luciano Giustini
by Michele Beltrame
In this article I would like to briefly describe CGI (Common Gateway Interface) programming. When you see, on a Web site, things such as access counters, imagemaps or dynamic pages, you can be almost sure that they've been created using CGI technology.
First of all let's clarify that CGI isn't a programming language, but an interface using which the Web client (the browser) is able to interact with programs located on the Web server. Actually you can write CGI programs with the programming language you like best or with the one you know best: C, C++, Pascal, SmallTalk, Phyton, Basic, .... However, the most used language for this programs is Perl, a language born in Unix some year ago and for which we should thank especially its creator, Larry Wall. The advantages you get from using Perl are many: it is easy to learn, it has got very powerful string manipulation functions and operators, it is highly portable and rich of extensions. However, a Perl interpreter should be installed on your server (or on your provider's server), because Perl is an interpreted language (to say it all, there's also a compiler, written by Malcom Beattie, but it's still in alpha testing). All the examples I'll insert in these articles are written in Perl. In any case, as I said before, you may use a programming language you like. You can find the Perl interpreter for many operating systems (besides the Unix versions, there is also the Windows NT porting and many others) on Tom Christiansen's www.perl.com.
In this article I'll call the CGI programs simply CGIs. This is not exactly correct, but everyone does and it's brief. ;-)
What do I do with CGIs ?There are lots of applications of CGI programming, all of them oriented to allow a major interaction between user and server, a thing which doesn't happen for normal HTML pages (except for the links only). Anyway, there are three main applications of CGI programming, from which many of the other derive:
Dynamic pages - Dynamic pages, picturesquely called virtual pages by some people, are HTML pages created on the fly, in response to a specific request made by the client or because of the need of displaying data which changes frequently. An example of dynamic page may be the following:
Michele Beltrame home page
This page has been clearly generated dynamically, because some data which was unknown to the creator of the CGI program is being displayed. (in this specific case, the number of hits the page received and the name of the client).
Forms - One of the most useful applications of CGI technology is the possibility to handle online forms filled in by the user. Many of the controls typical of most graphic interfaces may be used in a form: radio buttons, check boxes, list boxes, text areas, .... You can see an example of a form inserted in a HTML page in the image. Forms are usually used to collect information from the user and then store them in a database (or send them to a mailbox). However, it's also possible to create dynamic pages with the data given by the user with a form.
Gateways - The are kinds of information, such us the contents of a database, which can't be directly accessed by the client. To get past this problem it is necessary to use gateways, which are simply programs which read a certain file and interpret its contents, translating them in a format readable by the client.
How the CGI interface works
For the ones who are interested not just in knowing how to use a program, but also in understanding how it works ( and I hope there are many of you ;-) ), I'll write some lines on how a CGI is invoked by the client and on how it returns its output to the client. A CGI program is invoked as any other HTML document (it is "requested"): the client sends to the server something like the following:
GET /cgi-bin/booksearch.pl HTTP/1.0
The name of the requested file (booksearch.pl in the directory /cgi-bin) and the protocol used (HTTP 1.0) are on the first line. On the following lines the formats which the client can accept in reply to the request (in this case text files, html files, gifs and jpegs) are reported. In the last line there is the client name (Netscape 3.0/Linux in this case). There may be other lines in the request, such us the username, but only the first line is relevant to understand how CGI works.
If the server receives a request for a document that is located on a specific directory (/cgi-bin in this case) or that has got a particular extension (it may be .cgi for the files located outside the /cgi-bin directory), it doesn't send the document to the client, but it executes it as if it was an executable program (and in fact it is), sending its output to the client instead of the standard device (the screen, for instance). In order for the output to properly arrive to the client, it should be structured as follows:
HTTP/1.0 200 OK
Date: Sunday, 22-September-96 11:09:00 GMT
As you can see, you should return to the client a document with a full header containing date, time, name of the server program, MIME protocol version, content type and content length. However, in most cases it is enough to return a partial header which just specifies the content type :
The server will complete the header with the missing information. This feature makes the creation of HTML pages much easier, although there are circumstances, which we'll see in a future article, in which full headers have to be used.
It is necessary to make some simple changes to your server configuration for the CGI interface to work properly. Many of you won't need to work on the configuration files, because your service provider has probably already configured everything. Anyway, I think that the topic is worth some words. The configuration examples that I inserted here are for the Apache http server (you can freely fetch it from http://www.apache.org), but also work with NCSA httpd. However, I cannot guarantee for the other servers. ;-)
There are three configuration files (which in most cases are located /usr/local/etc/apache/conf) : httpd.conf, access.conf e srm.conf. The following directives should be included (or changed if they're already there) in httpd.conf :
Let's go on with srm.conf :
The last file we need to analyze is httpd.conf :
All right, everything should be configured by now... we are ready to begin with some simple CGI application. The CGI interface sets a number of environment variables which the programs can access to. This variables give many types of information. Here follows a summary of them, don't worry if you don't understand the function of some of them :
Not all servers and not all clients set all this variables, so some of them may not work always.
Now let's see a simple Perl program which uses some of the variable described above :
It is possible to call this CGI program from any HTML document simply adding a link to it. For instance, let's suppose that the program name is infoclient.pl and that it is stored in the directory /cgi-bin. In this case, the line to add is the following :
<A HREF="/cgi-bin/infoclient.pl">Client information</A>
When called, the CGI program will create a new (dynamic) HTML page, which will be similar to the following :
The client host name and the browser name are extracted from the environment variables. They are then displayed to the user. This example, though very simple, clearly shows the potentials of CGI technology.
I hope I succeeded in introducing CGI programming. In the next article I'll go deeply into the forms, so don't miss the next issue of BETA. Bye!
Michele Beltrame is Webmaster of ItalPro and is reachable on Internet by the editorial page.
Copyright © 1996 BETA Group. All rights reserved.