10 Jan 1997

CGI

The early development of the World Wide Web came at the CERN Physics Lab in Europe and the NCSA facility in the US. The first generation of Web servers were hosted on Unix machines. Although the formal standards of the Web are not biased to any single system, the convention used by a server to call a database query program was not part of the formal standards. What developed as a de-facto standard, the Common Gateway Interface or CGI, was very clearly based on the Unix environment.

The web server calls a CGI program by creating a new process, much as Unix itself launches programs when the user types in a command at a terminal session. There are four major elements of the CGI linkage:

  1. The type of request (GET or POST) and some other information about the server and the requester are passed as Environment variables.
  2. The "query string" (any data after the "?" that ends the URL) is passed as the command line parameter to the main program (argv). This is where the FORM data is located on a METHOD=GET request.
  3. A METHOD=POST request would have been accompanied by a byte count of additional data. This byte count can be determined from an Environment variable. The CGI program can then read exactly that number of bytes from "standard input" which will actually read the data sent over the network with the request.
  4. Anything the CGI program writes to the standard output file is sent to the browser. Generally, the first line of output should establish a MIME data type of either text/plain or text/html, then there should be a blank line, and then the rest of the reply.

In a simpler time, with only Unix servers, the CGI interface was simple and efficient. Unix has the special feature that network sessions can be treated as ordinary files. So early generations of Web Servers could simply pass the Client session as the "standard input" and "standard output" files and the CGI program could read and write network data directly.

But HTML got complicated. Now with proxy support, and HTML 1.1 extensions, even a Unix server may need to buffer the data generated by the CGI program to check headers, data type, and to manage byte counts. CGI is also a bad design for Windows, where it has had to be modified to support programming in languages like Visual Basic.

CGI is supposed to be universal, but there are still portability problems. The Microsoft IIS Web Server, for example, requires a CGI program to generate the "200 OK" status line and all the response headers. The Netscape Server, on the other hand, will generate these headers automatically, unless the CGI program is explicitly marked to use "non-parsed headers" by the appalling convention of starting the name of the executable with the letters "nph-".

Although CGI programs can be written in any language, the most popular computer language for the first generation of CGI programming was Perl. Perl was designed originally to read system output and reformat it in various types of reports. It is, therefore, ideally suited to extract data and report it in HTML format.

A number of vendors provide prepackaged general CGI programs. Usually the system administrator provides the outline of an HTML response and a SQL query to fetch data. The program issues the query and embeds the response in some sort of table. This allows people to use CGI without any real programming.

CGI will remain a universally supported option. Web to database interface products based on CGI will run well under light load on any Web Server package. Alternatives to CGI include:

Proprietary programming interfaces exposed by each Web server for high performance programming in C.

High level programming languages supported on the server to replace the conventional CGI interface with JavaScript (Netscape LiveWire), PL/SQL (Oracle), or Java.

Continue Back PCLT

Copyright 1996 PC Lube and Tune -- Distributed Applications and the Web H. Gilbert