WebSite
Windows CGI 1.3a Interface
Version of 18-Feb-96 (See section below for changes)
Status of this Document
This document is an informal specification. It is intended to be used
by both CGI programmers and server implementers. It is not intended
for this specification to enter the Internet standards track, as it is
platform-specific to Microsoft Windows 95 and Windows NT.
Version 1.3a documents the Authenticated Password variable, which is
"current practice", and a new variable Document Root which was
recently added. In addition, the description of the authentication
variables was changed to reflect current practice. They are passed
through the CGI interface whether or not the server used them for
authentication. This "pass-through authentication" is used by many
existing Windows CGI applications. Finally some additional notes were
added describing current practice for servers that support
multi-homing with separate logical (URL) path spaces.
_________________________________________________________________
Table of Contents
* Overview
* I/O Spooling
* HTML Form Data Decoding
* Launching the CGI program
+ Command Line
+ Launch Method
+ Document Associations
* CGI Data File
+ [CGI] Section
+ [Accept] Section
+ [System] Section
+ [Extra Headers] Section
+ [Form Literal] Section
+ [Form External] Section
+ [Form File] Section
+ [Form Huge] Section
* Example of Form Decoding
* CGI Results Processing
_________________________________________________________________
Overview
A large class of World Wide Web applications are best implemented
using external programs that are controlled by a web server. Examples
include front-ends to business applications which are themselves
subject to frequent changes in business rules. The broad acceptance of
rapid-application development (RAD) tools such as Visual Basic and
Delphi have given rise to the need to use these tools to Web-enable
many kinds of business applications. The widely used Common Gateway
Interface (CGI) uses techniques well suited to the Unix environment. A
different sort of interface is needed to support common Windows RAD
tools for CGI. It is the purpose of this specification to define such
an interface.
I/O Spooling
A key feature of Windows CGI is its spooled exchange of data between
the server and the CGI program. It is essential that the server
provide efficient transfer of data between the spool files and the
network. This means that the server should use memory-mapped
techniques, and minimize the number of separate network I/O requests
used.
The reasons for using spooled I/O are:
* Most RAD packages do not have native network (socket) I/O
capabilities.
* Socket I/O techniques are relatively exotic, and efficient results
require a thorough knowledge of the Win32 network interface. All
input and output would require complex buffering to achieve
acceptable network efficiency.
* Sockets cannot be inherited by a 16-bit program.
* Spooled input (e.g. POST content) can be memory mapped and thus
processed far more efficiently than is possible using
stream-oriented techniques.
* A reference set of spool files may be used for regression testing
and debugging in the RAD development environment.
* Spool files may be retained after a CGI program runs, for
"post-mortem" analysis, also using the RAD environment.
HTML Form Data Decoding
Windows CGI requires that the web server decode HTML form data if
present in a POST request. It is not required that the server decode
form data if it appears in the "query string" portion of a request
URL.
There are two ways in which form data may be may be sent by a browser
to the server:
URL-Encoded
This is the most common form data format. The contents of form
fields are "escaped" according to the rules in the HTML 1.0
Specification, then concatenated using unescaped ampersand
characters. This URL-encoded data is sent as a stream to the
server, with a content type of
application/x-www-form-urlencoded.
Multipart Form Data
This format has been introduced to permit efficient file
uploading with forms. It may be used without explicitly
including a file upload form field, however. The contents of
the form fields are sent as a MIME multipart message. Each
field is contained within a single part. The content type
indicated by the browser is multipart/form-data.
Compliant servers must decode both form data types.
Launching the CGI program
The server uses the CreateProcess() service to launch the CGI program.
The server maintains synchronization with the CGI program so it can
detect when the CGI program exits. This is done using the Win32
WaitForSingleObject() service, waiting for the CGI process handle to
become signalled, indicating program exit. The server must never use a
shell to execute the CGI program. This can create serious security
risks.
NOTE: The CGI program's process handle becomes signalled before the
process rundown is complete. Reliance on rundown to close files,
inherited handles, etc., can cause obscure synchronization problems.
Command Line
The server must execute a CGI program request by doing a
CreateProcess() with a command line in the following form:
WinCGI-exe cgi-data-file
WinCGI-exe
The complete path to the CGI program executable. The server
does not depend on the "current directory" or the PATH
environment variable. Note that the "executable" need not be a
.EXE file. It may be a document, provided an "association" with
a corresponding executable has been established.
cgi-data-file
The complete path to the CGI data file.
Launch Method
The server issues the CreateProcess() such that the process being
launched has its main window hidden. The launched process itself
should not cause the appearance of a window nor a change in the
Z-order of the windows on the desktop. The server supports a CGI
program/script debugging mode. If that mode is enabled, the CGI
program is launched such that its window shows and is made active.
This can assist in debugging CGI applications.
Document Associations
The server must honor document associations. If the target of a
Windows CGI request is a document (not an executable), the server must
attempt to find the associated application for the document and launch
the application such that the document is "processed".
_________________________________________________________________
The CGI Data File
The server passes data to the CGI program via a Windows "private
profile" file, in key-value format. The CGI program may then use the
standard Windows API services for enumerating and retrieving the
key-value pairs in the data file.
The CGI data file contains the following sections:
* [CGI]
* [Accept]
* [System]
* [Extra Headers]
* [Form Literal]
* [Form External]
* [Form File]
* [Form Huge]
The [CGI] Section
This section contains most of the CGI data items (accept types,
content, and extra headers are defined in separate sections). Each
item is provided as a string value. If the value is an empty string,
the keyword is omitted. The keywords are listed below:
Request Protocol
The name and revision of the information protocol this request
came in with. Format: protocol/revision. Example: "HTTP/1.0".
Request Method
The method with which the request was made. For HTTP, this is
"GET", "HEAD", "POST", etc.
Executable Path
The logical path to the CGI program executable, as needed for
self-referencing URLs. This may vary if the server supports
multi-homing with separate logical path spaces. The server must
provide the physical path equivalent using the logical to
physical mapping for the identity on which the current request
was received.
Document Root
The physical path to the logical root "/". This may vary if the
server supports multi-homing with separate logical path spaces.
The server must provide the physical path to the logical root
for the identity on which the current request was received.
Logical Path
A request may specify a path to a resource needed to complete
that request. This path may be in a logical pathname space.
This item contain the pathname exactly as received by the
server, without logical-to-physical translation.
Physical Path
If the request contained logical path information, the server
provides the path in physical form, in the native object (e.g.,
file) access syntax of the operating system. This may vary if
the server supports multi-homing with separate logical path
spaces. The server must provide the physical path equivalent
using the logical to physical mapping for the identity on which
the current request was received.
Query String
The information which follows the ? in the URL that generated
the request is the "query" information. The server furnishes
this to the back end whenever it is present on the request URL,
without any decoding or translation.
Request Range
Byte-range specification received with request (if any). See
the current Internet Draft (or RFC) describing the byte-range
extension to HTTP for more information. The server must support
CGI program participation in byte-ranging to be compliant with
this Specification.
Referer
The URL of the document that contained the link pointing to
this CGI program. Note that in some browsers the implementation
of this is broken, and cannot be relied-on.
From
The e-mail address of the browser user. Note that this is in
the HTTP specification but is not implemented in some browsers
due to privacy concerns.
User Agent
A string description of the client (browser) software. Not
generated by all browsers.
Content Type
For requests which have attached data this is the MIME content
type of that data. Format: type/subtype.
Content Length
For requests which have attached data, this is the length of
the content in bytes.
Content File
For requests which have attached data, the server makes the
data available to the CGI program by putting it into this file.
The value of this item is the complete pathname of that file.
Server Software
The name and version of the information server software
answering the request (and running the CGI program). Format:
name/version.
Server Name
The network host name or alias of the server, as needed for
self-referencing URLs. This (in combination with the
ServerPort) could be used to manufacture a full URL to the
server, for URL fixups. This may vary if the servetr supports
multi-homing. The value of this item must be the host name on
which the current request was received.
Server Port
Tne network port number on which the server is listening. This
is also needed for self-referencing URLs.
Server Admin
The e-mail address of the server's administrator. This is used
in error messages, and might be used to send MAPI mail to the
administrator, or to form "mailto:" URLs in generated
documents.
CGI Version
The revision of the CGI specification to which this server
complies. Format: CGI/revision. For this version, "CGI/1.2
(Win)".
Remote Host
The network host name of the client (requestor) system, if
available. This item is used for logging.
Remote Address
The network (IP) address of the client (requestor) system. This
item is used for logging if the host name is not available.
Authentication Method
The protocol-specific authentication method specified in the
request. If present, this is normally Basic. The server must
provide this whether or not it was used by the server for
authentication.
Authentication Realm
The method-specific authentication realm specified in the
request. If present in the request, the server must provide
this whether or not it was used by the server for
authentication.
Authenticated Username
The username (in the indicated realm) that the client used to
attempt authentication, as specified in the request. If present
in the request, the server must provide this whether or not it
was used by the server for authentication.
Authenticated Password
The password that the client used to attempt authentication, as
specified in the request. If present in the request, the server
must provide this whether or not it was used by the server for
authentication.
NOTE - Current practice on the O'Reilly WebSite servers require
that the CGI program's name begin with a dollar sign ($) to
have the password supplied through the CGI interface. This is
not required by this specification. It is recommended, however,
as it forces the CGI programmer to do something special to have
the password info exported from within the server's internal
environment.
The [Accept] Section
This section contains the client's acceptable data types found in the
request header as
Accept: type/subtype {parameters}
If the parameters (e.g., "q=0.100") are present, they are passed as
the value of the item. If there are no parameters, the value is "Yes".
Note: The accept types may easily be enumerated by the CGI program
with a call to GetPrivateProfileString() with NULL for the key name.
This returns all of the keys in the section as a null-delimited string
with a double-null terminator.
The [System] Section
This section contains items that are specific to the Windows
implementation of CGI. The following keys are used:
GMT Offset
The numper of seconds to be added to GMT time to reach local
time. For pacific Standard time, this number is -28,800. Useful
for computing GMT times.
Debug Mode
This is No unless the server's "CGI/script tracing" mode is
enabled, then it is Yes. Useful for providing conditional
tracing within the CGI program.
Output File
The full path/name of the file in which the server expects to
receive the CGI program's results.
Content File
The full path/name of the file that contains the content (if
any) that came with the request.
The [Extra Headers] Section
This section contains the "extra" headers that were included with the
request, in "key=value" form. The server must URL-unescape both the
key and the value prior to writing them to the CGI data file.
Note: The extra headers may easily be enumerated by the CGI program
with a call to GetPrivateProfileString() with NULL for the key name.
This returns all of the keys in the section as a null-delimited string
with a double-null terminator.
The [Form Literal] Section
If the request is an HTTP POST from an HTTP form (with content type of
application/x-www-form-urlencoded or multipart/form-data), the server
will decode the form data and put it into the [Form Literal] section.
For URL-encoded form data, raw form input is of the form
"key=value&key=value&...", with the value parts in url-encoded format.
The server splits the key=value pairs at the '&', then splits the key
and value at the '=', url-decodes the value string, and puts the
result into key=(decoded)value form in the [Form Literal] section.
For multipart form data, raw form input is in a MIME-style multipart
format, with each field in a separate part. The server extracts the
field namd and value from each part and puts the result into key=value
form in the [Form Literal] section.
If the form contains any SELECT MULTIPLE elements, there will be
multiple occurrences of the same key. In this case, the server
generates a normal "key=value" pair for the first occurrence, and it
appends a sequence number to subsequent occurrences. It is up to the
CGI program to know about this possibility and to properly recognize
the tagged keys.
The [Form External] Section
If the decoded value string is more than 254 characters long, or if
the decoded value string contains any control characters or
double-quotes, the server puts the decoded value into an external
tempfile and lists the field into the [Form External] section as:
key=pathname length
where pathname is the path and name of the tempfile containing the
decoded value string, and length is the length in bytes of the decoded
value string.
Note: Be sure to open this file in binary mode unless you are certain
that the form data is text!
The [Form Huge] Section
If the raw value string is more than 65,535 bytes long, the server
does no decoding, but it does get the keyword and mark the location
and size of the value in the Content File. The server lists the huge
field in the [Form Huge] section as:
key=offset length
where offset is the offset from the beginning of the Content File at
which the raw value string for this key is located, and length is the
length in bytes of the raw value string. You can use the offset to
perform a "Seek" to the start of the raw value string, and use the
length to know when you have read the entire raw string into your
decoder. Note: Be sure to open this file in binary mode unless you are
certain that the form data is text!
The [Form File] Section
If the request is in the multipart/form-data format, it may contain
one or more file uploads. In this case, each file upload is placed
into an external tempfile similar to the form external data. Each such
file upload is listed in the [Form File] section as:
key=[pathname] length type xfer [filename]
where pathname is the pathname of the external tempfile containing the
uploaded file, length is the length in bytes of the uploaded file,
type is the MIME content type of the uploaded file, xfer is the
content-transfer encoding of the uploaded file, and filename is the
original name of the uploaded file. The square brackets must be
included. They are used to delimit the file and pathnames, which may
contain spaces.
Example of Form Decoding
In the following sample, the form contained a small field, a SELECT
MULTIPLE with 2 small selections, a field with 300 characters in it,
one with line breaks (a text area), and a 230KB field.
[Form Literal]
smallfield=123 Main St. #122
multiple=first selection
multiple_1=second selection
[Form External]
field300chars=C:\TEMP\HS19AF6C.000 300
fieldwithlinebreaks=C:\TEMP\HS19AF6C.001 43
[Form Huge]
field230K=C:\TEMP\HS19AF6C.002 276920
_________________________________________________________________
Results Processing
The CGI program returns its results to the server as a data stream
representing (directly or indirectly) the goal of the request. The
server is responsible for "packaging" the data stream according to
HTTP, and for using HTTP to transport the data stream to the
requesting client. This means that the server normally adds the needed
HTTP headers to the CGI program's results.
The data stream consists of two parts: the header and the body. The
header consists of one or more lines of text, and is separated from
the body by a blank line. The body contains MIME-conforming data whose
content type must be reflected in the header.
The server does not interpret or modify the body in any way. It is
essential that the client receive exactly the data that was generated
by the back end.
Special Header Lines
The server recognizes the following header lines in the results data
stream:
Content-Type:
Indicates that the body contains data of the specified MIME
content type. The value must be a MIME content type/subtype.
URI: (value enclosed in angle brackets)
The value is either a full URL or a local file reference,
either of which points to an object to be returned to the
client in lieu of the body (which the server shall ignore in
this type of result). If the value is a local file, the server
sends it as the results of the request, as though the client
issued a GET for that object. If the value is a full URL, the
server returns a "401 redirect" to the client to retrieve the
specified object directly.
Location:
Same as URI, but this form is now deprecated. The value must not be
enclosed in angle brackets with this form.
Other Headers
Any other headers in the result stream are passed (unmodified) by the
server to the client. It is the responsibility of the CGI program to
avoid including headers that clash with those used by HTTP.
_________________________________________________________________
Direct Return
The server provides for the back end to return its results directly to
the client, bypassing the server's "packaging" of the data stream for
its information protocol. In this case, it is the responsibility of
the CGI program to generate a complete message packaged for HTTP.
The server looks at the results in the Output file, and if the first
line starts with "HTTP/1.0", it assumes that the results contain a
complete HTTP response, and sends the results to the client without
packaging.
_________________________________________________________________
Examples:
* The following example represents a response made by a CGI program
that was invoked by an HTTP server, and consists of an
HTML-formatted body:
--- BEGIN ---
Content-type: text/html <== MIME type of body
<== Header-body separator
<== Body starts here
Sample Document
Sample Document
[... etc.]
--- END ---
* This example represents a redirection response, where the server
is to direct the client to fetch the object indicated by the URL
(using FTP):
--- BEGIN ---
Location: ftp://ftp.netcom.com/pub/www/object.dat <== URL of object
<== Blank line
--- END ---
* This example represents a direct-return response from a CGI
program that was invoked by an HTTP server, where the results
contain a complete HTTP response:
--- BEGIN ---
HTTP/1.0 200 OK <== Start of HTTP Header
Date: Tuesday, 31-May-94 19:04:30 GMT
Server: WebSite 1.0
Content-type: text/html
Last-modified: Sunday, 15-May-94 02:12:32 GMT
Content-length: 4109
<== Header-body separator
A document
[... etc.]
--- END ---
_________________________________________________________________
Robert B. Denny
Article ID: W12618
Filename: WebSite Windows CGI 1.3a Interface.txt