Simple HTTP server

Author: Adrian Perez <aperez@igalia.com>
License:GPL3
Copyright: 2008-2009 Igalia S.L.

Abstract

Provides a not-so-basic standalone web server. Socket functionality is not provided. Requests are read from standard input, responses are sent to standard output and logging is done the standard error stream.

Contents

Running a webserver

Fortunately, dealing with sockets can be done in pure Bash [1]. Unfortunately enough, some distributors choose not to enable support for this feature. This is the case of the Bash builds included with Debian and Ubuntu.

[1]http://unixjunkie.blogspot.com/2006/01/two-cool-bash-tricks.html

Using auxiliar tools

Some utilities exist which allow nearly all program using the standard input and output streams to use sockets. It is likely that at least one of these is available for your operating system of choice. All those examples run a simple web server listening in port 8000 at 127.0.0.1 (the loopback network address). Once you try one of those command lines, point a browser to http://localhost:8000/doc and you will be able of browsing the Bill documentation using the simple built-in web server.

Package Command
ipsvd tcpsvd 127.0.0.1 8000 ./scripts/bill lib/www/http.bash
ucspi-tcp tcpserver -q 127.0.0.1 8000 ./scripts/bill lib/www/http.bash
netcat (while true ; do nc -l -p 8000 -c './scripts/bill lib/www/http.bash'; done)
netpipes faucet 8000 -H 127.0.0.1 -i -o ./scripts/bill lib/www/http.bash

Note

Commands should be run from the top-level Bill source directory, otherwise they may not work.

Warning

The netcat command is very flaky, it will not accept concurrent connections, and in general will perform poorly.

Functions

http_response_by_code

http_response_by_code code

Obtains the descriptive text for the given HTTP response status code. The strings returned are those specified in the code list contained in RFC 2068.

http_mimetype_guess

http_mimetype_guess path

Determines MIME types depending on the entries of a hash map named http_mime_mapping, which contains a default set of entries which suffice for serving the Bill documentation. The suffix of the file pointed by path is used to determine its content type.

For example:

(bill) http_mimetype_guess /etc/sysctl.conf
text/plain

http_error_document

http_error_document [ code [ description ... ] ]

Generates an error document in standard output:

  • code is the HTTP error code (default is 500)
  • description is an arbitrary piece of text which will be inserted pre-formatted in the output.

http_header

http_header name value

Sends an HTTP header.

http_header_start

http_header_start [ code [ description ] ]

http_date_now

http_date_now

Returns the current GMT time in RFC 1123 format, as needed for response HTTP headers.

http_body

http_body

Prepares the HTTP connection to send content body.

http_error

http_error [ code [ description ... ] ]

Sends an HTTP error status code and an accompanying HTML document explaining the error. Error document is formatted using http_error_document.

http_redirect

http_redirect uri [ code ]

Sends an HTTP redirection, to the given uri. If not supplied, the code will be 302 (a permanent redirect). You can pass 301 for temporary redirects. Keep in mind that using absolute paths in redirects is recommended.

http_log_clf

http_log_clf [ status ]

Logs a line of output to standard error in CLF format:

host ident authuser date request status bytes

Note that the “ident” field will always be empty. The field “address” will be empty when the remote IP cannot be guessed. The field “bytes” will be empty as well unless the Content-Length is set with http_header.

http_handle_GET

http_handle_GET

Default handler for the HTTP GET method. This does nothing more than serving static files and producing directory listings. Also, if path resolves to a directory which contains a file named index.html it will be served instead.

http_handle_request

http_handle_request [ handler_prefix ]

Serves a single HTTP request. The following variables are set by the function and their values will be clamped if already defined. Note that most of them start with the HTTP_ prefix or have names of the variables used by the CGI interface:

  • REQUEST_METHOD
  • PATH_INFO
  • QUERY_STRING
  • ...and so on.

The handler_prefix can be used to change behavior of how requests are served. It is used to find which functions are used to serve the different HTTP methods. As an example, one could define:

my_handler_GET () {
    # Do something interesting...
}
my_handler_HEAD () {
    # ...and something *even* more interesting.
}

and then use my_handler as prefix, then the HTTP request hadler will pass GET to my_handler_GET and HEAD ones to my_handler_HEAD. This allows for easily reusing the HTTP module.