This is the main documentation of v-Octopus. This documentation is for the usage on production machines only. Other topics can be found as follows:
- If you want to see how to build vOctopus from scratch, just go to the BuildingVoctopus page;
- If you are a developer interested on understanding the architecture:
- a UML class diagram is provided at ArchitecturalUMLDesign wiki page along with descriptions of the main interfaces;
- The list of the current open issues are located at the Issues section;
- The complete source revisions are located at the Sources section.
- If you want to learn more about the usage of Aspect-Oriented Programming on v-Octopus to implement cross-cut concerns, just go to the VOctopusAspects wiki page.
Requirements
Installing and Testing Drive
- Get the latest stable version at the Downloads section;
- Unzip the contents of the file to your desired local location;
Ex: /home/username
- Move htdocs folder (htdocs) to your drive; This is the server document root folder
- To change the path to the document root directory see Modifying Server Settings section;
- Go to the directory where you unzipped the contents
- In command prompt window type:
- Linux/Unix, Mac: ./vocotpus-start.sh
- Windows: cmd ./vocotpus-start.bat
At this point, you should see the message
V-Octopus Web Serving running, waiting connections on localhost:1025, where 'localhost' and '1025' are respectively the 'ServerName' and the 'Listen' configuration values. You can change these values by reading the
Modifying Server Settings section.
- To see a test page, Open browser window and type in http://localhost:1025
After the installation procedure, verify that you have a list of the following directories:
- bin: where all the executables are located;
- cgi-bin: default directory for the cgi-scripts;
- conf: directory with the configuration files httpd.conf and mime.types;
- docs: additional documentation of the server including the Developer's API, screenshots, etc;
- errors: directory containing the customizable Server-Side Include pages for HTTP errors such as 404, 500 and 501 errors;
- icons: the default list of icons for the fancy directory listing;
- logs: directory containing the logs from the server access.log;
- soa-ws: directory for web services components (experimental);
- tmp: directory used to hold the temporary files from uploaded files from PUT requests;
- htdocs: document root of this server (for historical purposes), containing the main web resources to be served.
vOctopus web server is highly configurable by its INSTALATION_DIR/conf/httpd.conf configuration file. It is very similar to the Apache's in regards to the configuration constants, and it will also include some new ones. In this way, it is very important that this file exists and contains all the necessary configuration settings needed in order to properly have vOctopus loaded. However, in case of any problem, error messages will be displayed with directions to solve the situation.
Access Control
In order to provide better security mechanisms, vOctopus webserver provides basic access control when running on any NIX based system (Mac, Linux, Unix...). When a client makes a request for a resource, the read permission is verified against the user who started the server, and the regular file systems' permission schema. If read permissions are not set, Server will return a response with 403 Status.
Additionally to that, cgi-bin and soa-ws directories are protected by default.
Changing httpd.conf
The httpd.conf configuration file is the main customization tool offered by vOctopus web server, since the server needs the correct information provided by the system's administration with the correct environment information. For instance, the web server root directory, the document root and CGI-bin directories are provided on this file in a way that the server can serve the correct information requested from any Web browser.
Make sure you save a local backup from the original and additional _httpd.conf_ files just in case before any substantial changes!
The following is a list with the main configuration parameters found on the httpd.conf file:
- ServerRoot: the root directory of your server;
- DocumentRoot: the root where the server looks for document to run;
- Listen: the port of the web server to be used on the HTTP requests;
- LogFile: the location of the servers log file
- ScriptAlias: CGI script alias used on the server;
- Alias: document path alias;
- MimeTypes file contains information about known mime types;
- Directory tag: a protected directory starting from the document root that contains a protection with username and password. Users accessing the server path that matches this directory will be asked to authenticate against the .htpasswd file located at the place indicated on the Directory tag.
http://v-octopus.googlecode.com/svn/trunk/voctopusHttpd/conf/httpd.conf
Enabling authentication for specific directories with <directory>
ones...
When a request is made to vOctopus web server, it will analyze the list of the pre-defined Directory entries on the httpd.conf and it will try to match the requested resource with one of the protected directories. The following is a protected directory definition...
<Directory "$VOCTOPUS_SERVER_ROOT/htdocs/cgi-bin-old">
AuthType Digest
AuthName "Selected Users MD5-secured Directories"
AuthDigestFile $VOCTOPUS_SERVER_ROOT/conf/.digest
Require user marcello,yoon,wmac01
</Directory>In case of a match, the server will ask the user to authenticate in case the path being requested is located as a sub-path of it. A request to a given resource will generate a response as follows:
HTTP/1.1 401 Unauthorized
Host: localhost:1025
Date: Thu, Apr 03 2008 03:01:08 PDT
Last-Modified: Sat, Mar 01 2008 02:28:19 PST
Server: vOctopus/0.7
WWW-Authenticate: Basic realm="Selected Users Secure Directories"
Connection: close
Transfer-Encoding: chunked
The browser will then require the user to provide additional information about the authentication as shown on the next screen-shot.
After providing the username and password, a new request is sent to the server as follows:
GET /cgi-bin-old/ HTTP/1.1
Host: localhost:1025
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.12) Gecko/20080207 Ubuntu/7.10 (gutsy) Firefox/2.0.0.12
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.8,pt-br;q=0.5,es;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
If-Modified-Since: Sat, Mar 01 2008 02:28:19 PST
Cache-Control: max-age=0, max-age=0
Authorization: Basic bWFyY2VsbG86dXRuMjlvYWQ=
The user will be authenticated by being checked against the .htpasswd file whose path is specified by AuthDigestFile or AuthUserFile (depending the AuthType used). vOctopus complies to the specification of the Apache Web Server and you can follow its specification.
If the username and password combination does not match with the ones defined by the configuration file, 401 response is sent back to the client as specified by the HTTP protocol requirements. Otherwise, the requested document will be displayed as expected.
Mapping URLs to the file system
For each request vOctopus has to decide what file or resource to serve by using the combination of the URL path and the DocumentRoot path, which is specified in the httpd.conf configuration file.
Here is the flow of decision on how to determine if the request has a found resource.:
- Search under the DocumentRoot with the path requested by the client;
- In case the resource is not found, the server will verify the hash map of each Alias and file system directory, which maps file system folder structure to the path specified in URL. For example, consider the following code snippet from the httpd.conf file:
...
Alias /research/ “/home/marcello/mypages/academic-pages/”
Alias /professional/ “/home/marcello/mypages/professional-pages/”
...
...
A request to the URL http://localhost:8866/research/cv.html will result to the search for the cv.html file inside the directory indicated on each Alias, following the mime.types definitions to handle the file. In case the file is not located, a 404 error message is displayed.
- On the other hand, if the request was to a given directory, the server will return the fancy directory listing for it.
Verifying the requests log with Logfile
Every time a request is made, the server will write a log entry to a document whose location is specified in httpd.conf file specified by the constant LogFile. The log file contains basic information about the request to the server and the response made by the server along with the date and the IP of the client. The following is a set of requests to resources on the server, and the format follows the conventions defined by the Apache Web Server.
127.0.0.1 - - [Mar/18/2008:19:30:37 -0700] "GET / HTTP/1.1" 200 4634
127.0.0.1 - - [Mar/18/2008:19:33:10 -0700] "GET /images/image1.gif HTTP/1.1" 200 15343
127.0.0.1 - - [Mar/18/2008:19:33:10 -0700] "GET /images/academic.jpg HTTP/1.1" 200 21815
127.0.0.1 - - [Mar/18/2008:19:33:11 -0700] "GET /favicon.ico HTTP/1.1" 404 1415
127.0.0.1 - - [Mar/18/2008:19:33:11 -0700] "GET /script/showcgi.pl HTTP/1.1" 200 1477
127.0.0.1 - - [Mar/18/2008:19:33:35 -0700] "GET /denied.html HTTP/1.1" 403 169
127.0.0.1 - - [Mar/18/2008:19:33:52 -0700] "GET /notfound.html HTTP/1.1" 404 1416
127.0.0.1 - - [Mar/18/2008:19:34:03 -0700] "GET /cgi-bin2/birthday.pl HTTP/1.1" 200 30
127.0.0.1 - - [Mar/18/2008:19:34:13 -0700] "GET /cgi-bin/birthday.pl HTTP/1.1" 200 30
127.0.0.1 - - [Mar/18/2008:19:34:29 -0700] "GET /cgi-bin/ListDirectories.py HTTP/1.1" 500 1543
127.0.0.1 - - [Mar/18/2008:19:38:46 -0700] "GET /favicon.ico HTTP/1.1" 404 1415
127.0.0.1 - - [Mar/18/2008:19:38:47 -0700] "GET /logs/access.log HTTP/1.1" 200 408878
This log file is important for systems' administrators and webmasters who constantly need to monitor the requests to files that are being requested by clients on the Internet. For example, by filtering the response status codes, administrators should decide on the creation of new resources (favicon.ico is constantly requested by default by web browsers such as Safari, Internet Explorer and Firefox for eye-candy purposes on the URL), or the bad execution and requests to CGI scripts and Web Services methods.
CGI
In order to make CGI programs work properly, the server offers the usage of the CGIHandler variable to give the complete command to the CGI/script. In this way, any script language can be easily added and it will be executed as a sub-process from the server and the usage of ScriptAlias directive is present on the httpd.conf. The following is a list of tested scripting languages with this server:
Here is the configuration snippet from the httpd.conf file. If you want to add another handler, just register using the complete path to the language and the expected file extension to be executed.
CgiHandler /usr/bin/perl .pl
CgiHandler /usr/bin/python .py
CgiHandler /usr/bin/ruby .rb
The configuration for scripts can be done by just adding the script to the default cgi-bin directory or by using the ScriptAlias configuration settings. In this case, scripts or directories can be mapped:
ScriptAlias /cgi-bin/ListDirectories.py "$VOCTOPUS_SERVER_ROOT/cgi-bin/src/edu/sfsu/cs/csc867/msales/voctopus/filesystem/ListDirectories.py"
ScriptAlias /cgi-bin/ "$VOCTOPUS_SERVER_ROOT/cgi-bin/"
ScriptAlias /cgi-bin2/ "$VOCTOPUS_SERVER_ROOT/cgi-bin2/"
ScriptAlias /script/ "$VOCTOPUS_SERVER_ROOT/cgi-bin/"
ScriptAlias /sa1/ "$VOCTOPUS_SERVER_ROOT/cgi-bin/"
ScriptAlias /sa2/ "$VOCTOPUS_SERVER_ROOT/cgi-bin2/"
The successful execution of a script will generate a proper 200 status and the output of its execution, as well as it will take advantage of the ScriptAlias constants on the httpd.conf configuration file to make different script.
On the other hand, when the execution of a script fail, the output of the execution of a given CGI call will generate the 500 status code, with the information retrieved from the exception that was thrown by the CGI sub-process. Furthermore, depending if the implementation of the CGI outputs the error messages to the error OutputStream, then the error message is used on the 500 message response.
v-Octopus Server Features
Supported HTTP Request Methods
The server supports the following request methods:
GET Request Method
The regular request to a file. It's shown entirely everything in this documentation by the screenshots.
The implementation of Cache Mechanism can return no body when the request uses the cache request header If-Modified-Since
Request
GET /icons/folder.gif HTTP/1.1
Host: localhost:1025
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.12) Gecko/20080207 Ubuntu/7.10 (gutsy) Firefox/2.0.0.12
Accept: image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.8,pt-br;q=0.5,es;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://localhost:1025/cgi-bin-old/
If-Modified-Since: Sat, Oct 09 2004 15:01:33 PDT
Response
Processing request for /icons/folder.gif
HTTP/1.1 304 Not Modified
POST Request Method
The request with a POST method is usually performed by a form submission and it can be tested using any CGI script that makes a form submission. The subprocess to execute the CGI using a POST request must add the request variables and values from the request. Consider the following request:
POST /cgi-bin/showcgi.pl HTTP/1.1
Host: localhost:1025
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.12) Gecko/20080207 Ubuntu/7.10 (gutsy) Firefox/2.0.0.12
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.8,pt-br;q=0.5,es;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://localhost:1025/showcgi.html
Content-Type: application/x-www-form-urlencoded
Content-Length: 32
string=thisIsMyValue&checkbox=on
The request parameters must be decoded from the request. Total support for attachments are made by reading the boundary tokens. See the PUT request example for that. In this request the variables are "string" and "checkbox". These values are passed to the subprocess by using the process OutputStream. By default, the CGI script that expects POST variables must read them from the InputStream of its process.
HEAD Request Method
The request method just returns the information about the resource being requested without including the body. The connection to the server is closed if the persistent connection headers are not provided.
HEAD / HTTP/1.1
HTTP/1.1 200 OK
Date: Wed, Mar 12 2008 15:04:00 PDT
Last-Modified: Wed, Feb 20 2008 04:44:28 PST
Expires: Thu, Mar 12 2009 15:04:00 PDT
Server: vOctopus/0.7
Connection closed by foreign host.
PUT Request Method
The request with a PUT method uploads a file to the temporary directory on the system and transfer it to the file system path that matches to the web path. It can be tested using a form submission. For example, consider the request code below.
PUT / HTTP/1.1
User-Agent: Jakarta Commons-HttpClient/3.0
Host: localhost:1025
Expect: 100-continue
Content-Length: 109784
Content-Type: multipart/form-data; boundary=Q45i4LjEIwRjtTYrt3EHmLzwNE-YH5gP8
--Q45i4LjEIwRjtTYrt3EHmLzwNE-YH5gP8
Content-Disposition: form-data; name="hs_err_pid14993.log"; filename="hs_err_pid14993.log"
Content-Type: application/octet-stream; charset=ISO-8859-1
Content-Transfer-Encoding: binary
For the file on the body of this method, delimited by the boundary token, the file will be uploaded to the server to a temporary directory, and then will be moved to the Web Root directory, since the request was to the web root directory. As a result, the client will get the response of the outcome of the PUT request, and if "everything" works fine, here is an example of response:
HTTP/1.1 201 Created
Date: Wed, Mar 12 2008 16:24:04 PDT
Location: /
Server: vOctopus/0.7
Content-Length: 4096
Served 127.0.0.1 in 119ms
Finally, this file can be requested as http://localhost:1025/hs_err_pid14993.log
Other request methods (CONNECT, DELETE, OPTIONS, TRACE)
If any of the other request methods defined by the Http protocol RFC is requested, a 501 response is returned and the connection is automatically closed by the server.
[marcello voctopusHttpd]$ telnet localhost 1025
Trying 127.0.0.1...
Connected to localhost.
DELETE / HTTP/1.1
HTTP/1.1 501 Not Implemented
Date: Wed, Mar 12 2008 14:55:22 PDT
Server: vOctopus/0.7
Connection closed by foreign host.
Caching Mechanism
Conditional requests can be performed and caching can be enabled/disabled on the configuration httpd.conf file. In order to enable caching with every OK (200) response, the server has to perform checks on the request headers and then decide on the response back to the client. Let's talk about the request headers first...
- If-Modified-Since: Last-Modified: vOctopus server checks the If-Modified-Since header in the request verifies the modification date of the file against the received date. If the file is not modified, the server returns the 304 status, without any additional response body. Nevertheless, the 200 status is returned with complete body if the file has been changed. The Last-Modified response header will be checked.
- Pragma: Last Modified: If the Pragma header is set to “no-cache” the server will always return 200 status.
Memory Cache to increase decrease Server Load
As an experiment, I developed an Aspect to intercept ASCII-based requests and save the file into memory, thus reducing throughput on the server file system, as well as memory load with the I/O operations. The entire description is available on the CachingAspect wiki page.