Please read the instructions very carefully. For this assignment, you are to write a simple Web server that only serves files. You will be writing two different versions of the server using two different I/O models: (1) select()-based I/O multiplexing, (2) thread-based blocking I/O. This project is to be done in groups of two students. If you have to work in groups of three or alone, please contact us.
To disallow clients from access files above the working directory where the server is run, you should call chroot. Please use man pages to understand how it works.
In the first model, you will use select() to serve all clients in a single thread. This is the I/O multiplexing model. Select blocks on multiple file descriptors, so there is no need to create additional threads. In the second model, you will use pthreads instead, serving each client with its own thread.
Clearly there are resource limitations for your web server. One is the maximum number of simultaneous clients, which is limited by the maximum number of open socket descriptors in both models, and additionally by the maximum number of threads in the second model. For this assignment, your server will not need to handle more than 1000 simultaneous clients, so you may hard-code this as an upper limit. (HINT: File descriptors for open files also take up resources. To serve many clients, you may need to restrict the number of simultaneous open file descriptors. This can be done by reading part of a file, closing it, then opening and reading the next chunk when ready. Semaphores may be used to keep track of the number of open file descriptors in the thread model.)
An important part of this assignment is error handling. You must be able to handle all errors associated with socket and file operations. Any function that has the potential to return an error value should be checked, and an error message should be written to the log file when an error value is returned. In addition, you should always output an error number for socket and file operations if one is available. (The error number can be obtained from checking the errno variable, which should be declared as an extern int at the beginning of your C or C++ file. You must also include errno.h to use this functionality.) For socket operations, you should also print out the client's IP and port they are available, and for file operations you should print out the file name if it is available. If the error is server-side, such as failed memory allocation, then you should also call exit() to terminate the server. If the error is client-side, for example if the client unexpectedly closes the connection without receiving the entire file, the server should log the error, then close the client connection and free up any state associated with the session.
In addition to the above errors, the server should disconnect clients that are idle (do not send or receive any data) for two minutes or longer and free up state associated with the connection.
The Web server should output information to the log in the following format: (Use '##' to preceding each log message and put a newline at the end of each log message so we can tell where they begin and end)
##[unix time stamp] [error number] Accepting a new connection from client [IP address]:[port].
##[unix time stamp] [error number] client [IP address]:[port] prematurely disconnected.
##[unix time stamp] [error number] Unable to open file [filename] requested by client [IP address]:[port].
##[unix time stamp] [error number] client [IP address]:[port] is too slow, closing connection.
HttpTest.py: this is the test client for you to test your Web server. You may need to modify the server IP address and port information at the beginning of the file to use it. It is written in python, but there is ample documentation on its functionality contained in the comments. Please look over the six test cases and try to understand what they are doing. If you are interested in learning more about python, a tutorial is available at http://docs.python.org/tut/tut.html. To run the test file, use the following syntax:
python
HttpTest.py
Test Number should be the number of the test case that you wish to run.
File1KB.htm: This is a 1KB html file that is used in some of the HttpTest.py test cases.
File500KB.htm: This is a 500KB html file that is used in some of the HttpTest.py test cases.
The checksum and size of these files are stored in HttpTest.py, and it will display an error if it does not receive a whole file or the checksum does not match.
SelectServer.h: contains useful structure definitions and a function for parsing client HTTP request.
DISCLAIMER: Do not use the client test code on a live web server that you are not running yourself. It is meant to stress-test a web server and could severely degrade performance of a live web server potentially causing denial of service.
webServer -p [port number] -l [log file name]
The port number denotes the port on which the web server listens for connections. Normally this would be port 80 for a web server, but you will need to run on a port greater than 1024 if you do not have root privileges. All required output should be written to the log file. If the user specifies "stdout" or "stderr" for the log file, then you should print output to stdout or stderr, respectively.
You are highly encouraged to use the client code provided to test your Web server. Also, please use your own web browser to test your server to see if you can successfully download and display a file.