The HyperText Transfer Protocol (HTTP) is an application-level protocol used mainly by Web servers and Web browsers [9]. It works as follows. Each document that can be transferred using HTTP is identified by means of a Uniform Resource Locator (URL). The URL contains the name of the Web server responsible for providing the document and usually also a server-internal path name. When a Web browser wishes to retrieve the document, it sends a request containing the URL to the server pointed to by the URL. The server inspects the request and returns the corresponding document. The document itself can contain references to other documents, which can be followed in the same way, if necessary. What is important for us is that, instead of sending the actual document, the Web server can respond with a single URL pointing to another location of the document (for example, a replica of the server). The server can also rewrite the references inside the document in such a way that they point to the replica. These two techniques perform redirection, since the Web browser starts communicating with the replica, and does not contact the main server any longer (see Figure 2.3).
The main advantage of HTTP-based redirection is that it is extremely easy to deploy. Basically, all that we need is the possibility of serving dynamically generated Web pages. Apart from creating the actual content, the generator can also determine an optimal replica and rewrite all internal references in such a way that they now point to this replica. Moreover, each reference can be treated separately, enabling each document to be replicated at a different set of replicas. This feature can be crucial for devising an efficient distribution strategy for hosted documents. What is also important is that we do not need administrative privileges to run the redirector, neither do we have to run any additional network services. All these features make the HTTP-based redirection portable and attractive for massive use.
The HTTP-based redirection has also proved to be efficient. Although it is always required to retrieve an initial document from the main server, all further work proceeds only between the client and the selected replica, which is likely to offer optimal performance to the client.
As for scalability, the situation is somewhat worse. The necessity of making the first contact with always the same, single service machine can make it a bottleneck when the number of clients increases. However, as long as the initial document can be generated and delivered fast, the problem is not that serious. The easiest way of ensuring fast handling of this first transaction is simply to keep the initial document short.
The main drawback of HTTP-based redirection is that it lacks transparency. By receiving a URL which explicitly points to a certain replica, the browser becomes aware of being switched between different machines. This leads to the ``bound reference problem'' described in Chapter 1. Since transparency constitutes our most important non-functional requirement, the lack of it makes the HTTP redirection unattractive.
Moreover, when using this kind of redirection mechanism, we cannot provide the client with more than one replica. Since each rewritten reference has to point to exactly one replica, the client is given no choice, and no alternative if the reference turns out to be unreachable.