The Domain Name System (DNS) is a distributed naming system widely used in the Internet [14]. Basically, it translates machine names into their network IP addresses and vice versa. It is based on its own protocol, supported by millions of DNS servers all over the world. DNS servers continuously exchange information on correspondence between domain names and network addresses in order to provide their users with one service: name resolution.
A classical DNS usage pattern is as follows. The client wishes to communicate with a certain machine, but it knows only the name of the machine. It needs to translate this name into a network address in order to communicate. Therefore, it asks its local DNS server to resolve the former into the latter. The DNS server, in turn, looks for the answer in its local cache and, if an appropriate record is found, returns the cached address to the client. Otherwise, it contacts other DNS servers, turning itself into a client that looks for the machine address.
There are two methods that the DNS server can use to resolve the query. By using the first of them, called iterative, the DNS server asks its peers for either the result or the address of another DNS server likely to know it. In the former case, the answer is cached locally and then passed to the client. Otherwise, the DNS server issues one more query to the DNS server pointed to by the peer, and the whole procedure repeats itself.
The second method is called recursive. When using this one, the DNS server asks its peer for nothing but the machine address. Therefore, it can be said that all queries issued by clients which are not DNS servers themselves are recursive. The peer can react in three ways. Firstly, it can forward the query (also as a recursive one) to another DNS server. The chain of recursive queries which is built in this way ends at the DNS server that knows the answer. Alternatively, the peer can decide to find the answer using iterative queries, and then return it. Finally, the peer can simply refuse handling the recursive query, as supporting recursive queries is resource-costly.
What can be noticed here is that unless the answer is cached at one of the intermediate DNS server the query eventually reaches the DNS server ``authoritative'' for the service domain. The authoritative server, in turn, can respond with any address it prefers. In particular, it can respond with the address of the service replica that it finds best for the query. The answer will finally be returned to the client which will contact the replica instead of the service machine (see Figure 2.5). As long as the client uses DNS names to reference the service, and not network addresses, this scheme will give the effect of client redirection.
DNS-based redirection has several advantages. The most visible one is that it achieves transparency without losing scalability. It is transparent because the clients are obliged to use the addresses provided by the authoritative DNS server, and cannot establish whether these addresses belong to the home machine of the service or to any of its replicas. DNS as a distributed name resolution service proved to be very efficient, even though the amount of people using it has increased tremendously with the growth of the Internet.
Another vital advantage of using DNS to redirect clients is that it is a natural way of informing the clients about the service addresses. It is used by many existing network services, and is very likely to be used by those to come as well. Moreover, DNS is supported by a huge infrastructure of millions of DNS servers, capable of caching the answers our redirector generates. Once we make this infrastructure work for us, both efficiency and availability of our redirector considerably increase.
One more important advantage of DNS is that it allows multiple replica addresses to be returned, enabling the client to choose one from them. This feature was not supported by any of the previously discussed redirection mechanisms.
The last advantage of DNS-based redirection is its good maintainability. Deployment of the complete redirection mechanism boils down to launching a single modified DNS server, and subsequently delegating a service domain to this server. From this moment on this server is responsible for answering requests for the service address. No other modification of the DNS infrastructure is necessary.
On the other hand, using DNS-based redirection leads to a few difficulties. The first of them is caused by the fact that DNS queries carry no information about the client that triggered the name resolution. All that the service-side DNS server knows is the network address of the DNS server that asks about the service location. Therefore, we have to assume that clients always use a DNS server that is close to them, and approximate a client's location to that of its DNS server [20]. Whether we consider it to be a drawback or not depends on the accuracy we want to achieve. Studies show that 64% of clients are located in the same network as their DNS servers [12]. Thus, as long as we do not need strict per-client redirection, the location of the client DNS server approximates the client well enough.
Another problem, tightly related to the first one, is what happens when a recursive query occurs. To process this kind of query, DNS servers create a chain of queries that ends at the service domain DNS server. What poses the actual difficulty is that the latter knows only the address of the DNS server that is one step before in the chain, and not the origin of the chain. The service domain DNS server therefore has no information about the location of the client (see Figure 2.6). Even if one of the intermediate DNS servers decides to switch to iterative mode, the information about the client location remains uncertain.
What helps here is the policy suggested in the DNS protocol specification [14]. According to this policy, the chain of recursive queries should never cross the border of an administrative domain, be it a small school or a large Internet Service Provider. DNS servers located in different administrative domains are strongly recommended to exchange information using iterative queries only. This applies both to the DNS servers that send queries outside their administrative domains and to the ones that receive them. The latter should refuse handling recursive queries coming from a domain managed by another party (and it is very likely that they would do so anyway, as processing recursive queries requires more computational resources). Note that the administrative domain can cover any number of network domains - the only constraint is that they are managed by a single institution.
Assuming that the policy is always respected and that every institution cares about having fast network connections between all machines falling under its administration, even in the case of recursive queries the redirector can still approximate the location of the client with the address of the DNS server issuing requests.
Yet another complication can be caused by caching answers in DNS. To avoid contacting the service domain DNS server too often, and to resolve repetitive queries faster, each answer can be assigned a TTL value (Time To Live). The TTL specifies the maximum time for which the answer can be stored in the network and considered valid. Thus, as long as a DNS server has such an answer in its cache, it can respond to a client immediately without bothering the service domain DNS server that previously produced the answer. Since regular domain names seldom change, DNS servers usually use large TTL values (the suggested value is 3 days, and the maximum - one week [5,15]). The problem is that all responses of the redirector are time-dependent and can expire shortly. If we cannot control their lifetime, we can end up in the situation in which the clients are redirected to replicas which are not optimal, or even no longer exist. A fundamental issue for our approach is how to choose TTL values. Large values favor caching, therefore efficiency. Small values, however, favor accuracy.
Fortunately, the problem of TTL selection is simpler than initially expected. According to recent research, caching efficiency is mostly stable for TTL values greater than 10 minutes [11]. Since we can expect replica lifetimes to be at least of the order of hours, choosing TTL values of 10 minutes appears to provide a reasonable trade-off.
Another (relatively little) problem is that URLs can contain a TCP port number on which the server is expecting requests. Since the client software will open TCP connections to the port number found in the URL, all replicas of the site have to be accessible through the same port. In particular, if the main server services several ports, all its replicas should service them all as well. This limitation may introduce some problems with keeping the configuration of all the replicas consistent. In comparison to making the content consistent, however, it is a rather small complication.
The last potential problem is that DNS cannot distinguish between different services located on the same machine. What can be helpful in case of DNS, however, is that different services hosted by the same machine can be referred to by different DNS names (normally resolved to the same IP address). In this way, if we need to replicate services separately, we can distribute the data from every service independently. Obviously, it is still less flexible than in the case of the HTTP-based redirection, where we can use a separate placement strategy for each document.