next up previous contents
Next: Redirection Policies Up: The Redirector as an Previous: DNS/TCP Support   Contents

DNS/UDP Support

The situation is much worse in the case of UDP. In essence, Apache does not allow a module to wait for incoming UDP datagrams. Although APR provides some basic routines for UDP-based communication, the entire request-processing subsystem implicitly assumes that all data arrive through TCP connections. As a consequence, it accesses the data via ``read'' and ``write'' calls. Unfortunately, these functions do not work with UDP, where another two should be used (``recvfrom'' and ``sendto,'' respectively). To solve this problem, we have to extend the original server.

First, we need to force Apache to react to incoming UDP datagrams. We can exploit the list of listeners by adding our handmade UDP socket to the list. We thus instruct Apache to consider it while polling for network events. The socket is created in one of the directive-related configuration hooks, but its actual initialization takes place in the ``post_config'' hook, discussed above. The reason of creating the socket in two separate phases is that Apache examines the list of listeners after parsing all the configuration directives, but before invoking the ``post_config'' hook. Since during the examination Apache initializes some other internal data structures, adding any sockets to the list after the examination does not work. Interestingly, sockets do not have to be configured to pass the examination - it is enough that appropriate listener structures can be found on the list. By initializing our UDP socket in the ``post_config'' hook we remain consistent with the behavior of Apache, which also creates listener structures in the directive-related configuration hooks, but initializes them after the configuration phase has finished.

Unfortunately, adding the socket alone is not enough. The problem is that Apache expects sockets to have set up a TCP connection. Therefore, if Apache passed our UDP socket to the request-processing subsystem, everything would break down, as the routines used to retrieve and send data via TCP connections differ from those used for communicating via UDP.

Recall from Section 3.1.6 that, after isolating a socket to service, the server first calls an accept function which can be customized in a per-socket manner, and then passes the accepted socket to the request-processing subsystem. It is tempting to implement the whole DNS server in a customized accepting function, and then report some error to avoid calling the request-processing subsystem. There are a few problems, however. Firstly, recall that concurrent calls to the accept function are serialized. If we decided to process incoming UDP datagrams as part of an accept function, we would lose all the benefits of running a concurrent server. Secondly, in the original server there is no way of gently breaking the accept-then-process sequence - either it succeeds completely, or a critical error is signaled. Without getting into further details, we would have to change the semantics of the accept function. The problem is that it is called by the MPM, and for this reason we would have to change all the MPMs that can ever be used.

What we propose is modifying the Apache core to associate the socket with an additional ``processing function,'' used to process the accepted socket instead of the request-processing subsystem. A pointer to this function can be added to the listener structure.

Then, we modify the very beginning of the request-processing subsystem as follows. It first checks whether a processing function has been provided for the just-accepted socket, and if so, calls it instead of proceeding with its usual operation. The benefit of having it done here is that there is only one request-processing subsystem: we need only one modification, unlike the case of MPMs.

The problem that arises here is where exactly the request-processing subsystem should look for the customized processing function. Since it only knows about the socket descriptor (and not the entire listener structure), it has to be informed about the processing function in a different way. Our solution exploits the connection resource pool. APR allows to treat each pool as a dictionary-like data structure. Using a special APR call, we can associate a pointer to the listener structure in the connection resource pool with a certain key. This is done in the customized UDP accept function. Later on, when the pool is passed to the request-processing subsystem, the latter can retrieve the pointer from the pool using the same key. In this way it can examine the content of the listener structure. In particular, it can check whether any custom processing function has been defined for the socket that is about to be processed.

Another problem we face is caused by the various ways MPMs deal with accepted sockets. Some of them can try to perform some socket-specific operations on the socket returned by the accept function. These operations include ``setsockopt,'' or even ``close,'' if the server is overloaded and cannot service the socket. Therefore, our accept function must return a valid TCP socket, even though in this particular accept-then-process sequence we are only going to use UDP. For this reason, we create a brand new TCP socket in the accept function, and return it to the calling MPM. The socket is destroyed at the beginning of the processing function, as its only purpose is to fool the MPM.

Figure 3.5: Overtaking UDP datagrams in Apache
\includegraphics[width=10cm]{xfig-accept.eps}

Now that we have two separate accept and process functions, called sequentially each time the UDP socket reports the arrival of new data, we can use them as follows. The accept function reads all datagrams that can be found in the socket buffer. In this way we prevent the socket buffer from being overflown, as it could happen if the accept function retrieved a single datagram each time. Then we register the list of datagrams in the connection pool under a special key, just as in the case of the listener structure.

The processing function reads the list of datagrams registered inside the connection pool. The datagrams are serviced one by one, and all the responses are sent using the same UDP socket. To avoid possible collisions between multiple ``send'' operations coming from different threads, they are serialized using a special lock. Since many calls to the processing function can be done in parallel, we still have a concurrent server.

Below we can see a typical UDP datagram processing, presented similarly to the previously discussed ``Apache loop,'' and illustrated in Figure 3.5:

  1. create an empty connection resource pool;
  2. lock the global ``accept_lock;''
  3. poll on the set of sockets extracted from the list of listeners;
  4. take any socket reporting the arrival of new data;
  5. perform accept on the socket, by calling the special accept function associated with the socket;
    1. retrieve any datagrams that reside in the socket buffer;
    2. register the list of datagrams inside the connection pool;
    3. register the socket-related listener structure inside the connection pool;
    4. create a fake TCP socket and return it;
  6. unlock the global ``accept_lock;''
  7. process the socket, by passing it to the request-processing subsystem;
    1. extract the listener structure from the connection pool;
    2. unregister the listener structure from the connection pool;
    3. call the processing function found inside the listener structure;
      1. destroy the fake TCP socket;
      2. extract the list of datagrams from the connection pool;
      3. unregister the list of datagrams from the connection pool;
      4. service all the datagrams;
  8. destroy the connection resource pool.

The reason for unregistering the data (both the listener structure and the list of datagrams) from the connection pool is that some MPMs reuse pools, instead of destroying them and creating new ones. If we left the data registered inside the pool, Apache could try to access it again the next time the pool would be used. Obviously, it would be incorrect behavior, as the data associated with the pool should not be accessible after the processing function has terminated.


next up previous contents
Next: Redirection Policies Up: The Redirector as an Previous: DNS/TCP Support   Contents
root 2002-08-27